flat assembler
Message board for the users of flat assembler.

Index > Linux > The real way to bootstrap fasm with no prior binary

Author
Thread Post new topic Reply to topic
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 19254
Location: In your JS exploiting you and your system
revolution 05 Mar 2023, 04:14
It's ugly and stupid, but it works. here is a preview of a bash script that creates fasm v1.0 without needing any pre-existing binary.
Code:
#!/bin/bash
function p () { for BYTE in $@ ; do printf "\x$BYTE" >>fasm.com ; done ; }

;...

p                             # 0000    org     100h
p                             # 0100    use16
p                             #      
p                             #      start:
p                             #      
p B4 4A                       #         mov     ah,4Ah
p BB 10 10                    # 0102    mov     bx,1010h
p CD 21                       # 0105    int     21h
p BA A2 7D                    # 0107    mov     dx,_logo
p B4 09                       # 010A    mov     ah,9

;...

p 0F 87 51 CF                 # 3735    ja      invalid_operand_size
p 67 AC                       # 3739    lods    byte [esi]
p 3C 28                       # 373B    cmp     al,'('
p 0F 85 36 CF                 # 373D    jne     invalid_operand
p E8 81 DD                    # 3741    call    get_byte_value
p 88 C4                       # 3744    mov     ah,al
p B0 CD                       # 3746    mov     al,CDh
p 67 AB                       # 3748    stos    word [edi]

;...

p                             # 7E23 real_code_size dd ?
p                             #      
p                             # 7E27 start_time dd ?
p                             # 7E2B written_size dd ?
p                             #      
p                             # 7E2F params rb 100h
p                             # 7F2F buffer rb 4000h
p                             #    
I used another script to create the file automatically.. The generator script is this:
Code:
#!/bin/bash

mkdir -p worktemp || exit 1
pushd worktemp || exit 1

unzip -LL -n -q ../fasm10.zip || exit 1

ln -s expressi.inc source/expressions.inc
ln -s preproce.inc source/preprocessor.inc
ln -s assemble.inc source/assembler.inc

cd source

cat > make.asm <<PREMACROS || exit 1
irp j,ja,jb,jc,je,jg,jl,jo,jp,js,jz,jae,jbe,jge,jle,jmp,jna,jnb,jnc,jne,jng,jnl,jno,jnp,jns,jnz,jpe,jpo {
        macro j args \\{
                match byte, short \\\\{
                        j args
                \\\\}
        \\}
}
irp v,  Ah,Bh,Ch,Dh,Eh,Fh,\\
        A0h,A1h,A2h,A3h,A4h,A5h,A6h,A7h,A8h,A9h,AAh,ABh,ACh,ADh,AEh,AFh,\\
        B0h,B1h,B2h,B3h,B4h,B5h,B6h,B7h,B8h,B9h,BAh,BBh,BCh,BDh,BEh,BFh,\\
        C0h,C1h,C2h,C3h,C4h,C5h,C6h,C7h,C8h,C9h,CAh,CBh,CCh,CDh,CEh,CFh,\\
        D0h,D1h,D2h,D3h,D4h,D5h,D6h,D7h,D8h,D9h,DAh,DBh,DCh,DDh,DEh,DFh,\\
        E0h,E1h,E2h,E3h,E4h,E5h,E6h,E7h,E8h,E9h,EAh,EBh,ECh,EDh,EEh,EFh,\\
        F0h,F1h,F2h,F3h,F4h,F5h,F6h,F7h,F8h,F9h,FAh,FBh,FCh,FDh,FEh,FFh,\\
        AF0Fh,BA0Fh,D9DEh,E0DFh,FFFFh,FFFFFh,FFFFF0h { v equ 0#v }
include 'fasm.asm'
PREMACROS

fasm -s make.fas make.asm || exit 1
../../listing -a -b 9 make.fas make.dis || exit 1

cat > fasm-make.sh <<PREFIX || exit 1
#!/bin/bash
function p () { for BYTE in \$@ ; do printf "\\x\$BYTE" >>fasm.com ; done ; }
PREFIX
sed 's/.......................\(....\)..\(...........................\)\(.*\)/p \2 # \1 \3/g' < make.dis >> fasm-make.sh || exit 1
bash fasm-make.sh || exit 1
diff fasm.com make.com || exit 1
mv fasm-make.sh ../../ || exit 1
popd || exit 1
rm -r worktemp || exit 1    
Of course you do need a working binary to do the initial generation. You could use a secure VM or something if there is a concern that needs to be addressed. But once you have the generated script it can be used stand-alone on another system with no binary.

You need to have a working copy of the listing generator, and fasm to create the .fas file. You might need to adjust the paths in the script.

The output is a 646710 byte file called fasm-make.sh. The preview of it is above. gzipped it is 100439 bytes.

To not abuse the board with large attachments I'll leave off attaching it for now. If requested I can attach it, but you can create your own instead.
Post 05 Mar 2023, 04:14
View user's profile Send private message Visit poster's website Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 19254
Location: In your JS exploiting you and your system
revolution 05 Mar 2023, 04:57
For those that need something other than a DOS bootstrap.
  • The first Win32 console version is 1.04
  • The first Linux version is 1.37
  • The first Win32 GUI is version 1.41
  • The first libc is version 1.64
  • The first Linux x64 version is 1.71.58
And special mention for those wanting an even older HDOS version then you can start with version 0.90 and work your way up from there.
Post 05 Mar 2023, 04:57
View user's profile Send private message Visit poster's website Reply with quote
sylware



Joined: 23 Oct 2020
Posts: 264
Location: Marseille/France
sylware 05 Mar 2023, 16:13
bootstraping is a big subject.

Current "elf/linux" stack is a disaster making "bootstraping" literaly insane.

Now you need a linux-like kernel with a modern gcc toolchain which you cannot build without a c++11 compiler... yeah... c++11 compiler, those guys are seriously toxic for the open source software world, like they want to destroy it.
Post 05 Mar 2023, 16:13
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 19254
Location: In your JS exploiting you and your system
revolution 05 Mar 2023, 18:02
I'm only bootstrapping fasm here.

It is someone else's problem if they also need to bootstrap their OS from zero.
Post 05 Mar 2023, 18:02
View user's profile Send private message Visit poster's website Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 19254
Location: In your JS exploiting you and your system
revolution 06 Mar 2023, 00:20
For compatibility with sh the "PREFIX" here string is this:
Code:
#!/bin/sh
p () { for BYTE in \$@ ; do printf \\\\"\$(printf %o 0x\$BYTE)" >>fasm.com ; done ; }    
It is a lot slower with the extra 31995 sub-processes that need to be started. Still only a few seconds though, and gives extra availability to systems that have sh.
Post 06 Mar 2023, 00:20
View user's profile Send private message Visit poster's website Reply with quote
Furs



Joined: 04 Mar 2016
Posts: 2103
Furs 06 Mar 2023, 14:35
sylware wrote:
bootstraping is a big subject.

Current "elf/linux" stack is a disaster making "bootstraping" literaly insane.

Now you need a linux-like kernel with a modern gcc toolchain which you cannot build without a c++11 compiler... yeah... c++11 compiler, those guys are seriously toxic for the open source software world, like they want to destroy it.
GCC can bootstrap itself, but it will take forever.
Post 06 Mar 2023, 14:35
View user's profile Send private message Reply with quote
sylware



Joined: 23 Oct 2020
Posts: 264
Location: Marseille/France
sylware 06 Mar 2023, 18:25
The reality is more complex.

before gcc 4.7.4, plain and simple C compilers could bootstrap gcc c++98 compilers. But some geniuses at GNU steering commitee did change that and made gcc to require a c++98 compiler to build.

And gcc 12.2.0 requires a c++11 compiler...

I guess you get the picture now: inductive vendor-lock-in by complexity and size.

Linux kernel: if you don't have the latest gcc extensions, it won't build...
Post 06 Mar 2023, 18:25
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 19254
Location: In your JS exploiting you and your system
revolution 06 Mar 2023, 19:31
Furs wrote:
GCC can bootstrap itself, but it will take forever.
Being able to self compile is not bootstrapping IMO.

You need something outside of that thing itself to initially create it.

So for GCC (and C compilers in general also I guess) you need something that is not GCC (and is not a C compiler at all) to make the first stage.
Post 06 Mar 2023, 19:31
View user's profile Send private message Visit poster's website Reply with quote
sylware



Joined: 23 Oct 2020
Posts: 264
Location: Marseille/France
sylware 06 Mar 2023, 20:11
It seems the convention about "The Bootstrap" is the set of binaries (which can build themselves) required to have the mimimum runtime and SDK to build a modern system.

The problem on elf/linux on PC, this set is just insane and disgusting.

Some are trying to clean that up: https://bootstrappable.org but I am pessimistic about it.

If I am not mistaken, they use an assembly-like byte code, to write a lisp interpreter (which seems to be order of magnitude simpler than a C compiler) and then they write a minimal C compiler with this lisp interpreter in order to build tinycc then gcc 4.7.4(c++98 ) then the last gcc you can compile with c++98 which provides c++11 then gcc 12.2.0.

The interpreter of the byte code is binary written (no assembler) specific to a machine ISA, it requires literaly a near 0 runtime.

Personally, I don't write assembly for that bootstrapable thingy: I write assembly because mainstream "high level" languages failed at pertinent syntax stability on the long run and their obscene syntax complexity kills most real-life alternative development effort right from the start: they are a planned obsolescence scam tainted with vendor lock-in. Saying otherwise would be bluntly hypocritical and would be negating reality.
Post 06 Mar 2023, 20:11
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 19254
Location: In your JS exploiting you and your system
revolution 06 Mar 2023, 20:46
In simple terms, if A needs B and B needs A then it isn't a bootstrap IMO. The chain could be of arbitrary length A-B-C-D-A etc. Any time you return back to a tool you started with then you are stuck in the loop forever. If your C compiler needs a C compiler then you need a C compiler. Erm, yeah, good luck with that. Smile

bootstrappable.org appear to have the right idea.
Post 06 Mar 2023, 20:46
View user's profile Send private message Visit poster's website Reply with quote
Furs



Joined: 04 Mar 2016
Posts: 2103
Furs 07 Mar 2023, 14:57
revolution wrote:
Furs wrote:
GCC can bootstrap itself, but it will take forever.
Being able to self compile is not bootstrapping IMO.

You need something outside of that thing itself to initially create it.

So for GCC (and C compilers in general also I guess) you need something that is not GCC (and is not a C compiler at all) to make the first stage.
Yeah. You can bootstrap GCC, starting with the very first version, and progressively build more versions of it. That's why I said it will take ages.
Post 07 Mar 2023, 14:57
View user's profile Send private message Reply with quote
al_Fazline



Joined: 24 Oct 2018
Posts: 54
al_Fazline 07 Mar 2023, 16:29
Regarding the script, I'm not sure it really qualifies as bootstrapping, the bash script in question is more like fancier version of a hexdump.

Suppose you use base64 of a fasm binary, would it qualify as no prior binary.
Post 07 Mar 2023, 16:29
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 19254
Location: In your JS exploiting you and your system
revolution 07 Mar 2023, 17:39
A plain hex dump, or base64, aren't auditable. If it is only raw hex then you still need a program on the target system to convert it to binary. I'm not sure there is any way to avoid that, unless you want to manually toggle switches by hand or something. So it might as well be [ba]sh.

With the script that also includes all the assembly and labels and things all in plain text, then any person suitably knowledgable in x86 encoding can certify the correctness, or at a minimum satisfy themselves that they are getting what they expect.


Last edited by revolution on 07 Mar 2023, 22:10; edited 2 times in total
Post 07 Mar 2023, 17:39
View user's profile Send private message Visit poster's website Reply with quote
sylware



Joined: 23 Oct 2020
Posts: 264
Location: Marseille/France
sylware 07 Mar 2023, 21:52
Yep, and the smaller the better, that's why the binary bootstrap (whic is supposed to be miniscule) is a "hexdump" with comments.
Post 07 Mar 2023, 21:52
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 19254
Location: In your JS exploiting you and your system
revolution 15 Mar 2023, 21:54
For DOS and Windows it is less clear to me how to make a nicely auditable bootstrap.

debug can be used to create a binary file from an input script, but it doesn't support comments. Sad
Code:
; in bootfasm.txt:
e100 b4 4a bb 10 10 cd 21 ba a2 7d b4 09 cd 21 fc e8
e110 04 01 e8 8a 01 e8 c7 00 80 3e 2f 7e 00 0f 84 b2
;...
rcx
7cfb
n bootfasm.com
w
q    
Then:
Code:
debug < bootfasm.txt    
That works. But it is just plain hex. How to show it isn't malware by including the accompanying source?
Post 15 Mar 2023, 21:54
View user's profile Send private message Visit poster's website Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8132
Location: Kraków, Poland
Tomasz Grysztar 15 Mar 2023, 21:59
revolution wrote:
That works. But it is just plain hex. How to show it isn't malware by including the accompanying source?
Maybe a batch file with interleaved comments and ECHO commands that would generate the bootfasm.txt and finally launch DEBUG?
Post 15 Mar 2023, 21:59
View user's profile Send private message Visit poster's website Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 19254
Location: In your JS exploiting you and your system
revolution 15 Mar 2023, 22:22
Like this?
Code:
rem                                     0000    org     100h
rem                                     0100    use16
rem
rem                                          start:
rem
rem                                             mov     ah,4Ah
echo e0100 B4 4A          > bootfasm.txt
rem                                     0102    mov     bx,1010h
echo e0102 BB 10 10      >> bootfasm.txt
rem                                     0105    int     21h
echo e0105 CD 21         >> bootfasm.txt
rem                                     0107    mov     dx,_logo
echo e0107 BA A2 7D      >> bootfasm.txt
rem                                     010A    mov     ah,9
echo e010A B4 09         >> bootfasm.txt
;...
echo rcx                 >> bootfasm.txt
echo 7CFB                >> bootfasm.txt
echo n bootfasm.com      >> bootfasm.txt
echo w                   >> bootfasm.txt
echo q                   >> bootfasm.txt
debug < bootfasm.txt    
The formatting needs some work. But, yes, all the elements are there. Smile
Post 15 Mar 2023, 22:22
View user's profile Send private message Visit poster's website Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 19254
Location: In your JS exploiting you and your system
revolution 17 Mar 2023, 21:47
So we went with this script using awk.
Code:
#!/bin/bash

mkdir -p worktemp || exit 1
pushd worktemp || exit 1

unzip -LL -n -q ../fasm10.zip || exit 1

ln -s expressi.inc source/expressions.inc
ln -s preproce.inc source/preprocessor.inc
ln -s assemble.inc source/assembler.inc

cd source || exit 1

cat > make.asm <<PREMACROS || exit 1
irp j,ja,jb,jc,je,jg,jl,jo,jp,js,jz,jae,jbe,jge,jle,jmp,jna,jnb,\\
    jnc,jne,jng,jnl,jno,jnp,jns,jnz,jpe,jpo,jnae,jnbe,jnge,jnle {
        macro j args \\{
                match byte, short \\\\{
                        j args
                \\\\}
        \\}
}
irp v,  Ah,Bh,Ch,Dh,Eh,Fh,\\
        A0h,A1h,A2h,A3h,A4h,A5h,A6h,A7h,A8h,A9h,AAh,ABh,ACh,ADh,AEh,AFh,\\
        B0h,B1h,B2h,B3h,B4h,B5h,B6h,B7h,B8h,B9h,BAh,BBh,BCh,BDh,BEh,BFh,\\
        C0h,C1h,C2h,C3h,C4h,C5h,C6h,C7h,C8h,C9h,CAh,CBh,CCh,CDh,CEh,CFh,\\
        D0h,D1h,D2h,D3h,D4h,D5h,D6h,D7h,D8h,D9h,DAh,DBh,DCh,DDh,DEh,DFh,\\
        E0h,E1h,E2h,E3h,E4h,E5h,E6h,E7h,E8h,E9h,EAh,EBh,ECh,EDh,EEh,EFh,\\
        F0h,F1h,F2h,F3h,F4h,F5h,F6h,F7h,F8h,F9h,FAh,FBh,FCh,FDh,FEh,FFh,\\
        AF0Fh,BA0Fh,D9DEh,E0DFh,FFFFh,FFFFFh,FFFFF0h { v equ 0#v }
include 'fasm.asm'
PREMACROS

fasm -s make.fas make.asm || exit 1
../../listing -a -b 9 make.fas make.lst || exit 1

awk 'BEGIN{
    print "@echo off                              >fasmmake.txt\r";
  }{
    addr=substr($0,24,4); hex=substr($0,30,26); source=gensub("\r","","g",substr($0,57)); bytes=int(length(gensub(" ","","g",hex))/2);
    eaddr=addr; if (addr == "    ") eaddr=prev_addr;
    prev_addr=sprintf("%04X",strtonum("0x" eaddr)+bytes);
    if (bytes == 0)
      printf "%58s %s %s\r\n", "rem", addr, source;
    else{
      print "echo e" eaddr, hex, ">>fasmmake.txt & rem", addr, source "\r";
      end_addr=prev_addr;}
  }END{
    print  "echo r cx                             >>fasmmake.txt\r";
    printf "echo %04X                             >>fasmmake.txt\r\n", strtonum("0x" end_addr)-0x100;
    print  "echo n fasm10.com                     >>fasmmake.txt\r";
    print  "echo w                                >>fasmmake.txt\r";
    print  "echo q                                >>fasmmake.txt\r";
    print  "debug < fasmmake.txt\r";
  }' < make.lst >> fasmmake.bat || exit 1

mv fasmmake.bat ../../ || exit 1
popd || exit 1
rm -r worktemp || exit 1    
It generates a reasonably readable output IMO.
Code:
;...
echo e010A B4 09                      >>fasmmake.txt & rem 010A         mov     ah,9
echo e010C CD 21                      >>fasmmake.txt & rem 010C         int     21h
                                                       rem      
echo e010E FC                         >>fasmmake.txt & rem 010E         cld
                                                       rem      
echo e010F E8 04 01                   >>fasmmake.txt & rem 010F         call    init_flatrm
echo e0112 E8 8A 01                   >>fasmmake.txt & rem 0112         call    init_memory
                                                       rem      
echo e0115 E8 C7 00                   >>fasmmake.txt & rem 0115         call    get_params
echo e0118 80 3E 2F 7E 00             >>fasmmake.txt & rem 0118         cmp     [params],0
echo e011D 0F 84 B2 00                >>fasmmake.txt & rem 011D         je      information
echo e0121 66 8D 06 30 7E             >>fasmmake.txt & rem 0121         lea     eax,[params+1]
;...
                                                       rem 7E2F params rb 100h
                                                       rem 7F2F buffer rb 4000h
                                                       rem      
                                                       rem      
echo r cx                             >>fasmmake.txt
echo 7CFB                             >>fasmmake.txt
echo n fasm10.com                     >>fasmmake.txt
echo w                                >>fasmmake.txt
echo q                                >>fasmmake.txt
debug < fasmmake.txt    
Now to submit and see if we can get this across the hatchway. Confused
Post 17 Mar 2023, 21:47
View user's profile Send private message Visit poster's website Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2023, Tomasz Grysztar. Also on GitHub, YouTube, Twitter.

Website powered by rwasa.