flat assembler
Message board for the users of flat assembler.
Index
> DOS > PSR Invaders 1.1 Goto page 1, 2 Next |
Author |
|
rugxulo 08 Feb 2017, 09:30
Ever played PSR Invaders 1.1 (circa 1995)? invadr11.zip (27 kb) mirror #2
In 2004, I manually (but sloppily) converted it to work with free assemblers. But this new script is much simpler, smaller, and more accurate than the full quirky, modified source. Tested with either GNU sed or Cheap sed. EDIT: Minor .BAT cleanups, now builds in Windows, too. EDIT#2: Minor simplifications. Code: @echo off ::#--- fix2.sed begins --- :: 1i\ :: OFFSET equ\ :: Offset equ\ :: Ptr equ\ :: LEA equ MOV :: /^;/b :: /CODE_SEG/d :: /END /d :: / ENDP/d :: s/40:// :: s/\]\[/+/ :: s/\[0\]// :: s/\][+]BX/+BX/ :: s/ES:\[/[ES:/ :: s/[ ][ ]*PROC .*/:/ :: s/\[\[/[/ ::#--- fix2.sed ends --- ::#--- fix3.sed begins --- :: /RemoveNewInt9:/,/ RET/s/OldInt9Addr/cs:&/ :: /NewInt9Handler:/,/NotIntercept:/s/\[/[cs:/ :: /NotIntercept:/,/CLC/s/StoreAX/cs:&/ ::#--- fix3.sed ends --- ::#--- vars.sed begins --- :: /^;/b :: / D[BWD] /b :: /,OFFSET/b :: /LEA /b ::#--- vars.sed ends --- if not exist invaders.asm goto end if not exist %0 %0.bat %1 if "%SED%"=="" set SED=sed echo %%SED%% = '%SED%' set B1=begins --- set E1=ends --- set S1=fix2.sed fix3.sed vars.sed for %%a in (%S1%) do %SED% -n -e "/%%a %B1%/,/ %E1%$/s/^::[ ][ ]*//w %%a" %0 for %%z in (S1 E1 B1) do set %%z= for %%a in (fix2 fix3 vars) do if not exist %%a.sed goto end set I0=invaders.asm %SED% -n -e "/ D[BWD] /s@^\([^ ][^ ]*\).*@s/\\<\1\\>/[\&]/@p" %I0%>>vars.sed %SED% -f vars.sed %I0% | %SED% -f fix2.sed -f fix3.sed >inv-fasm.asm set I0= REM cwsdpmi fasm inv-fasm.asm inv-fasm.com >NUL if not exist inv-fasm.com goto end echo. echo INV-FASM.COM FFF22EF9 crc32 inv-fasm.com echo. if "%1"=="notclean" goto end del vars.sed >NUL del fix?.sed >NUL del inv-fasm.asm >NUL :end if "%SED%"=="sed" set SED= Since I'm also sometimes using antiX Linux, which comes with DOSBox, I also whipped up a quick makefile in order to cross-build. EDIT: Very minor makefile cleanups. EDIT#2: Minor simplifications. Code: # GNUmakefile .RECIPEPREFIX := _ #=== fix2.sed begins === # 1i\ # OFFSET equ\ # Offset equ\ # Ptr equ\ # LEA equ MOV # /^;/b # /CODE_SEG/d # /END /d # / ENDP/d # s/40:// # s/\]\[/+/ # s/\[0\]// # s/\][+]BX/+BX/ # s/ES:\[/[ES:/ # s/[ ][ ]*PROC .*/:/ # s/\[\[/[/ #=== fix2.sed ends === #=== fix3.sed begins === # /RemoveNewInt9:/,/ RET/s/OldInt9Addr/cs:&/ # /NewInt9Handler:/,/NotIntercept:/s/\[/[cs:/ # /NotIntercept:/,/CLC/s/StoreAX/cs:&/ #=== fix3.sed ends === .PHONY: all check clean cleanall PROG=inv-fasm GAMEZIP=invadr11.zip OLDASM=INVADERS.ASM SED=sed FASM=fasm MD5SUM=md5sum WGET=wget WGETOPT=-q UNZIPPER=unzip UNZIPFLAGS=-qjan #GAMEURL=ftp.lanet.lv/ftp/mirror/x2ftp/msdos/programming/gamesrc/ GAMEURL=www.ibiblio.org/pub/micro/pc-stuff/freedos/files/games/invaders/ all: $(PROG).com check $(PROG).com: $(PROG).asm _@$(FASM) $< $@ fix2.sed fix3.sed: $(lastword $(MAKEFILE_LIST)) _@$(SED) -n -e '/$@ begins ===/,/$@ ends ===/s/^#[ ][ ]*//w $@' $< asmvars.sed: $(OLDASM) _@$(SED) -n -e 's|^\([^ ][^ ]*\)[ ][ ]*D[BWD] ..*|s/\\<\1\\>/[\&]/|p' $^ >$@ $(PROG).asm: $(OLDASM) asmvars.sed fix2.sed fix3.sed _@$(SED) -e '/^;/b' -e '/ D[BWD] /b' -e '/,OFFSET/b' -e '/LEA /b'\ -f asmvars.sed $< | $(SED) -f fix2.sed -f fix3.sed >$@ $(GAMEZIP): _@$(WGET) $(WGETOPT) $(GAMEURL)$(GAMEZIP) $(OLDASM): $(GAMEZIP) _@$(UNZIPPER) $(UNZIPFLAGS) $< INVADERS/$@ >/dev/null check: $(PROG).com _@$(MD5SUM) $< _@echo 5d6fa26af01606feb90f17e014390139 \ $< clean: _@$(RM) $(PROG).asm fix?.sed asmvars.sed cleanall: clean _@$(RM) $(PROG).com $(OLDASM) # EOF Last edited by rugxulo on 20 Feb 2018, 04:55; edited 3 times in total |
|||
08 Feb 2017, 09:30 |
|
rugxulo 07 Feb 2018, 21:34
I thought GNU sed and Cheap sed would be good enough for everyone (not that anyone complained). Both are GPL with ports to DOS, Windows, Linux, et al. But apparently "\<" "\>" is not truly portable nor standardized (yet??).
Cheap sed (2004) is a modified version of HHsed (1991), which was based upon Eric Raymond's sed. Another improved derivative of his by Rene Rebe is called minised (2014, BSD), which lacks the non-standard feature mentioned above. So, in the interest of perfection, I toyed with the idea of making a working script that didn't need the non-standard kludge. (I already did the same for NASM but instead using its preprocessor.) Just for completeness .... P.S. Happy 40th anniversary, Space Invaders! INV-FAS2.BAT Code: @echo off ::#--- fix2.sed begins --- :: 1i\ :: OFFSET equ\ :: Offset equ\ :: Ptr equ\ :: LEA equ MOV :: /^;/b :: /CODE_SEG/d :: /END /d :: / ENDP/d :: s/[ ][ ]*PROC .*/:/ :: s/ES:\[/[ES:/ :: s/ DD / DW 0,/ :: s/\[0\]$// :: s/40:// :: s/,,,*/,/ :: s/,\[\([0-9]\)\],\(\[.*\)\]/,\2+\1]/ :: s/\],\[/+/ :: /INC /s/\[\([0-9]\)\]/+\1/ :: /[+]BX/!s/\([ID].C\)[ ][ ]*\([^ ][^ ][^ ][^ ]*\)/\1 [\2]/ ::#--- fix2.sed ends --- ::#--- fix3.sed begins --- :: /RemoveNewInt9:/,/ RET/s/OldInt9Addr/cs:&/ :: /NewInt9Handler:/,/NotIntercept:/s/\[/[cs:/ :: /NotIntercept:/,/CLC/s/StoreAX/cs:&/ ::#--- fix3.sed ends --- ::#--- vars1.sed begins --- :: /^;/b :: / D[BWD] /b :: /,OFFSET/b :: /LEA /b :: s/ *;.*$// :: s/Word Ptr // :: s/\[[0-9]\],/,&/ :: s/\(,.*\)\(\[[0-9]\]\)/,\2\1/ ::#--- vars1.sed ends --- set INV=invaders.asm if not exist %INV% goto end if not exist %0 %0.bat %1 if "%SED%"=="" set SED=minised echo %%SED%% = '%SED%' set B1=begins --- set E1=ends --- set S1=fix2.sed fix3.sed vars1.sed for %%a in (%S1%) do %SED% -n -e "/%%a %B1%/,/ %E1%$/s/^::[ ][ ]*//w %%a" %0 for %%z in (S1 E1 B1) do set %%z= for %%a in (fix2 fix3 vars1) do if not exist %%a.sed goto end set V1=vars1.sed set V2=vars2.sed set F2=fix2.sed set F3=fix3.sed %SED% -n -e "/ D[BWD] /s|^\([^ ][^ ]*\).*|s/,\\(\1\\)$/,[\\1]/|p" %INV%>>%V1% %SED% -e "s|^s/,|s/ |" -e "s|\$/,\[\\1\]/|,/[\\1],/|" %V1% >%V2% %SED% -f %V1% %INV% | %SED% -f %V2% | %SED% -f %F2% -f %F3% >inv-fasm.asm for %%z in (V1 V2 F2 F3) do set %%z= REM cwsdpmi fasm inv-fasm.asm inv-fasm.com >NUL if not exist inv-fasm.com goto end echo. echo INV-FASM.COM FFF22EF9 crc32 inv-fasm.com echo. if "%1"=="notclean" goto end del vars?.sed >NUL del fix?.sed >NUL del inv-fasm.asm >NUL :end set INV= if "%SED%"=="minised" set SED= FASM2.MAK Code: # GNUmakefile .NOTPARALLEL: .RECIPEPREFIX := _ #=== fix2.sed begins === # 1i\ # OFFSET equ\ # Offset equ\ # Ptr equ\ # LEA equ MOV # /^;/b # /CODE_SEG/d # /END /d # / ENDP/d # s/[ ][ ]*PROC .*/:/ # s/ES:\[/[ES:/ # s/ DD / DW 0,/ # s/\[0\]$// # s/40:// # s/,,,*/,/ # s/,\[\([0-9]\)\],\(\[.*\)\]/,\2+\1]/ # s/\],\[/+/ # /INC /s/\[\([0-9]\)\]/+\1/ # /[+]BX/!s/\([ID].C\)[ ][ ]*\([^ ][^ ][^ ][^ ]*\)/\1 [\2]/ #=== fix2.sed ends === #=== fix3.sed begins === # /RemoveNewInt9:/,/ RET/s/OldInt9Addr/cs:&/ # /NewInt9Handler:/,/NotIntercept:/s/\[/[cs:/ # /NotIntercept:/,/CLC/s/StoreAX/cs:&/ #=== fix3.sed ends === #=== vars0.sed begins === # /^;/b # / D[BWD] /b # /,OFFSET/b # /LEA /b # s/ *;.*$// # s/Word Ptr // # s/\[[0-9]\],/,&/ # s/\(,.*\)\(\[[0-9]\]\)/,\2\1/ #=== vars0.sed ends === .PHONY: all check clean cleanall #http://exactcode.com/opensource/minised/ #http://dl.exactcode.de/oss/minised/minised-1.15.tar.gz SED=minised PROG=inv-fasm GAMEZIP=invadr11.zip OLDASM=INVADERS.ASM FASM=fasm MD5SUM=md5sum WGET=wget WGETOPT=-q UNZIPPER=unzip UNZIPFLAGS=-qjan #GAMEURL=ftp.lanet.lv/ftp/mirror/x2ftp/msdos/programming/gamesrc/ GAMEURL=www.ibiblio.org/pub/micro/pc-stuff/freedos/files/games/invaders/ all: $(PROG).com check $(PROG).com: $(PROG).asm _$(FASM) $< $@ fix2.sed fix3.sed vars0.sed: $(lastword $(MAKEFILE_LIST)) _@$(SED) -n -e '/$@ begins ===/,/$@ ends ===/s/^#[ ][ ]*//w $@' $< vars1.sed: $(OLDASM) _@$(SED) -n -e '/ D[BWD] /s|^\([^ ][^ ]*\).*|s/,\\(\1\\)$$/,[\\1]/|p' $< >$@ vars2.sed: vars1.sed _@$(SED) -e 's|^s/,|s/ |' -e 's|\$$/,\[\\1\]/|,/[\\1],/|' $< >$@ $(PROG).asm: vars0.sed vars1.sed vars2.sed fix2.sed fix3.sed $(OLDASM) _$(SED) -f $(word 1,$^) -f $(word 2,$^) $(lastword $^)\ | $(SED) -f $(word 1,$^) -f $(word 3,$^)\ | $(SED) -f $(word 4,$^) -f $(word 5,$^) >$@ $(GAMEZIP): _@$(WGET) $(WGETOPT) $(GAMEURL)$(GAMEZIP) $(OLDASM): $(GAMEZIP) _@$(UNZIPPER) $(UNZIPFLAGS) $< INVADERS/$@ >/dev/null check: $(PROG).com _@$(MD5SUM) $< _@echo 5d6fa26af01606feb90f17e014390139 \ $< clean: _@$(RM) $(PROG).asm vars?.sed fix?.sed cleanall: clean _@$(RM) $(PROG).com $(OLDASM) # EOF Last edited by rugxulo on 16 Feb 2018, 20:12; edited 1 time in total |
|||
07 Feb 2018, 21:34 |
|
Furs 13 Feb 2018, 13:26
Reading about all these limitations of various sed versions makes me understand why autoconf is such a nightmare, I suppose. (I've only used GNU sed)
|
|||
13 Feb 2018, 13:26 |
|
rugxulo 16 Feb 2018, 19:37
It was all in the spirit of minimalism, keeping things small with few dependencies, doing so as simply and portably as possible.
I guess I could've used Devore's NOMYSO (Perl), but Perl is much heavier, and the DJGPP port is abandoned (stuck at old 5.8.8). GNU sed still gets ports to DJGPP (barely), but it looks annoying to rebuild (haven't tried). Also, it's bloated. But it does work. But so does Cheap Sed. Latest GNU sed 4.4 (32-bit, DJGPP 2.05) is 291 kb (or 140 kb UPX'd). Old GNU sed 4.2.2 (32-bit, DJGPP 2.03p2) is 220 kb (or 105 kb UPX'd). Cheap sed (16-bit, 2004) is 26 kb (or 15 kb UPX'd). I was naive and forgot that "\<" wasn't standard. I guess that's not BRE, only ERE? Introduced in ex/vi, presumably (while sed originally came from ed). It just seems silly, to me, to require GNU sed when *BSD sed is 90% compatible already. Nobody wants to require two different seds just because software is too stupid to be compatible. It's like relying on both Perl and Python, or multiple assemblers. Sure, some projects do it, but it's "bad". Redundancy is bad. |
|||
16 Feb 2018, 19:37 |
|
rugxulo 05 Apr 2019, 19:03
Okay, so I wrote a much-simplified sed script for this. It's much cleaner and smaller. FYI, neither Minised nor old SEDMOD liked the "\{2,\}" construct, so I avoided that here. (BTW, these scripts are all p.d. or MIT or whatever, I don't care. Just in case that wasn't obvious.)
EDIT: Very minor simplifications. EDIT: A few more minor simplifications. Code: @echo off ::#--- fix.sed begins --- :: 1i\ :: OFFSET equ\ :: Offset equ\ :: LEA equ MOV :: /^;/b :: / D[BW] /b :: /LEA /b :: s/\[0\]// :: /,O[fF]/b :: /CODE_SEG/d :: /END /d :: / ENDP/d :: s/ DD / DW 0,/ :: s/Word Ptr // :: s/ *PROC .*/:/ :: s/ES:\[/[ES:/ :: s/40:// :: s/\[\([0-9]\)\]/_\1/ :: s/\([A-Z][a-zA-Z0-9][a-zA-Z0-9][a-zA-Z0-9_]*\),/[\1],/ :: s/,\([A-Z][a-zA-Z0-9][a-zA-Z0-9][a-zA-Z0-9_]*\)/,[\1]/ :: s/\([ID].C\)[ ][ ]*\([A-Z][^ ][^ ][^ ]*\)/\1 [\2]/ :: /_[0-9]\]/s/_/+/ :: /RemoveNewInt9:/,/CLC/s/\[\([^0]\)/[cs:\1/ ::#--- fix.sed ends --- if not exist invaders.asm goto end if not exist %0 %0.bat %1 if "%SED%"=="" set SED=minised echo %%SED%% = '%SED%' %SED% -n -e "/fix\.sed begins ---/,/ ends ---$/s/^::[ ][ ]*//w fix.sed" %0 if not exist fix.sed goto end %SED% -f fix.sed invaders.asm >inv-fasm.asm %DPMION% fasm inv-fasm.asm inv-fasm.com >NUL %DPMIOFF% if not exist inv-fasm.com goto end echo. echo INV-FASM.COM FFF22EF9 crc32 inv-fasm.com echo. if "%1"=="notclean" goto end del fix.sed >NUL del inv-fasm.asm >NUL :end if "%SED%"=="minised" set SED= But I'm still thinking (too hard?) about other possible solutions. I'm still not fully satisfied. Last edited by rugxulo on 28 Jul 2019, 03:02; edited 2 times in total |
|||
05 Apr 2019, 19:03 |
|
Tomasz Grysztar 05 Apr 2019, 19:22
rugxulo wrote: But I'm still thinking (too hard?) about other possible solutions. I'm still not fully satisfied. I think I may attempt this in a spare time. This could be another possible niche for fasmg - to allow assembly of legacy sources with an adaptive tool instead of altering the original text. Note that with help of fasmg's linear polynomials and metadata features even things like ASSUME could be supported in full (with the right segment prefix being chosen automatically depending on where the label is defined). |
|||
05 Apr 2019, 19:22 |
|
Tomasz Grysztar 05 Apr 2019, 21:13
I managed to get INVADERS.ASM to assemble with fasmg in unmodified form. This is a quick and dirty set of headers that does the job (save it as LEGACY.ASH):
Code: include 'cpu/80186.inc' ; use an added term to distinguish labels from absolute values: element CODE macro ORG? address org CODE + address end macro ORG 0 ; ignore some constructions for now: struc SEGMENT? end struc struc ENDS? end struc macro END?.BEGIN end macro ; register ASSUMEs, though emulation is currently limited to insertion of CS prefix under DS:NOTHING setting macro ASSUME? statement& iterate each, statement match reg:seg, each define ASSUMED?.reg? seg end match end iterate end macro ; very limited PROC emulation, just to get things going: struc (name) PROC? type match =NEAR?, type name: else match =FAR?, type name: else err 'unsupported PROC type' end match struc (name2) ENDP? match =name, name2 restruc ENDP? else err 'unexpected ENDP' end match end struc end struc ; parse legacy operand syntax: macro x86.parse_operand ns,op ns.size = 0 match =OFFSET? value, op x86.parse_legacy_address ns,value if ns.type = 'mem' ns.type = 'imm' ns.imm = ns.address ns.size = 0 ns.displacement_size = 0 end if else match sz =PTR? value, op ns.size = sz x86.parse_legacy_address ns,value else x86.parse_legacy_address ns,op end match if ns.type = 'imm' ns.displacement_size = 0 if ns.imm eq 1 elementof ns.imm if 1 metadataof (1 metadataof ns.imm) relativeto x86.reg ns.type = 'reg' ns.mod = 11b ns.rm = 1 metadataof ns.imm - 1 elementof (1 metadataof ns.imm) if ns.size & ns.size <> 1 metadataof (1 metadataof ns.imm) - x86.reg err 'operand sizes do not match' else ns.size = 1 metadataof (1 metadataof ns.imm) - x86.reg end if else if 1 metadataof ns.imm relativeto x86.sreg ns.type = 'sreg' ns.rm = 1 metadataof ns.imm - x86.sreg if ns.size & ns.size <> 2 err 'operand sizes do not match' else ns.size = 2 end if end if end if end if end macro macro x86.parse_legacy_address ns,op ns.segment_prefix = 0 local buffer,prefix buffer equ op define prefix match seg:offs, buffer if ~ seg relativeto 0 ; ignore numeric segment prefix redefine prefix seg: end if redefine buffer offs else match =Nothing?, ASSUMED?.DS redefine prefix CS: end match match base[add], buffer ns.type = 'mem' if elementsof(base+add) > elementsof(base+add-CODE) x86.parse_address ns,prefix base+add-CODE else x86.parse_address ns,prefix base+add end if else match [add], buffer ns.type = 'mem' if elementsof(add) > elementsof(add-CODE) x86.parse_address ns,prefix add-CODE else x86.parse_address ns,prefix add end if else if elementsof(buffer) > elementsof(buffer-CODE) ns.type = 'mem' x86.parse_address ns,prefix buffer-CODE else ns.imm = buffer ns.type = 'imm' ns.displacement_size = 0 end if end match end macro macro x86.parse_jump_operand ns,op match =far? dest, op x86.parse_operand_value ns,dest ns.jump_type = 'far' else match =near? dest, op x86.parse_operand_value ns,dest ns.jump_type = 'near' else match =short? dest, op x86.parse_operand_value ns,dest ns.jump_type = 'short' else x86.parse_operand_value ns,op ns.jump_type = '' end match end macro Then assemble on Windows with command like: Code: fasmg -iInclude('legacy.ash') invaders.asm invaders.com Code: fasmg -i include\ \'legacy.ash\' invaders.asm invaders.com Much more work would be needed to get segment definitions and the ASSUME to work correctly in general, but at least I managed to quickly demonstrate that this approach with fasmg is possible. And this prototype already hints at methods needed to provide a broader solution. |
|||
05 Apr 2019, 21:13 |
|
Tomasz Grysztar 06 Apr 2019, 08:45
For a little more compatibility we also need a LEA to MOV optimization, which TASM did (and INVADERS.ASM seems to be using it a lot). For a 16-bit instruction set a modified version of LEA macro that does it is:
Code: macro lea? dest*,src* x86.parse_operand @dest,dest x86.parse_operand @src,src if @dest.size <> 0 & @dest.size <> 2 err 'invalid operand size' end if if @src.type = 'mem' & @dest.type = 'reg' if @src.address_registers eq 0 db 0B8h + @dest.rm ; optimize to MOV dw @src.address else x86.store_instruction 8Dh,@src,@dest.rm end if else err 'invalid combination of operands' end if end macro But there is also a simpler option of redefining LEA on top of existing instruction handlers, which should be universal: Code: macro lea? dest*,src* x86.parse_operand @src,src if @src.type = 'mem' & @src.address_registers eq 0 mov dest,@src.address else lea dest,src end if end macro |
|||
06 Apr 2019, 08:45 |
|
Tomasz Grysztar 06 Apr 2019, 17:50
Encouraged by the first success, I decided to try it with a different source from the same era, one that holds more sentimental value for me - StarPort BBS Intro II (fcsp2src.zip) by Future Crew (1993).
This one required many more tricks, including emulation of @@-prefixed locals. But I finally got it to assemble with this updated LEGACY.ASH: Code: include 'cpu/80386.inc' ; use an added term to distinguish labels from absolute values: element CODE macro ORG? address org CODE + address end macro ORG 0 ; register ASSUMEs, though emulation is currently limited to insertion of CS prefix under DS:NOTHING setting macro ASSUME? statement& iterate each, statement match reg:seg, each define ASSUMED?.reg? seg end match end iterate end macro ; just an empty shell of SEGMENT: struc (name) SEGMENT? attr struc (name2) ENDS? match =name, name2 restruc ENDS? else err 'mismatched ENDS, expected one for ',`name end match end struc end struc ; use ALIGN variant that ignores variable bases: macro ALIGN? pow2*,value:? db (-($ scale 0)) and (pow2-1) dup value end macro ; there is one DB statement that uses a wild nested DUP syntax, we need to catch it and convert to something fasm understands: macro DB? definitions& match N =dup(M =dup(value)), definitions DB N*M dup (value) else DB definitions end match end macro ; handle OFFSET in DW definitions with a simple trick: macro DW? definitions& define OFFSET? -CODE+ ; could also be "0 scaleof" DW definitions restore OFFSET? end macro struc DW? definitions& label . : WORD DW definitions end struc ; several constructions have a very straightforward emulation: struc MACRO? definition& purge . ; disable forward-referencing macro . definition end struc macro ENDM?! esc end macro end macro macro IFDEF?! symbol if defined symbol end macro macro ENDIF?! end if end macro macro REPT? count macro ENDM?! end repeat purge ENDM? end macro repeat count end macro struc LABEL? size label . : size end struc ; a convoluted implementation of TASM's @@-prefixed local labels: macro __CMD end macro struc (name) ? definition& ; this intercepts all labels (including unknown instructions) match , definition display 'ignoring unknown directive ',`name,13,10 else if `name and 0FFFFh = '@@' REGISTER_LOCAL name ; this should cause the name to be defined in PROC as a symbolic link match true_symbol, name ; use MATCH to extract linked symbol macro __CMD purge __CMD true_symbol definition end macro end match else if $ eq CODE + 100h macro END?.name ; allow this label to be used as an entry point with END directive macro ?! line& end macro end macro end if macro __CMD purge __CMD name definition end macro end if __CMD end struc macro START_LOCALS namespace namespace.LOCALS ; the namespace.LOCALS macro is constructed to contain definitions like: ; define @@1 namespace.@@1 macro LOCALS_BUILDER esc macro namespace.LOCALS end macro macro REGISTER_LOCAL label macro LOCALS_BUILDER LOCALS_BUILDER define label namespace.label end macro end macro macro END_LOCALS LOCALS_BUILDER esc end macro end macro end macro struc (name) PROC? type match =NEAR?, type name: else match =FAR?, type name: else err 'unsupported PROC type' end match START_LOCALS name ; define symbolic links struc (name2) ENDP? match =name, name2 restruc ENDP? END_LOCALS else err 'mismatched ENDP, expected one for ',`name end match end struc end struc ; parse legacy operand syntax: macro x86.parse_operand ns,op ns.size = 0 ns.segment_prefix = 0 ns.prefix = 0 ns.opcode_prefix = 0 match =OFFSET? value, op x86.parse_legacy_address ns,value if ns.type = 'mem' ns.type = 'imm' ns.imm = ns.address ns.size = 0 ns.displacement_size = 0 end if else match sz =PTR? value, op ns.size = sz x86.parse_legacy_address ns,value else x86.parse_legacy_address ns,op end match if ns.type = 'imm' ns.segment_prefix = 0 ns.displacement_size = 0 if ns.imm eq 1 elementof ns.imm if 1 metadataof (1 metadataof ns.imm) relativeto x86.reg ns.type = 'reg' ns.mod = 11b ns.rm = 1 metadataof ns.imm - 1 elementof (1 metadataof ns.imm) if ns.size & ns.size <> 1 metadataof (1 metadataof ns.imm) - x86.reg err 'operand sizes do not match' else ns.size = 1 metadataof (1 metadataof ns.imm) - x86.reg end if else if 1 metadataof ns.imm relativeto x86.sreg ns.type = 'sreg' ns.rm = 1 metadataof ns.imm - x86.sreg if ns.size & ns.size <> 2 err 'operand sizes do not match' else ns.size = 2 end if end if end if end if end macro macro x86.parse_legacy_address ns,op local buffer,prefix buffer equ op define prefix match seg:offs, buffer if ~ seg relativeto 0 ; ignore numeric segment prefix redefine prefix seg: end if redefine buffer offs else match =Nothing?, ASSUMED?.DS redefine prefix CS: end match match seg:, prefix x86.parse_segment_prefix ns,seg end match match base[add], buffer ns.type = 'mem' if elementsof(base+add) > elementsof(base+add-CODE) x86.parse_address ns,base+add-CODE else x86.parse_address ns,base+add end if else match [add], buffer ns.type = 'mem' if elementsof(add) > elementsof(add-CODE) x86.parse_address ns,add-CODE else x86.parse_address ns,add end if else if elementsof(buffer) > elementsof(buffer-CODE) ns.type = 'mem' x86.parse_address ns,buffer-CODE else ns.imm = buffer ns.type = 'imm' ns.displacement_size = 0 end if end match end macro macro x86.parse_jump_operand ns,op ns.size = 0 match =far? dest, op x86.parse_operand_value ns,dest ns.jump_type = 'far' else match =near? dest, op x86.parse_operand_value ns,dest ns.jump_type = 'near' else match =short? dest, op x86.parse_operand_value ns,dest ns.jump_type = 'short' else x86.parse_operand_value ns,op ns.jump_type = '' end match if ns.type = 'imm' if ns.size = 0 ns.size = x86.mode shr 3 end if if ns.imm relativeto 0 & (ns.imm < 0 | ns.imm >= 1 shl (ns.size*8)) err 'value out of range' end if end if end macro ; optimize LEA to MOV where possible: macro lea? dest*,src* x86.parse_operand @src,src if @src.type = 'mem' & @src.address_registers eq 0 mov dest,@src.address else lea dest,src end if end macro PS. The comments in the beginning of SP2.ASM mention that original executable was made smaller by a separate postprocessor that removed zero bytes from the end of file. But we can include a couple of specially tailored macros to do this during assembly: Code: macro zerobeg def label zerobeg : word virtual end macro macro zeroend def label zeroend : word end virtual end macro |
|||
06 Apr 2019, 17:50 |
|
rugxulo 07 Apr 2019, 20:26
Nice work.
(Ah, you amended your post. Well, I haven't tried that last bit yet, but I did notice the size issue for SP2.COM.) I'm not 100% sure how you personally verified that SP2.COM assembled correctly. At first I was confused and remembered your old FASM 1 port of SP2.ASM that is found on this forum. So I tried to match that. (I used BTTR's X util to trim the bloat.) Quote:
Yes, the original was 1993 bytes, but I'm comparing two FASM versions instead. Not sure why you enabled XORTEXTS in that version, it's not enabled in the original by default. Also, your offsets are different because of different alignment in the .bss uninitialized section due to lacking the second 0FCh byte marker. But otherwise it's literally ("byte for byte") identical, once those two changes are made. At least, that's the best attempt I could make for that old version. Then I tried with TASM since I rather assumed you tried comparing with that. Quote:
For PSR Invaders, I had used NDISASM's output but deleted the (differing) encoding column and only compared disassembled instruction text. That was 100% identical there, but here I ran into a slight text variation, even if the instructions are (roughly) the exact same. At least, they function the same and are the same size, but I'm sure you recognize that it's a tiny drop different (unsigned word vs. signed byte or whatever). So that made me nervous, but it's actually fine. |
|||
07 Apr 2019, 20:26 |
|
Tomasz Grysztar 07 Apr 2019, 21:20
rugxulo wrote: I'm not 100% sure how you personally verified that SP2.COM assembled correctly. rugxulo wrote: For PSR Invaders, I had used NDISASM's output but deleted the (differing) encoding column and only compared disassembled instruction text. That was 100% identical there, but here I ran into a slight text variation, even if the instructions are (roughly) the exact same. At least, they function the same and are the same size, but I'm sure you recognize that it's a tiny drop different (unsigned word vs. signed byte or whatever). So that made me nervous, but it's actually fine. |
|||
07 Apr 2019, 21:20 |
|
Tomasz Grysztar 13 Apr 2019, 11:37
I thought it would be nice to also adapt MODES.ASM, which I recently recommended as a reference implementation of VGA mode switching without BIOS calls.
This one uses TASM's "ideal" mode, so it requires a separate package, as this is yet another variant of syntax. Less work required in operand parsing macros this time, because this variation has much more in common with fasm's syntax. Also, this time the source defines two different segments and produces .EXE file, so I've used my fasm-compatible MZ.INC formatting macros and a modified "x86.parse_operand_value" that parses SEG and OFFSET prefixes to process these values properly (and generate relocations for SEG values). I also had to improve the local labels processor to handle the labels outside of procedures, too. The complete package that allows to assemble unaltered MODES.ASM looks like: Code: include 'cpu/80386.inc' macro IDEAL? end macro macro MODEL? selection format binary as 'exe' include 'format/mz.inc' end macro macro CODESEG? element CODE segment CODE.seg org CODE end macro macro DATASEG? element DATA segment DATA.seg org DATA end macro macro SEGES? instruction& db 26h instruction end macro macro EXITCODE? code mov ax,4C00h + (code and 0FFh) int 21h end macro ; adapt MACRO syntax: macro ENDM?! esc end macro macrofooter end macro macro macrofooter end macro macro REPT?! count local content macro macrofooter purge macrofooter? repeat count content end repeat end macro esc macro content end macro ; a convoluted implementation of TASM's @@-prefixed local labels: macro __CMD end macro struc (name) ? definition& ; this intercepts all labels (including unknown instructions) match , definition display 'ignoring unknown directive ',`name,13,10 else if `name and 0FFFFh = '@@' REGISTER_LOCAL name ; this should cause the name to be defined in PROC as a symbolic link match true_symbol, name ; use MATCH to extract linked symbol macro __CMD purge __CMD true_symbol definition end macro end match else match :, definition macro END?.name? ; allow this label to be used as an entry point with END directive entry CODE.seg:name-CODE end macro end match macro __CMD purge __CMD name definition end macro end if __CMD end struc macro START_LOCALS namespace namespace.LOCALS ; the namespace.LOCALS macro is constructed to contain definitions like: ; define @@1 namespace.@@1 macro namespace.LOCALS_BUILDER esc macro namespace.LOCALS end macro macro namespace.LOCALS_CLEANUP end macro macro REGISTER_LOCAL label macro namespace.LOCALS_BUILDER namespace.LOCALS_BUILDER define label namespace.label end macro macro namespace.LOCALS_CLEANUP namespace.LOCALS_CLEANUP restore label end macro end macro macro END_LOCALS purge REGISTER_LOCAL,END_LOCALS namespace.LOCALS_BUILDER esc end macro namespace.LOCALS_CLEANUP end macro end macro macro PROC? name name: START_LOCALS name ; define symbolic links macro ENDP? purge ENDP? END_LOCALS end macro end macro define Global START_LOCALS Global postpone END_LOCALS end postpone ; process segmented-aware addresses in instruction operands, including OFFSET and SEG prefixes macro x86.parse_operand_value ns,op ns.segment_prefix = 0 ns.prefix = 0 ns.opcode_prefix = 0 match =OFFSET? addr, op ns.type = 'imm' ns.imm = +addr if ns.imm relativeto CODE ns.imm = ns.imm - CODE else if ns.imm relativeto DATA ns.imm = ns.imm - DATA end if ns.displacement_size = 0 else match =SEG? addr, op ns.type = 'imm' ns.imm = +addr if ns.imm relativeto CODE ns.imm = CODE.seg else if ns.imm relativeto DATA ns.imm = DATA.seg end if ns.displacement_size = 0 else match [addr], op ns.type = 'mem' match :sz offs, x86.addr ns.size = sz x86.parse_address ns,offs else x86.parse_address ns,addr end match if ns.displacement relativeto CODE ns.address = ns.address - CODE ns.displacement = ns.displacement - CODE else if ns.displacement relativeto DATA ns.address = ns.address - DATA ns.displacement = ns.displacement - DATA end if else ns.type = 'imm' ns.imm = +op if defined op ns.unresolved = 0 else ns.unresolved = 1 end if ns.displacement_size = 0 if ns.imm eq 1 elementof ns.imm if 1 metadataof (1 metadataof ns.imm) relativeto x86.reg ns.type = 'reg' ns.mode = x86.mode ns.mod = 11b ns.rm = 1 metadataof ns.imm - 1 elementof (1 metadataof ns.imm) if ns.size & ns.size <> 1 metadataof (1 metadataof ns.imm) - x86.reg err 'operand sizes do not match' else ns.size = 1 metadataof (1 metadataof ns.imm) - x86.reg end if else if 1 metadataof ns.imm relativeto x86.sreg ns.type = 'sreg' ns.rm = 1 metadataof ns.imm - x86.sreg if ns.size <> 0 & ns.size <> 2 & ns.size <> 4 err 'invalid operand size' end if end if end if end match end macro Code: fasmg -iInclude('TASMEMU.ASH') MODES.ASM |
|||
13 Apr 2019, 11:37 |
|
rugxulo 14 Apr 2019, 04:59
Again, nice work.
Tomasz Grysztar wrote: I thought it would be nice to also adapt MODES.ASM, which I recently recommended as a reference implementation of VGA mode switching without BIOS calls. I've heard of that file but never looked closely. Although I'm aware of the author (barely), a talented British bloke who wrote ALINK and is an expert in C++11 multithreading (wrote a book? bah, I never read it, don't grok C++). Hmmm, second edition just came out in February, now covering C++14 and -17. Tomasz Grysztar wrote:
It's quite fascinating how many dialects there are. Of course, it's also a frustrating minefield! MODES.ASM does "smart" and "jumps", but aren't those default anyways?? The only thing I vaguely know is that "nosmart" will disable the LEA optimization! With A86, you have to use +G2 option, I think. And MASM stopped doing it long ago, preferring manual use of "opattr" inside a macro, ugh. But at least "LEA AX, MyData" is seven bytes shorter (EDIT: in source text) than "MOV AX, OFFSET MyData", so that's "good" (sarcasm, assembly is overly verbose anyways!). If you're super bored and truly want other files to convert, take a look at XGREP (MASMv4 syntax?? "struc", yuck!) or Kaboom (weirdo MASMv6). And there are many other quirky dialects. (One guy wrote a DOS RAM disk driver, SHSURDRV, using his own eccentric NASM macros, but it only works in old 0.98.39 from 2005.) For TASM Ideal syntax, normally I'd recommend LZASM (or incomplete support in OpenWatcom's WASM -zcm=tasm). For MASM, I'd recommend JWasm. (Just to state the obvious.) Yes, it'd be cool if FASMG supported even a partial subset of some of these dialects. |
|||
14 Apr 2019, 04:59 |
|
redsock 14 Apr 2019, 05:25
rugxulo wrote: Again, nice work. rugxulo wrote: Yes, it'd be cool if FASMG supported even a partial subset of some of these dialects. |
|||
14 Apr 2019, 05:25 |
|
Tomasz Grysztar 14 Apr 2019, 11:11
rugxulo wrote: If you're super bored and truly want other files to convert, take a look at XGREP (MASMv4 syntax?? "struc", yuck!) or Kaboom (weirdo MASMv6). And there are many other quirky dialects. (One guy wrote a DOS RAM disk driver, SHSURDRV, using his own eccentric NASM macros, but it only works in old 0.98.39 from 2005.) Not that I'm bored, it is simply that I made fasmg a perfect toy for myself. I immensely enjoy writing every new convoluted macro set and then also trying to simplify it and make elegant if possible. I still need to find some free time for it, though. rugxulo wrote: Yes, it'd be cool if FASMG supported even a partial subset of some of these dialects. |
|||
14 Apr 2019, 11:11 |
|
Tomasz Grysztar 14 Apr 2019, 11:57
rugxulo wrote: Yes, the original was 1993 bytes, but I'm comparing two FASM versions instead. Not sure why you enabled XORTEXTS in that version, it's not enabled in the original by default. I made a small fasmg script that simulates the tool they must have originally used to trim the zero bytes from the end of file and at the same time XOR-encrypt the text: Code: Original:: file 'SP2.COM' LEN = $ restartout 0 while LEN > 0 load A : byte from Original:LEN-1 LEN = LEN - 1 if A break end if end while load DATA : LEN from Original:0 db DATA repeat LEN load B : byte from LEN-% if B = A break end if B = B xor 17h store B : byte at LEN-% end repeat Again, instead of post-processing, we can do it all in a single step, with a specially tailored headers for the assembly. Here is my SP2.ASH that assembles SP2.ASM to a 1993 byte .COM file (using LEGACY.ASH from my post above): Code: include 'LEGACY.ASH' XORTEXTS := 1 macro endtext1 def label endtext1 : byte repeat $ - text0 load A : byte from $ - % A = A xor 17h store A : byte at $ - % end repeat macro db? arg& match =0fch, arg purge db? db ? else db arg end match end macro end macro macro zerobeg def label zerobeg : word virtual end macro macro zeroend def label zeroend : word end virtual end macro Last edited by Tomasz Grysztar on 16 Apr 2019, 18:47; edited 1 time in total |
|||
14 Apr 2019, 11:57 |
|
rugxulo 18 Apr 2019, 05:57
Tomasz Grysztar wrote: I hope you don't mind that I keep hijacking your thread. For the record ... it's your message board. It's your assembler. It's still related to DOS and programming. I wouldn't consider this a huge intrusion. Invade away! |
|||
18 Apr 2019, 05:57 |
|
rugxulo 18 Apr 2019, 06:05
Tomasz Grysztar wrote:
Yeah, it seems MASM (?), even JWasm, both encode "test al, ah" as "test ah, al". Other than that, it seemed to match. Tomasz Grysztar wrote:
MODES as an .EXE is a bit tricky because the header varies due to linker (or, in your case, none). I'm not totally happy with my inability to replicate that, but it was released as "source only", so whatever. As long as it works, I guess it's fine. Still, that kind of verification strikes me as dangerous. It's not that I demand byte-for-byte, but it's much easier (obviously) to know you've done it correctly if you can 100% (or close enough) match to the original. (Certainly PSR Invaders can use some obvious optimizations, both space and speed. I vaguely remember it being too slow, even on my old 486. But who cares, my 486 is packed up and half dead anyways. Still, I dream ....) Tomasz Grysztar wrote:
I'm not sure if you mean this literally or not. For which host OS? Which kind of shell? Which build tool? Sure, it's possible, but you have to know what you want to support. Also, verification is tricky for things that do graphics or anything interactive. Batch-oriented tools are easier to verify. (Also, .COM is easier than .EXE, obviously.) |
|||
18 Apr 2019, 06:05 |
|
rugxulo 18 Apr 2019, 06:28
rugxulo wrote: But I'm still thinking (too hard?) about other possible solutions. I'm still not fully satisfied. Just for the record .... The TASM original is in (ancient, v4?) MASM syntax but has a bug (which TASM ignores). "40:[01ah]", TASM32 ignores the segment while JWasm complains. (Programmer error, PEBKAC.) Otherwise, with a well-written include (pre-included via "-Fi=") you "almost" don't need any changes. While I didn't write BYTEFIX exactly only for that program, it is a simple way to fix that (random seeking) without having to read the whole source in (sequentially, slowly) and outputting a redundant modified copy. (s/40:/ds:/) So there, no Sed is needed ... if you're willing to use JWasm! Should I rely on Sed at all? Previously, my simplest Sed script was for A86. I rewrote it in AWK, QBASIC, C, and (Turbo) Pascal. That's what I mean, is there a universal solution? Probably not. MS-DOS 5 came with QBASIC, but PC-DOS 7 (allegedly? I don't have it!) came with REXX. But C is fairly common, and TP compatibles aren't hard to find either. AWK has several ports, even for DOS. Debug script? That would take some extra effort, so I haven't done it (yet). But certainly it's redundant (i.e. "bad!") to need external tools when one tool is (potentially) good enough! "Ad hoc" solutions are still good, but generic / universal / reusable is even better! So I should just port Minised to Turbo Pascal, right??? Ugh. Sure, I'm vaguely interested, but it's still a bit difficult. Maybe a subset Sed-like tool would be just as good (or better??). I know that doesn't directly use FASMG, which is certainly genius, but it's yet another tool that I don't really understand. I'm just trying to expand my limited brain power with what tools I'm vaguely familiar with. (Yes, I'm furiously rewriting all my old Sed scripts to be more standard compatible. It's certainly interesting!) |
|||
18 Apr 2019, 06:28 |
|
Goto page 1, 2 Next < Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2025, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.