flat assembler
Message board for the users of flat assembler.

Index > DOS > PSR Invaders 1.1

Goto page 1, 2  Next
Author
Thread Post new topic Reply to topic
rugxulo



Joined: 09 Aug 2005
Posts: 2341
Location: Usono (aka, USA)
rugxulo 08 Feb 2017, 09:30
Ever played PSR Invaders 1.1 (circa 1995)? invadr11.zip (27 kb) mirror #2

In 2004, I manually (but sloppily) converted it to work with free assemblers. But this new script is much simpler, smaller, and more accurate than the full quirky, modified source.

Tested with either GNU sed or Cheap sed.

EDIT: Minor .BAT cleanups, now builds in Windows, too.
EDIT#2: Minor simplifications.

Code:
@echo off

::#--- fix2.sed begins ---
:: 1i\
:: OFFSET equ\
:: Offset equ\
:: Ptr equ\
:: LEA equ MOV
:: /^;/b
:: /CODE_SEG/d
:: /END /d
:: / ENDP/d
:: s/40://
:: s/\]\[/+/
:: s/\[0\]//
:: s/\][+]BX/+BX/
:: s/ES:\[/[ES:/
:: s/[ ][ ]*PROC .*/:/
:: s/\[\[/[/
::#--- fix2.sed ends ---

::#--- fix3.sed begins ---
:: /RemoveNewInt9:/,/ RET/s/OldInt9Addr/cs:&/
:: /NewInt9Handler:/,/NotIntercept:/s/\[/[cs:/
:: /NotIntercept:/,/CLC/s/StoreAX/cs:&/
::#--- fix3.sed ends ---

::#--- vars.sed begins ---
:: /^;/b
:: / D[BWD] /b
:: /,OFFSET/b
:: /LEA /b
::#--- vars.sed ends ---

if not exist invaders.asm goto end
if not exist %0 %0.bat %1

if "%SED%"=="" set SED=sed
echo %%SED%% = '%SED%'

set B1=begins ---
set E1=ends ---
set S1=fix2.sed fix3.sed vars.sed
for %%a in (%S1%) do %SED% -n -e "/%%a %B1%/,/ %E1%$/s/^::[ ][ ]*//w %%a" %0
for %%z in (S1 E1 B1) do set %%z=

for %%a in (fix2 fix3 vars) do if not exist %%a.sed goto end

set I0=invaders.asm
%SED% -n -e "/ D[BWD] /s@^\([^ ][^ ]*\).*@s/\\<\1\\>/[\&]/@p" %I0%>>vars.sed
%SED% -f vars.sed %I0% | %SED% -f fix2.sed -f fix3.sed >inv-fasm.asm
set I0=

REM cwsdpmi
fasm inv-fasm.asm inv-fasm.com >NUL
if not exist inv-fasm.com goto end

echo.
echo INV-FASM.COM    FFF22EF9
crc32 inv-fasm.com
echo.

if "%1"=="notclean" goto end
del vars.sed >NUL
del fix?.sed >NUL
del inv-fasm.asm >NUL

:end
if "%SED%"=="sed" set SED=
    


Since I'm also sometimes using antiX Linux, which comes with DOSBox, I also whipped up a quick makefile in order to cross-build.

EDIT: Very minor makefile cleanups.
EDIT#2: Minor simplifications.

Code:
# GNUmakefile
.RECIPEPREFIX := _

#=== fix2.sed begins ===
# 1i\
# OFFSET equ\
# Offset equ\
# Ptr equ\
# LEA equ MOV
# /^;/b
# /CODE_SEG/d
# /END /d
# / ENDP/d
# s/40://
# s/\]\[/+/
# s/\[0\]//
# s/\][+]BX/+BX/
# s/ES:\[/[ES:/
# s/[ ][ ]*PROC .*/:/
# s/\[\[/[/
#=== fix2.sed ends ===

#=== fix3.sed begins ===
# /RemoveNewInt9:/,/ RET/s/OldInt9Addr/cs:&/
# /NewInt9Handler:/,/NotIntercept:/s/\[/[cs:/
# /NotIntercept:/,/CLC/s/StoreAX/cs:&/
#=== fix3.sed ends ===

.PHONY: all check clean cleanall

PROG=inv-fasm
GAMEZIP=invadr11.zip
OLDASM=INVADERS.ASM
SED=sed
FASM=fasm
MD5SUM=md5sum
WGET=wget
WGETOPT=-q
UNZIPPER=unzip
UNZIPFLAGS=-qjan

#GAMEURL=ftp.lanet.lv/ftp/mirror/x2ftp/msdos/programming/gamesrc/
GAMEURL=www.ibiblio.org/pub/micro/pc-stuff/freedos/files/games/invaders/

all: $(PROG).com check

$(PROG).com: $(PROG).asm
_@$(FASM) $< $@

fix2.sed fix3.sed: $(lastword $(MAKEFILE_LIST))
_@$(SED) -n -e '/$@ begins ===/,/$@ ends ===/s/^#[ ][ ]*//w $@' $<

asmvars.sed: $(OLDASM)
_@$(SED) -n -e 's|^\([^ ][^ ]*\)[ ][ ]*D[BWD] ..*|s/\\<\1\\>/[\&]/|p' $^ >$@

$(PROG).asm: $(OLDASM) asmvars.sed fix2.sed fix3.sed
_@$(SED) -e '/^;/b' -e '/ D[BWD] /b' -e '/,OFFSET/b' -e '/LEA /b'\
 -f asmvars.sed $< | $(SED) -f fix2.sed -f fix3.sed >$@

$(GAMEZIP):
_@$(WGET) $(WGETOPT) $(GAMEURL)$(GAMEZIP)

$(OLDASM): $(GAMEZIP)
_@$(UNZIPPER) $(UNZIPFLAGS) $< INVADERS/$@ >/dev/null

check: $(PROG).com
_@$(MD5SUM) $<
_@echo 5d6fa26af01606feb90f17e014390139 \ $<

clean:
_@$(RM) $(PROG).asm fix?.sed asmvars.sed

cleanall: clean
_@$(RM) $(PROG).com $(OLDASM)

# EOF
    


Last edited by rugxulo on 20 Feb 2018, 04:55; edited 3 times in total
Post 08 Feb 2017, 09:30
View user's profile Send private message Visit poster's website Reply with quote
rugxulo



Joined: 09 Aug 2005
Posts: 2341
Location: Usono (aka, USA)
rugxulo 07 Feb 2018, 21:34
I thought GNU sed and Cheap sed would be good enough for everyone (not that anyone complained). Both are GPL with ports to DOS, Windows, Linux, et al. But apparently "\<" "\>" is not truly portable nor standardized (yet??).

Cheap sed (2004) is a modified version of HHsed (1991), which was based upon Eric Raymond's sed. Another improved derivative of his by Rene Rebe is called minised (2014, BSD), which lacks the non-standard feature mentioned above.

So, in the interest of perfection, I toyed with the idea of making a working script that didn't need the non-standard kludge. (I already did the same for NASM but instead using its preprocessor.)

Just for completeness ....

P.S. Happy 40th anniversary, Space Invaders!

INV-FAS2.BAT
Code:
@echo off

::#--- fix2.sed begins ---
:: 1i\
:: OFFSET equ\
:: Offset equ\
:: Ptr equ\
:: LEA equ MOV
:: /^;/b
:: /CODE_SEG/d
:: /END /d
:: / ENDP/d
:: s/[ ][ ]*PROC .*/:/
:: s/ES:\[/[ES:/
:: s/ DD / DW 0,/
:: s/\[0\]$//
:: s/40://
:: s/,,,*/,/
:: s/,\[\([0-9]\)\],\(\[.*\)\]/,\2+\1]/
:: s/\],\[/+/
:: /INC /s/\[\([0-9]\)\]/+\1/
:: /[+]BX/!s/\([ID].C\)[ ][ ]*\([^ ][^ ][^ ][^ ]*\)/\1 [\2]/
::#--- fix2.sed ends ---

::#--- fix3.sed begins ---
:: /RemoveNewInt9:/,/ RET/s/OldInt9Addr/cs:&/
:: /NewInt9Handler:/,/NotIntercept:/s/\[/[cs:/
:: /NotIntercept:/,/CLC/s/StoreAX/cs:&/
::#--- fix3.sed ends ---

::#--- vars1.sed begins ---
:: /^;/b
:: / D[BWD] /b
:: /,OFFSET/b
:: /LEA /b
:: s/ *;.*$//
:: s/Word Ptr //
:: s/\[[0-9]\],/,&/
:: s/\(,.*\)\(\[[0-9]\]\)/,\2\1/
::#--- vars1.sed ends ---

set INV=invaders.asm
if not exist %INV% goto end
if not exist %0 %0.bat %1

if "%SED%"=="" set SED=minised
echo %%SED%% = '%SED%'

set B1=begins ---
set E1=ends ---
set S1=fix2.sed fix3.sed vars1.sed
for %%a in (%S1%) do %SED% -n -e "/%%a %B1%/,/ %E1%$/s/^::[ ][ ]*//w %%a" %0
for %%z in (S1 E1 B1) do set %%z=

for %%a in (fix2 fix3 vars1) do if not exist %%a.sed goto end

set V1=vars1.sed
set V2=vars2.sed
set F2=fix2.sed
set F3=fix3.sed
%SED% -n -e "/ D[BWD] /s|^\([^ ][^ ]*\).*|s/,\\(\1\\)$/,[\\1]/|p" %INV%>>%V1%
%SED% -e "s|^s/,|s/ |" -e "s|\$/,\[\\1\]/|,/[\\1],/|" %V1% >%V2%
%SED% -f %V1% %INV% | %SED% -f %V2% | %SED% -f %F2% -f %F3% >inv-fasm.asm
for %%z in (V1 V2 F2 F3) do set %%z=

REM cwsdpmi
fasm inv-fasm.asm inv-fasm.com >NUL
if not exist inv-fasm.com goto end

echo.
echo INV-FASM.COM    FFF22EF9
crc32 inv-fasm.com
echo.

if "%1"=="notclean" goto end
del vars?.sed >NUL
del fix?.sed >NUL
del inv-fasm.asm >NUL

:end
set INV=
if "%SED%"=="minised" set SED=
    


FASM2.MAK
Code:
# GNUmakefile
.NOTPARALLEL:
.RECIPEPREFIX := _

#=== fix2.sed begins ===
# 1i\
# OFFSET equ\
# Offset equ\
# Ptr equ\
# LEA equ MOV
# /^;/b
# /CODE_SEG/d
# /END /d
# / ENDP/d
# s/[ ][ ]*PROC .*/:/
# s/ES:\[/[ES:/
# s/ DD / DW 0,/
# s/\[0\]$//
# s/40://
# s/,,,*/,/
# s/,\[\([0-9]\)\],\(\[.*\)\]/,\2+\1]/
# s/\],\[/+/
# /INC /s/\[\([0-9]\)\]/+\1/
# /[+]BX/!s/\([ID].C\)[ ][ ]*\([^ ][^ ][^ ][^ ]*\)/\1 [\2]/
#=== fix2.sed ends ===

#=== fix3.sed begins ===
# /RemoveNewInt9:/,/ RET/s/OldInt9Addr/cs:&/
# /NewInt9Handler:/,/NotIntercept:/s/\[/[cs:/
# /NotIntercept:/,/CLC/s/StoreAX/cs:&/
#=== fix3.sed ends ===

#=== vars0.sed begins ===
# /^;/b
# / D[BWD] /b
# /,OFFSET/b
# /LEA /b
# s/ *;.*$//
# s/Word Ptr //
# s/\[[0-9]\],/,&/
# s/\(,.*\)\(\[[0-9]\]\)/,\2\1/
#=== vars0.sed ends ===

.PHONY: all check clean cleanall

#http://exactcode.com/opensource/minised/
#http://dl.exactcode.de/oss/minised/minised-1.15.tar.gz
SED=minised

PROG=inv-fasm
GAMEZIP=invadr11.zip
OLDASM=INVADERS.ASM
FASM=fasm
MD5SUM=md5sum
WGET=wget
WGETOPT=-q
UNZIPPER=unzip
UNZIPFLAGS=-qjan

#GAMEURL=ftp.lanet.lv/ftp/mirror/x2ftp/msdos/programming/gamesrc/
GAMEURL=www.ibiblio.org/pub/micro/pc-stuff/freedos/files/games/invaders/

all: $(PROG).com check

$(PROG).com: $(PROG).asm
_$(FASM) $< $@

fix2.sed fix3.sed vars0.sed: $(lastword $(MAKEFILE_LIST))
_@$(SED) -n -e '/$@ begins ===/,/$@ ends ===/s/^#[ ][ ]*//w $@' $<

vars1.sed: $(OLDASM)
_@$(SED) -n -e '/ D[BWD] /s|^\([^ ][^ ]*\).*|s/,\\(\1\\)$$/,[\\1]/|p' $< >$@

vars2.sed: vars1.sed
_@$(SED) -e 's|^s/,|s/ |' -e 's|\$$/,\[\\1\]/|,/[\\1],/|' $< >$@

$(PROG).asm: vars0.sed vars1.sed vars2.sed fix2.sed fix3.sed $(OLDASM)
_$(SED) -f $(word 1,$^) -f $(word 2,$^) $(lastword $^)\
 | $(SED) -f $(word 1,$^) -f $(word 3,$^)\
 | $(SED) -f $(word 4,$^) -f $(word 5,$^) >$@

$(GAMEZIP):
_@$(WGET) $(WGETOPT) $(GAMEURL)$(GAMEZIP)

$(OLDASM): $(GAMEZIP)
_@$(UNZIPPER) $(UNZIPFLAGS) $< INVADERS/$@ >/dev/null

check: $(PROG).com
_@$(MD5SUM) $<
_@echo 5d6fa26af01606feb90f17e014390139 \ $<

clean:
_@$(RM) $(PROG).asm vars?.sed fix?.sed

cleanall: clean
_@$(RM) $(PROG).com $(OLDASM)

# EOF
    


Last edited by rugxulo on 16 Feb 2018, 20:12; edited 1 time in total
Post 07 Feb 2018, 21:34
View user's profile Send private message Visit poster's website Reply with quote
rugxulo



Joined: 09 Aug 2005
Posts: 2341
Location: Usono (aka, USA)
rugxulo 12 Feb 2018, 21:38
rugxulo wrote:
But apparently "\<" "\>" is not truly portable nor standardized.


Ugh, apparently "\+" isn't standard either, so I amended my newer script to avoid that. I thought it was only misunderstood in really ancient implementations (e.g. hhsed [1991] or sedmod [1987]), but apparently various modern *BSD seds also get confused.

Sedcheck dislikes "\<" (which it reports as "\'" for some odd reason), "\]" (only, which I consider spurious ... should I prefer "[]]" ?? doubt it!), and also "\+" (although "\{1,\}" isn't necessary here, and sedmod hates it).

Due to legibility issues, I prefer kludgy "[ ][ ]*" instead of (two-space) " *".

Also, GNU sed "--posix" has no problems now.

I haven't retested *BSD yet again, but I should try old FreeBSD 6.4 (with doscmd and X11), just in case that works.
Post 12 Feb 2018, 21:38
View user's profile Send private message Visit poster's website Reply with quote
Furs



Joined: 04 Mar 2016
Posts: 2500
Furs 13 Feb 2018, 13:26
Reading about all these limitations of various sed versions makes me understand why autoconf is such a nightmare, I suppose. Smile (I've only used GNU sed)
Post 13 Feb 2018, 13:26
View user's profile Send private message Reply with quote
rugxulo



Joined: 09 Aug 2005
Posts: 2341
Location: Usono (aka, USA)
rugxulo 16 Feb 2018, 19:37
It was all in the spirit of minimalism, keeping things small with few dependencies, doing so as simply and portably as possible.

I guess I could've used Devore's NOMYSO (Perl), but Perl is much heavier, and the DJGPP port is abandoned (stuck at old 5.8.8).

GNU sed still gets ports to DJGPP (barely), but it looks annoying to rebuild (haven't tried). Also, it's bloated. But it does work. But so does Cheap Sed.

Latest GNU sed 4.4 (32-bit, DJGPP 2.05) is 291 kb (or 140 kb UPX'd).
Old GNU sed 4.2.2 (32-bit, DJGPP 2.03p2) is 220 kb (or 105 kb UPX'd).
Cheap sed (16-bit, 2004) is 26 kb (or 15 kb UPX'd).

I was naive and forgot that "\<" wasn't standard. I guess that's not BRE, only ERE? Introduced in ex/vi, presumably (while sed originally came from ed).

It just seems silly, to me, to require GNU sed when *BSD sed is 90% compatible already. Nobody wants to require two different seds just because software is too stupid to be compatible. It's like relying on both Perl and Python, or multiple assemblers. Sure, some projects do it, but it's "bad". Redundancy is bad.
Post 16 Feb 2018, 19:37
View user's profile Send private message Visit poster's website Reply with quote
rugxulo



Joined: 09 Aug 2005
Posts: 2341
Location: Usono (aka, USA)
rugxulo 05 Apr 2019, 19:03
Okay, so I wrote a much-simplified sed script for this. It's much cleaner and smaller. FYI, neither Minised nor old SEDMOD liked the "\{2,\}" construct, so I avoided that here. (BTW, these scripts are all p.d. or MIT or whatever, I don't care. Just in case that wasn't obvious.)

EDIT: Very minor simplifications.
EDIT: A few more minor simplifications.

Code:
@echo off

::#--- fix.sed begins ---
:: 1i\
:: OFFSET equ\
:: Offset equ\
:: LEA equ MOV
:: /^;/b
:: / D[BW] /b
:: /LEA /b
:: s/\[0\]//
:: /,O[fF]/b
:: /CODE_SEG/d
:: /END /d
:: / ENDP/d
:: s/ DD / DW 0,/
:: s/Word Ptr //
:: s/ *PROC .*/:/
:: s/ES:\[/[ES:/
:: s/40://
:: s/\[\([0-9]\)\]/_\1/
:: s/\([A-Z][a-zA-Z0-9][a-zA-Z0-9][a-zA-Z0-9_]*\),/[\1],/
:: s/,\([A-Z][a-zA-Z0-9][a-zA-Z0-9][a-zA-Z0-9_]*\)/,[\1]/
:: s/\([ID].C\)[ ][ ]*\([A-Z][^ ][^ ][^ ]*\)/\1 [\2]/
:: /_[0-9]\]/s/_/+/
:: /RemoveNewInt9:/,/CLC/s/\[\([^0]\)/[cs:\1/
::#--- fix.sed ends ---

if not exist invaders.asm goto end
if not exist %0 %0.bat %1

if "%SED%"=="" set SED=minised
echo %%SED%% = '%SED%'

%SED% -n -e "/fix\.sed begins ---/,/ ends ---$/s/^::[ ][ ]*//w fix.sed" %0
if not exist fix.sed goto end
%SED% -f fix.sed invaders.asm >inv-fasm.asm

%DPMION%
fasm inv-fasm.asm inv-fasm.com >NUL
%DPMIOFF%
if not exist inv-fasm.com goto end

echo.
echo INV-FASM.COM    FFF22EF9
crc32 inv-fasm.com
echo.

if "%1"=="notclean" goto end
del fix.sed >NUL
del inv-fasm.asm >NUL

:end
if "%SED%"=="minised" set SED=
    


But I'm still thinking (too hard?) about other possible solutions. I'm still not fully satisfied.


Last edited by rugxulo on 28 Jul 2019, 03:02; edited 2 times in total
Post 05 Apr 2019, 19:03
View user's profile Send private message Visit poster's website Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8353
Location: Kraków, Poland
Tomasz Grysztar 05 Apr 2019, 19:22
rugxulo wrote:
But I'm still thinking (too hard?) about other possible solutions. I'm still not fully satisfied.
What about a reverse approach? Instead of converting the source to other syntax, alter fasmg macros so that they support the legacy syntax, at least to an extent required to assemble this program.

I think I may attempt this in a spare time. This could be another possible niche for fasmg - to allow assembly of legacy sources with an adaptive tool instead of altering the original text.

Note that with help of fasmg's linear polynomials and metadata features even things like ASSUME could be supported in full (with the right segment prefix being chosen automatically depending on where the label is defined).
Post 05 Apr 2019, 19:22
View user's profile Send private message Visit poster's website Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8353
Location: Kraków, Poland
Tomasz Grysztar 05 Apr 2019, 21:13
I managed to get INVADERS.ASM to assemble with fasmg in unmodified form. This is a quick and dirty set of headers that does the job (save it as LEGACY.ASH):
Code:
include 'cpu/80186.inc'

; use an added term to distinguish labels from absolute values:

element CODE
macro ORG? address
        org CODE + address
end macro

ORG 0

; ignore some constructions for now:

struc SEGMENT?
end struc

struc ENDS?
end struc

macro END?.BEGIN
end macro

; register ASSUMEs, though emulation is currently limited to insertion of CS prefix under DS:NOTHING setting

macro ASSUME? statement&
        iterate each, statement
                match reg:seg, each
                        define ASSUMED?.reg? seg
                end match
        end iterate
end macro

; very limited PROC emulation, just to get things going:

struc (name) PROC? type

        match =NEAR?, type
                name:
        else match =FAR?, type
                name:
        else
                err 'unsupported PROC type'
        end match

        struc (name2) ENDP?
                match =name, name2
                        restruc ENDP?
                else
                        err 'unexpected ENDP'
                end match
        end struc

end struc

; parse legacy operand syntax:

macro x86.parse_operand ns,op
        ns.size = 0
        match =OFFSET? value, op
                x86.parse_legacy_address ns,value
                if ns.type = 'mem'
                        ns.type = 'imm'
                        ns.imm = ns.address
                        ns.size = 0
                        ns.displacement_size = 0
                end if
        else match sz =PTR? value, op
                ns.size = sz
                x86.parse_legacy_address ns,value
        else
                x86.parse_legacy_address ns,op
        end match
        if ns.type = 'imm'
                ns.displacement_size = 0
                if ns.imm eq 1 elementof ns.imm
                        if 1 metadataof (1 metadataof ns.imm) relativeto x86.reg
                                ns.type = 'reg'
                                ns.mod = 11b
                                ns.rm = 1 metadataof ns.imm - 1 elementof (1 metadataof ns.imm)
                                if ns.size & ns.size <> 1 metadataof (1 metadataof ns.imm) - x86.reg
                                        err 'operand sizes do not match'
                                else
                                        ns.size = 1 metadataof (1 metadataof ns.imm) - x86.reg
                                end if
                        else if 1 metadataof ns.imm relativeto x86.sreg
                                ns.type = 'sreg'
                                ns.rm = 1 metadataof ns.imm - x86.sreg
                                if ns.size & ns.size <> 2
                                        err 'operand sizes do not match'
                                else
                                        ns.size = 2
                                end if
                        end if
                end if
        end if
end macro

macro x86.parse_legacy_address ns,op
        ns.segment_prefix = 0
        local buffer,prefix
        buffer equ op
        define prefix
        match seg:offs, buffer
                if ~ seg relativeto 0   ; ignore numeric segment prefix
                        redefine prefix seg:
                end if
                redefine buffer offs
        else match =Nothing?, ASSUMED?.DS
                redefine prefix CS:
        end match
        match base[add], buffer
                ns.type = 'mem'
                if elementsof(base+add) > elementsof(base+add-CODE)
                        x86.parse_address ns,prefix base+add-CODE
                else
                        x86.parse_address ns,prefix base+add
                end if
        else match [add], buffer
                ns.type = 'mem'
                if elementsof(add) > elementsof(add-CODE)
                        x86.parse_address ns,prefix add-CODE
                else
                        x86.parse_address ns,prefix add
                end if
        else
                if elementsof(buffer) > elementsof(buffer-CODE)
                        ns.type = 'mem'
                        x86.parse_address ns,prefix buffer-CODE
                else
                        ns.imm = buffer
                        ns.type = 'imm'
                        ns.displacement_size = 0
                end if
        end match
end macro

macro x86.parse_jump_operand ns,op
        match =far? dest, op
                x86.parse_operand_value ns,dest
                ns.jump_type = 'far'
        else match =near? dest, op
                x86.parse_operand_value ns,dest
                ns.jump_type = 'near'
        else match =short? dest, op
                x86.parse_operand_value ns,dest
                ns.jump_type = 'short'
        else
                x86.parse_operand_value ns,op
                ns.jump_type = ''
        end match
end macro    
The CPU package can be taken from my GitHub repository, though it is also included in the base package of fasmg, with x86 examples.

Then assemble on Windows with command like:
Code:
fasmg -iInclude('legacy.ash') invaders.asm invaders.com    
Or on Linux:
Code:
fasmg -i include\ \'legacy.ash\' invaders.asm invaders.com    


Much more work would be needed to get segment definitions and the ASSUME to work correctly in general, but at least I managed to quickly demonstrate that this approach with fasmg is possible. And this prototype already hints at methods needed to provide a broader solution.
Post 05 Apr 2019, 21:13
View user's profile Send private message Visit poster's website Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8353
Location: Kraków, Poland
Tomasz Grysztar 06 Apr 2019, 08:45
For a little more compatibility we also need a LEA to MOV optimization, which TASM did (and INVADERS.ASM seems to be using it a lot). For a 16-bit instruction set a modified version of LEA macro that does it is:
Code:
macro lea? dest*,src*
        x86.parse_operand @dest,dest
        x86.parse_operand @src,src
        if @dest.size <> 0 & @dest.size <> 2
                err 'invalid operand size'
        end if
        if @src.type = 'mem' & @dest.type = 'reg'
                if @src.address_registers eq 0
                        db 0B8h + @dest.rm      ; optimize to MOV
                        dw @src.address
                else
                        x86.store_instruction 8Dh,@src,@dest.rm
                end if
        else
                err 'invalid combination of operands'
        end if
end macro    


But there is also a simpler option of redefining LEA on top of existing instruction handlers, which should be universal:
Code:
macro lea? dest*,src*
        x86.parse_operand @src,src
        if @src.type = 'mem' & @src.address_registers eq 0
                mov     dest,@src.address
        else
                lea     dest,src
        end if
end macro    
The only disadvantage of such approach is that "parse_operand" ends up being called again for the same argument.
Post 06 Apr 2019, 08:45
View user's profile Send private message Visit poster's website Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8353
Location: Kraków, Poland
Tomasz Grysztar 06 Apr 2019, 17:50
Encouraged by the first success, I decided to try it with a different source from the same era, one that holds more sentimental value for me - StarPort BBS Intro II (fcsp2src.zip) by Future Crew (1993).

This one required many more tricks, including emulation of @@-prefixed locals. But I finally got it to assemble with this updated LEGACY.ASH:
Code:
include 'cpu/80386.inc'

; use an added term to distinguish labels from absolute values:

element CODE
macro ORG? address
        org CODE + address
end macro

ORG 0

; register ASSUMEs, though emulation is currently limited to insertion of CS prefix under DS:NOTHING setting

macro ASSUME? statement&
        iterate each, statement
                match reg:seg, each
                        define ASSUMED?.reg? seg
                end match
        end iterate
end macro

; just an empty shell of SEGMENT:

struc (name) SEGMENT? attr
        struc (name2) ENDS?
                match =name, name2
                        restruc ENDS?
                else
                        err 'mismatched ENDS, expected one for ',`name
                end match
        end struc
end struc

; use ALIGN variant that ignores variable bases:

macro ALIGN? pow2*,value:?
        db  (-($ scale 0)) and (pow2-1)  dup value
end macro

; there is one DB statement that uses a wild nested DUP syntax, we need to catch it and convert to something fasm understands:

macro DB? definitions&
        match N =dup(M =dup(value)), definitions
                DB N*M dup (value)
        else
                DB definitions
        end match
end macro

; handle OFFSET in DW definitions with a simple trick:

macro DW? definitions&
        define OFFSET? -CODE+   ; could also be "0 scaleof"
        DW definitions
        restore OFFSET?
end macro

struc DW? definitions&
        label . : WORD
        DW definitions
end struc

; several constructions have a very straightforward emulation:

struc MACRO? definition&
        purge . ; disable forward-referencing
        macro . definition
end struc

macro ENDM?!
        esc end macro
end macro

macro IFDEF?! symbol
        if defined symbol
end macro

macro ENDIF?!
        end if
end macro

macro REPT? count
        macro ENDM?!
                end repeat
                purge ENDM?
        end macro
        repeat count
end macro

struc LABEL? size
        label . : size
end struc

; a convoluted implementation of TASM's @@-prefixed local labels:

macro __CMD
end macro

struc (name) ? definition&      ; this intercepts all labels (including unknown instructions)
        match , definition
                display 'ignoring unknown directive ',`name,13,10
        else if `name and 0FFFFh = '@@'
                REGISTER_LOCAL name     ; this should cause the name to be defined in PROC as a symbolic link
                match true_symbol, name ; use MATCH to extract linked symbol
                        macro __CMD
                                purge __CMD
                                true_symbol definition
                        end macro
                end match
        else
                if $ eq CODE + 100h
                        macro END?.name         ; allow this label to be used as an entry point with END directive
                                macro ?! line&
                                end macro
                        end macro
                end if
                macro __CMD
                        purge __CMD
                        name definition
                end macro
        end if
        __CMD
end struc

macro START_LOCALS namespace
        namespace.LOCALS
        ; the namespace.LOCALS macro is constructed to contain definitions like:
        ;       define @@1 namespace.@@1
        macro LOCALS_BUILDER
                esc macro namespace.LOCALS
        end macro
        macro REGISTER_LOCAL label
                macro LOCALS_BUILDER
                        LOCALS_BUILDER
                        define label namespace.label
                end macro
        end macro
        macro END_LOCALS
                LOCALS_BUILDER
                esc end macro
        end macro
end macro

struc (name) PROC? type
        match =NEAR?, type
                name:
        else match =FAR?, type
                name:
        else
                err 'unsupported PROC type'
        end match
        START_LOCALS name               ; define symbolic links
        struc (name2) ENDP?
                match =name, name2
                        restruc ENDP?
                        END_LOCALS
                else
                        err 'mismatched ENDP, expected one for ',`name
                end match
        end struc
end struc

; parse legacy operand syntax:

macro x86.parse_operand ns,op
        ns.size = 0
        ns.segment_prefix = 0
        ns.prefix = 0
        ns.opcode_prefix = 0
        match =OFFSET? value, op
                x86.parse_legacy_address ns,value
                if ns.type = 'mem'
                        ns.type = 'imm'
                        ns.imm = ns.address
                        ns.size = 0
                        ns.displacement_size = 0
                end if
        else match sz =PTR? value, op
                ns.size = sz
                x86.parse_legacy_address ns,value
        else
                x86.parse_legacy_address ns,op
        end match
        if ns.type = 'imm'
                ns.segment_prefix = 0
                ns.displacement_size = 0
                if ns.imm eq 1 elementof ns.imm
                        if 1 metadataof (1 metadataof ns.imm) relativeto x86.reg
                                ns.type = 'reg'
                                ns.mod = 11b
                                ns.rm = 1 metadataof ns.imm - 1 elementof (1 metadataof ns.imm)
                                if ns.size & ns.size <> 1 metadataof (1 metadataof ns.imm) - x86.reg
                                        err 'operand sizes do not match'
                                else
                                        ns.size = 1 metadataof (1 metadataof ns.imm) - x86.reg
                                end if
                        else if 1 metadataof ns.imm relativeto x86.sreg
                                ns.type = 'sreg'
                                ns.rm = 1 metadataof ns.imm - x86.sreg
                                if ns.size & ns.size <> 2
                                        err 'operand sizes do not match'
                                else
                                        ns.size = 2
                                end if
                        end if
                end if
        end if
end macro

macro x86.parse_legacy_address ns,op
        local buffer,prefix
        buffer equ op
        define prefix
        match seg:offs, buffer
                if ~ seg relativeto 0   ; ignore numeric segment prefix
                        redefine prefix seg:
                end if
                redefine buffer offs
        else match =Nothing?, ASSUMED?.DS
                redefine prefix CS:
        end match
        match seg:, prefix
                x86.parse_segment_prefix ns,seg
        end match
        match base[add], buffer
                ns.type = 'mem'
                if elementsof(base+add) > elementsof(base+add-CODE)
                        x86.parse_address ns,base+add-CODE
                else
                        x86.parse_address ns,base+add
                end if
        else match [add], buffer
                ns.type = 'mem'
                if elementsof(add) > elementsof(add-CODE)
                        x86.parse_address ns,add-CODE
                else
                        x86.parse_address ns,add
                end if
        else
                if elementsof(buffer) > elementsof(buffer-CODE)
                        ns.type = 'mem'
                        x86.parse_address ns,buffer-CODE
                else
                        ns.imm = buffer
                        ns.type = 'imm'
                        ns.displacement_size = 0
                end if
        end match
end macro

macro x86.parse_jump_operand ns,op
        ns.size = 0
        match =far? dest, op
                x86.parse_operand_value ns,dest
                ns.jump_type = 'far'
        else match =near? dest, op
                x86.parse_operand_value ns,dest
                ns.jump_type = 'near'
        else match =short? dest, op
                x86.parse_operand_value ns,dest
                ns.jump_type = 'short'
        else
                x86.parse_operand_value ns,op
                ns.jump_type = ''
        end match
        if ns.type = 'imm'
                if ns.size = 0
                        ns.size = x86.mode shr 3
                end if
                if ns.imm relativeto 0 & (ns.imm < 0 | ns.imm >= 1 shl (ns.size*8))
                        err 'value out of range'
                end if
        end if
end macro

; optimize LEA to MOV where possible:

macro lea? dest*,src*
        x86.parse_operand @src,src
        if @src.type = 'mem' & @src.address_registers eq 0
                mov     dest,@src.address
        else
                lea     dest,src
        end if
end macro    
It allows to assemble SP2.ASM (again - without a single alteration to the original source text), but still works for INVADERS.ASM, too.

PS. The comments in the beginning of SP2.ASM mention that original executable was made smaller by a separate postprocessor that removed zero bytes from the end of file. But we can include a couple of specially tailored macros to do this during assembly:
Code:
macro zerobeg def
        label zerobeg : word
        virtual
end macro

macro zeroend def
        label zeroend : word
        end virtual
end macro    
Post 06 Apr 2019, 17:50
View user's profile Send private message Visit poster's website Reply with quote
rugxulo



Joined: 09 Aug 2005
Posts: 2341
Location: Usono (aka, USA)
rugxulo 07 Apr 2019, 20:26
Nice work. Cool

(Ah, you amended your post. Well, I haven't tried that last bit yet, but I did notice the size issue for SP2.COM.)

I'm not 100% sure how you personally verified that SP2.COM assembled correctly. At first I was confused and remembered your old FASM 1 port of SP2.ASM that is found on this forum. So I tried to match that. (I used BTTR's X util to trim the bloat.)

Quote:

@echo off

unzip -q c:\zips\fcsp2src.zip SP2.ASM
copy c:\zips\sp2.fas . >NUL

fasmg.exe -iInclude('legacy.ash') sp2.asm new.tmp
x.exe new.tmp new.com 0 1988 >NUL

echo.
gsed.exe -i~ -e "/XORTEXTS = /s/1/0/" -e "/LABEL endtext1/a\db 0fch" sp2.fas
cwsdpmi.exe
fasm.exe sp2.fas old.tmp
x.exe old.tmp old.com 0 1988 >NUL

del *.tmp >NUL

echo.
echo ****************************************
crc32 ???.com
comp old.com new.com
echo ****************************************


Yes, the original was 1993 bytes, but I'm comparing two FASM versions instead. Not sure why you enabled XORTEXTS in that version, it's not enabled in the original by default. Also, your offsets are different because of different alignment in the .bss uninitialized section due to lacking the second 0FCh byte marker. But otherwise it's literally ("byte for byte") identical, once those two changes are made. At least, that's the best attempt I could make for that old version.

Then I tried with TASM since I rather assumed you tried comparing with that.

Quote:

@echo off

tasm32 /ml /m2 /t sp2
warplink /c sp2,tas.tmp >NUL

fasmg -iinclude('legacy.ash') sp2.asm fas.tmp >NUL

ctty nul
for %%a in (tas fas) do x %%a.tmp %%a.com 0 1993
ctty con

ndisasm -b16 -o100h tas.com >tas.dis
ndisasm -b16 -o100h fas.com >fas.dis
awk "{$2=\"\";print}" tas.dis >tas.out
awk "{$2=\"\";print}" fas.dis >fas.out

goto summary

diff -U0 tas.out fas.out
REM --- tas.out 2019-04-07 14:37:22 -0500
REM +++ fas.out 2019-04-07 14:37:22 -0500
REM @@ -80 +80 @@
REM -000001B8 and ax,0xf
REM +000001B8 and ax,byte +0xf
REM @@ -352 +352 @@
REM -0000046F add ax,0x64
REM +0000046F add ax,byte +0x64

echo.
for %%a in (1B8 46F) do grep "^00000%%a " ?as.dis
REM TAS.DIS:000001B8 250F00 and ax,0xf
REM FAS.DIS:000001B8 83E00F and ax,byte +0xf
REM
REM TAS.DIS:0000046F 056400 add ax,0x64
REM FAS.DIS:0000046F 83C064 add ax,byte +0x64

:summary
REM ... it's fine, it roughly matches, "close enough" !! ...
ren fas.out *.ou~
sed -e "80s/byte [+]//" -e "352s/byte [+]//" fas.ou~ >fas.out
echo.
diff -s tas.out fas.out

if "%1"=="notclean" goto end
ctty nul
for %%z in (tmp dis out ou~ obj) do if exist *.%%z del *.%%z
del ?as.com
ctty con
:end


For PSR Invaders, I had used NDISASM's output but deleted the (differing) encoding column and only compared disassembled instruction text. That was 100% identical there, but here I ran into a slight text variation, even if the instructions are (roughly) the exact same. At least, they function the same and are the same size, but I'm sure you recognize that it's a tiny drop different (unsigned word vs. signed byte or whatever). So that made me nervous, but it's actually fine.
Post 07 Apr 2019, 20:26
View user's profile Send private message Visit poster's website Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8353
Location: Kraków, Poland
Tomasz Grysztar 07 Apr 2019, 21:20
rugxulo wrote:
I'm not 100% sure how you personally verified that SP2.COM assembled correctly.
To be honest, my main benchmark was that it ran as it should (and I took a screenshot as a proof). The problems with assembly seemed exclusively syntax-related here.

rugxulo wrote:
For PSR Invaders, I had used NDISASM's output but deleted the (differing) encoding column and only compared disassembled instruction text. That was 100% identical there, but here I ran into a slight text variation, even if the instructions are (roughly) the exact same. At least, they function the same and are the same size, but I'm sure you recognize that it's a tiny drop different (unsigned word vs. signed byte or whatever). So that made me nervous, but it's actually fine.
The choice of equivalent encoding for some instructions is a kind of a footprint that can even be used to determine what assembler was used to build the file. I wrote my x86 macros for fasmg to have the same footprint as fasm 1 - but obviously you could modify them to imitate the footprint of another assembler, the fasmg approach is really very flexible in that regard.
Post 07 Apr 2019, 21:20
View user's profile Send private message Visit poster's website Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8353
Location: Kraków, Poland
Tomasz Grysztar 13 Apr 2019, 11:37
I thought it would be nice to also adapt MODES.ASM, which I recently recommended as a reference implementation of VGA mode switching without BIOS calls.

This one uses TASM's "ideal" mode, so it requires a separate package, as this is yet another variant of syntax. Less work required in operand parsing macros this time, because this variation has much more in common with fasm's syntax. Also, this time the source defines two different segments and produces .EXE file, so I've used my fasm-compatible MZ.INC formatting macros and a modified "x86.parse_operand_value" that parses SEG and OFFSET prefixes to process these values properly (and generate relocations for SEG values). I also had to improve the local labels processor to handle the labels outside of procedures, too.

The complete package that allows to assemble unaltered MODES.ASM looks like:
Code:
include 'cpu/80386.inc'

macro IDEAL?
end macro

macro MODEL? selection
        format binary as 'exe'
        include 'format/mz.inc'
end macro

macro CODESEG?
        element CODE
        segment CODE.seg
        org CODE
end macro

macro DATASEG?
        element DATA
        segment DATA.seg
        org DATA
end macro

macro SEGES? instruction&
        db 26h
        instruction
end macro

macro EXITCODE? code
        mov     ax,4C00h + (code and 0FFh)
        int     21h
end macro

; adapt MACRO syntax:

macro ENDM?!
        esc end macro
        macrofooter
end macro

macro macrofooter
end macro

macro REPT?! count
        local content
        macro macrofooter
                purge macrofooter?
                repeat count
                        content
                end repeat
        end macro
        esc macro content
end macro

; a convoluted implementation of TASM's @@-prefixed local labels:

macro __CMD
end macro

struc (name) ? definition&      ; this intercepts all labels (including unknown instructions)
        match , definition
                display 'ignoring unknown directive ',`name,13,10
        else if `name and 0FFFFh = '@@'
                REGISTER_LOCAL name     ; this should cause the name to be defined in PROC as a symbolic link
                match true_symbol, name ; use MATCH to extract linked symbol
                        macro __CMD
                                purge __CMD
                                true_symbol definition
                        end macro
                end match
        else
                match :, definition
                        macro END?.name?        ; allow this label to be used as an entry point with END directive
                                entry CODE.seg:name-CODE
                        end macro
                end match
                macro __CMD
                        purge __CMD
                        name definition
                end macro
        end if
        __CMD
end struc

macro START_LOCALS namespace
        namespace.LOCALS
        ; the namespace.LOCALS macro is constructed to contain definitions like:
        ;       define @@1 namespace.@@1
        macro namespace.LOCALS_BUILDER
                esc macro namespace.LOCALS
        end macro
        macro namespace.LOCALS_CLEANUP
        end macro
        macro REGISTER_LOCAL label
                macro namespace.LOCALS_BUILDER
                        namespace.LOCALS_BUILDER
                        define label namespace.label
                end macro
                macro namespace.LOCALS_CLEANUP
                        namespace.LOCALS_CLEANUP
                        restore label
                end macro
        end macro
        macro END_LOCALS
                purge REGISTER_LOCAL,END_LOCALS
                namespace.LOCALS_BUILDER
                esc end macro
                namespace.LOCALS_CLEANUP
        end macro
end macro

macro PROC? name
        name:
        START_LOCALS name               ; define symbolic links
        macro ENDP?
                purge ENDP?
                END_LOCALS
        end macro
end macro

define Global
START_LOCALS Global
postpone
        END_LOCALS
end postpone

; process segmented-aware addresses in instruction operands, including OFFSET and SEG prefixes

macro x86.parse_operand_value ns,op
        ns.segment_prefix = 0
        ns.prefix = 0
        ns.opcode_prefix = 0
        match =OFFSET? addr, op
                ns.type = 'imm'
                ns.imm = +addr
                if ns.imm relativeto CODE
                        ns.imm = ns.imm - CODE
                else if ns.imm relativeto DATA
                        ns.imm = ns.imm - DATA
                end if
                ns.displacement_size = 0
        else match =SEG? addr, op
                ns.type = 'imm'
                ns.imm = +addr
                if ns.imm relativeto CODE
                        ns.imm = CODE.seg
                else if ns.imm relativeto DATA
                        ns.imm = DATA.seg
                end if
                ns.displacement_size = 0
        else match [addr], op
                ns.type = 'mem'
                match :sz offs, x86.addr
                        ns.size = sz
                        x86.parse_address ns,offs
                else
                        x86.parse_address ns,addr
                end match
                if ns.displacement relativeto CODE
                        ns.address = ns.address - CODE
                        ns.displacement = ns.displacement - CODE
                else if ns.displacement relativeto DATA
                        ns.address = ns.address - DATA
                        ns.displacement = ns.displacement - DATA
                end if
        else
                ns.type = 'imm'
                ns.imm = +op
                if defined op
                        ns.unresolved = 0
                else
                        ns.unresolved = 1
                end if
                ns.displacement_size = 0
                if ns.imm eq 1 elementof ns.imm
                        if 1 metadataof (1 metadataof ns.imm) relativeto x86.reg
                                ns.type = 'reg'
                                ns.mode = x86.mode
                                ns.mod = 11b
                                ns.rm = 1 metadataof ns.imm - 1 elementof (1 metadataof ns.imm)
                                if ns.size & ns.size <> 1 metadataof (1 metadataof ns.imm) - x86.reg
                                        err 'operand sizes do not match'
                                else
                                        ns.size = 1 metadataof (1 metadataof ns.imm) - x86.reg
                                end if
                        else if 1 metadataof ns.imm relativeto x86.sreg
                                ns.type = 'sreg'
                                ns.rm = 1 metadataof ns.imm - x86.sreg
                                if ns.size <> 0 & ns.size <> 2 & ns.size <> 4
                                        err 'invalid operand size'
                                end if
                        end if
                end if
        end match
end macro    
The command to assemble should look like:
Code:
fasmg -iInclude('TASMEMU.ASH') MODES.ASM    
Where TASMEMU.ASH is a file containing the above set of macros.
Post 13 Apr 2019, 11:37
View user's profile Send private message Visit poster's website Reply with quote
rugxulo



Joined: 09 Aug 2005
Posts: 2341
Location: Usono (aka, USA)
rugxulo 14 Apr 2019, 04:59
Again, nice work. Cool

Tomasz Grysztar wrote:
I thought it would be nice to also adapt MODES.ASM, which I recently recommended as a reference implementation of VGA mode switching without BIOS calls.


I've heard of that file but never looked closely. Although I'm aware of the author (barely), a talented British bloke who wrote ALINK and is an expert in C++11 multithreading (wrote a book? bah, I never read it, don't grok C++). Hmmm, second edition just came out in February, now covering C++14 and -17.

Tomasz Grysztar wrote:

This one uses TASM's "ideal" mode, so it requires a separate package, as this is yet another variant of syntax.


It's quite fascinating how many dialects there are. Of course, it's also a frustrating minefield!

MODES.ASM does "smart" and "jumps", but aren't those default anyways?? The only thing I vaguely know is that "nosmart" will disable the LEA optimization! With A86, you have to use +G2 option, I think. And MASM stopped doing it long ago, preferring manual use of "opattr" inside a macro, ugh. But at least "LEA AX, MyData" is seven bytes shorter (EDIT: in source text) than "MOV AX, OFFSET MyData", so that's "good" (sarcasm, assembly is overly verbose anyways!).

If you're super bored and truly want other files to convert, take a look at XGREP (MASMv4 syntax?? "struc", yuck!) or Kaboom (weirdo MASMv6). And there are many other quirky dialects. (One guy wrote a DOS RAM disk driver, SHSURDRV, using his own eccentric NASM macros, but it only works in old 0.98.39 from 2005.)

For TASM Ideal syntax, normally I'd recommend LZASM (or incomplete support in OpenWatcom's WASM -zcm=tasm). For MASM, I'd recommend JWasm. (Just to state the obvious.)

Yes, it'd be cool if FASMG supported even a partial subset of some of these dialects.
Post 14 Apr 2019, 04:59
View user's profile Send private message Visit poster's website Reply with quote
redsock



Joined: 09 Oct 2009
Posts: 430
Location: Australia
redsock 14 Apr 2019, 05:25
rugxulo wrote:
Again, nice work. Cool
+1
rugxulo wrote:
Yes, it'd be cool if FASMG supported even a partial subset of some of these dialects.
+1

Smile

_________________
2 Ton Digital - https://2ton.com.au/
Post 14 Apr 2019, 05:25
View user's profile Send private message Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8353
Location: Kraków, Poland
Tomasz Grysztar 14 Apr 2019, 11:11
rugxulo wrote:
If you're super bored and truly want other files to convert, take a look at XGREP (MASMv4 syntax?? "struc", yuck!) or Kaboom (weirdo MASMv6). And there are many other quirky dialects. (One guy wrote a DOS RAM disk driver, SHSURDRV, using his own eccentric NASM macros, but it only works in old 0.98.39 from 2005.)
Thank you! These small challenges look like a very good excuse to showcase more of the fasmg's tricks. I hope you don't mind that I keep hijacking your thread.

Not that I'm bored, it is simply that I made fasmg a perfect toy for myself. I immensely enjoy writing every new convoluted macro set and then also trying to simplify it and make elegant if possible. I still need to find some free time for it, though.

rugxulo wrote:
Yes, it'd be cool if FASMG supported even a partial subset of some of these dialects.
I believe that my examples here already demonstrate to some extent that it is possible to emulate even quite quirky old dialects with not that much work. The new challenges should be a perfect opportunity to show several more tricks - though some of them (like back-and-forth switching between segments withing a source) I already discussed in the Introduction to fasmg.
Post 14 Apr 2019, 11:11
View user's profile Send private message Visit poster's website Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8353
Location: Kraków, Poland
Tomasz Grysztar 14 Apr 2019, 11:57
rugxulo wrote:
Yes, the original was 1993 bytes, but I'm comparing two FASM versions instead. Not sure why you enabled XORTEXTS in that version, it's not enabled in the original by default.
It is not enabled by default in the source that was made public, but the executable that was originally distributed used this option.

I made a small fasmg script that simulates the tool they must have originally used to trim the zero bytes from the end of file and at the same time XOR-encrypt the text:
Code:
Original:: file 'SP2.COM'

LEN = $

restartout 0

while LEN > 0
        load A : byte from Original:LEN-1
        LEN = LEN - 1
        if A
                break
        end if
end while

load DATA : LEN from Original:0
db DATA

repeat LEN
        load B : byte from LEN-%
        if B = A
                break
        end if
        B = B xor 17h
        store B : byte at LEN-%
end repeat    
Now if I enable the XORTEXT option in the source and then use this post-processing script, I get an executable that has 1993 bytes and differs from the original only through assembler footprints in the instruction codes. By modifying x86 macros to emulate TASM footprints, we could make it re-create the original executable exactly.

Again, instead of post-processing, we can do it all in a single step, with a specially tailored headers for the assembly. Here is my SP2.ASH that assembles SP2.ASM to a 1993 byte .COM file (using LEGACY.ASH from my post above):
Code:
include 'LEGACY.ASH'

XORTEXTS := 1

macro endtext1 def
        label endtext1 : byte

        repeat $ - text0
                load A : byte from $ - %
                A = A xor 17h
                store A : byte at $ - %
        end repeat

        macro db? arg&
                match =0fch, arg
                        purge db?
                        db ?
                else
                        db arg
                end match
        end macro
end macro

macro zerobeg def
        label zerobeg : word
        virtual
end macro

macro zeroend def
        label zeroend : word
        end virtual
end macro    


Last edited by Tomasz Grysztar on 16 Apr 2019, 18:47; edited 1 time in total
Post 14 Apr 2019, 11:57
View user's profile Send private message Visit poster's website Reply with quote
rugxulo



Joined: 09 Aug 2005
Posts: 2341
Location: Usono (aka, USA)
rugxulo 18 Apr 2019, 05:57
Tomasz Grysztar wrote:
I hope you don't mind that I keep hijacking your thread.


For the record ... it's your message board. It's your assembler. It's still related to DOS and programming. I wouldn't consider this a huge intrusion. Invade away! Laughing
Post 18 Apr 2019, 05:57
View user's profile Send private message Visit poster's website Reply with quote
rugxulo



Joined: 09 Aug 2005
Posts: 2341
Location: Usono (aka, USA)
rugxulo 18 Apr 2019, 06:05
Tomasz Grysztar wrote:

The size of executable matches the original - again, there are just differences caused by assembler footprints.


Yeah, it seems MASM (?), even JWasm, both encode "test al, ah" as "test ah, al". Other than that, it seemed to match.

Tomasz Grysztar wrote:

I have also successfully merged these new tricks into a previous legacy header, there is now one file that allows to assemble INVADERS.ASM, SP2.ASM and XGREP.S. To make it easier to track this overall progress, I'm attaching a complete package with MAKE.BAT which assembles all four challenges solved so far.


MODES as an .EXE is a bit tricky because the header varies due to linker (or, in your case, none). I'm not totally happy with my inability to replicate that, but it was released as "source only", so whatever. As long as it works, I guess it's fine. Still, that kind of verification strikes me as dangerous. It's not that I demand byte-for-byte, but it's much easier (obviously) to know you've done it correctly if you can 100% (or close enough) match to the original.

(Certainly PSR Invaders can use some obvious optimizations, both space and speed. I vaguely remember it being too slow, even on my old 486. But who cares, my 486 is packed up and half dead anyways. Still, I dream ....)

Tomasz Grysztar wrote:

One could perhaps also make a nice script that would download all the pieces from various places in web and then do the assembly.


I'm not sure if you mean this literally or not. For which host OS? Which kind of shell? Which build tool? Sure, it's possible, but you have to know what you want to support. Also, verification is tricky for things that do graphics or anything interactive. Batch-oriented tools are easier to verify. (Also, .COM is easier than .EXE, obviously.)
Post 18 Apr 2019, 06:05
View user's profile Send private message Visit poster's website Reply with quote
rugxulo



Joined: 09 Aug 2005
Posts: 2341
Location: Usono (aka, USA)
rugxulo 18 Apr 2019, 06:28
rugxulo wrote:
But I'm still thinking (too hard?) about other possible solutions. I'm still not fully satisfied.


Just for the record ....

The TASM original is in (ancient, v4?) MASM syntax but has a bug (which TASM ignores). "40:[01ah]", TASM32 ignores the segment while JWasm complains. (Programmer error, PEBKAC.) Otherwise, with a well-written include (pre-included via "-Fi=") you "almost" don't need any changes. While I didn't write BYTEFIX exactly only for that program, it is a simple way to fix that (random seeking) without having to read the whole source in (sequentially, slowly) and outputting a redundant modified copy. (s/40:/ds:/) So there, no Sed is needed ... if you're willing to use JWasm!

Should I rely on Sed at all? Previously, my simplest Sed script was for A86. I rewrote it in AWK, QBASIC, C, and (Turbo) Pascal. That's what I mean, is there a universal solution? Probably not. MS-DOS 5 came with QBASIC, but PC-DOS 7 (allegedly? I don't have it!) came with REXX. But C is fairly common, and TP compatibles aren't hard to find either. AWK has several ports, even for DOS. Debug script? That would take some extra effort, so I haven't done it (yet). But certainly it's redundant (i.e. "bad!") to need external tools when one tool is (potentially) good enough! "Ad hoc" solutions are still good, but generic / universal / reusable is even better!

So I should just port Minised to Turbo Pascal, right??? Ugh. Sure, I'm vaguely interested, but it's still a bit difficult. Maybe a subset Sed-like tool would be just as good (or better??).

I know that doesn't directly use FASMG, which is certainly genius, but it's yet another tool that I don't really understand. I'm just trying to expand my limited brain power with what tools I'm vaguely familiar with. (Yes, I'm furiously rewriting all my old Sed scripts to be more standard compatible. It's certainly interesting!)
Post 18 Apr 2019, 06:28
View user's profile Send private message Visit poster's website Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  
Goto page 1, 2  Next

< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.