flat assembler
Message board for the users of flat assembler.

Index > DOS > PSR Invaders 1.1

Goto page Previous  1, 2
Author
Thread Post new topic Reply to topic
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8359
Location: Kraków, Poland
Tomasz Grysztar 18 Apr 2019, 06:48
rugxulo wrote:
It's not that I demand byte-for-byte, but it's much easier (obviously) to know you've done it correctly if you can 100% (or close enough) match to the original.
That depends how you define correctness. What if the old assembler had a bug and assembled something objectively wrong, but it just so happened that it did not break the program at the time? If we decided we wanted to match the output of the old assembler exactly, then we would also need to simulate the bug, but that might not necessarily be the right thing to do.

rugxulo wrote:
I'm not sure if you mean this literally or not. For which host OS? Which kind of shell? Which build tool? Sure, it's possible, but you have to know what you want to support.
We have Linux subsystem on Windows now, so it all matters less and less nowadays. I was thinking about something simple like using wget to download all the source packages and then a couple of commands that would do what my MAKE.BAT does, etc.
Post 18 Apr 2019, 06:48
View user's profile Send private message Visit poster's website Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20454
Location: In your JS exploiting you and your system
revolution 18 Apr 2019, 07:18
Tomasz Grysztar wrote:
rugxulo wrote:
It's not that I demand byte-for-byte, but it's much easier (obviously) to know you've done it correctly if you can 100% (or close enough) match to the original.
That depends how you define correctness. What if the old assembler had a bug and assembled something objectively wrong, but it just so happened that it did not break the program at the time? If we decided we wanted to match the output of the old assembler exactly, then we would also need to simulate the bug, but that might not necessarily be the right thing to do.
It doesn't even need a bug to exist. The program could be making use of a constant it finds in the code that happens to be one of the instructions that assembles into a different binary form.

Different binary output, same instruction behaviour.
Code:
v: test al,ah ; test ah,al
mov bx,[v] ;what value do we get here?    
Post 18 Apr 2019, 07:18
View user's profile Send private message Visit poster's website Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8359
Location: Kraków, Poland
Tomasz Grysztar 18 Apr 2019, 08:47
revolution wrote:
Tomasz Grysztar wrote:
rugxulo wrote:
It's not that I demand byte-for-byte, but it's much easier (obviously) to know you've done it correctly if you can 100% (or close enough) match to the original.
That depends how you define correctness. What if the old assembler had a bug and assembled something objectively wrong, but it just so happened that it did not break the program at the time? If we decided we wanted to match the output of the old assembler exactly, then we would also need to simulate the bug, but that might not necessarily be the right thing to do.
It doesn't even need a bug to exist. The program could be making use of a constant it finds in the code that happens to be one of the instructions that assembles into a different binary form.

Different binary output, same instruction behaviour.
Code:
v: test al,ah ; test ah,al
mov bx,[v] ;what value do we get here?    
Yeah, but that we can correct by making our assembler (in this case instruction encoding macros) generate the same footprints as the old one (well, as long as we have an access to either the original binary or the original assembler - otherwise we might have no point of reference) and there would be nothing controversial about it.

What I had in mind was (perhaps only theoretically possible) program that originally worked correctly despite having some instructions assembled wrongly due to some bug in the tool. I mean: would simulating the bug for the sake of it be then the right thing to do? Depends what our priorities would be.

rugxulo wrote:
The TASM original is in (ancient, v4?) MASM syntax but has a bug (which TASM ignores). "40:[01ah]", TASM32 ignores the segment while JWasm complains. (Programmer error, PEBKAC.)
Well, this is perhaps an example of something like it. Although in this case we have a bug in the source that got ignored because of a bug in the assembler, so if we want to be able to assemble this one at all, we have to mimic the bug. You may notice my "ignore numeric segment prefix" comment in macros for fasmg, this is exactly the part that emulates the original bug.
Post 18 Apr 2019, 08:47
View user's profile Send private message Visit poster's website Reply with quote
rugxulo



Joined: 09 Aug 2005
Posts: 2341
Location: Usono (aka, USA)
rugxulo 28 Apr 2019, 03:22
(Sorry I'm late in replying.) I'm sure you two are already aware of most of these quirks:

Code:
salc ; undocumented
aad 16 ; undocumented, allegedly didn't work on NEC V20/V30 clones
shl ax,79 ; was it 386+ that first limited the shift value?
cmp sp, ax ; you know, that whole 286 cpu check since SP's value after push changed in later cpus
loadall ; removed in later cpus
    


Not to mention obvious things like "-22 / 7" (-3? -4?) and "-22 % 7" (-1? 6?). Who's the boss? Who will complain (what will break?) if you get it wrong? What if you have to please two conflicting masters at the same time?

And yes, many demoscene codes (and other weird ones like that Befunge-93 interpreter I slightly modified, BEFI) used specific assembler encodings, which is "bad". (I'm sure some people are smart enough to do it successfully, but most of us aren't! It should be avoided, IMHO.) 89h (FASM) vs. 8Bh (TASM), used to do self-modifying (ugh! very slow in modern cpus) to change between "MOV" (no-op, in this particular case) and 0EBh "JMPS". (Again, I don't see the point. Simpler code can avoid that trickery. Maybe on some old machines that would be an ideal solution, but it seems dangerously obtuse. I don't recommend it, but some rare people do know what they're doing, even if obscure.)

The simplest solution is often the best. It's best not to rely too highly on specific APIs, OSes, arcane tricks, or tools, especially if they aren't free/libre. (Well, even "standard" code can be ignored, obsoleted, disappear.) If you really, really, really know what you're doing, okay, but ... obvious is better than obscure.
Post 28 Apr 2019, 03:22
View user's profile Send private message Visit poster's website Reply with quote
rugxulo



Joined: 09 Aug 2005
Posts: 2341
Location: Usono (aka, USA)
rugxulo 28 Apr 2019, 03:33
Tomasz Grysztar wrote:

rugxulo wrote:
I'm not sure if you mean this literally or not. For which host OS? Which kind of shell? Which build tool? Sure, it's possible, but you have to know what you want to support.
We have Linux subsystem on Windows now, so it all matters less and less nowadays. I was thinking about something simple like using wget to download all the source packages and then a couple of commands that would do what my MAKE.BAT does, etc.


I've never had Windows 10 on any machine. I still don't have any AVX machines. (But I'm no pro, so I don't care.) Yes, WSL sounds great, but I meant in general. I think Cygwin has Wget by default, but I don't know if Windows does (even 10, yet?). macOS has Curl, maybe. (antiX Linux has both.)

For DOS, it's easy (especially under VM, see here or here) if you have a working packet driver. There you can indeed use Curl or Links2 or similar. At least, to me, that's the most familiar. For other big OSes, it's probably more than obvious.
Post 28 Apr 2019, 03:33
View user's profile Send private message Visit poster's website Reply with quote
rugxulo



Joined: 09 Aug 2005
Posts: 2341
Location: Usono (aka, USA)
rugxulo 28 Jul 2019, 02:57
rugxulo wrote:
rugxulo wrote:
But I'm still thinking (too hard?) about other possible solutions. I'm still not fully satisfied.


Should I rely on Sed at all? Previously, my simplest Sed script was for A86. I rewrote it in AWK, QBASIC, C, and (Turbo) Pascal. That's what I mean, is there a universal solution? Probably not. "Ad hoc" solutions are still good, but generic / universal / reusable is even better!


I have rewritten the FASM sed script into both Turbo Pascal and AWK. With the DOS version of TP 5.5, the .EXE is only 5 kb (or 3 kb UPX'd; FPC 3.1.1 snapshot outputs 18 kb or 10 kb UPX'd for Small model). The .PAS source is roughly 4 kb. But, again, it's very ad hoc, so not great overall. It needs to be (even) further simplified, modularized, generalized. The AWK script is roughly 1500 bytes, but I feel much less comfortable there. Still, it works.

I may just write a small asm version (obvious, right?). But it's mostly pointless. I'm not sure I'm smart enough to write my own reusable tool here. But it's something I like to think about, to barely pretend to understand.

Oh, BTW, guess I'll refresh my latest .BAT here (yet again) for more savings!

Quote:

04/18/2019 12:48 AM 1,263 inv-fasm.ba~
07/23/2019 01:19 PM 1,132 inv-fasm.bat

c:\rugxulo\tmp\fasm>wc -lc *.ba?
57 1263 inv-fasm.ba~
54 1132 inv-fasm.bat
111 2395 total

c:\rugxulo\tmp\fasm>wc -lc *.ba? | awk "/\.ba~/{oldline=$1;oldbyte=$2};/\.bat/{newline=$1;newbyte=$2};END{print \"saved \" oldline-newline \" lines, saved \" oldbyte-newbyte \" bytes\"}"

saved 3 lines, saved 131 bytes
Post 28 Jul 2019, 02:57
View user's profile Send private message Visit poster's website Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  
Goto page Previous  1, 2

< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2025, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.