flat assembler
Message board for the users of flat assembler.

Index > Compiler Internals > invalid seg overrides accepted without warnings

Author
Thread Post new topic Reply to topic
rugxulo



Joined: 09 Aug 2005
Posts: 2341
Location: Usono (aka, USA)
rugxulo 05 Dec 2007, 15:17
Okay, this is a bit obscure, but shouldn't it (ideally) warn if invalid seg overrides are used? I'm aware of the jump hints, and yes the programmer should probably know better, but I'd still find it more helpful if FASM didn't allow such to assemble without warnings or errors or anything:

Code:
ds stosb             ; invalid but accepted
es stosb             ; redundant but accepted
    


All basic string instructions only allow overrides on source (SI) but not destination (DI), right??
Post 05 Dec 2007, 15:17
View user's profile Send private message Visit poster's website Reply with quote
LocoDelAssembly
Your code has a bug


Joined: 06 May 2005
Posts: 4624
Location: Argentina
LocoDelAssembly 05 Dec 2007, 15:36
rugxulo, this is from something more general. fasm allows you to use any prefix, you can even make "lock cmp eax, eax" but it will produce an exception on run-time.

Note that fasm does not accept "stos byte [ds:di]" but accept "stos byte [es:di]", so fasm checks for invalid seg overrides, but prefixes are always accepted, even things like "cs ds ss es rep cmp eax, eax" (which gets executed perfectly but it is very stupid).
Post 05 Dec 2007, 15:36
View user's profile Send private message Reply with quote
MazeGen



Joined: 06 Oct 2003
Posts: 977
Location: Czechoslovakia
MazeGen 06 Dec 2007, 14:19
Quote:
Code:
lock cmp eax, eax    

There is an ambiguity then. If this invalid instruction can be assembled, why "lea eax, ebx" can't?

Tomasz, answer this, please.
Post 06 Dec 2007, 14:19
View user's profile Send private message Visit poster's website Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8359
Location: Kraków, Poland
Tomasz Grysztar 06 Dec 2007, 14:27
It's all because I was too lazy to implement prefixes as something other as just a separate instructions (note that you may also assemble just a prefix and nothing more) that just allow to put one more instruction in the same line (I followed the NASM's standard in this one). See also this thread to see this behavior expanded even further.
Post 06 Dec 2007, 14:27
View user's profile Send private message Visit poster's website Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20454
Location: In your JS exploiting you and your system
revolution 06 Dec 2007, 17:27
MazeGen wrote:
Quote:
Code:
lock cmp eax, eax    

There is an ambiguity then. If this invalid instruction can be assembled, why "lea eax, ebx" can't?

Tomasz, answer this, please.
Let me be bold and answer this instead.

In "lea eax,ebx", ebx is not an address so how can eax get the effective address of something that is not an address?

Also "lock cmp eax,eax" is a valid construction and has a valid encoding (it is a hang over from the 8086 days). Indeed now days it can be used to deliberately test an exception handler and see that is performs as you expect.
Post 06 Dec 2007, 17:27
View user's profile Send private message Visit poster's website Reply with quote
MazeGen



Joined: 06 Oct 2003
Posts: 977
Location: Czechoslovakia
MazeGen 06 Dec 2007, 19:33
Tomasz, laziness is good reason. I can really understand this Smile I complain about LEA yet because I'd really like to get it assembled with FASM Smile

revolution, I could say in the same meaning:

In "lock cmp eax, ebx", eax is not an address so how can be an access to eax locked? And "lock cmp" can never be a valid construction (cmp does not write the result back to memory).

Also "lea eax, ebx" has valid encoding (8D,C3). Additionaly, "lea eax, ebx" can be used to deliberately test an exception handler... Wink
Post 06 Dec 2007, 19:33
View user's profile Send private message Visit poster's website Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20454
Location: In your JS exploiting you and your system
revolution 06 Dec 2007, 21:15
MazeGen wrote:
revolution, I could say in the same meaning:

In "lock cmp eax, ebx", eax is not an address so how can be an access to eax locked? And "lock cmp" can never be a valid construction (cmp does not write the result back to memory).

Also "lea eax, ebx" has valid encoding (8D,C3). Additionaly, "lea eax, ebx" can be used to deliberately test an exception handler... Wink

Good points, except for one small thing you overlooked. "lea eax, ebx" has never been a valid instruction even on old the 8086[1], so I see no good reason to suddenly allow it now. The encoding is only there as a back formation from the mod=11 similarity in other instructions. Your other point about testing the exception handler is good.

Also, lock can be used with all instructions on the 8086 and 80286. BTW, I am still supporting code for a legacy device using an 80c286. The hardware wiring uses the lock signal as an external output to trigger a timer. Thus the code has a few places where lock is used without an associated memory instruction following. The original designers thought this was a clever way to make a fast I/O bit with minimal hardware and software requirements[2]. If lock is disallowed now it would break my code.

[1] Of course the registers would only be 16 bits wide in this case. Also, I expect the old 8086 would probably happily execute it with some result in AX. That result might even be BX, similar to mov ax,bx. I can't test this though. Anyone with an 8086 lying about they can use?

[2] In actual practice this was a bad way of saving money, the software has to ensure no other bus activity is happening before this will work reliably. It is a major headache whenever this code needs revising.
Post 06 Dec 2007, 21:15
View user's profile Send private message Visit poster's website Reply with quote
MazeGen



Joined: 06 Oct 2003
Posts: 977
Location: Czechoslovakia
MazeGen 07 Dec 2007, 08:53
revolution, thanks for this information! I didn't know it worked this odd way on the 8086 and 80286. Now I can see that in Intel 286 and 386 manuals.

It's too bad that I threw away my old 286 AT PC long time ago Sad
Post 07 Dec 2007, 08:53
View user's profile Send private message Visit poster's website Reply with quote
rugxulo



Joined: 09 Aug 2005
Posts: 2341
Location: Usono (aka, USA)
rugxulo 27 Jan 2008, 23:00
Tomasz Grysztar wrote:
It's all because I was too lazy to implement prefixes as something other as just a separate instructions (note that you may also assemble just a prefix and nothing more) that just allow to put one more instruction in the same line (I followed the NASM's standard in this one). See also this thread to see this behavior expanded even further.


Keeping prefixes assembled by themselves is a good idea ("rep ret" for AMD64, "cs" and "ds" for jump hints on Intel's P4, "rep nop" which is the same as SSE2's "pause" encoding). It's only invalid destination overrides on string instructions (which correctly assemble but don't work) that might be a tad confusing without any warning or error. However, not truly important, just confusing (for me, anyways). Besides, now I learn how slow string instructions are anyways (for anything newer than 386)!
Post 27 Jan 2008, 23:00
View user's profile Send private message Visit poster's website Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2025, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.