flat assembler
Message board for the users of flat assembler.

Index > Compiler Internals > [solved] [Encoding problem] mov al, [42]

Author
Thread Post new topic Reply to topic
gtasm



Joined: 27 Jun 2019
Posts: 4
gtasm 27 Jun 2019, 19:45
Hello,

I'm new to Intel instructions encoding, so I may be mistaken. In that case, please feel free to tell me why I'm wrong.

But during some tests, I encountered that the following basic assembly code:
use64
mov al, [42]
would produce this output using fasm:
8A 05 24 00 00 00

- 8A makes sense. Intel doc says that "8A /r" means "Move r/m8 to r8"
- 05 is logically the modR/M, composed of:
- mod 00 (address of operand will be provided)
- reg 000 (al)
- r/m 101 (disp32)
=> 00000101b = 05h
- then "24 00 00 00" should be the 32 bits operand, representing the address we want to copy the value from.
But 42d != 24 00 00 00 (=36d)

To double check my understanding, I ran the same test using nasm.
In that case, the encoding is a bit different:
8A 04 25 2A 00 00 00
that I undestand as being a more verbose way to say the same thing, except that it uses the indirection of a dummy SIB byte.
But at least I can explain the operand value of "2A 00 00 00" since 2Ah = 42d

What do I miss? I don't want to believe that this can be a bug in fasm.

Thanks a lot for your help,

Tom
Post 27 Jun 2019, 19:45
View user's profile Send private message Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8359
Location: Kraków, Poland
Tomasz Grysztar 27 Jun 2019, 20:01
In the long mode fasm automatically generates RIP-based addressing by default, unless you enforce use of absolute addressing (see section 2.1.19 of the manual). Therefore the instruction that gets generated by
Code:
mov al, [42]    
is in this case the same as:
Code:
mov al, [rip+36]    
Note that the offset is going to change if the position of instruction in memory is different, for example:
Code:
use64
org 100h
mov al, [42]    
generates the same instruction as:
Code:
use64
mov al, [rip-220]    

PS. The r/m field value 101b in the long mode does not have the same meaning as in the legacy mode.
Post 27 Jun 2019, 20:01
View user's profile Send private message Visit poster's website Reply with quote
gtasm



Joined: 27 Jun 2019
Posts: 4
gtasm 27 Jun 2019, 20:19
Of course, I took it for granted that the address was an absolute one!
I have to dig and learn more about this new (for me) RIP-relative addressing.

Thank you very much for your precise, structured and very quick answer, Tomasz!

Tom
Post 27 Jun 2019, 20:19
View user's profile Send private message Reply with quote
gtasm



Joined: 27 Jun 2019
Posts: 4
gtasm 27 Jun 2019, 20:41
Ah, if I may, I notice that
Code:
mov eax, [42]    

is not encoding using RIP-relative addressing, even if it is shorter than the SIB-based addressing.

But there may be a technical reason for that (maybe a shorter latency with this kind of encoding of the instruction)...

Tom
Post 27 Jun 2019, 20:41
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20454
Location: In your JS exploiting you and your system
revolution 27 Jun 2019, 22:35
Instruction latency and throughput is not a consideration inside fasm. Each CPU and code stream is different so there isn't really a way for the assembler to know.

For your code above, which I assume is using 32 bit mode, there is no EIP addressing mode. RIP addressing is only possible in 64 bit mode.
Post 27 Jun 2019, 22:35
View user's profile Send private message Visit poster's website Reply with quote
gtasm



Joined: 27 Jun 2019
Posts: 4
gtasm 27 Jun 2019, 23:08
Thank you revolution,

It is valuable to know that latency has no impact on FASM output. I share your point of view.

Please forgive me for two mistakes I made:
- not giving the context of my code (prefixed by use64, as previously)
- confusing the output of FASM and the one of NASM that I used to double check my results.

Using FASM, the output of:
Code:
use64 
mov eax, [42]
    

is the shortest possible, RIP-relative addressing based:
Code:
8B 05 24 00 00 00
    


Only NASM outputs the longer SIB and absolute addressing based output.

Sorry for the confusion, and again, thank you both very much for the responsiveness and the quality of your answers.

Tom
Post 27 Jun 2019, 23:08
View user's profile Send private message Reply with quote
bitRAKE



Joined: 21 Jul 2003
Posts: 4075
Location: vpcmpistri
bitRAKE 20 Oct 2020, 08:38
Wasn't there some syntactical sugar to make fasmg generate absolute addressing?
Code:
mov rax,[dword $7FFE0014] ; 48:8B 04 25 $7FFE0014    
...I couldn't find anything in the manual, searched the board, and I can't seems to get around the @src.auto_relative feature of x86.parse_operand. What am I missing?

_________________
¯\(°_o)/¯ “languages are not safe - uses can be” Bjarne Stroustrup
Post 20 Oct 2020, 08:38
View user's profile Send private message Visit poster's website Reply with quote
bitRAKE



Joined: 21 Jul 2003
Posts: 4075
Location: vpcmpistri
bitRAKE 20 Oct 2020, 09:12
Code:
include 'cpu/x64.inc'
use64
mov rax,[dword $7FFE0014]    
It does work. Somehow I broke it. Embarassed

Code:
include 'format/format.inc'
format MS64 COFF
mov rax,[dword $7FFE0014]    
Looks like it doesn't work in COFF format because of the way relocations are done perhaps.

_________________
¯\(°_o)/¯ “languages are not safe - uses can be” Bjarne Stroustrup
Post 20 Oct 2020, 09:12
View user's profile Send private message Visit poster's website Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8359
Location: Kraków, Poland
Tomasz Grysztar 20 Oct 2020, 09:31
bitRAKE wrote:
I can't seems to get around the @src.auto_relative feature of x86.parse_operand. What am I missing?
Oh, there is a bug introduced upon rewrite to CALM:
Code:
        compute mode, pre shl 3
      no_address_size_prefix:

        compute mode, 0    
The "mode" value gets destroyed as soon as it got set up. That last line should be moved up, before the size prefix parsing block.
Post 20 Oct 2020, 09:31
View user's profile Send private message Visit poster's website Reply with quote
bitRAKE



Joined: 21 Jul 2003
Posts: 4075
Location: vpcmpistri
bitRAKE 20 Oct 2020, 09:44
Awesome! I was staring right at that too.
(Time to get some sleep.)

Thank you!

_________________
¯\(°_o)/¯ “languages are not safe - uses can be” Bjarne Stroustrup
Post 20 Oct 2020, 09:44
View user's profile Send private message Visit poster's website Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2025, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.