FASM: `mov rax, [gs:0x30]` generates address relative to RIP

Index > Main > FASM: `mov rax, [gs:0x30]` generates address relative to RIP

Author

Thread

VEG

Joined: 06 Feb 2013
Posts: 80

VEG 12 Nov 2019, 13:26

`mov rax, [gs:0x30]` generates address relative to RIP, as the result it generates broken code. Had to rewrite it like `mov rax, [gs: dword 0x30]`, but I think that it should be done by default like this when some segment register is mentioned.

12 Nov 2019, 13:26

Tomasz Grysztar

Joined: 16 Jun 2003
Posts: 8548
Location: Kraków, Poland

Tomasz Grysztar 12 Nov 2019, 13:53

With fasmg's implementations this is really easy to tweak. For fasm 1 it might be a bit harder.

12 Nov 2019, 13:53

ProMiNick

Joined: 24 Mar 2012
Posts: 826
Location: Russian Federation, Sochi

ProMiNick 12 Nov 2019, 14:55

I like this variant - placing segment register as prefix

Code:

; Template program using process environment block

format PE64 GUI 5.0
entry start

include 'win64w.inc'

section '.text' code readable executable

proc start
        ;sub     rsp,8           ; Make stack dqword aligned (proc macro already done for us)
     gs mov     rbx,[TEB.ProcessEnvironmentBlock]
        lea     rsi,[rbx+PEB.OSMajorVersion]
        lea     rdi,[_Version]
        lodsd
        add     byte[rdi],al
        lodsd
        add     byte[rdi+2*sizeof.TCHAR],al
        invoke  MessageBox,NULL,rdi,_winVerTitle,MB_OK
        ;add     rsp,8         ; restore stack (proc macro already done for us)
        ret
endp

section '.data' data readable writeable

  _Version TCHAR '0.0',0
  _winVerTitle TCHAR 'Windows version',0

section '.idata' import data readable writeable

  library user32,'USER32.DLL'

  include 'os specific/windows/api/x86/user32.inc'

12 Nov 2019, 14:55

Tomasz Grysztar

Joined: 16 Jun 2003
Posts: 8548
Location: Kraków, Poland

Tomasz Grysztar 12 Nov 2019, 15:38

I made another update of fasmg macros, because this change actually opens up new optimization routes that need to be taken into account:

Code:

00000000: 8B 05 2A 00 00 00        mov eax,[30h]                  
00000006: 65 67 A1 30 00 00 00     mov eax,[gs:30h]               
0000000D: 65 67 A1 FF FF FF FF     mov eax,[gs:000000000FFFFFFFFh]
00000014: 65 8B 04 25 FF FF FF FF  mov eax,[gs:0FFFFFFFFFFFFFFFFh]
0000001C: 65 A1 CC CC CC CC CC CC  mov eax,[gs:0CCCCCCCCCCCCCCCCh]
          CC CC

This makes it little more work to back-port to fasm 1.

ProMiNick wrote:

I like this variant - placing segment register as prefix

In the syntax of fasm this is equivalent to manually adding prefix to an instruction, without affecting the instruction itself. Therefore this way you'd not be able to affect the choice of addressing mode within instruction. I would argue that this is the whole point of these two different syntax variants existing.

12 Nov 2019, 15:38

Tomasz Grysztar

Joined: 16 Jun 2003
Posts: 8548
Location: Kraków, Poland

Tomasz Grysztar 12 Nov 2019, 20:59

VEG wrote:

`mov rax, [gs:0x30]` generates address relative to RIP, as the result it generates broken code.

I should add an important note: that code only becomes broken when you assume that it is position-independent and load it at a different location than the one it was assumed to be at. But you can only assemble this instruction by letting fasm assume that this code is to be loaded at a given absolute address. If you try to assemble it as relocatable code (for example when you select object output, or with PE output when you include relocations), fasm is going to signal an error.

In fact, I put a lot of effort to ensure that fasm generates correct code for a given assumptions. But this correctness goes only as far as the assumptions (in this case about the base address of code) are fulfilled.

If you need fasm to ensure that your code is position-independent, use a relocatable format and keep an eye on the relocation table - there should be no entries there. For example, if you use PE format, you can use this to check your code:

Code:

reloc: data fixups
    assert $ = reloc
end data

With that being said, I think your suggestion is a worthy improvement of the encoding choices, and I'm working to get it into fasm 1 too.

12 Nov 2019, 20:59

ProMiNick

Joined: 24 Mar 2012
Posts: 826
Location: Russian Federation, Sochi

ProMiNick 12 Nov 2019, 21:07

Tomasz, can thou explain how to apply such optimizations to practice?
The role of segment registers in x64 world is minimal - it is only point to TEB structure that whole fit 1000h. others segment regs are zeroed.
But maybe initial zeroed segregisters ds ss cs es are more interested in trickeries.
Thanks in advance.

12 Nov 2019, 21:07

Tomasz Grysztar

Joined: 16 Jun 2003
Posts: 8548
Location: Kraków, Poland

Tomasz Grysztar 12 Nov 2019, 21:18

ProMiNick wrote:

Tomasz, can thou explain how to apply such optimizations to practice?
The role of segment registers in x64 world is minimal - it is only point to TEB structure that whole fit 1000h. others segment regs are zeroed.
But maybe initial zeroed segregisters ds ss cs es are more interested in trickeries.
Thanks in advance.

Keep in mind that fasm can be used for things like OS development. There you may end up setting up FS and GS in a different way (note that these two are the only ones that you can tweak in long mode).

12 Nov 2019, 21:18

VEG

Joined: 06 Feb 2013
Posts: 80

VEG 13 Nov 2019, 16:08

Seems like the new version of FASM generates working code now for `mov eax, [gs:0x30]`.

A small question, just out of curiosity. I see that getting TIB in the kernelbase.dll is done this way:

Code:

65 48 8B 04 25 30 00 00 00    IDA disassembly: mov rax, gs:30h

FASM generates this:

Code:

65 67 48 A1 30 00 00 00       IDA disassembly: mov rax, large gs:30h

Is it basically the same thing, just one byte shorter?

13 Nov 2019, 16:08

Tomasz Grysztar

Joined: 16 Jun 2003
Posts: 8548
Location: Kraków, Poland

Tomasz Grysztar 13 Nov 2019, 16:18

VEG wrote:

Seems like the new version of FASM generates working code now for `mov eax, [gs:0x30]`.

As I explained above, this is not really a bug fix, the code generated previously was also correct and working. Please pay attention to the assumptions about base address that fasm has and use the generated code accordingly. Because if you keep using your code differently from what the assembly was aimed at, you may keep encountering such problems that are not really the assembler's fault.

VEG wrote:

A small question, just out of curiosity. I see that getting TIB in the kernelbase.dll is done this way:
Code:
65 48 8B 04 25 30 00 00 00    IDA disassembly: mov rax, gs:30h    
FASM generates this:
Code:
65 67 48 A1 30 00 00 00       IDA disassembly: mov rax, large gs:30h    
Is it basically the same thing, just one byte shorter?

Yes, this instruction can be encoded in many different ways, even when not counting the RIP-relative variants. See the optimization examples I posted above, you can see there that A1 opcode is generated for addresses that can be zero-extended from 32 bits, while 8B is used for ones that require sign-extension (like 0FFFFFFFFFFFFFFFFh).

13 Nov 2019, 16:18

VEG

Joined: 06 Feb 2013
Posts: 80

VEG 13 Nov 2019, 16:25

Thanks =)

13 Nov 2019, 16:25

ProMiNick

Joined: 24 Mar 2012
Posts: 826
Location: Russian Federation, Sochi

ProMiNick 13 Nov 2019, 23:02

side effect:

Code:

use64

Code:

mov     eax,[ds:$30]

encoded as just a

Code:

mov     eax,[$30]; 67 A1 30 00 00 00

with disabled RIP addressing

and one more variant:

Code:

gs mov     eax,[ds:$30]

looks like nonsence but now its produce valid code too

13 Nov 2019, 23:02

Tomasz Grysztar

Joined: 16 Jun 2003
Posts: 8548
Location: Kraków, Poland

Tomasz Grysztar 14 Nov 2019, 06:48

Oh, you found a bug there, this was only supposed to be affected by FS/GS, not DS. Apparently I should have used JA in place of JAE.

The problem with absolute addressing in long mode is that in general (with exception of these MOV variants) it cannot cover most of the addressing range, thus it is not a good automatic choice for the main segment addressing (you can still enforce it with size operator, though).

PS. I updated the 1.73.18 packages. I generally try to avoid "silent updates" nowadays, but this is such a minor correction and I believe the packages have not been propagated yet.

14 Nov 2019, 06:48

Tomasz Grysztar

Joined: 16 Jun 2003
Posts: 8548
Location: Kraków, Poland

Tomasz Grysztar 14 Nov 2019, 09:54

BTW, did you know that you can use FS/GS to access the entire addressing space nonetheless?

Code:

        rdgsbase rbx
        neg     rbx
        mov     [gs:rbx+hwnd],rax       ; access variable in program image

14 Nov 2019, 09:54

revolution
When all else fails, read the source

Joined: 24 Aug 2004
Posts: 21011
Location: In your JS exploiting you and your system

revolution 14 Nov 2019, 10:18

Why does the code have "neg rbx"? Confused

14 Nov 2019, 10:18

Tomasz Grysztar

Joined: 16 Jun 2003
Posts: 8548
Location: Kraków, Poland

Tomasz Grysztar 14 Nov 2019, 12:12

revolution wrote:

Why does the code have "neg rbx"?

Because GS:0 is at GSBASE, so if X is a linear address of something, its address within GS segment is X-GSBASE.

14 Nov 2019, 12:12

Feryno

Joined: 23 Mar 2005
Posts: 519
Location: Czech republic, Slovak republic

Feryno 15 Nov 2019, 14:53

As Tomasz pointed out, FS/GS have 64 bit bases (MSR for FS base, MSR for GS base, MSR for kernel GS base), the others (CS, DS, ES, SS) have 32 bit bases in GDT.
But be careful with the instruction rdgsbase - old processors do not support it (use CPUID to determine whether cpu supports it - input eax=7, input ecx=0, execute cpuid, check output ebx bit 0.)
When the cpuid output ebx bit 0. is set to 1, then you must check CR4 as its bit 16. enables/disables rdfsbase/rdgsbase/wr..., unluckily you can't easily obtain CR4 in usermode as mov gpr64,cr4 generates #GP(0) here and requires CPL=0 (kernelmode)

15 Nov 2019, 14:53

revolution
When all else fails, read the source

Joined: 24 Aug 2004
Posts: 21011
Location: In your JS exploiting you and your system

revolution 15 Nov 2019, 14:59

Feryno wrote:

As Tomasz pointed out, FS/GS have 64 bit bases (MSR for FS base, MSR for GS base, MSR for kernel GS base), the others (CS, DS, ES, SS) have 32 bit bases in GDT.
But be careful with the instruction rdgsbase - old processors do not support it (use CPUID to determine whether cpu supports it - input eax=7, input ecx=0, execute cpuid, check output ebx bit 0.)
When the cpuid output ebx bit 0. is set to 1, then you must check CR4 as its bit 16. enables/disables rdfsbase/rdgsbase/wr..., unluckily you can't easily obtain CR4 in usermode as mov gpr64,cr4 generates #GP(0) here and requires CPL=0 (kernelmode)

I think this shows the difficulty of determining the existence of instructions in general, not just rdgsbase.

Another approach is to simply execute the instruction and catch any exceptions that might occur. In the exception handler have alternate code to read the value in another way.

If one needs to execute rdgsbase often then do one test execution at program start-up and set a flag to select which sequence to use later.

15 Nov 2019, 14:59

Tomasz Grysztar

Joined: 16 Jun 2003
Posts: 8548
Location: Kraków, Poland

Tomasz Grysztar 15 Nov 2019, 15:04

Feryno wrote:

As Tomasz pointed out, FS/GS have 64 bit bases (MSR for FS base, MSR for GS base, MSR for kernel GS base), the others (CS, DS, ES, SS) have 32 bit bases in GDT.

Note that bases in GDT are ignored in long mode, so all segments except for FS/GS have base zero, no matter what you put in descriptor.

Similarly, limit in descriptor is also ignored in long mode - this is why FS/GS can always be used to access entire linear memory, the addresses are simply shifted (according to MSR value) and wrapped around.

Feryno wrote:

But be careful with the instruction rdgsbase - old processors do not support it (...)

Yes, I only used it to make this a simple demonstration. But even if you have no way of reading this MSR value, you can still access entire addressing space through FS/GS. It could be an interesting challenge to determine the base in a different way - like comparing contents of memory to find what the "shift" is. Wink

15 Nov 2019, 15:04

Feryno

Joined: 23 Mar 2005
Posts: 519
Location: Czech republic, Slovak republic

Feryno 17 Nov 2019, 21:02

Tomasz, if the things were so clear as years ago...
AMD introduced bit 13 in MSR EFER for Long Mode Segment Limit Enable.
But GS is still excluded from segment limit checks even if the bit 13 is enabled.
This caused a difference between AMD and Intel so only AMD allowed to run cpu emulators using segmentation (e.g. vmware 6). Using segmentation to run emulators was abandoned when emulators switched into using virtualization (SVM at AMD platform / VMX at Intel).
Scanning memory for finding GS base is very interesting idea. It remembers me the old days when scanning memory was the only way to determine whether gate A20 enabled or disabled.

17 Nov 2019, 21:02

< Last Thread | Next Thread >

Forum Rules:

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum