flat assembler
Message board for the users of flat assembler.
Index
> Main > FASM: `mov rax, [gs:0x30]` generates address relative to RIP |
Author |
|
Tomasz Grysztar 12 Nov 2019, 13:53
With fasmg's implementations this is really easy to tweak. For fasm 1 it might be a bit harder.
|
|||
12 Nov 2019, 13:53 |
|
ProMiNick 12 Nov 2019, 14:55
I like this variant - placing segment register as prefix
Code: ; Template program using process environment block format PE64 GUI 5.0 entry start include 'win64w.inc' section '.text' code readable executable proc start ;sub rsp,8 ; Make stack dqword aligned (proc macro already done for us) gs mov rbx,[TEB.ProcessEnvironmentBlock] lea rsi,[rbx+PEB.OSMajorVersion] lea rdi,[_Version] lodsd add byte[rdi],al lodsd add byte[rdi+2*sizeof.TCHAR],al invoke MessageBox,NULL,rdi,_winVerTitle,MB_OK ;add rsp,8 ; restore stack (proc macro already done for us) ret endp section '.data' data readable writeable _Version TCHAR '0.0',0 _winVerTitle TCHAR 'Windows version',0 section '.idata' import data readable writeable library user32,'USER32.DLL' include 'os specific/windows/api/x86/user32.inc' |
|||
12 Nov 2019, 14:55 |
|
Tomasz Grysztar 12 Nov 2019, 15:38
I made another update of fasmg macros, because this change actually opens up new optimization routes that need to be taken into account:
Code: 00000000: 8B 05 2A 00 00 00 mov eax,[30h] 00000006: 65 67 A1 30 00 00 00 mov eax,[gs:30h] 0000000D: 65 67 A1 FF FF FF FF mov eax,[gs:000000000FFFFFFFFh] 00000014: 65 8B 04 25 FF FF FF FF mov eax,[gs:0FFFFFFFFFFFFFFFFh] 0000001C: 65 A1 CC CC CC CC CC CC mov eax,[gs:0CCCCCCCCCCCCCCCCh] CC CC ProMiNick wrote: I like this variant - placing segment register as prefix |
|||
12 Nov 2019, 15:38 |
|
Tomasz Grysztar 12 Nov 2019, 20:59
VEG wrote: `mov rax, [gs:0x30]` generates address relative to RIP, as the result it generates broken code. In fact, I put a lot of effort to ensure that fasm generates correct code for a given assumptions. But this correctness goes only as far as the assumptions (in this case about the base address of code) are fulfilled. If you need fasm to ensure that your code is position-independent, use a relocatable format and keep an eye on the relocation table - there should be no entries there. For example, if you use PE format, you can use this to check your code: Code: reloc: data fixups assert $ = reloc end data With that being said, I think your suggestion is a worthy improvement of the encoding choices, and I'm working to get it into fasm 1 too. |
|||
12 Nov 2019, 20:59 |
|
ProMiNick 12 Nov 2019, 21:07
Tomasz, can thou explain how to apply such optimizations to practice?
The role of segment registers in x64 world is minimal - it is only point to TEB structure that whole fit 1000h. others segment regs are zeroed. But maybe initial zeroed segregisters ds ss cs es are more interested in trickeries. Thanks in advance. |
|||
12 Nov 2019, 21:07 |
|
Tomasz Grysztar 12 Nov 2019, 21:18
ProMiNick wrote: Tomasz, can thou explain how to apply such optimizations to practice? |
|||
12 Nov 2019, 21:18 |
|
VEG 13 Nov 2019, 16:08
Seems like the new version of FASM generates working code now for `mov eax, [gs:0x30]`.
A small question, just out of curiosity. I see that getting TIB in the kernelbase.dll is done this way: Code: 65 48 8B 04 25 30 00 00 00 IDA disassembly: mov rax, gs:30h FASM generates this: Code: 65 67 48 A1 30 00 00 00 IDA disassembly: mov rax, large gs:30h Is it basically the same thing, just one byte shorter? |
|||
13 Nov 2019, 16:08 |
|
Tomasz Grysztar 13 Nov 2019, 16:18
VEG wrote: Seems like the new version of FASM generates working code now for `mov eax, [gs:0x30]`. VEG wrote: A small question, just out of curiosity. I see that getting TIB in the kernelbase.dll is done this way: |
|||
13 Nov 2019, 16:18 |
|
VEG 13 Nov 2019, 16:25
Thanks =)
|
|||
13 Nov 2019, 16:25 |
|
ProMiNick 13 Nov 2019, 23:02
side effect:
Code: use64 Code: mov eax,[ds:$30] encoded as just a Code: mov eax,[$30]; 67 A1 30 00 00 00 and one more variant: Code: gs mov eax,[ds:$30] |
|||
13 Nov 2019, 23:02 |
|
Tomasz Grysztar 14 Nov 2019, 06:48
Oh, you found a bug there, this was only supposed to be affected by FS/GS, not DS. Apparently I should have used JA in place of JAE.
The problem with absolute addressing in long mode is that in general (with exception of these MOV variants) it cannot cover most of the addressing range, thus it is not a good automatic choice for the main segment addressing (you can still enforce it with size operator, though). PS. I updated the 1.73.18 packages. I generally try to avoid "silent updates" nowadays, but this is such a minor correction and I believe the packages have not been propagated yet. |
|||
14 Nov 2019, 06:48 |
|
Tomasz Grysztar 14 Nov 2019, 09:54
BTW, did you know that you can use FS/GS to access the entire addressing space nonetheless?
Code: rdgsbase rbx neg rbx mov [gs:rbx+hwnd],rax ; access variable in program image |
|||
14 Nov 2019, 09:54 |
|
revolution 14 Nov 2019, 10:18
Why does the code have "neg rbx"?
|
|||
14 Nov 2019, 10:18 |
|
Tomasz Grysztar 14 Nov 2019, 12:12
revolution wrote: Why does the code have "neg rbx"? |
|||
14 Nov 2019, 12:12 |
|
Feryno 15 Nov 2019, 14:53
As Tomasz pointed out, FS/GS have 64 bit bases (MSR for FS base, MSR for GS base, MSR for kernel GS base), the others (CS, DS, ES, SS) have 32 bit bases in GDT.
But be careful with the instruction rdgsbase - old processors do not support it (use CPUID to determine whether cpu supports it - input eax=7, input ecx=0, execute cpuid, check output ebx bit 0.) When the cpuid output ebx bit 0. is set to 1, then you must check CR4 as its bit 16. enables/disables rdfsbase/rdgsbase/wr..., unluckily you can't easily obtain CR4 in usermode as mov gpr64,cr4 generates #GP(0) here and requires CPL=0 (kernelmode) |
|||
15 Nov 2019, 14:53 |
|
revolution 15 Nov 2019, 14:59
Feryno wrote: As Tomasz pointed out, FS/GS have 64 bit bases (MSR for FS base, MSR for GS base, MSR for kernel GS base), the others (CS, DS, ES, SS) have 32 bit bases in GDT. Another approach is to simply execute the instruction and catch any exceptions that might occur. In the exception handler have alternate code to read the value in another way. If one needs to execute rdgsbase often then do one test execution at program start-up and set a flag to select which sequence to use later. |
|||
15 Nov 2019, 14:59 |
|
Tomasz Grysztar 15 Nov 2019, 15:04
Feryno wrote: As Tomasz pointed out, FS/GS have 64 bit bases (MSR for FS base, MSR for GS base, MSR for kernel GS base), the others (CS, DS, ES, SS) have 32 bit bases in GDT. Similarly, limit in descriptor is also ignored in long mode - this is why FS/GS can always be used to access entire linear memory, the addresses are simply shifted (according to MSR value) and wrapped around. Feryno wrote: But be careful with the instruction rdgsbase - old processors do not support it (...) |
|||
15 Nov 2019, 15:04 |
|
Feryno 17 Nov 2019, 21:02
Tomasz, if the things were so clear as years ago...
AMD introduced bit 13 in MSR EFER for Long Mode Segment Limit Enable. But GS is still excluded from segment limit checks even if the bit 13 is enabled. This caused a difference between AMD and Intel so only AMD allowed to run cpu emulators using segmentation (e.g. vmware 6). Using segmentation to run emulators was abandoned when emulators switched into using virtualization (SVM at AMD platform / VMX at Intel). Scanning memory for finding GS base is very interesting idea. It remembers me the old days when scanning memory was the only way to determine whether gate A20 enabled or disabled. |
|||
17 Nov 2019, 21:02 |
|
< Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2025, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.