flat assembler
Message board for the users of flat assembler.

Index > Compiler Internals > Segment override prefixes in PE64

Author
Thread Post new topic Reply to topic
TrDr.Charlie



Joined: 07 Dec 2010
Posts: 11
TrDr.Charlie 13 May 2012, 09:44
Hello all,

segment override prefixes ( es, cs, ss, ds ) are with this syntax allowed:

Code:

    es  add                     byte [rax], al                          ; 26 00 00
    cs  add                     byte [rax], al                          ; 2E 00 00
    ss  add                     byte [rax], al                          ; 36 00 00
    ds  add                     byte [rax], al                          ; 2E 00 00
    fs  add                     byte [rax], al                          ; 64 00 00
    gs  add                     byte [rax], al                          ; 65 00 00

    




and with this syntax are ignored:

Code:

        add                     byte [es: rax], al                      ; 00 00
        add                     byte [cs: rax], al                      ; 00 00
        add                     byte [ss: rax], al                      ; 00 00
        add                     byte [ds: rax], al                      ; 00 00
        add                     byte [fs: rax], al                      ; 64 00 00
        add                     byte [gs: rax], al                      ; 65 00 00

        add                     byte [es: rax + 8*rax + 100h], al       ; 00 84 C0 00 01 00 00
        add                     byte [cs: rax + 8*rax + 100h], al       ; 00 84 C0 00 01 00 00
        add                     byte [ss: rax + 8*rax + 100h], al       ; 00 84 C0 00 01 00 00
        add                     byte [ds: rax + 8*rax + 100h], al       ; 00 84 C0 00 01 00 00
        add                     byte [fs: rax + 8*rax + 100h], al       ; 64 00 84 C0 00 01 00 00
        add                     byte [gs: rax + 8*rax + 100h], al       ; 65 00 84 C0 00 01 00 00

        add                     byte [es: (rax)], al                    ; 00 00
        add                     byte [cs: (rax)], al                    ; 00 00
        add                     byte [ss: (rax)], al                    ; 00 00
        add                     byte [ds: (rax)], al                    ; 00 00
        add                     byte [fs: (rax)], al                    ; 64 00 00
        add                     byte [gs: (rax)], al                    ; 65 00 00

    


Nice day
Post 13 May 2012, 09:44
View user's profile Send private message Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8356
Location: Kraków, Poland
Tomasz Grysztar 13 May 2012, 09:53
When you use it as an instruction prefix, it is not a part of address, it is a prefix for instruction that you specify manually (like LOCK, REP, etc.). See the thread on branch hint prefixes for an example when this feature might be useful. Initially it was added just for the compatibility reasons. You can even generate prefix without any instruction following it, like an isolated DS or REP in a line. You can also put multiple prefixes in one line and all will be generated just as requested.

While when you specify segment inside the address in instruction, fasm feels free to optimize the encoding of that instruction.
Post 13 May 2012, 09:53
View user's profile Send private message Visit poster's website Reply with quote
TrDr.Charlie



Joined: 07 Dec 2010
Posts: 11
TrDr.Charlie 13 May 2012, 10:29
Many Thanks, Tomasz
Post 13 May 2012, 10:29
View user's profile Send private message Reply with quote
TrDr.Charlie



Joined: 07 Dec 2010
Posts: 11
TrDr.Charlie 19 May 2012, 16:38
Hi Tomasz and all,

for CMPS, LODS, MOVS and OUTS instructions are SO prefixes in PE64 allowed.
Ignored is always only "default" SO prefix.

Example (e.g. LODS ):
Code:
        irps seg, es cs ss ds fs gs { irps reg, rsi esi \{ lods byte [seg: reg] \} }
    

Created opcodes:
Code:
        lods                    byte [es: rsi]          ; 00000001000105C0  26 AC
        lods                    byte [es: esi]          ; 00000001000105C2  67 26 AC
        lods                    byte [cs: rsi]          ; 00000001000105C5  2E AC
        lods                    byte [cs: esi]          ; 00000001000105C7  67 2E AC
        lods                    byte [ss: rsi]          ; 00000001000105CA  36 AC
        lods                    byte [ss: esi]          ; 00000001000105CC  67 36 AC
        lods                    byte [rsi]              ; 00000001000105CF  AC
        lods                    byte [esi]              ; 00000001000105D0  67 AC
        lods                    byte [fs: rsi]          ; 00000001000105D2  64 AC
        lods                    byte [fs: esi]          ; 00000001000105D4  67 64 AC
        lods                    byte [gs: rsi]          ; 00000001000105D7  65 AC
        lods                    byte [gs: esi]          ; 00000001000105D9  67 65 AC
    


Solution:
instead 5 lines in SOURCE\X86_64.INC
Code:
      xxxx_store:
        cmp     [segment_register],4 ; 4 = ds SO prefix
        je      xxxx_segment_ok
        call    store_segment_prefix
      xxxx_segment_ok:
    

where xxxx = cmps,lods,movs and outs, use only 2 lines:
Code:
      xxxx_store:
        call    store_segment_prefix_if_necessary
    

Now are allowed in PE64 only fs and gs SO prefixes.

Nice day and many thanks
Post 19 May 2012, 16:38
View user's profile Send private message Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8356
Location: Kraków, Poland
Tomasz Grysztar 20 May 2012, 09:32
You are right, this was inconsistent in the handling of these four instructions. Thank you for noticing.
Post 20 May 2012, 09:32
View user's profile Send private message Visit poster's website Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20423
Location: In your JS exploiting you and your system
revolution 20 May 2012, 09:49
Tomasz: With the automatic removal of segment prefixes I wonder if in the future it can cause a problem. It is not entirely impossible for a "redundant" prefix to signal some other operation like non-temporal loads/stores. One never can be sure what future CPUs will implement.
Post 20 May 2012, 09:49
View user's profile Send private message Visit poster's website Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8356
Location: Kraków, Poland
Tomasz Grysztar 20 May 2012, 10:37
For this purpose you have the manual prefixes (see my first post above). The segment name inside the square brackets is a part of the address and is not supposed to mean anything else.
Post 20 May 2012, 10:37
View user's profile Send private message Visit poster's website Reply with quote
l_inc



Joined: 23 Oct 2009
Posts: 881
l_inc 22 May 2012, 13:49
Tomasz Grysztar
Quote:
The segment name inside the square brackets is a part of the address and is not supposed to mean anything else.

Yes. And by removing the prefix, you're changing the address, because from the processor's point of view a program always works with logical addresses (not with linear addresses), which consist of an offset and a segment selector. Even though the linear address for 64-bit mode would be the same no matter, what segment register I use (disregarding fs and gs), the logical address would still be different.

Assume, I've loaded a selector for a non-writable segment into es. And now a program tries to write with the es-prefix. It makes a big difference whether the "optimization" is performed or not even though the linear address is the same. So if you want to make fasm be operation oriented, I think, you should preserve the prefix.
Post 22 May 2012, 13:49
View user's profile Send private message Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8356
Location: Kraków, Poland
Tomasz Grysztar 22 May 2012, 14:34
l_inc wrote:
Yes. And by removing the prefix, you're changing the address, because from the processor's point of view a program always works with logical addresses (not with linear addresses), which consist of an offset and a segment selector. Even though the linear address for 64-bit mode would be the same no matter, what segment register I use (disregarding fs and gs), the logical address would still be different.

See what AMD manuals say about it.
AMD64 Architecture Programmer’s Manual Volume 3, appendix B, section B7 wrote:
In 64-bit mode, the CS, DS, ES, SS segment-override prefixes have no effect. These four prefixes are no longer treated as segment-override prefixes in the context of multiple-prefix rules. Instead, they are treated as null prefixes.

The Intel manuals contain no such directly stated passage, even though in many places the phrasing suggests that this behavior is assumed. It looks almost like the Intel manuals were written under the assumption that reader already knows the AMD64 specification. There are only some specific cases described clearly:
Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1, section 3.3.7.1 wrote:
If an instruction uses base registers RSP/RBP and uses a segment override prefix to specify a non-SS segment, a canonical fault generates a #GP (instead of an #SF). In 64-bit mode, only FS and GS segment-overrides are applicable in this situation. Other segment override prefixes (CS, DS, ES and SS) are ignored.
etc.

l_inc wrote:
Assume, I've loaded a selector for a non-writable segment into es. And now a program tries to write with the es-prefix.
Even if we didn't know yet that ES prefix is ignored in long mode, this case could still be clarified by another excerpt from the manuals:
AMD64 Architecture Programmer’s Manual Volume 1, section 2.1.2 wrote:
For references to the DS, ES, or SS segments in 64-bit mode, the processor assumes that the base for each of these segments is zero, neither their segment limit nor attributes are checked, and the processor simply checks that all such addresses are in canonical form.
Post 22 May 2012, 14:34
View user's profile Send private message Visit poster's website Reply with quote
l_inc



Joined: 23 Oct 2009
Posts: 881
l_inc 22 May 2012, 15:49
Tomasz Grysztar
Oh. Sorry. I normally read only Intel manuals. So according to "5.4 Type Checking" I assumed the type checking applies also to the 64-bit mode, because this section does not contain a separate subsection for the 64-bit mode. However I've just checked the second volume and e.g. mov does not generate a #GP based on the type checking.

Thank you for the clarification.
Post 22 May 2012, 15:49
View user's profile Send private message Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.