flat assembler
Message board for the users of flat assembler.
Index
> Compiler Internals > Segment override prefixes in PE64 |
Author |
|
Tomasz Grysztar 13 May 2012, 09:53
When you use it as an instruction prefix, it is not a part of address, it is a prefix for instruction that you specify manually (like LOCK, REP, etc.). See the thread on branch hint prefixes for an example when this feature might be useful. Initially it was added just for the compatibility reasons. You can even generate prefix without any instruction following it, like an isolated DS or REP in a line. You can also put multiple prefixes in one line and all will be generated just as requested.
While when you specify segment inside the address in instruction, fasm feels free to optimize the encoding of that instruction. |
|||
13 May 2012, 09:53 |
|
TrDr.Charlie 13 May 2012, 10:29
Many Thanks, Tomasz
|
|||
13 May 2012, 10:29 |
|
TrDr.Charlie 19 May 2012, 16:38
Hi Tomasz and all,
for CMPS, LODS, MOVS and OUTS instructions are SO prefixes in PE64 allowed. Ignored is always only "default" SO prefix. Example (e.g. LODS ): Code: irps seg, es cs ss ds fs gs { irps reg, rsi esi \{ lods byte [seg: reg] \} } Created opcodes: Code: lods byte [es: rsi] ; 00000001000105C0 26 AC lods byte [es: esi] ; 00000001000105C2 67 26 AC lods byte [cs: rsi] ; 00000001000105C5 2E AC lods byte [cs: esi] ; 00000001000105C7 67 2E AC lods byte [ss: rsi] ; 00000001000105CA 36 AC lods byte [ss: esi] ; 00000001000105CC 67 36 AC lods byte [rsi] ; 00000001000105CF AC lods byte [esi] ; 00000001000105D0 67 AC lods byte [fs: rsi] ; 00000001000105D2 64 AC lods byte [fs: esi] ; 00000001000105D4 67 64 AC lods byte [gs: rsi] ; 00000001000105D7 65 AC lods byte [gs: esi] ; 00000001000105D9 67 65 AC Solution: instead 5 lines in SOURCE\X86_64.INC Code: xxxx_store: cmp [segment_register],4 ; 4 = ds SO prefix je xxxx_segment_ok call store_segment_prefix xxxx_segment_ok: where xxxx = cmps,lods,movs and outs, use only 2 lines: Code:
xxxx_store:
call store_segment_prefix_if_necessary
Now are allowed in PE64 only fs and gs SO prefixes. Nice day and many thanks |
|||
19 May 2012, 16:38 |
|
Tomasz Grysztar 20 May 2012, 09:32
You are right, this was inconsistent in the handling of these four instructions. Thank you for noticing.
|
|||
20 May 2012, 09:32 |
|
revolution 20 May 2012, 09:49
Tomasz: With the automatic removal of segment prefixes I wonder if in the future it can cause a problem. It is not entirely impossible for a "redundant" prefix to signal some other operation like non-temporal loads/stores. One never can be sure what future CPUs will implement.
|
|||
20 May 2012, 09:49 |
|
Tomasz Grysztar 20 May 2012, 10:37
For this purpose you have the manual prefixes (see my first post above). The segment name inside the square brackets is a part of the address and is not supposed to mean anything else.
|
|||
20 May 2012, 10:37 |
|
l_inc 22 May 2012, 13:49
Tomasz Grysztar
Quote: The segment name inside the square brackets is a part of the address and is not supposed to mean anything else. Yes. And by removing the prefix, you're changing the address, because from the processor's point of view a program always works with logical addresses (not with linear addresses), which consist of an offset and a segment selector. Even though the linear address for 64-bit mode would be the same no matter, what segment register I use (disregarding fs and gs), the logical address would still be different. Assume, I've loaded a selector for a non-writable segment into es. And now a program tries to write with the es-prefix. It makes a big difference whether the "optimization" is performed or not even though the linear address is the same. So if you want to make fasm be operation oriented, I think, you should preserve the prefix. |
|||
22 May 2012, 13:49 |
|
Tomasz Grysztar 22 May 2012, 14:34
l_inc wrote: Yes. And by removing the prefix, you're changing the address, because from the processor's point of view a program always works with logical addresses (not with linear addresses), which consist of an offset and a segment selector. Even though the linear address for 64-bit mode would be the same no matter, what segment register I use (disregarding fs and gs), the logical address would still be different. See what AMD manuals say about it. AMD64 Architecture Programmer’s Manual Volume 3, appendix B, section B7 wrote: In 64-bit mode, the CS, DS, ES, SS segment-override prefixes have no effect. These four prefixes are no longer treated as segment-override prefixes in the context of multiple-prefix rules. Instead, they are treated as null prefixes. The Intel manuals contain no such directly stated passage, even though in many places the phrasing suggests that this behavior is assumed. It looks almost like the Intel manuals were written under the assumption that reader already knows the AMD64 specification. There are only some specific cases described clearly: Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1, section 3.3.7.1 wrote: If an instruction uses base registers RSP/RBP and uses a segment override prefix to specify a non-SS segment, a canonical fault generates a #GP (instead of an #SF). In 64-bit mode, only FS and GS segment-overrides are applicable in this situation. Other segment override prefixes (CS, DS, ES and SS) are ignored. l_inc wrote: Assume, I've loaded a selector for a non-writable segment into es. And now a program tries to write with the es-prefix. AMD64 Architecture Programmer’s Manual Volume 1, section 2.1.2 wrote: For references to the DS, ES, or SS segments in 64-bit mode, the processor assumes that the base for each of these segments is zero, neither their segment limit nor attributes are checked, and the processor simply checks that all such addresses are in canonical form. |
|||
22 May 2012, 14:34 |
|
l_inc 22 May 2012, 15:49
Tomasz Grysztar
Oh. Sorry. I normally read only Intel manuals. So according to "5.4 Type Checking" I assumed the type checking applies also to the 64-bit mode, because this section does not contain a separate subsection for the 64-bit mode. However I've just checked the second volume and e.g. mov does not generate a #GP based on the type checking. Thank you for the clarification. |
|||
22 May 2012, 15:49 |
|
< Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.