flat assembler
Message board for the users of flat assembler.
Index
> Main > Instruction space |
Author |
|
bitRAKE 01 Jun 2023, 17:16
The world of instruction encoding in the x86 architecture is indeed multi-layered and intricate.
Terms such as "ignored" and "NOP" (No Operation) may seem similar at first glance, as both imply a lack of impact on the instruction execution. However, there's a subtle distinction. "Ignored" refers to certain bits, bytes, or prefixes that the processor overlooks during decoding or execution. A common example is the 2-bit 'mode' field in the ModR/M byte, but there are myriad other instances. The F2 prefix with CALL, RET, JMP, Jcc is another instance where this prefix is ignored. In essence, "ignoring" parts of an encoding implies that Intel has yet to finalize the usage of these aspects, leading to multiple encodings yielding identical results. On the other hand, the terms "reserved" and "#UD" (Undefined Instruction) both represent encodings that should be avoided, but they have different implications. "Reserved" often pertains to bits or prefixes that are earmarked for future use, with recommendations frequently suggesting they "must be zero". "#UD" refers to encodings that don't map to any valid or defined instruction. Adding to the complexity, the documentation can be inconsistent. For example, the 2-bit 'mode' field of the ModR/M byte may be described as "reserved" in some instances and "ignored" in others, which can certainly be perplexing. Reusing "#UD" opcodes and "ignored" bits is a way of extending the instruction set while maintaining backward compatibility. Old software can still run on new processors because the previously undefined or ignored encodings weren't used by that software. However, redefining these encodings can make it harder to write new software that also runs on old processors, because the new instructions won't be recognized by those processors. This tension between backward compatibility and forward progress is a recurring theme in the evolution of the x86 architecture. In theory, it's technically possible that a previously defined instruction encoding could be made invalid (#UD, or "Undefined Instruction") in newer processor models, and then repurposed again for something else later. However, this practice is generally avoided because it can cause significant compatibility issues. Furthermore, once an encoding is made #UD in a certain processor model, redefining it again for newer models can create a "compatibility gap." Software using the newly repurposed instruction would work on both very old processors (where it's ignored) and very new ones (where it's defined), but not on the intermediate ones (where it's #UD). Over time, CPU manufacturers, including Intel and AMD, have been somewhat vague about the specifics and invariants of their systems. This ambiguity has necessitated various organizations to compile their own x86 specifications for programmatic reasoning about software. The intricacies of the x86 architecture and its instruction set demand careful interpretation and understanding for accurate software analysis and development. _________________ ¯\(°_o)/¯ “languages are not safe - uses can be” Bjarne Stroustrup |
|||
01 Jun 2023, 17:16 |
|
revolution 01 Jun 2023, 18:11
I would interpret "ignored" to mean I can freely choose any bit combination I want. They are ignored, so they have no consequences.
I would interpret "reserved" to mean I must only use the stated values. They might be allocated for new functions in the future, so consequences are to be anticipated. They seem quite opposite to me. |
|||
01 Jun 2023, 18:11 |
|
bitRAKE 02 Jun 2023, 03:29
Your interpretation seems very rational. If we look historically - where do new instructions come from? The rational interpretation would suggest they should all come from a "reserved" space.
|
|||
02 Jun 2023, 03:29 |
|
revolution 02 Jun 2023, 04:01
Current CPUs might choose to ignore values documented as reserved. So someone light think it is okay to use other other values because the testing showed the bits are not used. Which, of course, would be folly for anyone wanting to make the code future compatible.
It might be fine if you absolutely know the code will only ever run on a restricted set of machines fully under the control of the author. But things like that tend to get an expanded scope sometime later, and decisions of old tend to get forgotten, and suddenly the code is expected to run on different machines. Then much hair-pulling ensues because it keeps randomly erroring even though the newer machines are stated to be 100% compatible with the older machines. |
|||
02 Jun 2023, 04:01 |
|
< Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.