flat assembler
Message board for the users of flat assembler.
Index
> Main > xor ax, 1 and xor ax,1 |
Author |
|
uri 28 Jul 2005, 10:28
sorry, i already know.
for first i must write xor ax,1 for second - xor ax, word 1 |
|||
28 Jul 2005, 10:28 |
|
uri 28 Jul 2005, 14:50
No, this question not closed.
Another situation: i want to compile instruction "xor bh, al". But what i have? Code: 32 F8 xor bh, al 30 C7 xor bh, al Here i can't specify any types, because both instructions have same type: Code: 32 /r XOR r8,r/m8 30 /r XOR r/m8,r8 How can i select opcode? Only via macroses? |
|||
28 Jul 2005, 14:50 |
|
Tomasz Grysztar 28 Jul 2005, 14:52
In this case (as opposed to the previous one) these two encodings are fully functionally equivalent - so assembler always chooses only one of them. It can be even considered a kind of footprint for the assembler.
|
|||
28 Jul 2005, 14:52 |
|
uri 28 Jul 2005, 15:06
Yes, fully functionally equivalent, but in case polymorphic code it's very important.
Ok, can you say - what opcode will be selected by fasm in case two identical commands, but two different opcodes? Is present some regularity, or it's depends only from command? |
|||
28 Jul 2005, 15:06 |
|
THEWizardGenius 28 Jul 2005, 19:00
Well I would assume that the opcodes chosen are those that use the smallest code. However, when they are the same size, same speed, and do EXACTLY the same thing, it is up to the compiler. You can ask which is used, or check the source code, or maybe it is documented somewhere else (maybe FASM internals documentation? I don't know).
|
|||
28 Jul 2005, 19:00 |
|
Matrix 28 Jul 2005, 21:55
uri wrote: No, this question not closed. hi what do you say on Code: 32 F8 xor bh,< al 30 C7 xor bh,> al 32 F8 xor bh, al my only idea is make somethng like r8 direction Code: 32 /r XOR r8 , r/m8 so < 30 /r XOR r/m8 , r8 so > 32 F8 xor bh,< al 30 C7 xor bh,> al and if not specified direction '<' '>' then use 32 F8 xor bh,< al 32 /r XOR r8 , r/m8 so < because first parameter is r8 this whould mean ',<' and ',>' whould define which one to use in addition to ',' |
|||
28 Jul 2005, 21:55 |
|
Tomasz Grysztar 28 Jul 2005, 22:13
Isn't it better to write just machine code for such applications? The assembly language is there to provide you abstraction from encodings and focuses on functionality of instructions (at least on such vision of assembly language the fasm's syntax is based).
|
|||
28 Jul 2005, 22:13 |
|
LocoDelAssembly 28 Jul 2005, 22:39
166C:0100 2D0010 SUB AX,1000
166C:0103 81E80010 SUB AX,1000 The first one is sub accum, immediate and the second is sub reg, immediate Is there a way to force "sub" to be assembled with reg operand instead of accum? I can't remember where I read using 2D instead of 81 E8 not pair in some situations on Pentium 1 |
|||
28 Jul 2005, 22:39 |
|
LocoDelAssembly 28 Jul 2005, 23:16
Well I remember now but it wasn't "sub"
http://www.codingnow.com/2000/download/pentopt.htm#26_13 Quote:
[edit] 166C:0100 A33412 MOV [1234],AX 166C:0103 89063412 MOV [1234],AX [/edit] Last edited by LocoDelAssembly on 29 Jul 2005, 01:47; edited 1 time in total |
|||
28 Jul 2005, 23:16 |
|
comrade 29 Jul 2005, 01:39
there are also two ways to encode:
Code: 33C0 xor eax,eax 31C0 xor eax,eax The first one is common across MS products and others I have seen (Borland). Actually I have only seen the second one in FASM. |
|||
29 Jul 2005, 01:39 |
|
comrade 29 Jul 2005, 02:24
and yes, these different encodings are used by assemblers as a footprint
i hard somewhere that unregistered copies of a86 or a386 made such special encodings, so the author could track down those who distributed programs that were compiled with unregistered a86... damn capitalists |
|||
29 Jul 2005, 02:24 |
|
MCD 29 Jul 2005, 09:01
This is just a completly useless discussion for the overwhelming majority of programs. It's only important for polymorphic code...but anyway, which sensible software uses that?
|
|||
29 Jul 2005, 09:01 |
|
Matrix 29 Jul 2005, 09:33
well yes, it is only a difference in the machine code,
but anyway it could be added to the features request, in a smaller priority at least for completeness. another option is to define typecast r8 r16 r32 r64 m8 m16 m32 m64 typecast, so syntax whouldn't has to be changed with other character combinations example Code: 32 F8 xor bh,< al 30 C7 xor bh,> al 32 F8 xor bh, al instead this could be written (advanced typecast) Code: 32 /r XOR r8 , r/m8 so < 30 /r XOR r/m8 , r8 so > 32 F8 xor r8 bh, al 30 C7 xor bh, r8 al 32 F8 xor bh, al |
|||
29 Jul 2005, 09:33 |
|
Madis731 02 Aug 2005, 13:40
r8 is reserved in 64-bit systems, or was it r08?
|
|||
02 Aug 2005, 13:40 |
|
vid 02 Aug 2005, 15:59
when speaking about it - there is a bigger problem. you cannot force size of constant operand when there are multiple possibilities, like:
cmp eax,0 0 here gets encoded as byte (instruction cmp r32,c8 ) where c8 gets sign extended. But if you aren't sure if constant fits in byte, you cannot force it, like cmp eax,byte 0 where assembler says it cannot assemble even though it assembles it to byte. This forces us to hardcode instruction in sefl-overwriting code, which is against "The assembly language is there to provide you abstraction from encodings and focuses on functionality of instructions". |
|||
02 Aug 2005, 15:59 |
|
Tomasz Grysztar 02 Aug 2005, 16:17
Because it "focuses on functionality" it cannot allow "cmp eax,byte 0" because clearly the operand sizes doesn't match here, and this is likely to mean a mistake by programmer (like if you do "cmp eax,[var]" with byte variable, where also fasm detects an error). If you aren't sure if constant fits in byte and you need to be sure that it does, just use the "if" directive to check it out (or something like "cmp eax,value and 07Fh" if you care only about instruction code, not overflows).
The other case are direct jumps and calls, as size operator in this case doesn't apply to the size of target address (as this actually depends on the processor's mode rather than on the instruction itself), but to the size of relative address displacement. Thus the size operators for all possible values of displacements are allowed there. And as for the "cmp eax,byte 0" not being allowed, consider also this kind of code: Code: var db ? cmp [var],al if we change the "db" to "dd" in "var" definition, we will get an error - for obvious reasons (see Design Principles. And thus it was clear for me that "cmp [var],byte 0" should cause the same kind of error when we change "db" to "dd", since it's just analoguous. And then to allow choosing between shorter and longer encodings I had to make a scheme explained in the section 1.2.6 of the manual. You know, the arguments like the ones you used were all considered by me when I was designing fasm's syntax and it actually took me more time to think over the syntax than to implement it. I did not write fasm in a "let's get it working and we will worry about details later" way. Edit from the future: The actual operation processor performs is on full-sized values, even if the opcode has the immediate in a compressed form (which is sign-extended for the actual operation). With things like AVX-512 displacement compression this understanding becomes even more apparent. And the separation of abstraction layers is something that I have been pushing further and further during fasm's development. I wanted the assembly syntax to define precisely the operation to be performed, but leave the choice of the encoding to the assembler, and only allow to influence the choice through additional options, ideally completely separate syntactically (and they should only be needed for unusual purposes, as normally it's what the instruction does that matters). For this reason I've been moving away even from the old fasm's scheme of enforcing large immediate opcode by placing size operator before the value - in fasmg's implementation of x86 instructions it's no longer there, with justification being that it can easily be done through additional tweaks to the encoder, separate from the main content of the assembly code, which should be focused solely on defining the operations to perform. Last edited by Tomasz Grysztar on 19 Dec 2021, 13:04; edited 1 time in total |
|||
02 Aug 2005, 16:17 |
|
vid 02 Aug 2005, 16:57
Quote: Because it "focuses on functionality" it cannot allow "cmp eax,byte 0" because clearly the operand sizes doesn't match here, and this is likely to mean a mistake by programmer i still don't agree, operand sizes doesn't ALWAYS have to match. Just like with movzx, if there is isntruction which allows different sizes of operands, then assembler should be able to interpret it. Quote: And thus it was clear for me that "cmp [var],byte 0" should cause the same kind of error when we change "db" to "dd", since it's just analoguous. It just isn't. Processor doesn't have instruction cmp-r16-r8, so error here is allright, but it does have instruction cmp-r16-c8, so i don't see reason to throw error for perfectly encodeable instruction. it isn't general rule that all instructon arguments have to have same size. Your reasoning seems to me as more "internal problem" of FASM, i think if coder writes "cmp eax, byte 8" he can await instruction to be generated and doesn't have to remember if FASM does create shortened form or it doesn't. It's like writing "public" before things which would be public anyway in C, if you want to be sure cmp-r16-c8 opcode is generated, you can make it sure this way. And it's much more clear for reader of code who don't know FASM so well. Also i don't see any problem this could cause, i can only see benefits. |
|||
02 Aug 2005, 16:57 |
|
rugxulo 09 Sep 2005, 05:31
He hasn't gone after anyone yet, and he says he'll only do it if their program is commercial (i.e., you make money, you should register) although you aren't supposed to distribute any kind of program assembled by A86 without registering.
comrade wrote: and yes, these different encodings are used by assemblers as a footprint |
|||
09 Sep 2005, 05:31 |
|
< Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.