xor ax, 1 and xor ax,1

Index > Main > xor ax, 1 and xor ax,1

Author

Thread

uri

Joined: 09 Apr 2004
Posts: 43
Location: L'viv, Ukraine

uri 28 Jul 2005, 10:25

there are two different opcodes for one instruction:

Code:

83F001  xor  ax,0001
350100  xor  ax,0001

can i say fasm generate one of it, or i must write some macro?

28 Jul 2005, 10:25

uri

Joined: 09 Apr 2004
Posts: 43
Location: L'viv, Ukraine

uri 28 Jul 2005, 10:28

sorry, i already know.

for first i must write xor ax,1
for second - xor ax, word 1

28 Jul 2005, 10:28

uri

Joined: 09 Apr 2004
Posts: 43
Location: L'viv, Ukraine

uri 28 Jul 2005, 14:50

No, this question not closed.

Another situation:
i want to compile instruction "xor bh, al".
But what i have?

Code:

32 F8    xor bh, al
30 C7    xor bh, al

Here i can't specify any types, because both instructions have same type:

Code:

32 /r     XOR r8,r/m8
30 /r     XOR r/m8,r8

How can i select opcode? Only via macroses? Sad

28 Jul 2005, 14:50

Tomasz Grysztar

Joined: 16 Jun 2003
Posts: 8434
Location: Kraków, Poland

Tomasz Grysztar 28 Jul 2005, 14:52

In this case (as opposed to the previous one) these two encodings are fully functionally equivalent - so assembler always chooses only one of them. It can be even considered a kind of footprint for the assembler.

28 Jul 2005, 14:52

uri

Joined: 09 Apr 2004
Posts: 43
Location: L'viv, Ukraine

uri 28 Jul 2005, 15:06

Yes, fully functionally equivalent, but in case polymorphic code it's very important.

Ok, can you say - what opcode will be selected by fasm in case two identical commands, but two different opcodes? Is present some regularity, or it's depends only from command?

28 Jul 2005, 15:06

THEWizardGenius

Joined: 14 Jan 2005
Posts: 382
Location: California, USA

THEWizardGenius 28 Jul 2005, 19:00

Well I would assume that the opcodes chosen are those that use the smallest code. However, when they are the same size, same speed, and do EXACTLY the same thing, it is up to the compiler. You can ask which is used, or check the source code, or maybe it is documented somewhere else (maybe FASM internals documentation? I don't know).

28 Jul 2005, 19:00

Matrix

Joined: 04 Sep 2004
Posts: 1164
Location: Overflow

Matrix 28 Jul 2005, 21:55

uri wrote:

No, this question not closed.

Another situation:
i want to compile instruction "xor bh, al".
But what i have?
Code:
32 F8    xor bh, al
30 C7    xor bh, al
    
Here i can't specify any types, because both instructions have same type:
Code:
32 /r     XOR r8,r/m8
30 /r     XOR r/m8,r8
    
How can i select opcode? Only via macroses?

hi
what do you say on

Code:

32 F8    xor bh,< al
30 C7    xor bh,> al
32 F8    xor bh, al

my only idea is
make somethng like r8 direction

Code:

32 /r     XOR r8 , r/m8 so <
30 /r     XOR r/m8 , r8 so >

32 F8    xor bh,< al
30 C7    xor bh,> al

and if not specified direction '<' '>' then
use 32 F8    xor bh,< al
32 /r     XOR r8 , r/m8 so <
because first parameter is r8

this whould mean ',<' and ',>' whould define which one to use in addition to ','

28 Jul 2005, 21:55

Tomasz Grysztar

Joined: 16 Jun 2003
Posts: 8434
Location: Kraków, Poland

Tomasz Grysztar 28 Jul 2005, 22:13

Isn't it better to write just machine code for such applications? The assembly language is there to provide you abstraction from encodings and focuses on functionality of instructions (at least on such vision of assembly language the fasm's syntax is based).

28 Jul 2005, 22:13

LocoDelAssembly
Your code has a bug

Joined: 06 May 2005
Posts: 4623
Location: Argentina

LocoDelAssembly 28 Jul 2005, 22:39

166C:0100 2D0010 SUB AX,1000
166C:0103 81E80010 SUB AX,1000

The first one is sub accum, immediate and the second is sub reg, immediate

Is there a way to force "sub" to be assembled with reg operand instead of accum? I can't remember where I read using 2D instead of 81 E8 not pair in some situations on Pentium 1

28 Jul 2005, 22:39

LocoDelAssembly
Your code has a bug

Joined: 06 May 2005
Posts: 4623
Location: Argentina

LocoDelAssembly 28 Jul 2005, 23:16

Well I remember now but it wasn't "sub" Razz

http://www.codingnow.com/2000/download/pentopt.htm#26_13

Quote:

26.13 MOV [MEM], ACCUM (PPlain and PMMX)

The instructions MOV [mem],AL MOV [mem],AX MOV [mem],EAX are treated by the pairing circuitry as if they were writing to the accumulator. Thus the following instructions do not pair:

MOV [mydata], EAX
MOV EBX, EAX

This problem occurs only with the short form of the MOV instruction which can not have a base or index register, and which can only have the accumulator as source. You can avoid the problem by using another register, by reordering your instructions, by using a pointer, or by hard-coding the general form of the MOV instruction.

In 32 bit mode you can write the general form of MOV [mem],EAX:

DB 89H, 05H
DD OFFSET DS:mem

In 16 bit mode you can write the general form of MOV [mem],AX:

DB 89H, 06H
DW OFFSET DS:mem

To use AL instead of (E)AX, you replace 89H with 88H

This flaw has not been fixed in the PMMX.

[edit]
166C:0100 A33412 MOV [1234],AX
166C:0103 89063412 MOV [1234],AX
[/edit]

Last edited by LocoDelAssembly on 29 Jul 2005, 01:47; edited 1 time in total

28 Jul 2005, 23:16

comrade

Joined: 16 Jun 2003
Posts: 1150
Location: Russian Federation

comrade 29 Jul 2005, 01:39

there are also two ways to encode:

Code:

33C0    xor eax,eax
31C0    xor eax,eax

The first one is common across MS products and others I have seen (Borland). Actually I have only seen the second one in FASM.

29 Jul 2005, 01:39

comrade

Joined: 16 Jun 2003
Posts: 1150
Location: Russian Federation

comrade 29 Jul 2005, 02:24

and yes, these different encodings are used by assemblers as a footprint

i hard somewhere that unregistered copies of a86 or a386 made such special encodings, so the author could track down those who distributed programs that were compiled with unregistered a86... damn capitalists

29 Jul 2005, 02:24

MCD

Joined: 21 Aug 2004
Posts: 602
Location: Germany

MCD 29 Jul 2005, 09:01

This is just a completly useless discussion for the overwhelming majority of programs. It's only important for polymorphic code...but anyway, which sensible software uses that?

29 Jul 2005, 09:01

Matrix

Joined: 04 Sep 2004
Posts: 1164
Location: Overflow

Matrix 29 Jul 2005, 09:33

well yes, it is only a difference in the machine code,
but anyway it could be added to the features request, in a smaller priority at least for completeness.

another option is to define typecast r8 r16 r32 r64 m8 m16 m32 m64 typecast, so syntax whouldn't has to be changed with other character combinations example

Code:

32 F8    xor bh,< al
30 C7    xor bh,> al
32 F8    xor bh, al

instead this could be written (advanced typecast)

Code:

32 /r     XOR r8 , r/m8 so <
30 /r     XOR r/m8 , r8 so >

32 F8    xor r8 bh, al
30 C7    xor bh, r8 al
32 F8    xor bh, al

29 Jul 2005, 09:33

Madis731

Joined: 25 Sep 2003
Posts: 2138
Location: Estonia

Madis731 02 Aug 2005, 13:40

r8 is reserved in 64-bit systems, or was it r08?

02 Aug 2005, 13:40

vid
Verbosity in development

Joined: 05 Sep 2003
Posts: 7103
Location: Slovakia

vid 02 Aug 2005, 15:59

when speaking about it - there is a bigger problem. you cannot force size of constant operand when there are multiple possibilities, like:
cmp eax,0
0 here gets encoded as byte (instruction cmp r32,c8 ) where c8 gets sign extended. But if you aren't sure if constant fits in byte, you cannot force it, like
cmp eax,byte 0
where assembler says it cannot assemble even though it assembles it to byte.
This forces us to hardcode instruction in sefl-overwriting code, which is against "The assembly language is there to provide you abstraction from encodings and focuses on functionality of instructions".

02 Aug 2005, 15:59

Tomasz Grysztar

Joined: 16 Jun 2003
Posts: 8434
Location: Kraków, Poland

Tomasz Grysztar 02 Aug 2005, 16:17

Because it "focuses on functionality" it cannot allow "cmp eax,byte 0" because clearly the operand sizes doesn't match here, and this is likely to mean a mistake by programmer (like if you do "cmp eax,[var]" with byte variable, where also fasm detects an error). If you aren't sure if constant fits in byte and you need to be sure that it does, just use the "if" directive to check it out (or something like "cmp eax,value and 07Fh" if you care only about instruction code, not overflows).

The other case are direct jumps and calls, as size operator in this case doesn't apply to the size of target address (as this actually depends on the processor's mode rather than on the instruction itself), but to the size of relative address displacement. Thus the size operators for all possible values of displacements are allowed there.

And as for the "cmp eax,byte 0" not being allowed, consider also this kind of code:

Code:

var db ?
cmp [var],al

if we change the "db" to "dd" in "var" definition, we will get an error - for obvious reasons (see Design Principles. And thus it was clear for me that "cmp [var],byte 0" should cause the same kind of error when we change "db" to "dd", since it's just analoguous. And then to allow choosing between shorter and longer encodings I had to make a scheme explained in the section 1.2.6 of the manual. You know, the arguments like the ones you used were all considered by me when I was designing fasm's syntax and it actually took me more time to think over the syntax than to implement it. I did not write fasm in a "let's get it working and we will worry about details later" way.

Edit from the future: The actual operation processor performs is on full-sized values, even if the opcode has the immediate in a compressed form (which is sign-extended for the actual operation). With things like AVX-512 displacement compression this understanding becomes even more apparent. And the separation of abstraction layers is something that I have been pushing further and further during fasm's development. I wanted the assembly syntax to define precisely the operation to be performed, but leave the choice of the encoding to the assembler, and only allow to influence the choice through additional options, ideally completely separate syntactically (and they should only be needed for unusual purposes, as normally it's what the instruction does that matters). For this reason I've been moving away even from the old fasm's scheme of enforcing large immediate opcode by placing size operator before the value - in fasmg's implementation of x86 instructions it's no longer there, with justification being that it can easily be done through additional tweaks to the encoder, separate from the main content of the assembly code, which should be focused solely on defining the operations to perform.

Last edited by Tomasz Grysztar on 19 Dec 2021, 13:04; edited 1 time in total

02 Aug 2005, 16:17

vid
Verbosity in development

Joined: 05 Sep 2003
Posts: 7103
Location: Slovakia

vid 02 Aug 2005, 16:57

Quote:

Because it "focuses on functionality" it cannot allow "cmp eax,byte 0" because clearly the operand sizes doesn't match here, and this is likely to mean a mistake by programmer

i still don't agree, operand sizes doesn't ALWAYS have to match. Just like with movzx, if there is isntruction which allows different sizes of operands, then assembler should be able to interpret it.

Quote:

And thus it was clear for me that "cmp [var],byte 0" should cause the same kind of error when we change "db" to "dd", since it's just analoguous.

It just isn't. Processor doesn't have instruction cmp-r16-r8, so error here is allright, but it does have instruction cmp-r16-c8, so i don't see reason to throw error for perfectly encodeable instruction.
it isn't general rule that all instructon arguments have to have same size.
Your reasoning seems to me as more "internal problem" of FASM, i think if coder writes "cmp eax, byte 8" he can await instruction to be generated and doesn't have to remember if FASM does create shortened form or it doesn't. It's like writing "public" before things which would be public anyway in C, if you want to be sure cmp-r16-c8 opcode is generated, you can make it sure this way. And it's much more clear for reader of code who don't know FASM so well.
Also i don't see any problem this could cause, i can only see benefits.

02 Aug 2005, 16:57

rugxulo

Joined: 09 Aug 2005
Posts: 2341
Location: Usono (aka, USA)

rugxulo 09 Sep 2005, 05:31

He hasn't gone after anyone yet, and he says he'll only do it if their program is commercial (i.e., you make money, you should register) although you aren't supposed to distribute any kind of program assembled by A86 without registering.

comrade wrote:

and yes, these different encodings are used by assemblers as a footprint

i hard somewhere that unregistered copies of a86 or a386 made such special encodings, so the author could track down those who distributed programs that were compiled with unregistered a86... damn capitalists

09 Sep 2005, 05:31

< Last Thread | Next Thread >

Forum Rules:

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum