flat assembler
Message board for the users of flat assembler.

Index > Main > Rules for immediate operand sizes

Goto page 1, 2  Next
Author
Thread Post new topic Reply to topic
Plue



Joined: 15 Dec 2005
Posts: 151
Plue 18 Jan 2010, 15:12
I'm writing an in-memory assembler for JIT compiling purposes. But I have one big problem: I can't find any rules regarding the signedness of immediate operands.

Take for example this opcode:
Code:
83    /0 ib  ADD r/m32,imm8    
Fasm treats imm8 as signed, and doesn't generate this opcode if imm is 129 (because it's outside signed byte range). Fasm instead generates this opcode:
Code:
81    /0 id  ADD r/m32,imm32    
Which is longer, but will work correctly.

So far so good: imm8 are signed. But wait! Look at this opcode:
Code:
C8     iw ib ENTER imm16,imm8    
Here fasm accepts 129 for imm8 without complaining!

I don't know how to handle this in my assembler! Surely I don't have to know for each opcode whether it accepts signed or unsigned operands? How do I solve this delicate problem without making my assembler behave unpredictably?

_________________
Roses are red
Violets are blue
Some poems rhyme
And some don't.
Post 18 Jan 2010, 15:12
View user's profile Send private message Reply with quote
LocoDelAssembly
Your code has a bug


Joined: 06 May 2005
Posts: 4624
Location: Argentina
LocoDelAssembly 18 Jan 2010, 16:15
You'll have to follow opcode descriptions too, because "imm8" says only that you should put an 8-bit immediate (i.e. a number coded in as instruction's operand), but it doesn't says whether it is signed or unsigned.

83 /0 ib ADD r/m32, imm8 < Add sign-extended imm8 to r/m32.

81 /0 id ADD r/m32, imm32 < Add imm32 to r/m32.

REX.W + 81 /0 id ADD r/m64, imm32 < Add imm32 sign-extended to 64-bits to r/m64.
Post 18 Jan 2010, 16:15
View user's profile Send private message Reply with quote
MazeGen



Joined: 06 Oct 2003
Posts: 977
Location: Czechoslovakia
MazeGen 18 Jan 2010, 19:20
As for 8-bit immediates that are sign-extended, search for "Ibs" here:

http://ref.x86asm.net/geek.html

I means immediate, bs means byte sign-extended to the size of the other operand. You can see that there are only few such opcodes.

(Skip Ibss, it means that the immediate is sign-extended to size of stack pointer).

And as Loco said, some 32-bit immediates are sign-extended to 64 bits in 64-bit mode. Search for Ivds.
Post 18 Jan 2010, 19:20
View user's profile Send private message Visit poster's website Reply with quote
hopcode



Joined: 04 Mar 2008
Posts: 563
Location: Germany
hopcode 18 Jan 2010, 23:01
Plue wrote:
So far so good: imm8 are signed. But wait! Look at this opcode:
Code:
C8     iw ib ENTER imm16,imm8    
Here fasm accepts 129 for imm8 without complaining!

Just in time to...,look here (inside the table of flde)
http://board.flatassembler.net/topic.php?p=108432#108432
how i solved the problem

EDIT: i hope it will be useful because i was too rush to answer (i have just read two lines of the head in the thread)

Regards,
hopcode
Post 18 Jan 2010, 23:01
View user's profile Send private message Visit poster's website Reply with quote
Plue



Joined: 15 Dec 2005
Posts: 151
Plue 19 Jan 2010, 11:22
Aww, annoying.

_________________
Roses are red
Violets are blue
Some poems rhyme
And some don't.
Post 19 Jan 2010, 11:22
View user's profile Send private message Reply with quote
hopcode



Joined: 04 Mar 2008
Posts: 563
Location: Germany
hopcode 19 Jan 2010, 14:01
Plue wrote:
Aww, annoying.
What is just annoying you ?
Quote:
that fasm accepts 129 for imm8 without complaining!
Question
You should try to write a "poem" like that to state, whether
complaining is the right thing to do or not.

What...what is annoying ?
That i edit my post before or after other posts?
I can do it, i am hopcode.

Yes, reading the manuals is always annoying, especially for an
ignorant one like you are. Please next time,

1) do read the manuals accurately
2) ask if you dont know how to solve
Quote:
this delicate problem

3) accept please what others tell you
4) write your JIT stuff
5) share it with us, because we need your "JIT engine" Laughing
then and i will show you personally how could be

the roses red
and the violets blue
...
without complaining.Laughing

Do you understand ?
It is simple, not annoying...
Post 19 Jan 2010, 14:01
View user's profile Send private message Visit poster's website Reply with quote
MazeGen



Joined: 06 Oct 2003
Posts: 977
Location: Czechoslovakia
MazeGen 19 Jan 2010, 14:25
hopcode, are you drunk or what? If you did read the manual accurately, why you haven't mentioned the s bit, located at bit index 1 in those few primary opcodes, that specifies the sing extension of 8-bit immediate?
Post 19 Jan 2010, 14:25
View user's profile Send private message Visit poster's website Reply with quote
hopcode



Joined: 04 Mar 2008
Posts: 563
Location: Germany
hopcode 19 Jan 2010, 15:19
MazeGen,please...
i am not drunk.

On the contrary, the s bit has nothing to do with the
the ENTER instruction.

in ENTER
nesting level should be restricted to 0->31. But is actually what the
operation ENTER does


Intel AM reference
Code:
    



If you set it to -129 or +129 the first operation is always
MOD 32 on the index.
Post 19 Jan 2010, 15:19
View user's profile Send private message Visit poster's website Reply with quote
LocoDelAssembly
Your code has a bug


Joined: 06 May 2005
Posts: 4624
Location: Argentina
LocoDelAssembly 21 Jan 2010, 17:07
Plue, someone asked me to put attention on this thread. Would you mind explaining the meaning of "Aww, annoying."? It does means that it is annoying/tedious the programming of the imm8 related parts or is it something else like "your answers were really helpless"?
Post 21 Jan 2010, 17:07
View user's profile Send private message Reply with quote
hopcode



Joined: 04 Mar 2008
Posts: 563
Location: Germany
hopcode 22 Jan 2010, 01:13
I do not /did not claim explanation from you, Plue.
Whether right or wrong my words, I am very sorry.I apologize.
I like friendships, i am a very simple (almost stupid sometimes) person.

Regards,
hopcode
Post 22 Jan 2010, 01:13
View user's profile Send private message Visit poster's website Reply with quote
Plue



Joined: 15 Dec 2005
Posts: 151
Plue 27 Jan 2010, 20:39
I am very grateful for the help, which gave me the exact information I asked for.
What's annoying is that I have written my assembler without support for signed/unsigned 8-bit operands, and now I have to change it just for a few opcodes.
Post 27 Jan 2010, 20:39
View user's profile Send private message Reply with quote
Plue



Joined: 15 Dec 2005
Posts: 151
Plue 28 Jan 2010, 13:53
It turns out there are still some inconsistencies between the opcode table and fasm.

LocoDelAssembly wrote:
You'll have to follow opcode descriptions too, because "imm8" says only that you should put an 8-bit immediate (i.e. a number coded in as instruction's operand), but it doesn't says whether it is signed or unsigned
No. "Intel Intruction Set Reference" clearly states that imm8 is signed.

Still fasm accepts this:
Code:
mov cl, 233    
Question

_________________
Roses are red
Violets are blue
Some poems rhyme
And some don't.
Post 28 Jan 2010, 13:53
View user's profile Send private message Reply with quote
hopcode



Joined: 04 Mar 2008
Posts: 563
Location: Germany
hopcode 28 Jan 2010, 15:24
Plue wrote:
Still fasm accepts this:
Code:
mov cl, 233    
Question

Plue, it is in few words
B1 /ib
and it follows the series B0->B7
B0+ rb MOV r8, imm8 E Valid Valid Move imm8 to r8. (this is for AL)
and there ib means:
Quote:

• ib, iw, id, io — A 1-byte (ib), 2-byte (iw), 4-byte (id) or 8-byte (io) immediate
operand to the instruction that follows the opcode, ModR/M bytes or scaleindexing
bytes. The opcode determines if the operand is a signed value. All
words, doublewords and quadwords are given with the low-order byte first.

while rb
Quote:

• +rb, +rw, +rd, +ro — A register code, from 0 through 7, added to the
hexadecimal byte given at the left of the plus sign to form a single opcode byte.
See Table 3-1 for the codes. The +ro columns in the table are applicable only in
64-bit mode.


Regards,
hopcode
Post 28 Jan 2010, 15:24
View user's profile Send private message Visit poster's website Reply with quote
hopcode



Joined: 04 Mar 2008
Posts: 563
Location: Germany
hopcode 28 Jan 2010, 16:23
AMD manuals are EDIT:NOT straightforward:
Quote:

rb The byte register operand is specified in the Opcode byte. To determine
the Opcode byte for a particular register, add the hexadecimal value on the
left of the plus sign to the value of rb for that register, as follows:
AL=0, CL=1, DL=2, BL= 3, AH=4, CH=5, DH=6, and BH=7. So for example,
the opcode for moving an immediate byte to a register (MOV) is "B0+rb".
So B0–B7 are valid opcodes, and B0 is "MOV AL,imm8".


EDIT: not enough clear!!! (or as clear as in intel at least)
Post 28 Jan 2010, 16:23
View user's profile Send private message Visit poster's website Reply with quote
Plue



Joined: 15 Dec 2005
Posts: 151
Plue 28 Jan 2010, 17:04
Intel manual:
Quote:
imm8 — An immediate byte value. The imm8 symbol is a signed number
between –128 and +127 inclusive. For instructions in which imm8 is combined
with a word or doubleword operand, the immediate value is sign-extended to
form a word or doubleword. The upper byte of the word is filled with the topmost
bit of the immediate value.

It says imm8 is signed. But fasm accepts unsigned values for some of these.

_________________
Roses are red
Violets are blue
Some poems rhyme
And some don't.
Post 28 Jan 2010, 17:04
View user's profile Send private message Reply with quote
hopcode



Joined: 04 Mar 2008
Posts: 563
Location: Germany
hopcode 28 Jan 2010, 17:20
Yes, you are completely right... and if i have right understood the song of these last days, fasm should change here too (especially when used as backend input/output producer). So a basical opcoding will change totally the mentality acquired in years of asm programming... (not mine, fortunately Very Happy )

Regards,
hopcode
Post 28 Jan 2010, 17:20
View user's profile Send private message Visit poster's website Reply with quote
Borsuc



Joined: 29 Dec 2005
Posts: 2465
Location: Bucharest, Romania
Borsuc 28 Jan 2010, 17:34
Plue wrote:
Still fasm accepts this:
Code:
mov cl, 233    
Question
Why wouldn't it? You, the programmer, can use cl in an unsigned manner. Confused

_________________
Previously known as The_Grey_Beast
Post 28 Jan 2010, 17:34
View user's profile Send private message Reply with quote
LocoDelAssembly
Your code has a bug


Joined: 06 May 2005
Posts: 4624
Location: Argentina
LocoDelAssembly 28 Jan 2010, 18:37
[-128..-1] overlaps with [128..255] so you need to know what the instruction will really do with an imm. For instance imm16 is described this way:
Quote:

imm16 — An immediate word value used for instructions whose operand-size
attribute is 16 bits. This is a number between –32,768 and +32,767 inclusive.


Then RET this way:
Quote:
Opcode Instruction 64-Bit
Mode
Compat/
Leg Mode
Description
C3 RET Valid Valid Near return to calling procedure.
CB RET Valid Valid Far return to calling procedure.
C2 iw RET imm16 Valid Valid Near return to calling procedure and pop
imm16 bytes from stack.
CA iw RET imm16 Valid Valid Far return to calling procedure and pop
imm16 bytes from stack.
.
.
.
IF instruction has immediate operand
THEN IF StackAddressSize = 32
THEN
ESP ← ESP + SRC; (* Release parameters from stack *)
ELSE
IF StackAddressSize = 64
THEN
RSP ← RSP + SRC; (* Release parameters from stack *)
ELSE (* StackAddressSize = 16 *)
SP ← SP + SRC; (* Release parameters from stack *)
FI;
FI;
FI;


So, would you expect that "RET -4" subtracts four from ESP in 32-bit mode instead of adding 65532? (Thanks revolution Wink)

You need to know the context of the instruction to know how the imm will be interpreted, the only fixed rule is that if the dest operand has the same width than the imm then you can use signed and unsigned values interchangeably, otherwise, you need to know if the CPU will do sign-extension or not.
Post 28 Jan 2010, 18:37
View user's profile Send private message Reply with quote
Plue



Joined: 15 Dec 2005
Posts: 151
Plue 28 Jan 2010, 20:59
Borsuc wrote:
Plue wrote:
Still fasm accepts this:
Code:
mov cl, 233    
Question
Why wouldn't it? You, the programmer, can use cl in an unsigned manner. Confused
Because this becomes a problem when you move into a 32-bit register, as the value is then sign extended. You see, there are two ways to assemble "add ecx, 155": one is to treat 155 as a byte and the other way is to treat it as a dword. Treating it as a byte gives a smaller code. But if you treat 233 as a byte, then the value will be wrong when it is sign extended.
So there is an inherent inconsistency in the way byte values are treated. Still, the opcode reference from intel describes both the byte in "mov cl, 155" and the byte in "mov ecx, 155" as imm8. So we have two different behaviors for the same operand type, this contradicts common sense and also a different section of the opcode reference. Which made me confused.

Quote:
You need to know the context of the instruction to know how the imm will be interpreted, the only fixed rule is that if the dest operand has the same width than the imm then you can use signed and unsigned values interchangeably, otherwise, you need to know if the CPU will do sign-extension or not.
It seems like you're right on this account, even though intel's opcode reference says otherwise. If the operand sizes are different, then you really have to know this per opcode. Intel says that if the opcode sizes are different, then sign extension is used. But this isobviously wrong for certain instructions, like "ROL r/m32, imm8", since rotate left can't rotate right, and in any case only 5 of the bits in imm8 is used.
Also, according to Intel's rules, in "ENTER imm16,imm8" imm8 should be sign extended, but it's not.

_________________
Roses are red
Violets are blue
Some poems rhyme
And some don't.
Post 28 Jan 2010, 20:59
View user's profile Send private message Reply with quote
hopcode



Joined: 04 Mar 2008
Posts: 563
Location: Germany
hopcode 28 Jan 2010, 21:07
Following Intel and AMD docs,...mmh...there's nothing to do, they both agree on what fasm doesent do. but ... that is not the matter.

Note that i am the person who said "no fasm change" here
http://board.flatassembler.net/topic.php?p=108563#108563
for that cases and i repeat it for
Code:
mov cl,233    
OK.
There,
Tomasz Grysztar wrote:
Fix will be coming soon.

mainly because
bitRAKE wrote:
Hiding coding errors is never desirable.
But,i have never seen such a situation wher fasm hides my errors. Ok
i have little experience. But till now also, that does not persuade me to a fasm-being-necessairly-fixed.
On the contrary, i think (imho) that the user-coehrence can visibly
override that "hiding" etc., simply, as here stated
Borsuc wrote:
...You, the programmer, can use cl in an unsigned manner...

Maybe fasm (i think, personal opinion) will be planned as to be used as an improved backend compiler too. In fact, You, it is to say, the user
LocoDelAssembly wrote:
You need to know the context of the instruction to know how the imm will be interpreted.

we know it, but a frontend is pehraps not so capable, for this one and other situations.


Regards,
hopcode
.
.
.
Post 28 Jan 2010, 21:07
View user's profile Send private message Visit poster's website Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  
Goto page 1, 2  Next

< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2023, Tomasz Grysztar. Also on GitHub, YouTube, Twitter.

Website powered by rwasa.