flat assembler
Message board for the users of flat assembler.

Index > Main > Registers?

Author
Thread Post new topic Reply to topic
rhyno_dagreat



Joined: 31 Jul 2006
Posts: 487
Location: Maryland, Unol Daleithiau
rhyno_dagreat 11 Jan 2007, 04:02
Hey. I was wondering if any of you all know what the opcodes are for the different registers and how I could represent them in machine code. Thanks!

-Ryan "Rhyno" Lloyd
Post 11 Jan 2007, 04:02
View user's profile Send private message Reply with quote
DOS386



Joined: 08 Dec 2006
Posts: 1900
DOS386 11 Jan 2007, 05:29
Quote:
if any of you all know what the opcodes are for the different registers and how I could represent them in machine code


There are tables:

http://www.online.ee/~andre/i80386/Tabs.html

Tomasz had to deal with them when developing FASM Wink

_________________
Bug Nr.: 12345

Title: Hello World program compiles to 100 KB !!!

Status: Closed: NOT a Bug
Post 11 Jan 2007, 05:29
View user's profile Send private message Reply with quote
LocoDelAssembly
Your code has a bug


Joined: 06 May 2005
Posts: 4624
Location: Argentina
LocoDelAssembly 11 Jan 2007, 05:41
Registers has no opcodes, them are encoded in the instruction (the instructions have opcodes).

For example "push EAX" is opcode $50 (note that in this case the instruction is of the form PUSH EAX and not PUSH REG). There are some other cases in which de opcode also specifies the register (like SUB accum, imm) but most of the time the way the regs are refered is by the Mod R/M byte which can aditionaly be followed by a SIB byte. Take a look to http://www.sandpile.org/ia32/opc_rm32.htm and http://www.sandpile.org/ia32/opc_sib.htm to see the encodings.

http://www.sandpile.org/ia32/opc_enc.htm <- Opcode encoding.

[edit]Sorry NTOSKRNL_VXE, my internet connection (or sanpile site) is so slow now that the time I spent looking for those links was enough to let you post before me but of course your post wasn't existed when I was typing this reply.[/edit]
Post 11 Jan 2007, 05:41
View user's profile Send private message Reply with quote
rhyno_dagreat



Joined: 31 Jul 2006
Posts: 487
Location: Maryland, Unol Daleithiau
rhyno_dagreat 11 Jan 2007, 06:00
Thanks!
Post 11 Jan 2007, 06:00
View user's profile Send private message Reply with quote
tantrikwizard



Joined: 13 Dec 2006
Posts: 142
tantrikwizard 11 Jan 2007, 20:15
rhyno_dagreat wrote:
Hey. I was wondering if any of you all know what the opcodes are for the different registers and how I could represent them in machine code. Thanks!

-Ryan "Rhyno" Lloyd




From: mark@omnifest.uwm.edu (Mark Hopkins)

Newsgroups: alt.lang.asm

Subject: A Summary of the 80486 Opcodes and Instructions



(1) The 80x86 is an Octal Machine

This is a follow-up and revision of an article posted in alt.lang.asm on

7-5-92 concerning the 80x86 instruction encoding.

Some bugs were corrected June, the 20th, 1997 by S.Klose (sven@devcon.net)

(minor bugs in 32bit effective addresses and opcode typoes)



The only proper way to understand 80x86 coding is to realize that ALL 80x86

OPCODES ARE CODED IN OCTAL. A byte has 3 octal digits, ranging from 000 to

377. In fact, each octal group (000-077, 100-177, etc.) tends to encode a

specific variety of operation. All of these are features inherited from the

8080/8085/Z80.

For some reason absolutely everybody misses all of this, even the Intel

people who wrote the reference on the 8086 (and even the 8080). The opcode

scheme outlined briefly below is expanded starting in the 80386, but

consistently with the overall scheme here.



As an example to see how this works, the mov instructions in octal are:



210 xrm mov Eb, Rb

211 xrm mov Ew, Rw

212 xrm mov Rb, Eb

213 xrm mov Rw, Ew

214 xsm mov Ew, SR

216 xsm mov SR, Ew



The meanings of the octal digits (x, m, r, s) and their correspondence to the

operands (Eb, Ew, Rb, Rw, SR) are the following:



The digit r (0-7) encodes the register operand as follows:

REGISTER (r): 0 1 2 3 4 5 6 7

Rb = Byte-sized register AL CL DL BL AH CH DL BH

Rw = Word-sized register AX CX DX BX SP BP SI DI



The segment register digit s (0-7) encodes the segment register as follows:

SEGMENT REGISTER (s): 0 1 2 3 4 5 6 7

SR = Segment register ES CS SS DS



The digits x (0-3), and m (0-7) encode the address mode according to

the following scheme. One or more bytes (labeled: Disp) may immediately

follow xrm as described below.



TABLE 1: 16-BIT ADDRESSING MODE (x, m):

Eb = Address of byte-sized object in memory or register

Ew = Address of word-sized object in memory or register

Dw = Unsigned word

Dc = Signed byte ("character"), range: -128 to +127 (decimal).

Db = Unsigned byte



x m Disp Eb Ew

------------------

3 r Rb Rw

0 6 Dw DS:[Dw]

0 m Base:[0] (except for xm = 06).

1 m Dc Base:[Dc]

2 m Dw Base:[Dw]



x 0 Disp DS:[BX + SI + Disp]

x 1 Disp DS:[BX + DI + Disp]

x 2 Disp SS:[BP + SI + Disp]

x 3 Disp SS:[BP + DI + Disp]

x 4 Disp DS:[SI + Disp]

x 5 Disp SS:[DI + Disp]

x 6 Disp DS:[BP + Disp] (except for xm = 06)

x 7 Disp DS:[BX + Disp]



This expands into the following table:



TABLE 1a: 16-BIT ADDRESSING MODE (x, m) for the expansion impaired. Smile

xm Eb/Ew xm Eb/Ew xm Eb/Ew xm Eb/Ew

00 DS:[BX + SI] 10 Dc DS:[BX + SI + Dc] 20 Dw DS:[BX + SI + Dw] 30 AL/AX

01 DS:[BX + DI] 11 Dc DS:[BX + DI + Dc] 21 Dw DS:[BX + DI + Dw] 31 CL/CX

02 SS:[BX + SI] 12 Dc SS:[BP + SI + Dc] 22 Dw SS:[BP + SI + Dw] 32 DL/DX

03 SS:[BX + DI] 13 Dc SS:[BP + DI + Dc] 23 Dw SS:[BP + DI + Dw] 33 BL/BX

04 DS:[SI] 14 Dc DS:[SI + Dc] 24 Dw DS:[SI + Dw] 34 AH/SP

05 DS:[DI] 15 Dc DS:[DI + Dc] 25 Dw DS:[DI + Dw] 35 CH/BP

06 Dw DS:[Dw] 16 Dc SS:[BP + Dc] 26 Dw SS:[BP + Dw] 36 DH/SI

07 DS:[BX] 17 Dc DS:[BX + Dc] 27 Dw DS:[BX + Dw] 37 BH/DI



Operands where x is 0, 1, or 2 are all pointers. If the instruction is a WORD

instruction (211, 213, 214, 216 are), then this pointer addresses a

word-sized object. The format of the object at the indicated address will

always be low-order byte first, and high-order byte second. Otherwise the

instruction is a BYTE instruction (210, 212) and the pointer addresses

byte-sized object at the indicated address.



The default segments (DS:, SS:) can be overridden with a segment prefix. In

all cases it's understood that everything has the default segment DS, except

for the two stack/frame pointers (BP and SP) whose default segment is SS.

That will be explained below.



Modes where x = 1, or 2 will require displacement bytes (Dc or Dw) to follow

the opcode as explained above.



When x = 3, WORD sized instructions address the word registers (AX, CX, ...)

and the BYTE size instructions the byte registers (AL, CL, ...).



EXAMPLE 1: The instruction opcode: 210 135 375

Here, xm = 15, and r = 3, so the operands are:



mov Eb, Rb

=>

mov byte ptr DS:[DI + Dc], BL



The displacement, Dc, is 375 (or fd in hexadecimal), which is the signed byte

-3. So the instruction reads:



mov byte ptr DS:[DI - 3], BL



or just:

mov [DI - 3], BL



In C-like notation, the meaning of this operation would be:



((byte *)DS) [DI - 3] = BL;



EXAMPLE 2: The instruction opcode: 216 332

Here, xm = 32, and s = 3, so the operands are:



mov SR, Ew

=>

mov DS, DX



A move to CS is not possible (because the far jump instruction already does

that) so that the opcode sequence:



216 x2m



is free to be used for encoding something else.



EXAMPLE 3: As an illustration of why it's better to think in octal, just look

at the opcodes for the binary arithmetic instructions:



0P0 xrm Op Eb, Rb

0P1 xrm Op Ew, Rw

0P2 xrm Op Rb, Eb

0P3 xrm Op Rw, Ew

0P4 Db Op AL, Db

0P5 Dw Op AX, Dw



They all have the same form, with a single digit encoding the operator as

follows:

P Op P Op

0 add 1 or

2 adc 3 sbb

4 and 5 sub

6 xor 7 cmp



That's a good fraction of your reference table right there.



EXAMPLE 4: The same mapping is used in the immediate to memory/register form

of these operations:



200 xPm Db Op Eb, Db

201 xPm Dw Op Ew, Dw

203 xPm Dc Op Ew, Dc



(2) An Outline of 80x86 Instructions and Encoding

The authors of 8080 and 8086 references (including Intel's own references)

are apparently not aware of the octal nature of their own machines, and the

result is an almost grotesque complication and bungling up in the presentation

of something that is actually fairly simple. Thus, people claim that it's

almost impossible to know 8086 binary by heart, whereas in fact I know most of

it by memory. I'll straighten out the mess for you here.



As alluded to above, instructions are encoded as follows:



op xrm Const



where * op is a 1 or 2 byte opcode,

* xrm (if present) constitutes 3 octal digits whose normal uses are:

r = Register operand, xm = Memory or Register operand.

It may be followed immediately by a "displacement" byte or word,

depending solely on the digits x and m.

* Const (if present) denotes a byte or word value whose presence and

format depends solely on what op (and sometimes xrm) is.



In some cases, the opcode itself may be separated out into octal digits, e.g.



0s6 = push (Segment Register #s).



The one major exception to the coding scheme are all the conditional code

operations. Since there are 16 distinct conditional codes, they are

represented as a hexadecimal digit. The conditional jump in octal ranges

from 160 to 177, which is 7x in hexadecimal, where x is a hex digit encoding

the jump's condition. I'll represent them by the format: 160+CC.



The register and address encoding was described above. The '386 expands on

this a little with the addition of two segment registers:



SEGMENT REGISTER (s): 0 1 2 3 4 5 6 7

SR = Segment register ES CS SS DS FS GS



In TABLE 1, note that the addresses encoded on modes 0m, 1m, 2m are the same

regardless of whether you're referring to Eb or Ew. What distinguishes them

is the size of the object being pointed to and this can be explicitly

indicated in traditional '86 assemblers like the following examples:



byte ptr [BP]

word ptr [BX + DI]



As explained before, all addresses, except those involving BP refer to the

data segment, DS. All the BP's refer to the stack segment, SS. This

is about to be explained.



(3) Segmentation and Registers

The 80x86 was designed with more or less specific uses for its registers.

In fact, the names are supposed to reflect their main uses:



AX (AH:AL) = Accumulator

BX (BH:BL) = Base Register

CX (CH:CL) = Counting Register

DX (DH:DL) = Data Register



CS = Code Segment -- where constants and programs lie.

DS = Data Segment -- where static variables lie.

SS = Stack Segment -- where auto variables and function parameters lie.

SP, BP = Stack and Frame Pointers, used to segment out the

local variables and function parameters.

ES = Extra Segment -- used in combination with the index registers for

string operations as follows:

DS:[SI] -- points to the Source of the string operation.

ES:[DI] -- points to the Destinction of the string operation.



The typical setup for the stack is as follows:



High Addresses FUNCTION DEFINITION: FUNCTION CALL:

... mov BP, SP push Parameters

Parameters push BP call Function

Return Address sub SP, Locals

BP -> Old BP ... function ...

Local Variables mov SP, BP

SP -> ... pop BP

Low Addresses ret Parameters



this dictates a certain protocol in calling functions with parameters and

returning from them, as shown above. In fact, this is so much so that the

opening and closing sequences above have all been defined as single operations

starting with the 80286 so that the function definition above can be rewritten

as:

FUNCTION DEFINITION:

enter Locals, 0

... function ...

leave

ret Parameters



(4) Word and Address Size on the 80386 and Above

Starting with the 80386, operations can be done with not just 16-bit words

but also 32 bit words. Generally the same operation is defined for both sets

and context is used to determine which is which in the following two ways:



* Which mode the machine is running in

Protected mode -- both word sizes and address sizes are 32-bits

Real & Virtual modes -- 16-bits.

* The presence of certain prefixes to override either the default

word size, address size or both on an instruction-by-instruction

basis.



(a) Word Size

When the word size for the current operation is 32-bits, everything listed

above as "word" is interpreted as 32-bits, including registers. The register

numbering corresponding to this word size is:



REGISTER (r): 0 1 2 3 4 5 6 7

Rb = Byte-sized register AL CL DL BL AH CH DL BH

Rd = Dword-sized register EAX ECX EDX EBX ESP EBP ESI EDI



(b) Address Size

When the address size is switched to 32-bits, the address scheme listed in

TABLE 1 is altered in its entirety.



TABLE 2: 32-BIT ADDRESSING MODE (x, m): Encoding of scaled index SI:

x m Disp Eb Ew si SI

-------------------- ---------------

0 6 Dw DS:[Dw] s0 EAX * 2^s

0 4 sir [Rd + SI + 0] s1 ECX * 2^s

1 4 sir Dc [Rd + SI + Dc] s2 EDX * 2^s

2 4 sir Dw [Rd + SI + Dw] s3 EBX * 2^s

0 r [Rd + 0] (except r = 4) 04 0

1 r Dc [Rd + Dc] (except r = 4) s5 inhibits Rd

2 r Dw [Rd + Dw] (except r = 4) s6 ESI * 2^s

3 r Rb Rw s7 EDI * 2^s



The encodings si = 14, 24 and 34 remain undefined.



This alteration is INDEPENDENT of the word size setting. That means that

even the "Dw"'s, "Rw"'s in the chart above will vary in interpretation as

16-bit or 32-bit objects depending on the word size setting. That leads to

4 possible combinations, not just 2.



EXAMPLE 5: The opcode sequence 211 135 375

This is the operation

mov Ew, Rw

where xm = 15, r = 3 and Disp = -3. The 4 combinations are:



Addr-Size Word-Size Operation

16 16 mov word ptr [DI - 3], BX

16 32 mov dword ptr [DI - 3], EBX

32 16 mov word ptr [EBP - 3], BX

32 32 mov dword ptr [EBP - 3], EBX



EXAMPLE 6: The opcode sequence 211 134 302 375 with 32-bit addressing.

This is the move instruction where xm = 14 and r = 3.



mov Ew, [E]BX ([E]BX since r = 3)



It uses the indexed register addressing. The address, Ew, may be derived

as follows:

x m sir Disp Ew Comments

1 4 sir Dc [EDX + SI + Dc]

1 4 si2 375 [EDX + SI - 3] (Rd = EDX for r = 2)

1 4 302 375 [EDX + 8*EAX - 3] (SI = 8*EAX for si = 30)



Therefore, this instruction represents one of the following:



Word-Size Operation: 211 134 302 375

16 mov word ptr [EDX + 8*EAX - 3], BX

32 mov dword ptr [EDX + 8*EAX - 3], EBX



(5) The Opcode Summary

The chart below summarises all the machine instructions. The following

abbreviations are used:



Registers: Immediate Data Constant:

Rb (byte sized) Db (byte sized)

Rw (word sized) Dw (word sized)

Rd (dword sized) Dc (signed byte)



Register/Memory Address: Relative Code Address:

Eb (byte sized) Cb (byte sized)

Ew (word sized) Cw (word sized)



Memory Address: Code Address:

Es (16 bit selector) Af (32/48 bit absolute far code address)

En (near 16/32 bit pointer)

Ef (far i32/48 bit pointer)

Ep (pointer to 6-byte object)

Ea (generic address)



Processor Extensions:

* = 80186 extension

$ = 80286 extension

# = 80386 extension

@ = 80486 extension



The switch between 16 and 32 bit word size affects all operands labeled

Rw, Ew, Dw, Cw, En and even Af and Ef. The latter two objects refer to

far code addresses which are 4 bytes when the word size is 16 bits, and

6 bytes else.



The only such operands not actually affected by the word-size switch are

those whose size a consequence of the operation's meaning. These include

the following: RET, BOUND, ARPL, SMSW, LMSW, LAR and LSL.



The switch between 16 and 32 bit address size affects all the operands

labeled Eb, Ew, Es, En, Ef, Ep, and Ea. Each of these is interpreted

according to the xm digts in the opcode according to either the 16-bit

addres table described near the start of the article or the 32-bit address

table just described above.



NOTE: In the following presentation everything is in octal.



ARITHMETIC & LOGIC

------------------

Comments:

* All of these operations affect all 6 arithmetic flags, except NOT (which

affects no flags), and INC and DEC (which don't affect CF).

* IMUL and MUL only affect CF and OF predictably.

* IDIV and DIV affect no flags predictably.

* AND, OR, XOR, and TEST all set CF and OF to 0 and alter AF unpredictably.

* CMP and TEST have no affect on any operands. They're used for setting

flags. CMP is used for doing relational operators (< > <= >= == !=), and

TEST for doing bit-testing.

* CMP and TEST can have their operands listed in either order.

P Op Description

0 ADD L, E L += E

2 ADC L, E L += E + CF

5 SUB L, E L -= E

3 SBB L, E L -= E + CF

7 CMP L, E (void)(L - E)

1 OR L, E L |= E

4 AND L, E L &= E

6 XOR L, E L ^= E

0P0 xrm Op Eb, Rb

0P1 xrm Op Ew, Rw

0P2 xrm Op Rb, Eb

0P3 xrm Op Rw, Ew

0P4 Db Op AL, Db

0P5 Dw Op AX, Dw

200 xPm Db Op Eb, Db

201 xPm Dw Op Ew, Dw

203 xPm Dc Op Ew, Dc



NOT L L = ~L

366 x2m not Eb

367 x2m not Ew

NEG L L = -L

366 x3m neg Eb

367 x3m neg Ew



INC L L++

10r inc Rw

376 x0m inc Eb

377 x0m inc Ew

DEC L L--

11r dec Rw

376 x1m dec Eb

377 x1m dec Ew



TEST L, E (void)(L&E)

204 xrm test Rb, Eb

205 xrm test Rw, Ew

250 Db test AL, Db

251 Dw test AX, Dw

366 x0m Db test Eb, Db

367 x0m Dw test Ew, Dw



IMUL L, E, D L = (signed)E*D

IMUL L, E L = (signed)L*E

# 017 257 xrm Dw imul Rw, Ew

* 151 xrm Dw imul Rw, Ew, Dw

* 153 xrm Db imul Rw, Ew, Dc



In the following operations:

Operand Size ACC' ACC

1 AX AL

2 DX:AX AX

4 EDX:EAX EAX

P Op Description

4 MUL E ACC' = (unsigned) ACC*E

5 IMUL E ACC' = (signed) ACC*E

6 DIV E ACC' = (unsigned) ACC%E : ACC/E

7 IDIV E ACC' = (signed) ACC%E : ACC/E

366 xPm Op Eb

367 xPm Op Ew



SHIFTS & ROTATIONS

------------------

Comments:

* Where applicable, N is masked off by 0x1f.

* For Rxx and Sxx, OF is predictably affected only when N is 1.

* SHLD and SHRD affect all 6 arithmetic flags, but OF and AF unpredictably.

* RxL: OF = (CF != high order bit of L) before shift

* RxR: OF = (high order bit of L != next high order bit of L) before shift

* SxL: OF = (CF != sign bit of L) after shift

* SxR: OF = (sign bit of L) after shift

P Op Description

0 ROL CF <- [<-<-<-] <- high order bit Rotate

1 ROR low order bit -> [->->->] -> CF

2 RCL CF <- [<-<-<-] <- CF Rotate Through CF

3 RCR CF -> [->->->] -> CF

4 SHL CF <- [<-<-<-] <- 0 Shift (unsigned)

5 SHR 0 -> [->->->] -> CF

4 SAL CF <- [<-<-<-] <- 0 Shift (signed)

7 SAR sign bit -> [->->->] -> CF

* 300 xPm Db Op Eb, Db

* 301 xPm Db Op Ew, Db

320 xPm Op Eb, 1

321 xPm Op Ew, 1

322 xPm Op Eb, CL

323 xPm Op Ew, CL



SHLD L, E, N CF:L = L:E << N

SHRD L, E, N L:CF = E:L >> N

# 017 244 Db shld Ew, Rw, Db

# 017 245 shld Ew, Rw, CL

# 017 254 Db shrd Ew, Rw, Db

# 017 255 shrd Ew, Rw, CL



TYPE CONVERSIONS

----------------

[] Decimal Conversions

Comments:

* DAA and DAS are used for adjusting the results of addition and subtraction

respectively back to packed BCD format. They will alter all 6 of the

arithmetic flags, OF unpredictably.

* AAA, AAS, AAD, and AAM are used for adjusting the results of the four

basic arithmetic operations back to unpacked BCD format or ASCII format.

However, AAD is used *before* a divide operation. They too affect all

6 of the arithmetic flags, but only AF and CF predictably (for AAA and

AAS) or SF, ZF and PF (for AAD and AAM).

* In the following, A0 stands for the lower 4 bits of AL and A1 the upper

4 bits of AL.

* The binary codes for AAM and AAD each consist of an opcode followed by

a constant 10 (012 in octal). It has been said that this "10" is

actually a hidden parameter to a more general AAD and AAM operator,

which can actually be used for any base other than 10. Some processors

will not allow AAD to be generalized in this way, however. The reason it

was left out in the open like this was supposedly because the original

8086 design literally ran out of space to pack in the opcode.

DAA if (A0 > 9) AF = 1; if (AF) AL += (0x10 - 10);

if (A1 > 9) CF = 1; if (CF) AL += (0x10 - 10)*0x10;

DAS if (A0 > 9) AF = 1; if (AF) AL -= (0x10 - 10);

if (A1 > 9) CF = 1; if (CF) AL += (0x10 - 10)*0x10;

AAA if (A0 > 9) AF = 1; CF = AF; if (CF) A0 += (0x10 - 10), AH++;

AAS if (A0 > 9) AF = 1; CF = AF; if (CF) A0 -= (0x10 - 10), AH--;

AAM AX = AL/10 : AL%10

AAD AX = (10*AH + AL)%0x10

047 daa

057 das

067 aaa

077 aas

324 012 aam

325 012 aad



[] Sign Conversions

Comments:

* In converting from a shorter to longer operand size, sign conversion

involves either taking the leading (sign) bit and replicating it leftward

(conversion to signed), or placing zero's on the left (for conversion

to unsigned).

MOVSX L, E L = (signed)E

MOVZX L, E L = (unsigned)E

# 017 266 xrm movsx Rw, Eb

# 017 267 xrm movsx Rw, Ew

# 017 266 xrm movzx Ew, Rb

# 017 277 xrm movzx Ew, Rw



CBW AX = (signed)AL

CWDE EAX = (signed)AX

CWD DX:AX = (signed)AX

CDQ EDX:EAX = (signed)EAX

230 cbw / (#) cwde

231 cwd / (#) cdq



[] Byte Ordering

* Used to convert between "little Endian" (Intel byte ordering) and "big

Endian" (Motorola byte ordering). Typical use: networking applications.

BSWAP L L[0]:L[1]:L[2]:L[3] = L[3]:L[2]:L[1]:L[0]

@ 017 31r bswap Rd



[] Table Lookup

XLATB AL = [BX + AL]

327 xlatb



SEMAPHORES & SYNCHRONIZATION

----------------------------

Comments:

* All these operations affect all 6 arithmetic flags. BT, BTS, BTR, BTC

affect only CF predictably; and BSF and BSR affect only ZF predictably.

* ACC is either AL, AX or EAX in CMPXCHG, depending on the operand size.

* WAIT is used in the '486 to force a pending unmasked interrupt from the

internal floating point processing unit.

* LOCK is a prefix used in multi-CPU contexts to assure exclusive access to

memory for the following two-step read & modify operations:

(INC, DEC, NEG, NOT) Mem (ADD, ADC, SUB, SBB) Mem, Src

(BT, BTS, BTR, BTC) Mem, Src (AND, XOR, OR) Mem, Src

XCHG Reg, Mem XCHG Mem, Reg

But XCHG automatically does its own LOCK so does not need to be prefixed.

P Op Description

4 BT L, N CF = L.N;

5 BTS L, N CF = L.N; L.N = 1;

6 BTR L, N CF = L.N; L.N = 0;

7 BTC L, N CF = L.N; L.N = !L.N;

# 017 2P3 xrm Op Ew, Rw

# 017 272 xPm Db Op Ew, Db



BSF L, E ZF = !E; if (ZF) L = First 1-bit position in E; else L = ???

BSR L, E ZF = !E; if (ZF) L = Last 1-bit position in E; else L = ???

# 017 274 xrm bsf Rw, Ew

# 017 275 xrm bsr Rw, Ew



CMPXCHG L, E ZF = (ACC == L); if (ZF) L = E; else ACC = L;

@ 017 246 xrm cmpxchg Eb, Rb

@ 017 247 xrm cmpxchg Ew, Rw

XADD L, L' =

@ 017 300 xrm xadd Eb, Rb

@ 017 301 xrm xadd Ew, Rw



NOP Delay 1 cycle.

WAIT Wait for coprocessor unit.

LOCK Hardware memory bus semaphore.

HLT Wait for a reset or interrupt.

220 nop

233 wait

360 lock

364 hlt



INT N push [E]FLAGS, CS, [E]IP; TF = 0;

if (the Nth entry in the IDT is a Interrupt Gate) IF = 0;

jmp to the far address listed under the Nth entry in the IDT

INTO if (OF) INT 4

IRET if (NT) return to task listed under TSS.BackLink;

else pop [E]IP, CS, [E]FLAGS;

314 int 3

315 Db int Db

316 into

317 iret



FLAGS

-----

Comments:

* No flags are affected except the explicit moves to the FLAGS register:

POPF[D] and SAHF, but SAHF only sets the arithmetic flags (except OF).

POPF pop FLAGS

POPFD pop EFLAGS

PUSHF push FLAGS

PUSHFD push EFLAGS

SAHF FLAGS |= (AH & 0xd5)

LAHF AH = FLAGS;

234 pushf / (#) pushfd

235 popf / (#) popfd

236 sahf

237 lahf

CMC CF = !CF

CLC CF = 0

STC CF = 1

CLI IF = 0 (Interrupts off)

STI IF = 1 (Interrupts on)

CLD DF = 0 (Set string ops to increment)

STD DF = 1 (Set string ops to decrement)

365 cmc

370 clc

371 stc

372 cli

373 sti

374 cld

375 std



CONDITIONAL OPERATIONS

----------------------

(NOTE: The values listed for CC are in octal).



CC Condition(s) Definition Descriptions

07 A NBE !CF && !ZF x > y x > 0 (unsigned)

03 AE NB !CF x >= y x >= 0 (unsigned)

02 B NAE CF x < y x < 0 (unsigned)

06 BE NA CF || ZF x <= y x <= 0 (unsigned)

17 G NLE SF == OF && !ZF x > y x > 0 (signed)

15 GE NL SF == OF x >= y x >= 0 (signed)

14 L NGE SF != OF x < y x < 0 (signed)

16 LE NG SF != OF || ZF x <= y x <= 0 (signed)

04 E Z ZF x == y x == 0

05 NE NZ !ZF x != y x != 0

00 O OF Overflow (signed overflow)

01 NO !OF No overflow (signed overflow)

02 C CF Carry (unsigned overflow)

03 NC !CF No carry (unsigned overflow)

10 S SF (Negative) sign

11 NS !SF No (negative) sign

12 P PE PF Parity [even]

13 NP PO !PF No parity (parity odd)

CC cc Cond.



Jcc Rel if (Cond) EIP += Rel;

SETcc L L = (Cond)? 1: 0;

# 017 200+CC Cw jcc Cw

# 017 220+CC x0m setcc Rb

160+CC jcc Cb



STACK OPERATIONS

----------------

Comments:

* PUSHA[D] uses the value SP had before the operation started.

* POPA[D] doesn't actually affect [E]SP, which is why it's bracketed out.

* POP CS is not allowed because it's already subsumed by the RET (far)

operation. Instead, 017 is used as a 2-byte operation prefix.

* POP SS inhibits interrupts in order to allow [E]SP to be altered in the

following operation -- for what should be obvious reasons.

PUSH E SP -= sizeof E; SS:[SP] = E;

PUSHA push AX, CX, DX, BX, SP, BP, SI, DI

PUSHAD push EAX, ECX, EDX, EBX, ESP, EBP, ESI, EDI

# 017 240 push FS

# 017 250 push GS

0s6 push SR (s = 0-3)

12r push Rw

* 140 pusha / (#) pushad

150 Dw push Dw

152 Dc push Dc

377 x6m push Ew

POP L L = SS:[SP]; SP += sizeof L;

POPA pop DI, SI, BP, (SP), BX, DX, CX, AX

POPAD pop EDI, ESI, EBP, (ESP), EBX, EDX, ECX, EAX

# 017 241 pop FS

# 017 251 pop GS

0s7 pop SR (s = 0, 2-3)

13r pop Rw

* 141 popa / (#) popad

217 x0m pop Ew



TRANSFER OPERATIONS

-------------------

Comments:

* XCHG can have its operands listed in either order.

* MOV CS, ... is not allowed, since this is already subsumed by JMPs.

* LCS ... is not allowed either for the same reason.

* XCHG AX, AX is one and the same as NOP.

XCHG L, E =

206 xrm xchg Rb, Eb

207 xrm xchg Rw, Ew

22r xchg AX, Rw (r != 0)

MOV L, E L = E;

210 xrm mov Eb, Rb

211 xrm mov Ew, Rw

212 xrm mov Rb, Eb

213 xrm mov Rw, Ew

214 xsm mov Es, SR (s = 0-3, (#) 4-5)

216 xsm mov SR, Es (s = 0,2-3, (#) 4-5)

240 Dw mov AL, [Dw]

241 Dw mov AX, [Dw]

242 Dw mov [Dw], AL

243 Dw mov [Dw], AX

26r Db mov Rb, Db

27r Dw mov Rw, Dw

306 x0m Db mov Eb, Db

307 x0m Dw mov Ew, Dw

LEA L, An L = &An;

215 xrm lea Rw, En (x != 3)

LSeg L, Af Seg:L = &Af;

# 017 262 xrm lss Rw, Ef (x != 3)

# 017 264 xrm lfs Rw, Ef (x != 3)

# 017 265 xrm lgs Rw, Ef (x != 3)

304 xrm les Rw, Ef (x != 3)

305 xrm lds Rw, Ef (x != 3)



ADDRESSING

----------

Comments:

* The current mode of the machine determines its default mode (16 or 32

bits).

* RAND: and ADDR: (not an standard name, since Intel has none) are

prefixes that alter the default for the next instruction only.

* RAND: changes the word size between 16 and 32 bits.

* ADDR: changes the address size between 16 and 32 bits.

* seg: cannot override the implied ES:[DI] operand in any string op,

but can override the DS in the implied DS:[SI] operands there.

seg: Segment override prefix

ADDR: Address size toggle

RAND: Operand size toggle

305 xrm lds Rw, Ef (x != 3)

046 ES:

056 CS:

066 SS:

076 DS:

# 144 FS:

# 145 GS:

# 146 RAND:

# 147 ADDR:



PORT I/O

--------

Comments:

* In protected mode the user of these operations must pass the I/O

Privilege Level (IOPL) else they are blocked by an interrupt.

This allows the Operating System to spool I/O devices in a

multitasking system (since the OS handles interrupts) to avoid having

processes all trying to use the same device at once.

IN ACC, Port ACC = IO[Port]

344 Db in AL, Db

345 Db in AX, Db

354 in AL, DX

355 in AX, DX

OUT Port, ACC IO[Port] = ACC

346 Db out Db, AL

347 Db out Db, AX

356 out DX, AL

357 out DX, AX



STRING OPERATIONS

-----------------

Comments:

* In all these operations below, Src denotes DS;[ESI] and Dest ES:[EDI].

* Dest cannot be overridden by a segment prefix, only Src.

* The pointes (ESI, EDI) are bumped up (DF = 0) or down (DF = 1) after

the operation by sizeof Operand.

* ACC is either AL, AX or EAX depending on the operand size.

* The flags altered are exactly those altered by the corresponding

MOV, IN, OUT, or CMP operation (namely: only SCAS and CMPS alter the

flags and in the same way as CMP) and these are therefore the only ones

that can be prefixed by REP[N]E/REP[N]Z.

* REP with all string ops, but REP LODS doesn't do anything sensible.

INS in Dest, DX

OUTS out DX, Src

MOVS mov Dest, Src

CMPS cmp Dest, Src

STOS mov Dest, ACC

LODS mov ACC, Src

SCAS cmp ACC, Dest

* 154 insb

* 155 insw / (#) insd

* 156 outsb

* 157 outsw / (#) outsd

244 movsb

245 movsw / (#) movsd

246 cmpsb

247 cmpsw / (#) cmpsd

252 stosb

253 stosw / (#) stosd

254 lodsb

255 lodsw / (#) lodsd

256 scasb

257 scasw / (#) scasd

REP Op while (CX-- > 0) Op

REPE /REPZ Op while (CX-- > 0 && ZF) Op

REPNE/REPNZ Op while (CX-- > 0 && !ZF) Op

362 repne / repnz / rep

363 repe / repz



CONTROL FLOW

------------

Comments:

* The distinction between near and far jumps/calls/returns is built right

into the 8086 language, which pretty much forces you to explicitly

declare a routine as "near" or "far" and be consistent about it. The

intended usage runs pretty much like C's static vs. global functions,

with each C file being analogous to an 8086 segment.

* The 8086 was specifically designed to be a Pascal (and PL/I) machine,

though. Intel wrongly assumed that one of these languages would become

like C is now. So the ENTER and LEAVE operators were added (and BOUND

to do array bounds-checking). The segmentation structure was intended

to support these types of languages.

JCXZ Rel if (!CX) IP += Rel;

JECXZ Rel if (!ECX) IP += Rel;

LOOPcc Rel if (!--CX && cc) IP += Rel;

340 Cb loopnz Cb / loopne Cb

341 Cb loopz Cb / loope Cb

342 Cb loop Cb

343 Cb jcxz Cb / (#) jecxz Cb

JMP Rel IP += Rel;

JMP FAR Af CS:IP = Af;

CALL Rel push IP; IP += Rel;

CALL FAR Af push CS, IP; IP = Af;

232 Af call Af

350 Cw call Cw

351 Cw jmp Cw

352 Af jmp far Af

353 Cb jmp Cb

377 x2m call En

377 x3m call far Ef

377 x4m jmp En

377 x5m jmp far Ef

RET Params pop IP; SP += Params (default: Params = 0)

RET FAR Params pop IP, CS; SP += Params (default: Params = 0)

302 Dw ret Dw

303 ret

312 Dw ret far Dw

313 ret far

ENTER Locs, N push EBP;

(sub EBP, 4; push [EBP]) N-1 times, if N > 0

mov EBP, ESP

(add EBP, 4*(N-1); push EBP), if N > 0

sub ESP, Locs

LEAVE mov ESP, EBP; pop EBP

* 310 Dw Db enter Dw, Db

* 311 leave



SYSTEM CONTROL & MEMORY PROTECTION

----------------------------------

BOUND A, AA if (A not in range AA[0]..AA[1]) INT 5

ARPL L, E ZF = (L.RPL < E.RPL);

if (ZF) L.RPL = E.RPL;

* 142 xrm bound Rw, Ed

$ 143 xrm arpl Es, Rw



SLDT Sel Sel = LDTR

STR Sel Sel = TR

LLDT Sel LDTR = Sel

LTR Sel TR = Sel

VERR Sel ZF = (Sel is accessible and has read-access)

VERW Sel ZF = (Sel is accessible and has write-access)

LAR L, Sel ZF = (Sel is accessible);

if (ZF) L = the access rights of Sel's descriptor.

LSL L, Sel ZF = (Sel is accessible);

if (ZF) L = the segment limit of Sel's descriptor.

$ 017 000 x0m sldt Ew

$ 017 000 x1m str Ew

$ 017 000 x2m lldt Ew

$ 017 000 x3m ltr Ew

$ 017 000 x4m verr Ew

$ 017 000 x5m verw Ew

$ 017 002 xrm lar Rw, Ew

$ 017 003 xrm lsl Rw, Ew



SGDT Desc Desc = GDTR

SIDT Desc Desc = IDTR

LGDT Desc GDTR = Desc

LIDT Desc IDTR = Desc

$ 017 001 x0m sgdt Ep

$ 017 001 x1m sidt Ep

$ 017 001 x2m lgdt Ep

$ 017 001 x3m lidt Ep



SMSW L L = MSW ... note that MSW is CR0 bits 0-15.

LMSW E MSW = E

CLTS MSW.3 = 0 ... clears the Task Switched flag.

$ 017 001 x4m smsw Ew

$ 017 001 x6m lmsw Ew

$ 017 006 clts



INVD Invalidate internal cache.

WBINVD Invalidate internal cache, after writing it back.

INVLPD Ea Invalidate Ea's page.

@ 017 010 invd

@ 017 011 wbinvd

@ 017 020 x7m invlpg Ea



MOV Reg, SysReg

MOV SysReg, Reg

# 017 040 3nr mov Rd, CRn (n = 0-3)

# 017 041 3nr mov Rd, DRn (n = 0-3, 6-7)

# 017 042 3nr mov CRn, Rd (n = 0, 2-3)

# 017 043 3nr mov DRn, Rd (n = 0-3, 6-7)

# 017 044 3nr mov Rd, TRn (n = 6-7)

# 017 046 3nr mov TRn, Rd (n = 6-7)



CO-PROCESSOR ESCAPE SEQUENCE

----------------------------

Comments:

* This escape sequence is intended to be used with an external co-processor

with the most common application being the 80x87 floating point unit.

* Starting in the 80486, the floating point unit was made internal to the

processor.



ESC TL, Ea Escape, operation TL, address mode Ea.

33T xLm esc TL Ea



(6) Floating Point Operations

The Floating Point unit consists of 8 internal registers arranged in a

circular stack, and the Control Word (CW), Status Word (SW) and Tag Word (TW)

registers. The floating point stack registers all store data in Real80

format (described below).

Operations are carried out on data in the following formats (low-order bits

on right):



INTEGER: 16/32/64 bits (Int16, Int32, Int64)

BCD: (BCD80)

S 0000000 D D D D D D D D D D D D D D D D D D

S = 1-bit sign (1 = negative, 0 = positive)

D = 4-bit digit (encodes digits 0-9).

FLOATING POINT: 32/64/80 bits (Real32, Real64, Real80)

S Exponent Mantissa

S = 1-bit sign (1 = negative, 0 = positive)

Exponent = 8/11/15 bit biased exponent

Mantissa = 23/52/64 bit decimal fraction.

The values of floating point numbers in each format are as follows:

Real32: (-1)^S (1 + Mantissa)/2^23 x 2^(Exponent - 127)

Real64: (-1)^S (1 + Mantissa)/2^52 x 2^(Exponent - 1023)

Real80: (-1)^S Mantissa/2^63 x 2^(Exponent - 16383)



The floatng point formats do not cover all the logical combination of binary

0's and 1's, and the remaining combinations are defined for special purposes:



Sign Exponent Mantissa Meaning

S 0 0 0 ... 0 0 0 0 ... 0 0

S 0 0 0 ... 0 ... 1 ... DENORMAL (Infinitesimal)

S 1 1 1 ... 1 0 0 0 ... 0 INFINITY

S 1 1 1 ... 1 0 ... 1 ... Signalling NaN (Not a Number)

S 1 1 1 ... 1 1 ... Quiet NaN



This is all IEEE standard format. Quiet NaN's are set by the FP Unit to

indicate invalid operations.



Notation:

ST(n) -- the nth item below the stack top.

ST ----- ST(0), the stack top.

Int*, BCD*, Real* -- described above.



All Int*, BCD*, and Real* operands are stored in memory and are encoded in

the 80x86's current addressing mode (16 or 32 bit). All opcodes are

listed in the format:



T L xm for 8086 escape code 33T xLm



Since only memory addresses are used in the operations, that frees up all

the combinations xm where x = 3. These are generally used to encode the

operations that do not involve memory addresses. In the following

presentation where "xm" is listed generally, it is understood that x is not 3.



The operations FENI, FDISI are specific to the 8887; FSETPM to the 80287

and FUCOM*, FPREM1, and the trig. operations FSIN, FCOS, FSINCOS are all

present only in the 80387 and after.



DATA TRANSFER

-------------

Comments:

* The followng table is used:

P 0 2 3

F-OP fld fst fstp

I-OP fild fist fistp

FLD Arg ST = (Real80)Arg

FST Arg Arg = (typeof Arg)ST

FSTP Arg Arg = (typeof Arg)ST; pop();

FXCH Arg Arg <--> ST, with appropriate type conversions.

1 P xm F-OP Real32

3 P xm I-OP Int32

5 P xm F-OP Real64

7 P xm I-OP Int16

3 5 xm fld Real80

3 7 xm fstp Real80

7 4 xm fbld BCD80

7 5 xm fild Int64

7 6 xm fbstp BCD80

7 7 xm fistp Int64

1 0 3m fld ST(m)

1 1 3m fxch ST(m)

5 2 3m fst ST(m)

5 3 3m fstp ST(m)



COMPARISON

----------

Comments:

* The followng table is used:

P 2 3

F-OP fcom fcomp

I-OP ficom ficomp

FCOM Arg cmp ST, Arg

FCOMP Arg cmp ST, Arg; pop();

0 P xm F-OP Real32

2 P xm I-OP Int32

4 P xm F-OP Real64

6 P xm I-OP Int16

0 P 3m F-OP ST(m)

FCOMPP cmp ST, ST(1); pop(); pop();

6 3 31 fcompp

FTST cmp ST, 0.0

1 4 34 ftst

FXAM examine ST

1 4 35 fxam

FUCOM Arg unordered compare ST, Arg

FUCOMP Arg unordered compare ST, Arg; pop();

FUCOMPP Arg unordered compare ST, ST(1); pop(); pop();

5 4 3m fucom ST(m)

5 5 3m fucomp ST(m)

2 5 31 fucompp



ARITHMETIC OPERATIONS

---------------------

Comments:

* The followng table is used:

P 0 1 4 5 6 7

F-OP fadd fmul fsub fsubr fdiv fdivr

I-OP fiadd fimul fisub fisubr fidiv fidivr

P-OP faddp fmulp fsubp fsubrp fdivp fdivrp

* Dest is ST and Src the listed operand except where noted below.

FADD Arg Dest += Src

FSUB Arg Dest += Src

FSUBR Arg Dest = Src - Dest

FMUL Arg Dest *= Src

FDIV Arg Dest /= Src

FDIVR Arg Dest = Src/Dest

0 P xm F-OP Real32

2 P xm I-OP Int32

4 P xm F-OP Real64

6 P xm I-OP Int16

0 P 3m F-OP ST(m)

4 P 3m F-OP ST(m) (Dest = ST(m), Src = ST)

6 P 3m P-OP ST(m) (Dest = ST(m), Src = ST)



CONSTANTS

---------

FLD1 ST = 1.0

FLDL2T ST = log_2(10)

FLDL2E ST = log_2(e)

FLDPI ST = pi

FLDLG2 ST = log_10(2)

FLDLN2 ST = ln(2)

FLDZ ST = 0.0

1 5 30 fld1

1 5 31 fldl2t

1 5 32 fldl2e

1 5 33 fldpi

1 5 34 fldlg2

1 5 35 fldln2

1 5 36 fldz



BUILT-IN FUNCTIONS

------------------

Comments:

* The stack replacements entail pop()'s.

FCHS ST = -ST

FABS ST = |ST|

F2XM1 ST = 2^ST - 1

FYL2X Replace the stack: ST(1), ST -> ST(1)*log_2(ST)

FPTAN Replace the stack: ST -> tan(ST), 1.0

FPATAN Replace the stack: ST(1), ST -> atan(ST(1)/ST)

FXTRACT Replace the stack: ST -> exponent(ST), mantissa(ST)

FPREM1 ST = remainder(ST/ST(1)), IEEE consistent

FPREM ST = remainder(ST/ST(1))

FYL2XPI Replace the stack: ST(1), ST -> ST(1)*log_2(ST + 1)

FSQRT ST = sqrt(ST)

FSINCOS Replace the stack: ST -> sin(ST), cos(ST)

FRNDINT ST = round(ST)

FSCALE ST *= 2^(int)ST(1)

FSIN ST = sin(ST)

FCOS ST = cos(ST)

1 4 30 fchs

1 4 31 fabs

1 6 30 f2xm1

1 6 31 fyl2x

1 6 32 fptan

1 6 33 fpatan

1 6 34 fxtract

1 6 35 fprem1

1 7 30 fprem

1 7 31 fyl2xpi

1 7 32 fsqrt

1 7 33 fsincos

1 7 34 frndint

1 7 35 fscale

1 7 36 fsin

1 7 37 fcos



CONTROL

-------

Comments:

* The save and load operations for the environment and state are used

primarily for multitasking applications where 2 or more processes are

using the FP unit concurrently.

FNOP Delay 1 cycle.

FLDENV Arg Load FP environment from [Arg]

FLDCW Arg CW = Arg

FSTENV Arg Save FP environment to [Arg]

FSTCW Arg Arg = CW

FDECSTP TOP = (TOP - 1) mod 8

FINCSTP TOP = (TOP + 1) mod 8

FENI Enable interrupts (8087 only)

FDISI Disable interrupts (8087 only)

FCLEX Clear out FP exception flags

FINIT Initialize FP registers

FSETPM Enter Protected Mode (80287 only)

FFREE ST(m) Mark register m as unused.

FRSTOR Arg Restore FP state from [Arg]

FSAVE Arg Save FP state to [Arg]

FSTSW Arg Arg = SW

1 2 30 fnop

1 4 xm fldenv Ea

1 5 xm fldcw Ea

1 6 xm fstenv Ea

1 7 xm fstcw Ea

1 6 36 fdecstp

1 6 37 fincstp

3 4 30 feni

3 4 31 fdisi

3 4 32 fclex

3 4 33 finit

3 4 34 fsetpm

5 0 3m ffree ST(m)

5 4 xm frstor Ea

5 6 xm fsave Ea

5 7 xm fstsw Ea

7 4 30 fstsw AX
Post 11 Jan 2007, 20:15
View user's profile Send private message Visit poster's website Yahoo Messenger MSN Messenger Reply with quote
rhyno_dagreat



Joined: 31 Jul 2006
Posts: 487
Location: Maryland, Unol Daleithiau
rhyno_dagreat 12 Jan 2007, 03:28
Thanks Tantrikwizard! This is some very good info! Very Happy
Post 12 Jan 2007, 03:28
View user's profile Send private message Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.