flat assembler
Message board for the users of flat assembler.

Index > Compiler Internals > Assembly Instruction Set

Author
Thread Post new topic Reply to topic
vbVeryBeginner



Joined: 15 Aug 2004
Posts: 884
Location: \\world\asia\malaysia
vbVeryBeginner 01 Nov 2004, 03:23
Assembly Instruction Set

Typical 8086 Instruction Format

A machine instruction for the 8086 occupies from one to six bytes. For most instructions, the first byte contains the opcode, the second byte contains the addressing modes of the operands, and the other byetes contain either address information or immediate data. A typical two-operand instruction has the format given in figure below.

Image

In the first byte, we see a six-bit opcode that identifies the operation. The same opcode is used for both 8 and 16 bit operations. The size of the operands is given by the W bit: W=0 means 8 bit data and W=1 means 16 bit data.

For register-to-register, register-to-memory, and memory-to-register operations, the REG field in the second byte contains a register number and the D bit specifies whether the register in the Reg field is a source or destination operand, D=0 means source and D=1 means destination. For other types of operations, the REg field contains a three bit extension of the opcode.

Code:
+-----------------+-----------------------------------------------------+
|                 |                                                     |
|     MOD = 11    |             Effective Address Calculation           |
|                 |                                                     |
+-----+-----+-----+-----+----------------+--------------+---------------+
| R/M | W=0 | W=1 | R/M | MOD=00         | MOD=01       | MOD=10        |
+-----------------+-----+----------------+--------------+---------------+
| 000 | AL  | AX  | 000 | (BX)+(SI)      | (BX)+(SI)+D8 | (BX)+(SI)+D16 |
+-----+-----+-----+-----+----------------+--------------+---------------+
| 001 | CL  | CX  | 001 | (BX)+(DI)      | (BX)+(DI)+D8 | (BX)+(DI)+D16 |
+-----+-----+-----+-----+----------------+--------------+---------------+
| 010 | DL  | DX  | 010 | (BP)+(SI)      | (BP)+(SI)+D8 | (BP)+(SI)+D16 |
+-----+-----+-----+-----+----------------+--------------+---------------+
| 011 | BL  | BX  | 011 | (BP)+(DI)      | (BP)+(DI)+D8 | (BP)+(DI)+D16 |
+-----+-----+-----+-----+----------------+--------------+---------------+
| 100 | AH  | SP  | 100 | (SI)           | (SI)+D8      | (SI)+D16      |
+-----+-----+-----+-----+----------------+--------------+---------------+
| 101 | CH  | BP  | 101 | (DI)           | (DI)+D8      | (DI)+D16      |
+-----+-----+-----+-----+----------------+--------------+---------------+
| 110 | DH  | SI  | 110 | DIRECT ADDRESS | (BP)+D8      | (BP)+D16      |
+-----+-----+-----+-----+----------------+--------------+---------------+
| 111 | BH  | DI  | 111 | (BX)           | (BX)+D8      | (BX)+D16      |
+-----+-----+-----+-----+----------------+--------------+---------------+
    


MOD=11 means register mode
MOD=00 means memory mode with no displacement, except when R/M=110 then 16 bit displacement follows.
MOD=01 means memory, with 8 bit displacement following (DCool
MOD=10 means memory, with 16 bit displacement following (D16)

The combination of the W bit and the REG field can specify a total of 16 registers.
The second operand is specified by the MOD and R/M fields.
Code:
+-------------------+
| Bits for          |
| Segment Registers |
+-------------------+
| 000     ES        |
| 001     CS        |
| 010     SS        |
| 011     DS        |
| 100     FS        |
| 101     GS        |
+-------------------+
    

eg.
Code:
ADD AX,BX    01D8    0000 0001 1101 1000
ADD BX,AX        01C3    0000 0001 1100 0011
    

our is r/r
so, it applies to 0000 00dw mod reg r/m
Code:
ADD AX,BX
opcode = 0000 00 <- for add instruction
d      = 0
w      = 1
reg    = 011
r/m    = 000
    

reg 011 is BX because w = 1, and would become BH or BL if w = 0
so we know reg is BX, but BX is source or destionation? we therefore refer to d
d = 0 means reg is source so automatically r/m is destination
so ADD dest,source = ADD 000(AX),011(BX)
Code:
ADD BX,AX
opcode = 0000 00 <- for add instruction
d      = 0
w      = 1
reg    = 000
r/m    = 011
    


reg 000 is AX because w = 1 and would become AH or AL if w = 0
so we know reg is AX, but AX is source or destination? we therefore refer to d
d = 0 means reg is source so automatically r/m is destination
so ADD dest,source = ADD 011(BX),000(AX)

Code:
opcode = 0000 00
d      = 0   0 means reg is source and 1 means reg is destination 
w      = 1   the width is a word
mod    = 11  r/m specifies the second operand is a register and the w = 1 (BX) w = 0 (BH/BL)
reg    = register
r/m    = register / memory
    


Code:
ADD: Addition
Format:     ADD destination,source
Operation:        (dest) = (source) + (dest)
Encoding:     Memory or register with register (m/r or r/r)
               0000 00dw mod reg r/m
               Immediate to accumulator
            0000 010w data
              Immediate to memory or register
             1000 00sw mod 000 r/m data
          (s is set if a byte data is added to 16-bit memory or register)
    


reference
---------
1.Assembly Language Programming and Organization of the IBM PC, Ytha Yu, Charles Marut,ISBN 0-07-072692-2
(this book is better in term of explanation)


Last edited by vbVeryBeginner on 01 Nov 2004, 17:03; edited 4 times in total
Post 01 Nov 2004, 03:23
View user's profile Send private message Visit poster's website Reply with quote
Matrix



Joined: 04 Sep 2004
Posts: 1166
Location: Overflow
Matrix 01 Nov 2004, 03:32
Hello,
nice, but i
dk if Privalov likes your post, because the topic said:
Here you can ask questions about the fasm source code, report bugs, submit modifications.
Post 01 Nov 2004, 03:32
View user's profile Send private message Visit poster's website Reply with quote
vbVeryBeginner



Joined: 15 Aug 2004
Posts: 884
Location: \\world\asia\malaysia
vbVeryBeginner 01 Nov 2004, 10:44
hi, matrix
yup, i've been thinking about this too before i posted this coz i find no suitable group for it.

in the General group, perhaps only the Main group suitable for this, but the description reads General discussion about fasm

i posted it in Compiler Internals coz i think this information is a must read for those who wanna build an assembler or compiler.

one question is, why it reads compiler internals and not assembler internal? and fasm is flat assembler not flat compiler?
Post 01 Nov 2004, 10:44
View user's profile Send private message Visit poster's website Reply with quote
scientica
Retired moderator


Joined: 16 Jun 2003
Posts: 689
Location: Linköping, Sweden
scientica 01 Nov 2004, 11:13
I just saw teh "ASCII Image", and I wonder aren't you missing the first (optional) byte?, or maybe it's just me confusing things, but doesn't the 8086 have prefixes? (or this a feature of it's decendants, 80[23]86?)
Post 01 Nov 2004, 11:13
View user's profile Send private message Visit poster's website Reply with quote
Matrix



Joined: 04 Sep 2004
Posts: 1166
Location: Overflow
Matrix 01 Nov 2004, 11:34
vbVeryBeginner wrote:
hi, matrix
yup, i've been thinking about this too before i posted this coz i find no suitable group for it.

in the General group, perhaps only the Main group suitable for this, but the description reads General discussion about fasm

i posted it in Compiler Internals coz i think this information is a must read for those who wanna build an assembler or compiler.

one question is, why it reads compiler internals and not assembler internal? and fasm is flat assembler not flat compiler?


Smile
COMPILER IS A COMPILER,
AN ASSEMBLER CAN COMPILE TOO Smile
Post 01 Nov 2004, 11:34
View user's profile Send private message Visit poster's website Reply with quote
Vortex



Joined: 17 Jun 2003
Posts: 318
Vortex 01 Nov 2004, 16:37
No, an assembler doesn't compile, it's assembles.

_________________
Code it... That's all...
Post 01 Nov 2004, 16:37
View user's profile Send private message Visit poster's website Reply with quote
vbVeryBeginner



Joined: 15 Aug 2004
Posts: 884
Location: \\world\asia\malaysia
vbVeryBeginner 01 Nov 2004, 17:04
it is soooo damn hard to ascii image here, Sad
Post 01 Nov 2004, 17:04
View user's profile Send private message Visit poster's website Reply with quote
vbVeryBeginner



Joined: 15 Aug 2004
Posts: 884
Location: \\world\asia\malaysia
vbVeryBeginner 01 Nov 2004, 17:07
scientica wrote:
I just saw teh "ASCII Image", and I wonder aren't you missing the first (optional) byte?, or maybe it's just me confusing things, but doesn't the 8086 have prefixes? (or this a feature of it's decendants, 80[23]86?)


first optional byte? care to tell me more, coz i just translated the image from book.
Post 01 Nov 2004, 17:07
View user's profile Send private message Visit poster's website Reply with quote
roticv



Joined: 19 Jun 2003
Posts: 374
Location: Singapore
roticv 01 Nov 2004, 17:20
Prefix... stuff like 67h, 66h etc etc
Post 01 Nov 2004, 17:20
View user's profile Send private message Visit poster's website MSN Messenger Reply with quote
vbVeryBeginner



Joined: 15 Aug 2004
Posts: 884
Location: \\world\asia\malaysia
vbVeryBeginner 01 Nov 2004, 23:21
i just got this from the Intel 386 manual
Code:
+---------------+---------------+---------------+---------------+
|  INSTRUCTION  |   ADDRESS-    |    OPERAND-   |   SEGMENT     |
|    PREFIX     |  SIZE PREFIX  |  SIZE PREFIX  |   OVERRIDE    |
+---------------+---------------+---------------+---------------+
|     0 OR 1         0 OR 1           0 OR 1         0 OR 1     |
+---------------------------------------------------------------+
|                        NUMBER OF BYTES                        |
+---------------------------------------------------------------+

+----------+-----------+-------+------------------+-------------+
|  OPCODE  |  MODR/M   |  SIB  |   DISPLACEMENT   |  IMMEDIATE  |
|          |           |       |                  |             |
+----------+-----------+-------+------------------+-------------+
|  1 OR 2     0 OR 1    0 OR 1      0,1,2 OR 4       0,1,2 OR 4 |
+---------------------------------------------------------------+
|                        NUMBER OF BYTES                        |
+---------------------------------------------------------------+
    


scientica wrote:

but doesn't the 8086 have prefixes? (or this a feature of it's decendants, 80[23]86?)


i am confuse already, coz the book i read didn't have the stuff like instruction prefix Sad maybe old x86 don't have those prefixes Confused
Post 01 Nov 2004, 23:21
View user's profile Send private message Visit poster's website Reply with quote
S.T.A.S.



Joined: 09 Jan 2004
Posts: 173
Location: Ru#27
S.T.A.S. 02 Nov 2004, 05:49
You might want to take a look at Opcodes Book by The Svin. (79.23 kb, rar archive)
Post 02 Nov 2004, 05:49
View user's profile Send private message Reply with quote
roticv



Joined: 19 Jun 2003
Posts: 374
Location: Singapore
roticv 02 Nov 2004, 06:45
Actually to say the truth, I learnt opcode from reading The Svin's tutorial. The original tutorial was posted on the win32asmcommunity board...

I am not very sure about 8086 instruction set, but please enlighten me on whether segment override is present in 8086. If it exist, that means the prefix must have existed in 8086.
Post 02 Nov 2004, 06:45
View user's profile Send private message Visit poster's website MSN Messenger Reply with quote
S.T.A.S.



Joined: 09 Jan 2004
Posts: 173
Location: Ru#27
S.T.A.S. 02 Nov 2004, 07:47
Yeah, The Svin's tutorial is very nice, the link above contains topics from win32asmcommunity collected in one doc file. (may be it's also available for download from somewhere else, I don't know..)
Post 02 Nov 2004, 07:47
View user's profile Send private message Reply with quote
vbVeryBeginner



Joined: 15 Aug 2004
Posts: 884
Location: \\world\asia\malaysia
vbVeryBeginner 02 Nov 2004, 14:53
this is from the svin .doc file

Introduction to opcode logical blocks

there are 6 logical blocks that might be used in opcode.
Important thing is not only names and meaning of them, but also the order of them.
Here they are:
1.Prefixes
2.Code
3.byte mod r/m
4.byte sib
5.offset in command
6.imm. operand.

Not necessarily all of them are used.
But one block is used always
it is block 2 -> CODE.
Post 02 Nov 2004, 14:53
View user's profile Send private message Visit poster's website Reply with quote
MazeGen



Joined: 06 Oct 2003
Posts: 977
Location: Czechoslovakia
MazeGen 05 Nov 2004, 19:55
roticv wrote:
I am not very sure about 8086 instruction set, but please enlighten me on whether segment override is present in 8086. If it exist, that means the prefix must have existed in 8086.


The following prefixes are in 8086 valid:

REP/REPcc
LOCK
Segment override, except FS: and GS: (they are 386+)
______________________________________________

BTW, according to official references, there are also 3-byte opcodes...

_________________
x86asm.net
Post 05 Nov 2004, 19:55
View user's profile Send private message Visit poster's website Reply with quote
simpaticool



Joined: 12 Jan 2005
Posts: 16
Location: Romania
simpaticool 03 Feb 2005, 08:10
Hi!
My name is Teo, aka as Simpaticool (SIMPle And TIny
COdes On Line)
Now I'm going to create an OS. And I was wondering if I can
make an asm language for it.
But I didn't know how does asm intructions look in HEX.
This tutorial is a good, a very good idea.

Thanks! Very Happy
Post 03 Feb 2005, 08:10
View user's profile Send private message Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.