flat assembler
Message board for the users of flat assembler.

Index > Main > assembly code to machine code

Author
Thread Post new topic Reply to topic
wisepenguin



Joined: 30 Mar 2005
Posts: 129
wisepenguin
i was experimenting with fasm by just putting in assembly statements into an empty file, compiling it and viewing the code in a hex editor.
i was doing this to see how it works as im new to assembler.
anyway, i was also checking out the intel documentation and googling for x86 machine opcodes and the like.

test.asm
--------
push eax

when assembled produces a 2 byte file with these 2 bytes
0x66 0x50

i dont really understand how this is produced and i have checked the help file that comes with masm32 as it has an opcode and instruction table, like the one in the intel docs.

first i looked up the push instruction and it doesnt mention 0x66.

off to try and find some more info on it, i havent yet.
Post 23 Jul 2005, 13:07
View user's profile Send private message Reply with quote
decard



Joined: 11 Sep 2003
Posts: 1092
Location: Poland
decard
0x66 is not an instruction, but a prefix. By default fasm assumes that it should generate 16-bit code (use16), In this mode 0x50 is "push ax". Prefix before instruction makes CPU to interpret it as 32-bit instruction, "push eax".
Post 23 Jul 2005, 13:14
View user's profile Send private message Visit poster's website Reply with quote
wisepenguin



Joined: 30 Mar 2005
Posts: 129
wisepenguin
thankyou very much decard !

and if i change the source file to

use32
push eax

output file is 0x50

is there a number specified for each register ? as in the masm opcode table it says 50 +r
and if i change
push eax
to push ebx, or ecx

it changes to 53 or 51 etc

i have been looking for a table with the register values, i have looked at the fasm sources but its quite confusing to me.
hopefully as i get better i will understand.
Post 23 Jul 2005, 13:24
View user's profile Send private message Reply with quote
decard



Joined: 11 Sep 2003
Posts: 1092
Location: Poland
decard
Instruction encoding seems to be quite complicated at first look, but everything becomes clear after you realise that all of them are actually encoded in octal. I have great article that explains how to encode instructions, but I have only printed version, and I can't find it on the internet (actually I can't find the papers too Wink).

Intel manuals don't mention about octals, doesn't give any clue about instruction's real nature.

Here's a topic regarding the same problem: http://board.flatassembler.net/topic.php?t=1356
Post 23 Jul 2005, 13:50
View user's profile Send private message Visit poster's website Reply with quote
decard



Joined: 11 Sep 2003
Posts: 1092
Location: Poland
decard
Wow, I found the paper, typed a few words from it in Google and found the document Very Happy

http://www.dabo.de/ccc99/www.camp.ccc.de/radio/help.txt
Post 23 Jul 2005, 14:05
View user's profile Send private message Visit poster's website Reply with quote
wisepenguin



Joined: 30 Mar 2005
Posts: 129
wisepenguin
thanks for finding the link and posting it here.

also, in the masm32 opcodes help file i cant understand the use of the "/" character

pasted from the file...

89 / r MOV r/m32,r32 Move r32 to r/m32

its the 89 / r

because r/m32 means register or 32 bit memory location
Post 23 Jul 2005, 16:46
View user's profile Send private message Reply with quote
Raedwulf



Joined: 13 Jul 2005
Posts: 375
Location: United Kingdom
Raedwulf
Thanks decard for the document Smile Possibly will be useful in my DLL2INC project Smile
Post 23 Jul 2005, 16:53
View user's profile Send private message MSN Messenger Reply with quote
wisepenguin



Joined: 30 Mar 2005
Posts: 129
wisepenguin
the latest one i am trying to figure out is

use32
mov edx, [0xCD]

and produces

0x8B 0x15 0xCD 0x00 0x00 0x00

the 0x8B i can figure as the mov instruction for memory location to register from the masm help file, and the 0xCD as the address i specified.

but the 0x15 bit i cannot work out, if i change the register this value changes. but the values i got for registers from "$; CodeX 1.0 opcode map 1.0" are different.
i think this is to do with the encoding, and some other bits are modified which causes this value to be changed more dramatically.
Post 23 Jul 2005, 17:18
View user's profile Send private message Reply with quote
tom tobias



Joined: 09 Sep 2003
Posts: 1320
Location: usa
tom tobias
Decard, thank you very much, this is an excellent reference. Note to FASM architect:
Rb does not mean "RESERVE BYTE", in this article:

"As an example to see how this works, the mov instructions in octal are:
210 xrm mov Eb, Rb
211 xrm mov Ew, Rw
212 xrm mov Rb, Eb
...."
(and perhaps some other acronym would be more suitable than rb to reserve memory for a variable in FASM...)
Post 23 Jul 2005, 19:03
View user's profile Send private message Reply with quote
Llama Boy



Joined: 24 Jul 2005
Posts: 7
Location: Texas, USA
Llama Boy
i am working on a compiler that goes directly into machine code, and i am working out the stuff on paper, and by compiling stuff and writing down the machine code for different things that my compiler will need at first. i translated this into machine, and im lost at how it only trasfers this to dx:
Code:
jmp start
_out db 'output',10,13,'$'
start:
mov dx,_out
ret    


and it does all the storage for that, but when it moves it to the dx, it only transfers 2bytes (i know that is all it takes), what is it sending?

The hex for the move is "BA 02 01" 02, is the start of the string in code, but what is the 01?
Post 25 Jul 2005, 00:33
View user's profile Send private message Visit poster's website AIM Address Yahoo Messenger MSN Messenger Reply with quote
decard



Joined: 11 Sep 2003
Posts: 1092
Location: Poland
decard
Check it again, on my machine this instruction is assembled to "BA 02 00".
Post 25 Jul 2005, 07:24
View user's profile Send private message Visit poster's website Reply with quote
MazeGen



Joined: 06 Oct 2003
Posts: 977
Location: Czechoslovakia
MazeGen
wisepenguin wrote:
the latest one i am trying to figure out is

use32
mov edx, [0xCD]

and produces

0x8B 0x15 0xCD 0x00 0x00 0x00

the 0x8B i can figure as the mov instruction for memory location to register from the masm help file, and the 0xCD as the address i specified.

but the 0x15 bit i cannot work out, if i change the register this value changes. but the values i got for registers from "$; CodeX 1.0 opcode map 1.0" are different.
i think this is to do with the encoding, and some other bits are modified which causes this value to be changed more dramatically.

What exactly is unclear to you? The second byte, 0x15, is modr/m byte:
Code:
15 hex = 00010101 bin

00  - Mod, with r/m field 101 means [disp32]
010 - register EDX
101 - r/m
    
Post 25 Jul 2005, 11:41
View user's profile Send private message Visit poster's website Reply with quote
wisepenguin



Joined: 30 Mar 2005
Posts: 129
wisepenguin
MazeGen, thankyou.
i have been searching for how to encode the instructions,
and you have just explained it for my example.

now i got to find a table of values so i can fill in the "bits" myself.
once again, cheers Smile
Post 25 Jul 2005, 13:12
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 17715
Location: In your JS exploiting you and your system
revolution
wisepenguin: Visit the AMD or Intel website and download the processor software developer manuals. It is all explained in superb detail. Sorry I don't have the URL, but with a little navigating I think it should be easy to find. Good luck.

If you already have the manuals, then look for the opcode map and the instruction format sections.
Post 25 Jul 2005, 13:57
View user's profile Send private message Visit poster's website Reply with quote
Llama Boy



Joined: 24 Jul 2005
Posts: 7
Location: Texas, USA
Llama Boy
> Check it again, on my machine this instruction is assembled to "BA 02 00".

I forgot to put the heading on there, it is:

org 100h
use16

and i have compiled and checked many times, and it is still "BA 02 01".

_________________
Image
donut < llama < az
http://llama.computerboffin.com/ <-signup
http://llama.computerboffin.com/dotvs/
Post 25 Jul 2005, 22:50
View user's profile Send private message Visit poster's website AIM Address Yahoo Messenger MSN Messenger Reply with quote
MazeGen



Joined: 06 Oct 2003
Posts: 977
Location: Czechoslovakia
MazeGen
BA 02 01 means

Code:
BYTE 0xBA - MOV DX,
WORD 0x0102 - 102h
    


org 100h + sizeof (jmp short) = 102h

- assembled code is correct.
Post 25 Jul 2005, 22:56
View user's profile Send private message Visit poster's website Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2020, Tomasz Grysztar. Also on GitHub, YouTube, Twitter.

Website powered by rwasa.