flat assembler
Message board for the users of flat assembler.

Index > Heap > paradox ?

Goto page 1, 2, 3  Next
Author
Thread Post new topic Reply to topic
roboticmehdi



Joined: 20 Apr 2011
Posts: 22
roboticmehdi
Hello everybody. I am student at university and i like programming although my major is mechanical engineering. So i got a question, I have been surfing around the internet and have noticed something that seemed very strange to me. Most of the assembly compilers around have been written in C. Well this this so weird, how a low level programming language can be written in higher level programming language. Im not talking about the IDE, im talking about the compiler itself, after all compilers are just executables right ? How is it possible ? Shocked
Post 21 Apr 2011, 10:25
View user's profile Send private message Reply with quote
mindcooler



Joined: 01 Dec 2009
Posts: 423
Location: Västerås, Sweden
mindcooler
Are you asking how it is possible that executables can be written in a high level programming language?
Post 21 Apr 2011, 10:54
View user's profile Send private message Visit poster's website MSN Messenger ICQ Number Reply with quote
roboticmehdi



Joined: 20 Apr 2011
Posts: 22
roboticmehdi
No, im asking how is possible that a low level programming language compiler is created using a high level programming language compiler. For example most of assembly compilers are created using c compiler. How is it possible?
Post 21 Apr 2011, 11:06
View user's profile Send private message Reply with quote
JohnFound



Joined: 16 Jun 2003
Posts: 3500
Location: Bulgaria
JohnFound
The compiler reads some text file, process it and writes another file. These are operations every language can do - HLL, or not. So it is possible to write compiler in HLL.

Of course, it is a shame for programmers that prefers to use high level language to write assembler compiler.
They plead his ignorance with nice words, about portability, easy support etc. but the truth is, they are assembly language illiterates and thus simply can't write such complex project in assembly language. Very Happy

Regards
Post 21 Apr 2011, 11:20
View user's profile Send private message Visit poster's website ICQ Number Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 17278
Location: In your JS exploiting you and your system
revolution
A brief history of software development:
  • Front panel switches to input code. Manually assembled
  • Front panel switches to input an assembler. Manually assembled.
  • Assembler to write C. Machine assembled.
  • C to write assembler. C compiled.


Last edited by revolution on 08 Jun 2011, 12:42; edited 1 time in total
Post 21 Apr 2011, 11:22
View user's profile Send private message Visit poster's website Reply with quote
roboticmehdi



Joined: 20 Apr 2011
Posts: 22
roboticmehdi
Yes, i understand what a compiler does. Yes, it reads some text file and creates another file which has the machine codes and can be executed by cpu. When we open that executable file the operating system loader loads it into ram and execution begins. Correct me if i am wrong so far. But here what i cannot understand:

lets say we want to write a simple assembly compiler using C compiler. our mini compiler will read the text file and create executable. and it will only be able to compile these words "mov ax,5" or "mov bx,6" or "add ax,bx". ( this is just an example, i dont know assembly very well so i may be wrong here, anyway.. ). so here our C source code of the mini compiler:

(note: i will write program in very simple language, dont regard it as pure C, but you know what i mean right? ):

=========================================
miniassemblycompiler.c
declare i as integer variable, assign it to 0
declare source as text file
declare destination as executable file
repeat
{
i=i+1;
read the i-th line of source
if the ith line is equal to "mov ax,5" then write "?1" to destination
if the ith line is equal to "mov bx,5" then write "?2" to destination
if the ith line is equal to "add ax,bx" then write "?3" to destination
}
until end of text file
=========================================

So does anybody knows what should be in the place of ?1 , ?2 and ?3 ???
I mean this the way compilers work, right ?
So what should i write instead of those question marks ??
So my question is how a compiler actually creates an executable file ?
Post 22 Apr 2011, 14:18
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 17278
Location: In your JS exploiting you and your system
revolution
Just download the manual from either Intel or AMD. All the opcodes are listed there.

For your particular example:
Code:
mov ax,5  ; B8 05 00
mov bx,5  ; BB 05 00
add ax,bx ; 01 D8    
Post 22 Apr 2011, 14:24
View user's profile Send private message Visit poster's website Reply with quote
JohnFound



Joined: 16 Jun 2003
Posts: 3500
Location: Bulgaria
JohnFound
Instead of ?1, ?2 and ?3 the compiler should write to the executable file several bytes of the instruction machine codes. These bytes are computed by the assembler depending on instruction opcode ("mov" in your case) and the operands addressing mode.
The machine code of the instruction may have one or several bytes, depending on instruction assembled.
Usually the machine codes are not random, but are constructed by some rules, that the assembler have to follow.
In your case, you should check first for "mov", then to call function that to analyze the operands and to compute the proper bytes of the instruction.

In order to understand how exactly the instructions are related to the machine code, you have to read the manuals for the particular processor. For x86 there is manuals on the intel web site.
But this knowledge is useful only if you want to write your own assembler.

The HLL compilers usually just generate assembler source and then pass it to the back-end assembler.
Post 22 Apr 2011, 14:34
View user's profile Send private message Visit poster's website ICQ Number Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 17278
Location: In your JS exploiting you and your system
revolution
JohnFound wrote:
The HLL compilers usually just generate assembler source and then pass it to the back-end assembler.
That doesn't work when the HLL program is the assembler. Infinite regression. roboticmehdi was correct, the C code (of the C based assembler) must write the binary output directly. It can't simply take the assembly source and forward it to the assembler; it is the assembler.
Post 22 Apr 2011, 15:04
View user's profile Send private message Visit poster's website Reply with quote
roboticmehdi



Joined: 20 Apr 2011
Posts: 22
roboticmehdi
is the compiler output called "object file" ? is it possible to create an executable without a compiler, just by writing the machine instructions into a file ? ( i know it should be hard, but i dont want to write a 3D crysis game using this method Very Happy , just want something that would print "Hello World" , i just wanna learn how this stuff works on basic level ). I looked in google a lot but hardly found anything impressive. If you know good source i would be thankful if you give the link. ( my operating system is ubuntu 10.10 , but i sometimes use MS-DOS 6.22 as a live cd )
Post 22 Apr 2011, 15:10
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 17278
Location: In your JS exploiting you and your system
revolution
There is no technical reason why any assembler cannot create binary executable files directly. Just that most don't seem to be programmed for it. fasm, however, can generate binary executable files directly, as well as object files.

If you want to drive yourself crazy and make an executable file manually, then just start up your favourite hex editor and start typing.
Post 22 Apr 2011, 15:19
View user's profile Send private message Visit poster's website Reply with quote
roboticmehdi



Joined: 20 Apr 2011
Posts: 22
roboticmehdi
i wrote a simple "Hello World" C source code and compiled it using GCC, it created an executable file. i run that executable and everything is fine, it works as predicted, just types "Hello World" and stops. then i opened it with notepad and here is what i saw :

==========================================================

ELF    ƒ4 0 4  (    4 4€4€      4 44     € €        ŸŸ      ( (Ÿ(ŸÈ È    H HHD D   Qåtd   Råtd ŸŸì ì   /lib/ld-linux.so.2    GNU       GNU ƒ¢Hoÿ£”·â]±·«Õ¢óñn     ­KãÀ  .  )   Œ„   __gmon_start__ libc.so.6 _IO_stdin_used puts __libc_start_main GLIBC_2.0        ii
 @ 🠠     U‰åSƒìè [ÃX ‹“üÿÿÿ…Òtè èÙ è„ X[ÉÃÿ5øŸÿ%üŸ ÿ%  h éàÿÿÿÿ% h éÐÿÿÿÿ% h éÀÿÿÿ1í^‰áƒäðPTRhЃhàƒQVh´ƒè¿ÿÿÿôU‰åSƒì€=  u?¡ » ŸëŸÁûƒë9Øs¶ ƒÀ£ ÿ…Ÿ¡ 9ØrèÆ ƒÄ[]Ít& ¼' U‰åƒì¡$Ÿ…Àt¸ …Àt Ç$$ŸÿÐÉÐU‰åƒäðƒìÇ$„è'ÿÿÿ¸ ÉÃU‰å]Ít& ¼' U‰åWVSèO Ã  ƒìè—þÿÿ» ÿÿÿƒ ÿÿÿ)ÇÁÿ…ÿt$1ö‹E‰D$‹E ‰D$‹E‰$ÿ”³ ÿÿÿƒÆ9þrÞƒÄ[^_]Ë$АU‰åSƒì¡Ÿƒøÿt»ŸfƒëÿЋƒøÿuôƒÄ[]АU‰åSƒìè [Ã| è¬þÿÿY[Éà   Hello World ÿÿÿÿ ÿÿÿÿ   ‚
l„õþÿoŒ ü ¬
J    ôŸ     x‚ p‚    þÿÿoP‚ÿÿÿo ðÿÿoF‚ (Ÿ Ö‚æ‚ö‚ GCC: (Ubuntu/Linaro 4.4.4-14ubuntu5) 4.4.5 .symtab .strtab .shstrtab .interp .note.ABI-tag .note.gnu.build-id .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rel.dyn .rel.plt .init .text .fini .rodata .eh_frame .ctors .dtors .jcr .dynamic .got .got.plt .data .bss .comment    44   #   HH  1   hh $  D öÿÿo ŒŒ    N  ¬¬ P     V   üü J  ^ ÿÿÿo F‚F
   k þÿÿo P‚P    z  p‚p     ƒ  x‚x     Œ   ‚ 0  ‡   À‚À @   ’   ƒ  l  ˜   l„l   ž   ˆ„ˆ   ¦   œ„œ   °   Ÿ   ·   Ÿ   ¾   $Ÿ$   à   (Ÿ( È    Ì   ðŸð    Ñ   ôŸô    Ú        à       å  0  +     ? î    ¸   ,    ¸   4   H   h   Œ   ¬   ü   F‚   P‚   p‚  x‚ 
‚  À‚  ƒ 
l„   ˆ„   œ„   Ÿ   Ÿ   $Ÿ   (Ÿ   ðŸ   ôŸ               ñÿ Ÿ    Ÿ   ( $Ÿ   5 0ƒ 
K     Z     h ƒ 
  ñÿt Ÿ    œ„    $Ÿ   › @„ 
±  ñÿ¾ ôŸ   Ô Ÿ  å Ÿ  ø (Ÿ        Ѓ 
 ƒ 
# 2 F ˆ„   M l„   S  p Œ„        Œ    ™ Ÿ  ¦ àƒZ 
¶    ñÿÂ    ñÿÇ  ×    ñÿÞ :„ 
õ ´ƒ 
ú ‚  crtstuff.c __CTOR_LIST__ __DTOR_LIST__ __JCR_LIST__ __do_global_dtors_aux completed.7065 dtor_idx.7067 frame_dummy __CTOR_END__ __FRAME_END__ __JCR_END__ __do_global_ctors_aux helloworld.c _GLOBAL_OFFSET_TABLE_ __init_array_end __init_array_start _DYNAMIC data_start __libc_csu_fini _start __gmon_start__ _Jv_RegisterClasses _fp_hw _fini __libc_start_main@@GLIBC_2.0 _IO_stdin_used __data_start __dso_handle __DTOR_END__ __libc_csu_init __bss_start _end puts@@GLIBC_2.0 _edata __i686.get_pc_thunk.bx main _init

=============================================================

what the hack is going on ?
Post 22 Apr 2011, 15:40
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 17278
Location: In your JS exploiting you and your system
revolution
Read up on the ELF (executable and linkable format). It is a binary format so your text editor won't make much sense of it. Use a disassembler to see the code.
Post 22 Apr 2011, 16:05
View user's profile Send private message Visit poster's website Reply with quote
JohnFound



Joined: 16 Jun 2003
Posts: 3500
Location: Bulgaria
JohnFound
Hm, roboticmehdi, I have an impression, that you don't distinguish binary and text files. What is your background in programming?
Post 22 Apr 2011, 16:30
View user's profile Send private message Visit poster's website ICQ Number Reply with quote
roboticmehdi



Joined: 20 Apr 2011
Posts: 22
roboticmehdi
not very good. i know several programming languages like c,basic,pascal and very very little assembly( mostly copy-paste stuff in assembly when i need it ). thats why im here trying to understand stuff. im good at programming but i dont know much about technical details. seriously whats the difference between a binary file and text file. why cant a text editor display binary file ?
Post 22 Apr 2011, 19:47
View user's profile Send private message Reply with quote
roboticmehdi



Joined: 20 Apr 2011
Posts: 22
roboticmehdi
hey i think i got it. text editor understands all those 1 and 0 as lettes. but a hex editor translates them into hex format and then displays. correct me pls if im wrong.
Post 22 Apr 2011, 19:51
View user's profile Send private message Reply with quote
roboticmehdi



Joined: 20 Apr 2011
Posts: 22
roboticmehdi
actually all files are binary files, but text editors display those binaries as letters, numbers, symbols,etc... on the other hand hex editor displays them ax hexadecimal numbers.
Post 22 Apr 2011, 20:04
View user's profile Send private message Reply with quote
JohnFound



Joined: 16 Jun 2003
Posts: 3500
Location: Bulgaria
JohnFound
Yes, you advance pretty fast. Keep it this way. Smile
Post 23 Apr 2011, 06:19
View user's profile Send private message Visit poster's website ICQ Number Reply with quote
roboticmehdi



Joined: 20 Apr 2011
Posts: 22
roboticmehdi
What is the difference between opcode and machine code ? but dont copy paste from wikipedia, i read it and did not understand much Smile
Post 23 Apr 2011, 10:24
View user's profile Send private message Reply with quote
JohnFound



Joined: 16 Jun 2003
Posts: 3500
Location: Bulgaria
JohnFound
For me, the opcode is the textual representation of the machine instruction with or without operands. For example "mov" or "inc". Machine code is the actual bytes of the instruction. Note that one opcode can have different machine codes depending on operands and the current context (for example 32bit of 16bit).
Opcodes are for human reading, machine code is for execution from the processor.
Post 23 Apr 2011, 10:43
View user's profile Send private message Visit poster's website ICQ Number Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  
Goto page 1, 2, 3  Next

< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You can attach files in this forum
You can download files in this forum


Copyright © 1999-2020, Tomasz Grysztar.

Powered by rwasa.