flat assembler
Message board for the users of flat assembler.

Index > Linux > Linking object files: do you use ld or gcc?

Goto page 1, 2  Next
Author
Thread Post new topic Reply to topic
buzzkill



Joined: 15 Mar 2009
Posts: 111
Location: the nether lands
buzzkill 15 Mar 2009, 22:46
When I assembled and linked (using ld like I always do) listing.asm (in fasm/tools), I got a segfault when running the program. When debugging, I found the stack was 'off', ie the program assumed argc on [esp+4] and argv[] on [esp+8] instead of argc on [esp] and argv[] on [esp+4]. I then used gcc to create a binary from the object file (instead of ld), and then the program worked as expected.

This surprised me because I always thought asm coders are control freaks Wink who don't like stuff in their executables that they didn't put there themselves. Using gcc instead of ld almost doubles the size of the program, and puts a lot of startup/exit-code in there, and also gives you a (slightly) different stack upon program entry. Now, in my own programs (just a few simple ones so far Smile ) I assume a standard program entry, ie no gcc stuff (like ctors/dtors etc) in there, but that means that if someone else uses gcc instead of ld to create a binary from my source, they will have similar problems with eg the stack.

So my question to all of you is: do you use either ld or gcc on your fasm-generated object files, and why that one and not the other? What, if anything, do you assume about eg the stack (or anything else) upon program entry?

Any insights are appreciated Smile
Post 15 Mar 2009, 22:46
View user's profile Send private message Reply with quote
vid
Verbosity in development


Joined: 05 Sep 2003
Posts: 7105
Location: Slovakia
vid 16 Mar 2009, 00:45
Quote:
So my question to all of you is: do you use either ld or gcc on your fasm-generated object files, and why that one and not the other? What, if anything, do you assume about eg the stack (or anything else) upon program entry?

Any insights are appreciated

Insight is that control freaks usually use 'format ELF executable', eg. they use FASM to produce directly executable file, not object that needs to linked. Smile

Does this mean that ld adds some extra initialization code, which you didn't want there? I haven't used it with asm programs for quite some time, so I don't remember how exactly I linked my all-asm apps.
Post 16 Mar 2009, 00:45
View user's profile Send private message Visit poster's website AIM Address MSN Messenger ICQ Number Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20343
Location: In your JS exploiting you and your system
revolution 16 Mar 2009, 01:20
vid wrote:
Insight is that control freaks usually use 'format ELF executable', eg. they use FASM to produce directly executable file, not object that needs to linked.
And real control freaks use format binary.
Post 16 Mar 2009, 01:20
View user's profile Send private message Visit poster's website Reply with quote
buzzkill



Joined: 15 Mar 2009
Posts: 111
Location: the nether lands
buzzkill 16 Mar 2009, 01:21
When you use 'format ELF executable', fasm generates an executable without sections, so none of the standard gnu/linux tools can deal with it Sad

Here's what happens after you assemble the hello.asm (from examples/elfexe/ directory) with "fasm hello.asm" :

Code:
$ objdump -d ./hello

./hello:     file format elf32-i386

    


-> So no disassembly...

Code:
$ readelf -a ./hello
ELF Header:
  Magic:   7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00 
  Class:                             ELF32
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              EXEC (Executable file)
  Machine:                           Intel 80386
  Version:                           0x1
  Entry point address:               0x8048074
  Start of program headers:          52 (bytes into file)
  Start of section headers:          0 (bytes into file)
  Flags:                             0x0
  Size of this header:               52 (bytes)
  Size of program headers:           32 (bytes)
  Number of program headers:         2
  Size of section headers:           40 (bytes)
  Number of section headers:         0
  Section header string table index: 0

There are no sections in this file.

There are no sections in this file.

Program Headers:
  Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
  LOAD           0x000074 0x08048074 0x08048074 0x0001f 0x0001f R E 0x1000
  LOAD           0x000093 0x08049093 0x08049093 0x0000d 0x0000d RW  0x1000

There is no dynamic section in this file.

There are no relocations in this file.

There are no unwind sections in this file.

No version information found in this file.
    


-> Note the "There are no sections in this file.", also no symbols...

Code:
$ agdb ./hello
GNU gdb 6.3
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i686-pc-linux-gnu"...(no debugging symbols found)
Using host libthread_db library "/lib/libthread_db.so.1".

(gdb) disas start
No symbol table is loaded.  Use the "file" command.
(gdb) disas 0x8048074
No function contains specified address.

(gdb) q
    


-> Note how we can't disassemble here either, not by using the symbolic entry point from the program (start), and not by using the entry point address either (see "Entry point address:" line in readelf output).

So it appears that the way fasm generates its executables is not very compatible with the way things are normally done in linux, and all your linux tools don't work (well) with fasm-generated executables Sad That's the reason I define the needed sections myself and use fasm to generate an object file, which I then link with ld, the linux linker, to produce a "normal" executable (no offense Smile )

For the record, this is what a hello.asm looks like when I do it:
Code:
; assemble/link:
; $ fasm hello2.asm hello2.o
; $ ld hello2.o -o hello2

format ELF

section '.text' executable

public _start
_start:
    mov     eax, 4
    mov     ebx, 1
    mov     ecx, msg
    mov     edx, msg.len
    int      80h
    mov     eax, 1
    mov     ebx, 0
    int      80h

section '.data' writeable

msg         db "It works!", 0Ah
.len        = $-msg
    


This defines two standard sections and a standard entry point and is usable in gdb and all other linux tools.

I have to say, I'm surprised that other linux asm coders don't do it this way. Can I ask you what tools you use for debugging etc your fasm programs?
Post 16 Mar 2009, 01:21
View user's profile Send private message Reply with quote
buzzkill



Joined: 15 Mar 2009
Posts: 111
Location: the nether lands
buzzkill 16 Mar 2009, 01:26
vid wrote:

Does this mean that ld adds some extra initialization code, which you didn't want there?


Forgot to answer that one Smile
No, I mean the opposite, linking with gcc adds (a lot of) stuff I don't want or need, and linking with ld doesn't. The fact that in a gcc-generated executable the stack seems "off" to me, is precisely because of the fact that gcc adds startup code.
Post 16 Mar 2009, 01:26
View user's profile Send private message Reply with quote
buzzkill



Joined: 15 Mar 2009
Posts: 111
Location: the nether lands
buzzkill 16 Mar 2009, 01:30
revolution wrote:
And real control freaks use format binary.


Laughing Yep, that'll take care of all that unwanted fluff in your binaries Smile It's just that getting your OS to load your programs gets to be such a hassle Smile
Post 16 Mar 2009, 01:30
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20343
Location: In your JS exploiting you and your system
revolution 16 Mar 2009, 01:35
And hyper control-freaks write their own OS (with format binary of course) so loading is not a problem.
Post 16 Mar 2009, 01:35
View user's profile Send private message Visit poster's website Reply with quote
vid
Verbosity in development


Joined: 05 Sep 2003
Posts: 7105
Location: Slovakia
vid 16 Mar 2009, 11:55
buzzkill: you can create executables with sections, just as you create objects with sections.

Code:
format ELF executable
entry start

segment executable

start:
...

segment readable writeable
    
Post 16 Mar 2009, 11:55
View user's profile Send private message Visit poster's website AIM Address MSN Messenger ICQ Number Reply with quote
vid
Verbosity in development


Joined: 05 Sep 2003
Posts: 7105
Location: Slovakia
vid 16 Mar 2009, 11:57
You should read TFM more carefully next time:
Quote:
To create executable file, follow the format choice directive with the executable keyword. It allows to use entry directive followed by the value to set as entry point of program. On the other hand it makes extrn and public directives unavailable, and instead of section there should be the segment directive used, followed only by one or more segment permission flags. The origin of segment is aligned to page (4096 bytes), and available flags for are: readable, writeable and executable.
Post 16 Mar 2009, 11:57
View user's profile Send private message Visit poster's website AIM Address MSN Messenger ICQ Number Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8354
Location: Kraków, Poland
Tomasz Grysztar 16 Mar 2009, 12:32
vid: The ELF segments are a different entity than sections. See ELF specification, section "Program Loading".
Post 16 Mar 2009, 12:32
View user's profile Send private message Visit poster's website Reply with quote
vid
Verbosity in development


Joined: 05 Sep 2003
Posts: 7105
Location: Slovakia
vid 16 Mar 2009, 13:17
Hmm, I never noticed this. So, since buzzkill says standard linux tools expect executables to have sections, obvious question pops out: how do segments and sections mix up in ELF executables?
Post 16 Mar 2009, 13:17
View user's profile Send private message Visit poster's website AIM Address MSN Messenger ICQ Number Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8354
Location: Kraków, Poland
Tomasz Grysztar 16 Mar 2009, 13:35
vid wrote:
obvious question pops out: how do segments and sections mix up in ELF executables?
ELF specification wrote:
Sections hold the bulk of object file information for the linking view (...). Files used during linking must have a section header table; other object files may or may not have one.
Thus the section table is only used for linking, and the program loader just ignores it.

And, as you can see in the attached figure, the run-time segments usually are "larger unit" than sections, they group the sections of the same attributes (like writeable).


Description: Figure 1-1 from ELF specification
Filesize: 23.21 KB
Viewed: 22529 Time(s)

elf_views.jpg


Post 16 Mar 2009, 13:35
View user's profile Send private message Visit poster's website Reply with quote
vid
Verbosity in development


Joined: 05 Sep 2003
Posts: 7105
Location: Slovakia
vid 16 Mar 2009, 13:59
I see. Soooo, why doesn't FASM then support 'section' in ELF executable? Internal problem?
Post 16 Mar 2009, 13:59
View user's profile Send private message Visit poster's website AIM Address MSN Messenger ICQ Number Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8354
Location: Kraków, Poland
Tomasz Grysztar 16 Mar 2009, 14:04
vid wrote:
I see. Soooo, why doesn't FASM then support 'section' in ELF executable? Internal problem?

Just lack of necessity. The Unix/Linux systems are generally equipped with linker, so it was object format that was my main concern here. The executable variant is just a toy for generating really simple executables, maximally stripped (that is, containing only the obligatory parts).
Post 16 Mar 2009, 14:04
View user's profile Send private message Visit poster's website Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8354
Location: Kraków, Poland
Tomasz Grysztar 16 Mar 2009, 14:42
buzzkill wrote:
Code:
$ agdb ./hello
GNU gdb 6.3
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i686-pc-linux-gnu"...(no debugging symbols found)
Using host libthread_db library "/lib/libthread_db.so.1".

(gdb) disas start
No symbol table is loaded.  Use the "file" command.
(gdb) disas 0x8048074
No function contains specified address.

(gdb) q
    


-> Note how we can't disassemble here either, not by using the symbolic entry point from the program (start), and not by using the entry point address either (see "Entry point address:" line in readelf output).

So it appears that the way fasm generates its executables is not very compatible with the way things are normally done in linux (...)

Note that the very similar thing will happen with your ld-generated executable, when you run "strip" (which is "standard" tool) on in first. So it's not that these files are "unstandard", they are just not so suitable for the debugging purposes, etc.
Post 16 Mar 2009, 14:42
View user's profile Send private message Visit poster's website Reply with quote
buzzkill



Joined: 15 Mar 2009
Posts: 111
Location: the nether lands
buzzkill 16 Mar 2009, 15:46
Tomasz, first of all let me say thank you for creating fasm, it's a great assembler. And it's nice to see that you yourself get involved in these forums too.

Quote:
Note that the very similar thing will happen with your ld-generated executable, when you run "strip" (which is "standard" tool) on in first.


Yes, that's correct, but you would only run strip on an executable that's ready "to go out into the world", ie while developing/debugging/etc you wouldn't strip your binaries. (BTW, even "strip --strip-all" leaves some sections intact Smile ) This is like when programming C, you would always put debugging info into your executable while developing, and only when you're ready to ship the program, you'd leave debugging info out.

Quote:
Thus the section table is only used for linking, and the program loader just ignores it.


So this would mean that a fasm-generated executable can't be linked to anything else? For a noob (such as myself) this can be a real hassle, because something like a printf() is really useful when you don't have your own library of output functions yet. I've seen tutorials where they just have you use printf() to communicate with the outside world.


Anyway, I'm not trying to tell you how to do your job ofcourse, but maybe some explanation of this in the manual, and maybe an included example program linking to libc would make things a little clearer for newbies?


BTW Tomasz, would you have a look at my first question (in my original post)? I'd like to know why you assume what you do about the stack at program startup. Do you expect your users to use gcc to generate the executables? What, according to you, should be at [esp] (before argc) at startup?
Because even though you use the "section way" instead of the "format ELF executable" way in listing.asm (hope that makes sense to you Smile ), it will segfault for users who use ld to generate the listing executable. (Maybe a few words in the readme.txt about what we should do to your src to create an executable might help?)
Post 16 Mar 2009, 15:46
View user's profile Send private message Reply with quote
Endre



Joined: 29 Dec 2003
Posts: 215
Location: Budapest, Hungary
Endre 16 Mar 2009, 16:10
On linux you may give a chance to the gnu assembler. With intel syntax it's no longer so terrible although its syntax differs a bit from that of fasm. Recently I use it rather than fasm. You can debug your code with gdb, and in addition you can reduce its final size with elfkickers (sstrip). I appreciate that fasm's macro capabilities are far better but I practically never use them. On the other hand with gnu assembler you can apply cpp (the c preprocessor) to get those C-constants defined in any of the header files. With fasm it is almost impossible. Here the hello world application. The header inclusions are between .if/.endif because the assembler cannot process C-headers, and with this you can avoid compiler errors. The preprocessor however works fine thus your C-constants will be changed to the appropriate values.

Code:
/**
 * For debugging compile with
 * gcc -g -nostdlib hello_gcc.S -o hello_gcc
 *
 * or for releasing with
 *
 * gcc -s -nostdlib hello_gcc.S -o hello_gcc
 * sstrip ./hello_gcc
 */

        .if 0
#include <asm/unistd.h>
#include <unistd.h>
        .endif

        .line __LINE__
        .intel_syntax noprefix
        .globl _start
        .text
_start:
        nop
        mov     rax, __NR_write
        mov     rdi, STDOUT_FILENO
        mov     rsi, offset msg
        mov     rdx, offset msg_size
        syscall
        xor     rdi, rdi
        mov     rax, __NR_exit
        syscall

msg:
        .ascii "Hello 64-bit world!\n"
msg_size = . - msg
    


The nop at _start is there because gdb cannot stop at the first instruction so you have to set your break point at the second instruction. Here my gdb session:

Code:
GNU gdb 6.8
Copyright (C) 2008 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-pc-linux-gnu"...
(gdb) b 22
Breakpoint 1 at 0x4000b1: file hello_gcc.S, line 22.
(gdb) set radix 16
Input and output radices now set to decimal 16, hex 10, octal 20.
(gdb) r
Starting program: /home/kozmae/projects/asm/termios/hello_gcc 

Breakpoint 1, _start () at hello_gcc.S:22
22           mov     rax, __NR_write
Current language:  auto; currently asm
(gdb) disp $rax
1: $rax = 0x0
(gdb) disp $rdi
2: $rdi = 0x0
(gdb) disp $rsi
3: $rsi = 0x0
(gdb) disp $rdx
4: $rdx = 0x0
(gdb) s
23          mov     rdi, STDOUT_FILENO
4: $rdx = 0x0
3: $rsi = 0x0
2: $rdi = 0x0
1: $rax = 0x1
(gdb) 
24          mov     rsi, offset msg
4: $rdx = 0x0
3: $rsi = 0x0
2: $rdi = 0x1
1: $rax = 0x1
(gdb) 
    
Post 16 Mar 2009, 16:10
View user's profile Send private message Reply with quote
Endre



Joined: 29 Dec 2003
Posts: 215
Location: Budapest, Hungary
Endre 16 Mar 2009, 16:19
Oh sorry, now I see that you're using 32bit, so
Code:
/**
 * For debugging compile with
 * gcc -g -m32 -nostdlib hello_gcc.S -o hello_gcc
 *
 * or for releasing with
 *
 * gcc -s -m32 -nostdlib hello_gcc.S -o hello_gcc
 * sstrip32 ./hello_gcc
 */

        .if 0
#include <asm/unistd.h>
#include <unistd.h>
        .endif

        .line __LINE__
        .intel_syntax noprefix
        .globl _start
        .text
_start:
        nop
        mov     eax, __NR_write
        mov     ebx, STDOUT_FILENO
        mov     ecx, offset msg
        mov     edx, offset msg_size
        int     0x80
        xor     ebx, ebx
        mov     eax, __NR_exit
        int     0x80

msg:
        .ascii "Hello 32-bit world!\n"
msg_size = . - msg    
Post 16 Mar 2009, 16:19
View user's profile Send private message Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8354
Location: Kraków, Poland
Tomasz Grysztar 16 Mar 2009, 16:37
buzzkill wrote:
So this would mean that a fasm-generated executable can't be linked to anything else?

Yes, and that's why you should use object output for any serious project. As I said above, the executable output is just for the case, when you want to create a very simple, completely stripped executable containing just a bare code. Right, I should make it more clear in manual, that this format option is just a toy, really.

buzzkill wrote:
BTW Tomasz, would you have a look at my first question (in my original post)? I'd like to know why you assume what you do about the stack at program startup. Do you expect your users to use gcc to generate the executables? What, according to you, should be at [esp] (before argc) at startup? Because even though you use the "section way" instead of the "format ELF executable" way in listing.asm (hope that makes sense to you Smile ), it will segfault for users who use ld to generate the listing executable. (Maybe a few words in the readme.txt about what we should do to your src to create an executable might help?)

With libc package comes the README.TXT file...
readme from libc package wrote:
The fasm.o is an object file in ELF format. To get the final executable for
your system, you need to use the appropriate tool to link it with the C
library available on your platform. With the GNU tools it is enough to use
this command:

gcc fasm.o -o fasm

And listing.asm doesn't use "format ELF executable" - it uses object format, and should be linked in the same way, as fasm.o (I perhaps need to add this info to tools readme).
Note that all those program declare function "main", which implies that it should be linked just as a C program would be. This is done to ensure maximum portability.
Post 16 Mar 2009, 16:37
View user's profile Send private message Visit poster's website Reply with quote
buzzkill



Joined: 15 Mar 2009
Posts: 111
Location: the nether lands
buzzkill 16 Mar 2009, 16:55
Endre,

I have used gas (a little) before, Programming From The Ground Up uses it too. I've also used nasm before, but now that I'm once again starting to dabble with assembly, I thought I'd give fasm a try Smile .

I'm planning to have a look at fasm's macro capabilities, because I usually find that sort of stuff handy to "abstract away" some things like function entry/exit, locals and arguments on the stack etc, I used those with nasm too. As for using constants from std include files, for nasm I just created a .inc myself with some handy constants I used often, and I'm sure somebody who's good with eg Perl could whip up a script to convert the include files to something fasm understands. I haven't looked into all that with fasm, though.

In principle you can use any assembler with any OS ofcourse, and a lot of things (eg macros vs preprocessor) are down to personal preference I think. But gas is the de facto assembler on linux ofcourse because it's the backend for gcc (incidentally, people used to say that gas wasn't that suitable as a standalone assembler for precisely that reason).

Totally off-topic: I see you use int 80h syscalls. Any reason you don't use the more 'modern' sysenter way (through VDSO)?
Post 16 Mar 2009, 16:55
View user's profile Send private message Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  
Goto page 1, 2  Next

< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.