flat assembler
Message board for the users of flat assembler.

Index > Compiler Internals > The ELF bug strikes back...

Author
Thread Post new topic Reply to topic
JohnFound



Joined: 16 Jun 2003
Posts: 3499
Location: Bulgaria
JohnFound 07 Nov 2013, 14:23
Well, I got another problem with the ELF files not loaded in Linux on some code size. The layout of the problem file is the following:
Code:
        format ELF executable 3
        entry start

        segment readable writeable executable

start:
        times $ed0 db 'C'

        segment readable writeable

        times $48 db 'D'
        rb $0c

        segment interpreter readable

        db '******************', 0

        segment dynamic readable

        db $50 dup('.')

        segment readable writeable

        db $9f dup '?'    

The readelf utility gives the following headers:
Code:
Program Headers:
  Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
  LOAD           0x0000d4 0x080480d4 0x080480d4 0x00ed0 0x00ed0 RWE 0x1000
  LOAD           0x000fa4 0x08049fa4 0x08049fa4 0x00048 0x00054 RW  0x1000
  INTERP         0x000fec 0x0804afec 0x0804afec 0x00013 0x00013 R   0x1
      [Requesting program interpreter: ******************]
  DYNAMIC        0x000fff 0x0804afff 0x0804afff 0x00050 0x00050 R   0x1
  LOAD           0x00104f 0x0804a04f 0x0804a04f 0x0009f 0x0009f RW  0x1000    

What warns me is that the last segment starts at address $0804a04f, while adding $50 to the address of the previous segment $0804afff + $50 = $804b04f.

_________________
Tox ID: 48C0321ADDB2FE5F644BB5E3D58B0D58C35E5BCBC81D7CD333633FEDF1047914A534256478D9
Post 07 Nov 2013, 14:23
View user's profile Send private message Visit poster's website ICQ Number Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8353
Location: Kraków, Poland
Tomasz Grysztar 07 Nov 2013, 18:40
Edit: initially I thought that this address overlapping was a bug, but then I recalled that it was intentional and valid, and the problem lies elsewhere. See below.

Please try the following fix and let me know if this is enough to solve your problems.

In FORMATS.INC at line 3943 there is this block of instructions:
Code:
        mov     eax,[ebx+8]
        cmp     byte [ebx],1
        jne     elf_segment_position_ok
        add     eax,[ebx+14h]
        add     eax,0FFFh    
Please move the CMP+JNE pair one instruction further, so that this fragment becomes:
Code:
        mov     eax,[ebx+8]
        add     eax,[ebx+14h]
        cmp     byte [ebx],1
        jne     elf_segment_position_ok
        add     eax,0FFFh    


Last edited by Tomasz Grysztar on 07 Nov 2013, 21:32; edited 1 time in total
Post 07 Nov 2013, 18:40
View user's profile Send private message Visit poster's website Reply with quote
JohnFound



Joined: 16 Jun 2003
Posts: 3499
Location: Bulgaria
JohnFound 07 Nov 2013, 19:42
The good news are that now the readelf reports more reasonable information:
Code:
Program Headers:
  Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
  LOAD           0x0000d4 0x080480d4 0x080480d4 0x00ed0 0x00ed0 RWE 0x1000
  LOAD           0x000fa4 0x08049fa4 0x08049fa4 0x00048 0x00054 RW  0x1000
  INTERP         0x000fec 0x0804afec 0x0804afec 0x00013 0x00013 R   0x1
      [Requesting program interpreter: /lib/ld-linux.so.2]
  DYNAMIC        0x000fff 0x0804afff 0x0804afff 0x00050 0x00050 R   0x1
  LOAD           0x00104f 0x0804b04f 0x0804b04f 0x0009f 0x0009f RW  0x1000    


The bad news are that the binary still can not be loaded by the loader and gives segmentation fault, on attempt to read from address $0804afec - i.e. from the interpreter segment. It says there is no memory allocated. Crying or Very sad

I managed to run it, but only after adding random amount of filler bytes ( $f00 actually) in the code segment. The readelf report in this case looks following way:
Code:
Program Headers:
  Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
  LOAD           0x0000d4 0x080480d4 0x080480d4 0x01dd0 0x01dd0 RWE 0x1000
  LOAD           0x001ea4 0x0804aea4 0x0804aea4 0x00048 0x00054 RW  0x1000
  INTERP         0x001eec 0x0804beec 0x0804beec 0x00013 0x00013 R   0x1
      [Requesting program interpreter: /lib/ld-linux.so.2]
  DYNAMIC        0x001eff 0x0804beff 0x0804beff 0x00050 0x00050 R   0x1
  LOAD           0x001f4f 0x0804bf4f 0x0804bf4f 0x0009f 0x0009f RW  0x1000    


If this information can't help, I may try tomorrow to create some really working small example that to reveal this behavior...

_________________
Tox ID: 48C0321ADDB2FE5F644BB5E3D58B0D58C35E5BCBC81D7CD333633FEDF1047914A534256478D9
Post 07 Nov 2013, 19:42
View user's profile Send private message Visit poster's website ICQ Number Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8353
Location: Kraków, Poland
Tomasz Grysztar 07 Nov 2013, 20:14
It appears that this problem manifests itself when interpreter segment is inside a page that does not overlap with any segment of LOAD type. Since only the segments of LOAD type need to be actually mapped into memory, this is not a complete surprise - but it does not look right either.
Post 07 Nov 2013, 20:14
View user's profile Send private message Visit poster's website Reply with quote
JohnFound



Joined: 16 Jun 2003
Posts: 3499
Location: Bulgaria
JohnFound 07 Nov 2013, 20:26
Hm, my knowledge of ELF format is really close to zero. How the loader can load the binary without mapping it into the memory???
Post 07 Nov 2013, 20:26
View user's profile Send private message Visit poster's website ICQ Number Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8353
Location: Kraków, Poland
Tomasz Grysztar 07 Nov 2013, 20:42
The special segments like INTERP or DYNAMIC are meta-information that does not need to be present at run-time - that's why by specification they are different types from the LOAD type, which is actual program code/data that needs to be loaded to memory for the run-time. So loader could load the special segments into its private (later discarded) memory only for the purpose of loading and preparing the executable. But this problem hints that in case of Linux loader it tries to access it within the memory mapped for LOAD segments. Maybe it has some additional assumptions.
Post 07 Nov 2013, 20:42
View user's profile Send private message Visit poster's website Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8353
Location: Kraków, Poland
Tomasz Grysztar 07 Nov 2013, 20:51
Wait, I just wanted to add that I never tried to create executables where INTERP segment is not at the beginning, and then I remembered that there was in fact a very good reason for it:
ELF Specification wrote:
PT_INTERP The array element specifies the location and size of a null-terminated path name to invoke as an interpreter. This segment type is meaningful only for executable files (though it may occur for shared objects); it may not occur more than once in a file. If it is present, it must precede any loadable segment entry.
Perhaps fasm should check for this condition and show an error when you try to put interpreter after some code or data.
Post 07 Nov 2013, 20:51
View user's profile Send private message Visit poster's website Reply with quote
JohnFound



Joined: 16 Jun 2003
Posts: 3499
Location: Bulgaria
JohnFound 07 Nov 2013, 21:19
Ah, surprise! Smile So, I have to put the interpreter segment at the beginning right? But how about the dynamic segment that follows the interpreter one?
Post 07 Nov 2013, 21:19
View user's profile Send private message Visit poster's website ICQ Number Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8353
Location: Kraków, Poland
Tomasz Grysztar 07 Nov 2013, 21:30
AFAIK specification states no rules for the placement of DYNAMIC segment like it does for INTERP, and when I look at some of the executables that come with Linux, they frequently have DYNAMIC stuck in the middle between some LOAD segments, so that should not be a problem.
Post 07 Nov 2013, 21:30
View user's profile Send private message Visit poster's website Reply with quote
JohnFound



Joined: 16 Jun 2003
Posts: 3499
Location: Bulgaria
JohnFound 08 Nov 2013, 07:11
After moving the interpreter segment at the very beginning of the executable (and with the suggested changes in format.inc), the only change it make is that the segmentation fault is generated at the start of the dynamic segment...
Post 08 Nov 2013, 07:11
View user's profile Send private message Visit poster's website ICQ Number Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8353
Location: Kraków, Poland
Tomasz Grysztar 08 Nov 2013, 09:05
So it probably needs to have the meta-data segments to reside in the memory prepared when mapping LOAD segments. The change that I proposed and you tested might might have been a move in a wrong direction - we probably should not remove the overlapping, but refine it (so that the DYNAMIC segment becomes embedded completely inside the LOAD segment).

Still it would be better to know what are the exact loader's expectations. The ELF Specification is a bit fuzzy on specific details and there are many variants that appear to be valid when interpreting specs, but are not accepted by specific implementations (like Linux executable loader).
Post 08 Nov 2013, 09:05
View user's profile Send private message Visit poster's website Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.