flat assembler
Message board for the users of flat assembler.

Index > Compiler Internals > format elf(64) executable segment alignment

Author
Thread Post new topic Reply to topic
s54820



Joined: 20 Oct 2023
Posts: 4
s54820 20 Oct 2023, 23:20
A bug in the ELF formatter breaks alignment and produces holes in the ELF memory representation.

fasm manual wrote:
The origin of a non-special segment is aligned to page (4096 bytes).

However, when I compile the following code:
Code:
format elf executable 3 at 0x10000
entry start
segment readable executable
start: dd 0xaaaaaaaa
segment readable
dd 0xbbbbbbbb
    

I get the following output from readelf -e:
Code:
Entry point 0x10074
Program Headers:
  Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
  LOAD           0x000000 0x00010000 0x00010000 0x00078 0x00078 R E 0x1000
  LOAD           0x000078 0x00011078 0x00011078 0x00004 0x00004 R   0x1000
    

The first section includes ELF headers, and having a start label not aligned to 4K might be okay though it is a bit counter-intuitive. However, the second section is not aligned at all, and there's a hole at 0x11000.


Last edited by s54820 on 21 Oct 2023, 03:52; edited 1 time in total
Post 20 Oct 2023, 23:20
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20363
Location: In your JS exploiting you and your system
revolution 21 Oct 2023, 00:22
The alignment is correct for the ELF format. The "Align" column shows 0x1000 (4096).

The offset of 0x78 is just how ELF works.
Post 21 Oct 2023, 00:22
View user's profile Send private message Visit poster's website Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20363
Location: In your JS exploiting you and your system
revolution 21 Oct 2023, 00:32
I made a patched version to show that it doesn't work when the segment offset is 0x00.
Code:
format elf executable 3 at 0x10000
entry start
segment readable executable
start:
        mov     eax,[0x11000]
        int     0x80
segment readable
exit_code: dd 1    
Then patched.
Code:
EElf file type is EXEC (Executable file)
Entry point 0x10074
There are 2 program headers, starting at offset 52

Program Headers:
  Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
  LOAD           0x000000 0x00010000 0x00010000 0x0007b 0x0007b R E 0x1000
  LOAD           0x00007b 0x00011000 0x00011000 0x00004 0x00004 R   0x1000    
And the result
Code:
Segmentation fault    
Post 21 Oct 2023, 00:32
View user's profile Send private message Visit poster's website Reply with quote
s54820



Joined: 20 Oct 2023
Posts: 4
s54820 21 Oct 2023, 01:41
revolution wrote:
The alignment is correct for the ELF format. The "Align" column shows 0x1000 (4096).

Yes, it is sort of correct. In ELF terms. But then fasm documentation is wrong.

revolution wrote:
I made a patched version to show that it doesn't work when the segment offset is 0x00.

I made a patched version to show that is does work. You have to align^W make sure that both virtual address and an offset match mod (whatever you put in p_align).
Code:
Elf file type is EXEC (Executable file)
Entry point 0x10074
There are 2 program headers, starting at offset 52

Program Headers:
  Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
  LOAD           0x000000 0x00010000 0x00010000 0x0007b 0x0007b R E 0x1000
  LOAD           0x001000 0x00011000 0x00011000 0x00004 0x00004 R   0x1000
    


Description:
Download
Filename: elf_segment_alignment.zip
Filesize: 261 Bytes
Downloaded: 119 Time(s)



Last edited by s54820 on 21 Oct 2023, 01:51; edited 1 time in total
Post 21 Oct 2023, 01:41
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20363
Location: In your JS exploiting you and your system
revolution 21 Oct 2023, 01:48
s54820 wrote:
I made a patched version to show that is does work. You have to align both virtual address and an offset to whatever you put in p_align.
It kinda works, but not really, the result is probably not what you expected.

Using the same code I posted above and patching the offset:
Code:
~ fdbg ./s54820
0000000000010074  > mov eax,[00011000] ; []=464C457F    
It simply puts the elf header at the place where you expect to find the data. The exit code should be 1, not 0x464C457F.
Post 21 Oct 2023, 01:48
View user's profile Send private message Visit poster's website Reply with quote
s54820



Joined: 20 Oct 2023
Posts: 4
s54820 21 Oct 2023, 01:56
revolution wrote:
Using the same code I posted above and patching the offset.

And inserting almost 4K bytes to make an offset correct? ;-)

Code:
0x00010074 in ?? ()
=> 0x00010074:  a1 00 10 01 00          mov    eax,ds:0x11000
(gdb) x/4xb 0x11000
0x11000:        0x01    0x00    0x00    0x00
    
Post 21 Oct 2023, 01:56
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20363
Location: In your JS exploiting you and your system
revolution 21 Oct 2023, 02:03
s54820 wrote:
[And inserting almost 4K bytes to make an offset correct?
I'm not sure what you mean by correct. The current ELF formatter is fine IMO. It is the same format as used by the major tools like GCC and others.

If you need a particular alignment for your data you can do it.
Code:
format elf executable 3 at 0x10000
entry start
segment readable executable
start:
        mov     eax,[exit_code]
        int     0x80
segment readable
align 4096
exit_code: dd 1    
Result:
Code:
0000000000010074  > mov eax,[00012000] ; []=00000001 ; perfect Smile    
Post 21 Oct 2023, 02:03
View user's profile Send private message Visit poster's website Reply with quote
s54820



Joined: 20 Oct 2023
Posts: 4
s54820 21 Oct 2023, 03:51
Well, okay. It is not a bug, fasm saves some space in a file. But it certainly could be improved by having something like [tt]segment readable align 4096[/tt]

revolution wrote:
It is the same format as used by the major tools like GCC and others.

Not quite. "GCC and others" align at least some segments. If you inspect a random file with readelf -Wl /usr/bin/gcc | grep -A 1 'LOAD.*R E' you will see that both executable segment (.text) and the following segment (usually .rodata) have offsets and virtual addresses aligned to a page boundary:
Code:
  LOAD           0x004000 0x0000000000404000 0x0000000000404000 0x114d71 0x114d71 R E 0x1000
  LOAD           0x119000 0x0000000000519000 0x0000000000519000 0x0babb5 0x0babb5 R   0x1000    

revolution wrote:
If you need a particular alignment for your data you can do it.

This way it inserts the same amount of bytes that it "saved" by making holes in memory. But okay, I can see now that this behavior is intentional and the documentation is wrong.
Post 21 Oct 2023, 03:51
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20363
Location: In your JS exploiting you and your system
revolution 21 Oct 2023, 04:04
To have both the exe file and the in-memory layout appear at 4096 alignment then you have to pad with bytes, right? The Linux loader doesn't shift the bytes like Windows does. At least IME anyway. Has that changed lately?

Here is the result for GCC on my system.
Code:
Program Headers:
  Type           Offset   VirtAddr           PhysAddr           FileSiz  MemSiz   Flg Align
  PHDR           0x000040 0x0000000000400040 0x0000000000400040 0x000230 0x000230 R E 0x8
  INTERP         0x000270 0x0000000000400270 0x0000000000400270 0x00001c 0x00001c R   0x1
      [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
  LOAD           0x000000 0x0000000000400000 0x0000000000400000 0x0dcaac 0x0dcaac R E 0x200000
  LOAD           0x0dcf20 0x00000000006dcf20 0x00000000006dcf20 0x001f98 0x004590 RW  0x200000
  DYNAMIC        0x0dddf8 0x00000000006dddf8 0x00000000006dddf8 0x0001e0 0x0001e0 RW  0x8
  NOTE           0x00028c 0x000000000040028c 0x000000000040028c 0x000044 0x000044 R   0x4
  TLS            0x0dcf20 0x00000000006dcf20 0x00000000006dcf20 0x000000 0x000010 R   0x8
  GNU_EH_FRAME   0x0ca394 0x00000000004ca394 0x00000000004ca394 0x002c74 0x002c74 R   0x4
  GNU_STACK      0x000000 0x0000000000000000 0x0000000000000000 0x000000 0x000000 RW  0x10
  GNU_RELRO      0x0dcf20 0x00000000006dcf20 0x00000000006dcf20 0x0010e0 0x0010e0 R   0x1    
Notice that the lowest 12 bit of "Offset" and "VirtAddr" always match. This is a Linux loader requirement IIRC. So I thought the only way to force it to use 0xxxx000 is to insert padding bytes.
Post 21 Oct 2023, 04:04
View user's profile Send private message Visit poster's website Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20363
Location: In your JS exploiting you and your system
revolution 21 Oct 2023, 04:06
s54820 wrote:
... the documentation is wrong.
I think you misunderstand what it is saying. Each segment is aligned to 4096. That it all it says, nothing else about the contents of the segment. And to have it also place the contents at 4096 alignment then you necessarily need to insert padding.
Post 21 Oct 2023, 04:06
View user's profile Send private message Visit poster's website Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.