flat assembler
Message board for the users of flat assembler.

Index > Linux > Malloc in assembly, how far can I reach?

Author
Thread Post new topic Reply to topic
tianboh



Joined: 07 Jul 2023
Posts: 10
tianboh 08 Jul 2023, 01:57
I am writing a compiler for a simplified C language following x86-64 ABI. This language supports pointer and alloc. I checked related manuals, but found malloc/calloc are provided by stdlib.h. I am wondering how far can I reach without these libraries? In other words, do I need to use system call to allocate memory on heap? If so, what functions can I resort to in assembly?

Edit by revolution: Moved to Linux forum
Post 08 Jul 2023, 01:57
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20335
Location: In your JS exploiting you and your system
revolution 08 Jul 2023, 08:54
Unless you statically allocate all memory at the time you generate the exe file, then you will always need to use the OS APIs to allocate memory.

For which OS?

For Linux you can use SYS_BRK, MMAP, etc. system calls.
For Windows there is VirtualAlloc, etc.
For other OS then it could be anything.

So you can avoid the stdlib stuff, but it needs to be replaced with some other way to allocate memory.
Post 08 Jul 2023, 08:54
View user's profile Send private message Visit poster's website Reply with quote
tianboh



Joined: 07 Jul 2023
Posts: 10
tianboh 09 Jul 2023, 01:12
Thank you for your reply! In fact, I am using Linux.

Another question, If I use calloc from libc, can I simply write something like

mov $1 %rdi
mov $4 %rsi
call calloc
mov %rax %TARGET_REG

to allocate an integer?

I tried to generate assembly using GCC, it uses PLT, and generate code like

blah blah
call malloc@PLT
blah blah

However, I haven't seen any linker or assembler that can handle PLT in my environment(I am self studying a compiler course, and this is its assignment), so I am not sure how to implement alloc in my compiler. Any ideas?
Post 09 Jul 2023, 01:12
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20335
Location: In your JS exploiting you and your system
revolution 09 Jul 2023, 07:47
For Linux using SYS_BRK is like this:
Code:
format ELF64 executable 0
entry main

RAM_NEEDED      = 10 shl 10

SYS_WRITE       = 1
SYS_BRK         = 12
SYS_EXIT_GROUP  = 231
STD_OUTPUT      = 1

segment executable readable

main:
        mov     eax,SYS_BRK
        xor     edi,edi
        syscall
        lea     rdi,[rax + RAM_NEEDED]
        mov     eax,SYS_BRK
        syscall
        cmp     rax,rdi
        lea     rsi,[mem_good]
        mov     edx,mem_good.len
        jz      .message_okay
        lea     rsi,[mem_bad]
        mov     edx,mem_bad.len
    .message_okay:
        mov     eax,SYS_WRITE
        mov     edi,STD_OUTPUT
        syscall
        mov     eax,SYS_EXIT_GROUP
        xor     edi,edi
        syscall

mem_good:       db 'Allocation successful',10
mem_good.len    = $ - mem_good

mem_bad:        db 'Allocation failed',10
mem_bad.len     = $ - mem_bad    
Post 09 Jul 2023, 07:47
View user's profile Send private message Visit poster's website Reply with quote
tianboh



Joined: 07 Jul 2023
Posts: 10
tianboh 10 Jul 2023, 01:35
wow! That's awesome, still I got some questions

1. Why is syscall having a different calling convention? It takes eax to determine the brk instruction and uses rdi to denote a void pointer. I feel like SYS_BRK/12 should be passed as the first parameter(edi) and address pointer the second(rsi).

2. It seems you allocated twice through syscall. Is there any insight behind it?
Post 10 Jul 2023, 01:35
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20335
Location: In your JS exploiting you and your system
revolution 10 Jul 2023, 02:30
The Linux kernel doesn't use C calling convention. Typing "man syscall" in a terminal shows ...
Code:
~ man syscall
...
       arch/ABI      arg1  arg2  arg3  arg4  arg5  arg6  arg7  Notes
       ──────────────────────────────────────────────────────────────────
       arm/OABI      a1    a2    a3    a4    v1    v2    v3
       arm/EABI      r0    r1    r2    r3    r4    r5    r6
       arm64         x0    x1    x2    x3    x4    x5    -
       blackfin      R0    R1    R2    R3    R4    R5    -
       i386          ebx   ecx   edx   esi   edi   ebp   -
       ia64          out0  out1  out2  out3  out4  out5  -
       mips/o32      a0    a1    a2    a3    -     -     -     See below
       mips/n32,64   a0    a1    a2    a3    a4    a5    -
       parisc        r26   r25   r24   r23   r22   r21   -
       s390          r2    r3    r4    r5    r6    r7    -
       s390x         r2    r3    r4    r5    r6    r7    -
       sparc/32      o0    o1    o2    o3    o4    o5    -
       sparc/64      o0    o1    o2    o3    o4    o5    -
       x86_64        rdi   rsi   rdx   r10   r8    r9    -
       x32           rdi   rsi   rdx   r10   r8    r9    -    
Be careful when typing "man brk" because it only details the C library wrapper inputs and outputs for SYS_BRK. Best to look at the kernal source to see the actual underlying behaviour. So the first SYS_BRK call returns the current break, and the second call extends it by the requested amount.
Post 10 Jul 2023, 02:30
View user's profile Send private message Visit poster's website Reply with quote
tianboh



Joined: 07 Jul 2023
Posts: 10
tianboh 10 Jul 2023, 03:09
Thank you so much for the great explanation! It helps a lot!
Post 10 Jul 2023, 03:09
View user's profile Send private message Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.