flat assembler
Message board for the users of flat assembler.

Index > Unix > Mach-O executables made with fasmg

Goto page Previous  1, 2
Author
Thread Post new topic Reply to topic
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8354
Location: Kraków, Poland
Tomasz Grysztar 23 Aug 2017, 07:33
tthsqe wrote:
But this segment is marked executable. I though this was a big no-no.
As far as I know there is nothing in the specifications that forbids giving more permissions to any given segment. This is something I routinely did in other formats, for example to combine all segments into a single, classic "flat" segment for both data and code (including potential for self-modifying code). But Mach-O loaders tend to be very picky, so you may be right and it might be worth trying some other combinations of sections and permissions. However, as I recall from debugging the 64-bit sample with lldb, the fault was at the (correct) address inside the linked library. The function pointers were resolved and updated correctly, and the address contained some actual code (I recognized the prologue instructions), but it resided in an area of memory with no "executable" permissions, it was treated as imported data.
Post 23 Aug 2017, 07:33
View user's profile Send private message Visit poster's website Reply with quote
tthsqe



Joined: 20 May 2009
Posts: 767
tthsqe 23 Aug 2017, 10:08
Ok, so you currently want to direct the loader to update the symbol addresses in the __IMPORT.__nl_symbol_ptr section. Fine. My question is: how are you telling the loader that this is where it should fill in the address? Your LC_DYSYMTAB command gives a INDIRECTSYMOFF, which points us to an array of dwords 0, 1. This are indices into a symbol table entry. The symbol table looks like

Code:
Symbol table:

  External symbols:
   0 _write, Section 0, Value 0x1010
    External, Undefined, no section
    Reference type: External non lazy,  Flags: 
   1 _exit, Section 0, Value 0x1016
    External, Undefined, no section
    Reference type: External non lazy,  Flags: 

  Indirect symbols:
   _write, type 0x1, sect 0, desc 0x0, val 0x1010
   _exit, type 0x1, sect 0, desc 0x0, val 0x1016    


These symbols have no section associated with them (I guess section 0 doesn't exist). So at best, there is only the information contained in the value member. These values (0x1010 and 0x1016) point not to the _write2: dq ? and _exit2: dq ? in the code that I posted before, but to the jump instructions themselves ( _write: jmp qword[_write2] and _exit: jmp qword[_exit2])! This looks bad.

Do you expect the loader to rewrite the jmp instructions? If not, then you expect the loader to rewrite the _write2 and _exit2 qwords; how do you expect the loader to know where _write2 and _exit2 are (I called them this way here but they are generated by "sym.ptr" in your macros)?
Post 23 Aug 2017, 10:08
View user's profile Send private message Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8354
Location: Kraków, Poland
Tomasz Grysztar 23 Aug 2017, 10:46
tthsqe wrote:
Ok, so you currently want to direct the loader to update the symbol addresses in the __IMPORT.__nl_symbol_ptr section. Fine. My question is: how are you telling the loader that this is where it should fill in the address?
The section that contains the pointer variables is marked as S_NON_LAZY_SYMBOL_POINTERS and the "reserved1" field is then interpreted as an index into indirect symbol table (always 0 in my samples). The sequence of consecutive entries in the indirect symbol table is then mapped to the pointers in the "__nl_symbol_ptr" section.
The indirect symbol table is defined by "indirectsymoff" and "nindirectsyms" fields in the LC_DYSYMTAB header, and it is simply an array of indexes into actual symbol table. This way the correspondence between entries in S_NON_LAZY_SYMBOL_POINTERS section and symbols is established. It should work the same way for S_LAZY_SYMBOL_POINTERS, with a different index into a indirect symbol table in "reserved1".

tthsqe wrote:
Do you expect the loader to rewrite the jmp instructions? If not, then you expect the loader to rewrite the _write2 and _exit2 qwords; how do you expect the loader to know where _write2 and _exit2 are (I called them this way here but they are generated by "sym.ptr" in your macros)?
It rewrites the values in the pointer qwords - as I noted above, I have verified it with a debugger and this part works correctly.
Post 23 Aug 2017, 10:46
View user's profile Send private message Visit poster's website Reply with quote
tthsqe



Joined: 20 May 2009
Posts: 767
tthsqe 23 Aug 2017, 13:50
The issues might be
- "LC_DYLD_INFO_ONLY" command is missing
- "__LINKEDIT" segment is malformed. (LC_DYLD_INFO_ONLY cmdn should actually point into this)

Will try to test soon.
Post 23 Aug 2017, 13:50
View user's profile Send private message Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8354
Location: Kraków, Poland
Tomasz Grysztar 23 Aug 2017, 14:03
tthsqe wrote:
The issues might be
- "LC_DYLD_INFO_ONLY" command is missing
- "__LINKEDIT" segment is malformed. (LC_DYLD_INFO_ONLY cmdn should actually point into this)
LC_DYLD_INFO_[ONLY] is a new format for dynamic linking, which additionally uses some kind of interpreted language to script the entire dynamic linking process. I wanted to get the older (presumably simpler) style of dynamic linking working first - see my second post in this thread. To create an executable with the old style of dynamic linking structures you need to use "-mmacosx-version-min=" flag when compiling/linking.
Post 23 Aug 2017, 14:03
View user's profile Send private message Visit poster's website Reply with quote
tthsqe



Joined: 20 May 2009
Posts: 767
tthsqe 23 Aug 2017, 18:15
Since you are using the simpler format for dynamic linking, I really couldn't see anything wrong with your macros, which is why I was confused that they didn't work.

However, now that I have access to a machine, I see it might have been an error on my part, as I was setting the MachO.Settings.BaseAddress to something much lower like 0x400000. With 0x100000000, it works! [EDIT: nope! it works with anything, so I really don't know what was going wrong before]. So, Thomasz, now I'm wondering why it didn't/doesn't work for you.

Some possible improvements to the macros (in order of importance):
- in macros "macro interpreter? path" and "macro uses? lib&", you need an align 8 (not just align 4) at the end so that the size of the command is divisible by 8.

- make the first command that sets the zero address to no-read, no-write, and no-execute not not go all of the way to the base address. Im not sure why you would not want to use the addresses below the base address.

- In the case where you want to two symbols, each called "foo", but coming from different libraries, is the newer system required? or is it possible to do it with the older system?

- when using an imported symbol, it is more minimalistic to use it as "call [printf]" with printf being the location written by the loader. The current ones do "call printf; printf: jmp [_printf_]" and then _printf_ is the one written by the loader. This also means that the execute flag could be removed form the __IMPORT section, which might be the cause of errors from buggy loaders.

- the program at the end works as expected when I change the align 4 to align 8 in your uses and interpreter macros. However, when I run it in a debugger, it complains about overlapping sections. Might be something to look into.
Code:
$ lldb ./demo
(lldb) target create "./demo"
Current executable set to './demo' (x86_64).
(lldb) run
Process 1002 launched: './demo' (x86_64)
warning: (x86_64) ./demo(0x0000000100000000) address 0x0000000100000000 maps to more than one section: demo.__TEXT and demo.__TEXT
warning: (x86_64) ./demo(0x0000000100000000) address 0x0000000100001000 maps to more than one section: demo.__DATA and demo.__DATA
warning: (x86_64) ./demo(0x0000000100000000) address 0x0000000100002000 maps to more than one section: demo.__IMPORT and demo.__IMPORT
Hello. This is linked.
Hello from thread 10
Hello from thread 9
Hello from thread 8
Hello from thread 7
Hello from thread 6
Hello from thread 5
Hello from thread 4
Hello from thread 3
Hello from thread 2
Hello from thread 1
Hello from thread 0
Process 1002 exited with status = 0 (0x00000000) 
(lldb) quit    


The program:
Code:
include 'x86/include/x64.inc'
use64

MachO.Settings.ProcessorType equ CPU_TYPE_X86_64
MachO.Settings.BaseAddress = 0x100000000; there is a no-write, no-read segment below this

include 'x86/macinc/macho.inc'
interpreter '/usr/lib/dyld'
uses '/usr/lib/libSystem.B.dylib' (1.0.0, 1225.0.0)
import exit,'_exit'
import pthread_create,'_pthread_create'
import pthread_mutex_init,'_pthread_mutex_init'
import pthread_mutex_destroy,'_pthread_mutex_destroy'
import pthread_mutex_lock,'_pthread_mutex_lock'
import pthread_mutex_unlock,'_pthread_mutex_unlock'
import pthread_join,'_pthread_join'
import printf,'_printf'
import write,'_write'


segment '__TEXT' readable executable

  section '__text' align 16

entry Start
Start:
        and     rsp, -16
        sub     rsp, 8*32

        mov     edi, 1
        lea     rsi, [sz_Hello]
        mov     edx, sz_Hello_end - sz_Hello
        call    write

        lea     rdi, [mutex]
        xor     esi, esi
        call    pthread_mutex_init

        mov     ebx, 10
.CreateNext:
        lea     rdi, [rsp+8*rbx]        ; thread handle
        xor     esi, esi
        mov     ecx, ebx                ; parameter to pass
        lea     rdx, [Thread_Routine]
        call    pthread_create
        sub     ebx, 1
        jns     .CreateNext

        mov     ebx, 10
.JoinNext:
        mov     rdi, qword[rsp+8*rbx]
        xor     esi, esi
        call    pthread_join
        sub     ebx, 1
        jns     .JoinNext

        lea     rdi, [mutex]
        call    pthread_mutex_destroy

        mov     rdi, rax
        call    exit
        int3

Thread_Routine:
        push    rbx
        mov     ebx, edi        ; parameter passed

        lea     rdi, [mutex]
        call    pthread_mutex_lock

        lea     rdi, [sz_ThreadHello]
        mov     esi, ebx
        call    printf

        lea     rdi, [mutex]
        call    pthread_mutex_unlock

        pop     rbx
        ret
         

section '__cstring' align 4

sz_Hello: db 'Hello. This is linked.', 10
sz_Hello_end:

sz_ThreadHello: db 'Hello from thread %d', 10
sz_ThreadHello_end:


segment '__DATA' readable writeable

  section '__data' align 16

mutex:  rb 48    
Post 23 Aug 2017, 18:15
View user's profile Send private message Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8354
Location: Kraków, Poland
Tomasz Grysztar 23 Aug 2017, 19:38
tthsqe wrote:
So, Thomasz, now I'm wondering why it didn't/doesn't work for you.
Perhaps it might depend on the version number of MacOS? Also, I did not have a proper machine for testing, only a quickly patched up VM that a kind soul on our Discord channel provided.

tthsqe wrote:
- in macros "macro interpreter? path" and "macro uses? lib&", you need an align 8 (not just align 4) at the end so that the size of the command is divisible by 8.
Thanks, a good catch! This is required by the basic specification of 64-bit variant, not negotiable.

tthsqe wrote:
- make the first command that sets the zero address to no-read, no-write, and no-execute not not go all of the way to the base address. Im not sure why you would not want to use the addresses below the base address.
This is something that the programs from systems folders that I analyzed do. My impression was that it is standard practice to extend __PAGEZERO to cover the entire range. The purpose is probably to catch some of the bugs with improperly initialized pointers.

tthsqe wrote:
- In the case where you want to two symbols, each called "foo", but coming from different libraries, is the newer system required? or is it possible to do it with the older system?
It is possible under old system, too. If you turn on the MH_TWOLEVEL flags in the header, the hight 8 bits of n_desc field of every symbol is interpreted as a number of library that this symbol resides in. External libraries are numbered from 1 in order in which they are included in headers (like what the "uses" macro does). Current variant of macros does not support this, but it should be easy to make a variant that would do that.

tthsqe wrote:
- when using an imported symbol, it is more minimalistic to use it as "call [printf]" with printf being the location written by the loader. The current ones do "call printf; printf: jmp [_printf_]" and then _printf_ is the one written by the loader. This also means that the execute flag could be removed form the __IMPORT section, which might be the cause of errors from buggy loaders.
The __jump_table section should actually reside in the __TEXT segment. I included it in __IMPORT to make the first draft of the macros a bit simpler. I'm going to correct this in future.
Post 23 Aug 2017, 19:38
View user's profile Send private message Visit poster's website Reply with quote
tthsqe



Joined: 20 May 2009
Posts: 767
tthsqe 23 Aug 2017, 22:32
One more thing: if I put a segment without execution privileges before a segment with execution privileges, cmd line says
Code:
Killed: 9    

and lldb says:
Code:
error: error: ::posix_spawnp ( pid => 16121, path = './lwan', file_actions = 0x7fff5fbde278, attr = 0x7fff5fbde288, argv = 0x7fc380c75660, envp = 0x7fc380c71ac0 ) err = Malformed Mach-o file (0x00000058)    


so there are still some things to work out. I'm not sure what is wrong with data before code.
Post 23 Aug 2017, 22:32
View user's profile Send private message Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8354
Location: Kraków, Poland
Tomasz Grysztar 26 Aug 2017, 09:29
I have another opportunity to test some executables and I think I found the reason why some of my dynamic linking samples did not work: the function pointers must reside within initialized area of a segment (so I changed their initial values from "?" to "0"). This is probably also the reason why it seemed like the function stubs were necessary (by putting them after the pointers I was making the pointers zero-initialized).

I also made the macros create function stubs in __TEXT section and pointer variables in __DATA section - this also demonstrates how to move anything around inside the executable. The sections needed for importing are inserted in the beginning of segments, to avoid unnecessarily filling any of the uninitialized data at the end of segment. If you would prefer, for example, to move the __stubs section to the end of __TEXT segment, it is enough to alter this fragment:
Code:
                if ~MachO.__TEXT & MachO.SEGNAME = '__TEXT'
                        MachO.__TEXT = 1
                        MachO.__stubs    
and make it:
Code:
                if ~MachO.__TEXT & MachO.SEGNAME = '__TEXT'
                        MachO.__TEXT = 1
                        macro MachO.close_segment
                                MachO.__stubs
                                purge MachO.close_segment
                                MachO.close_segment    


The packages I posted above have been updated with the new version of macros. I also think that they may have matured enough that I can include them in the official package of fasmg.
Post 26 Aug 2017, 09:29
View user's profile Send private message Visit poster's website Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  
Goto page Previous  1, 2

< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.