flat assembler
Message board for the users of flat assembler.
 Home   FAQ   Search   Register 
 Profile   Log in to check your private messages   Log in 
flat assembler > Unix > Mach-O executables made with fasmg

Goto page Previous  1, 2
Author
Thread Post new topic Reply to topic
Tomasz Grysztar
Assembly Artist


Joined: 16 Jun 2003
Posts: 6518
Location: Kraków, Poland

tthsqe wrote:
But this segment is marked executable. I though this was a big no-no.

As far as I know there is nothing in the specifications that forbids giving more permissions to any given segment. This is something I routinely did in other formats, for example to combine all segments into a single, classic "flat" segment for both data and code (including potential for self-modifying code). But Mach-O loaders tend to be very picky, so you may be right and it might be worth trying some other combinations of sections and permissions. However, as I recall from debugging the 64-bit sample with lldb, the fault was at the (correct) address inside the linked library. The function pointers were resolved and updated correctly, and the address contained some actual code (I recognized the prologue instructions), but it resided in an area of memory with no "executable" permissions, it was treated as imported data.
Post 23 Aug 2017, 07:33
View user's profile Send private message Visit poster's website Reply with quote
tthsqe



Joined: 20 May 2009
Posts: 697
Ok, so you currently want to direct the loader to update the symbol addresses in the __IMPORT.__nl_symbol_ptr section. Fine. My question is: how are you telling the loader that this is where it should fill in the address? Your LC_DYSYMTAB command gives a INDIRECTSYMOFF, which points us to an array of dwords 0, 1. This are indices into a symbol table entry. The symbol table looks like


Code:
Symbol table:

  External symbols:
   0 _writeSection 0Value 0x1010
    ExternalUndefinedno section
    Reference typeExternal non lazy,  Flags
   1 _exitSection 0Value 0x1016
    ExternalUndefinedno section
    Reference typeExternal non lazy,  Flags

  Indirect symbols:
   _writetype 0x1sect 0desc 0x0val 0x1010
   _exittype 0x1sect 0desc 0x0val 0x1016



These symbols have no section associated with them (I guess section 0 doesn't exist). So at best, there is only the information contained in the value member. These values (0x1010 and 0x1016) point not to the _write2: dq ? and _exit2: dq ? in the code that I posted before, but to the jump instructions themselves ( _write: jmp qword[_write2] and _exit: jmp qword[_exit2])! This looks bad.

Do you expect the loader to rewrite the jmp instructions? If not, then you expect the loader to rewrite the _write2 and _exit2 qwords; how do you expect the loader to know where _write2 and _exit2 are (I called them this way here but they are generated by "sym.ptr" in your macros)?
Post 23 Aug 2017, 10:08
View user's profile Send private message Reply with quote
Tomasz Grysztar
Assembly Artist


Joined: 16 Jun 2003
Posts: 6518
Location: Kraków, Poland

tthsqe wrote:
Ok, so you currently want to direct the loader to update the symbol addresses in the __IMPORT.__nl_symbol_ptr section. Fine. My question is: how are you telling the loader that this is where it should fill in the address?

The section that contains the pointer variables is marked as S_NON_LAZY_SYMBOL_POINTERS and the "reserved1" field is then interpreted as an index into indirect symbol table (always 0 in my samples). The sequence of consecutive entries in the indirect symbol table is then mapped to the pointers in the "__nl_symbol_ptr" section.
The indirect symbol table is defined by "indirectsymoff" and "nindirectsyms" fields in the LC_DYSYMTAB header, and it is simply an array of indexes into actual symbol table. This way the correspondence between entries in S_NON_LAZY_SYMBOL_POINTERS section and symbols is established. It should work the same way for S_LAZY_SYMBOL_POINTERS, with a different index into a indirect symbol table in "reserved1".


tthsqe wrote:
Do you expect the loader to rewrite the jmp instructions? If not, then you expect the loader to rewrite the _write2 and _exit2 qwords; how do you expect the loader to know where _write2 and _exit2 are (I called them this way here but they are generated by "sym.ptr" in your macros)?

It rewrites the values in the pointer qwords - as I noted above, I have verified it with a debugger and this part works correctly.
Post 23 Aug 2017, 10:46
View user's profile Send private message Visit poster's website Reply with quote
tthsqe



Joined: 20 May 2009
Posts: 697
The issues might be
- "LC_DYLD_INFO_ONLY" command is missing
- "__LINKEDIT" segment is malformed. (LC_DYLD_INFO_ONLY cmdn should actually point into this)

Will try to test soon.
Post 23 Aug 2017, 13:50
View user's profile Send private message Reply with quote
Tomasz Grysztar
Assembly Artist


Joined: 16 Jun 2003
Posts: 6518
Location: Kraków, Poland

tthsqe wrote:
The issues might be
- "LC_DYLD_INFO_ONLY" command is missing
- "__LINKEDIT" segment is malformed. (LC_DYLD_INFO_ONLY cmdn should actually point into this)

LC_DYLD_INFO_[ONLY] is a new format for dynamic linking, which additionally uses some kind of interpreted language to script the entire dynamic linking process. I wanted to get the older (presumably simpler) style of dynamic linking working first - see my second post in this thread. To create an executable with the old style of dynamic linking structures you need to use "-mmacosx-version-min=" flag when compiling/linking.
Post 23 Aug 2017, 14:03
View user's profile Send private message Visit poster's website Reply with quote
tthsqe



Joined: 20 May 2009
Posts: 697
Since you are using the simpler format for dynamic linking, I really couldn't see anything wrong with your macros, which is why I was confused that they didn't work.

However, now that I have access to a machine, I see it might have been an error on my part, as I was setting the MachO.Settings.BaseAddress to something much lower like 0x400000. With 0x100000000, it works! [EDIT: nope! it works with anything, so I really don't know what was going wrong before]. So, Thomasz, now I'm wondering why it didn't/doesn't work for you.

Some possible improvements to the macros (in order of importance):
- in macros "macro interpreter? path" and "macro uses? lib&", you need an align 8 (not just align 4) at the end so that the size of the command is divisible by 8.

- make the first command that sets the zero address to no-read, no-write, and no-execute not not go all of the way to the base address. Im not sure why you would not want to use the addresses below the base address.

- In the case where you want to two symbols, each called "foo", but coming from different libraries, is the newer system required? or is it possible to do it with the older system?

- when using an imported symbol, it is more minimalistic to use it as "call [printf]" with printf being the location written by the loader. The current ones do "call printf; printf: jmp [_printf_]" and then _printf_ is the one written by the loader. This also means that the execute flag could be removed form the __IMPORT section, which might be the cause of errors from buggy loaders.

- the program at the end works as expected when I change the align 4 to align 8 in your uses and interpreter macros. However, when I run it in a debugger, it complains about overlapping sections. Might be something to look into.

Code:
$ lldb ./demo
(lldbtarget create "./demo"
Current executable set to './demo' (x86_64).
(lldbrun
Process 1002 launched'./demo' (x86_64)
warning: (x86_64./demo(0x0000000100000000address 0x0000000100000000 maps to more than one sectiondemo.__TEXT and demo.__TEXT
warning: (x86_64./demo(0x0000000100000000address 0x0000000100001000 maps to more than one sectiondemo.__DATA and demo.__DATA
warning: (x86_64./demo(0x0000000100000000address 0x0000000100002000 maps to more than one sectiondemo.__IMPORT and demo.__IMPORT
Hello. This is linked.
Hello from thread 10
Hello from thread 9
Hello from thread 8
Hello from thread 7
Hello from thread 6
Hello from thread 5
Hello from thread 4
Hello from thread 3
Hello from thread 2
Hello from thread 1
Hello from thread 0
Process 1002 exited with status = 0 (0x00000000
(lldbquit



The program:

Code:

include 'x86/include/x64.inc'
use64

MachO.Settings.ProcessorType equ CPU_TYPE_X86_64
MachO.Settings.BaseAddress = 0x100000000; there is a no-write, no-read segment below this

include 'x86/macinc/macho.inc'
interpreter '/usr/lib/dyld'
uses '/usr/lib/libSystem.B.dylib' (1.0.01225.0.0)
import exit,'_exit'
import pthread_create,'_pthread_create'
import pthread_mutex_init,'_pthread_mutex_init'
import pthread_mutex_destroy,'_pthread_mutex_destroy'
import pthread_mutex_lock,'_pthread_mutex_lock'
import pthread_mutex_unlock,'_pthread_mutex_unlock'
import pthread_join,'_pthread_join'
import printf,'_printf'
import write,'_write'


segment '__TEXT' readable executable

  section '__text' align 16

entry Start
Start:
        and     rsp, -16
        sub     rsp8*32

        mov     edi1
        lea     rsi, [sz_Hello]
        mov     edxsz_Hello_end - sz_Hello
        call    write

        lea     rdi, [mutex]
        xor     esiesi
        call    pthread_mutex_init

        mov     ebx10
.CreateNext:
        lea     rdi, [rsp+8*rbx]        ; thread handle
        xor     esiesi
        mov     ecxebx                ; parameter to pass
        lea     rdx, [Thread_Routine]
        call    pthread_create
        sub     ebx1
        jns     .CreateNext

        mov     ebx10
.JoinNext:
        mov     rdiqword[rsp+8*rbx]
        xor     esiesi
        call    pthread_join
        sub     ebx1
        jns     .JoinNext

        lea     rdi, [mutex]
        call    pthread_mutex_destroy

        mov     rdirax
        call    exit
        int3

Thread_Routine:
        push    rbx
        mov     ebxedi        ; parameter passed

        lea     rdi, [mutex]
        call    pthread_mutex_lock

        lea     rdi, [sz_ThreadHello]
        mov     esiebx
        call    printf

        lea     rdi, [mutex]
        call    pthread_mutex_unlock

        pop     rbx
        ret
         

section '__cstring' align 4

sz_Hellodb 'Hello. This is linked.'10
sz_Hello_end:

sz_ThreadHellodb 'Hello from thread %d'10
sz_ThreadHello_end:


segment '__DATA' readable writeable

  section '__data' align 16

mutex:  rb 48

Post 23 Aug 2017, 18:15
View user's profile Send private message Reply with quote
Tomasz Grysztar
Assembly Artist


Joined: 16 Jun 2003
Posts: 6518
Location: Kraków, Poland

tthsqe wrote:
So, Thomasz, now I'm wondering why it didn't/doesn't work for you.

Perhaps it might depend on the version number of MacOS? Also, I did not have a proper machine for testing, only a quickly patched up VM that a kind soul on our Discord channel provided.


tthsqe wrote:
- in macros "macro interpreter? path" and "macro uses? lib&", you need an align 8 (not just align 4) at the end so that the size of the command is divisible by 8.

Thanks, a good catch! This is required by the basic specification of 64-bit variant, not negotiable.


tthsqe wrote:
- make the first command that sets the zero address to no-read, no-write, and no-execute not not go all of the way to the base address. Im not sure why you would not want to use the addresses below the base address.

This is something that the programs from systems folders that I analyzed do. My impression was that it is standard practice to extend __PAGEZERO to cover the entire range. The purpose is probably to catch some of the bugs with improperly initialized pointers.


tthsqe wrote:
- In the case where you want to two symbols, each called "foo", but coming from different libraries, is the newer system required? or is it possible to do it with the older system?

It is possible under old system, too. If you turn on the MH_TWOLEVEL flags in the header, the hight 8 bits of n_desc field of every symbol is interpreted as a number of library that this symbol resides in. External libraries are numbered from 1 in order in which they are included in headers (like what the "uses" macro does). Current variant of macros does not support this, but it should be easy to make a variant that would do that.


tthsqe wrote:
- when using an imported symbol, it is more minimalistic to use it as "call [printf]" with printf being the location written by the loader. The current ones do "call printf; printf: jmp [_printf_]" and then _printf_ is the one written by the loader. This also means that the execute flag could be removed form the __IMPORT section, which might be the cause of errors from buggy loaders.

The __jump_table section should actually reside in the __TEXT segment. I included it in __IMPORT to make the first draft of the macros a bit simpler. I'm going to correct this in future.
Post 23 Aug 2017, 19:38
View user's profile Send private message Visit poster's website Reply with quote
tthsqe



Joined: 20 May 2009
Posts: 697
One more thing: if I put a segment without execution privileges before a segment with execution privileges, cmd line says

Code:
Killed9


and lldb says:

Code:
errorerror: ::posix_spawnp ( pid => 16121path = './lwan'file_actions = 0x7fff5fbde278attr = 0x7fff5fbde288argv = 0x7fc380c75660envp = 0x7fc380c71ac0 ) err = Malformed Mach-o file (0x00000058)



so there are still some things to work out. I'm not sure what is wrong with data before code.
Post 23 Aug 2017, 22:32
View user's profile Send private message Reply with quote
Tomasz Grysztar
Assembly Artist


Joined: 16 Jun 2003
Posts: 6518
Location: Kraków, Poland
I have another opportunity to test some executables and I think I found the reason why some of my dynamic linking samples did not work: the function pointers must reside within initialized area of a segment (so I changed their initial values from "?" to "0"). This is probably also the reason why it seemed like the function stubs were necessary (by putting them after the pointers I was making the pointers zero-initialized).

I also made the macros create function stubs in __TEXT section and pointer variables in __DATA section - this also demonstrates how to move anything around inside the executable. The sections needed for importing are inserted in the beginning of segments, to avoid unnecessarily filling any of the uninitialized data at the end of segment. If you would prefer, for example, to move the __stubs section to the end of __TEXT segment, it is enough to alter this fragment:

Code:
                if ~MachO.__TEXT & MachO.SEGNAME = '__TEXT'
                        MachO.__TEXT = 1
                        MachO.__stubs

and make it:

Code:
                if ~MachO.__TEXT & MachO.SEGNAME = '__TEXT'
                        MachO.__TEXT = 1
                        macro MachO.close_segment
                                MachO.__stubs
                                purge MachO.close_segment
                                MachO.close_segment



The packages I posted above have been updated with the new version of macros. I also think that they may have matured enough that I can include them in the official package of fasmg.
Post 26 Aug 2017, 09:29
View user's profile Send private message Visit poster's website Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  
Goto page Previous  1, 2

< Last Thread | Next Thread >

Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Powered by phpBB © 2001-2005 phpBB Group.

Main index   Download   Documentation   Examples   Message board
Copyright © 2004-2016, Tomasz Grysztar.