flat assembler
Message board for the users of flat assembler.
Index
> Assembly > Learning binary file formats (work in progress) |
Author |
|
Tomasz Grysztar 15 Jul 2018, 11:48
This is an unfinished book-like tutorial on the executable/object formats, with a heavy assistance of fasmg. I started with PE and I plan to follow up with ELF (including object varieties), then perhaps Mach-O and possibly some others. While writing the main text I have also been posting little sidenotes and outtakes, now gathered elsewhere.
Introduction Getting started with fasmg Chapter 1 PE (Portable Executable) 1.1 Building a simple program 1.2 Adding relocations 1.3 Making a library 1.4 Embedding resources 1.5 Moving to 64 bits with PE+ 1.6 Experimenting further Chapter 2 ELF (Executable and Linkable Format) 2.1 A minimal executable file To be continued...
Last edited by Tomasz Grysztar on 17 Feb 2022, 11:16; edited 5 times in total |
|||||||||||||||||||||||||||||||
15 Jul 2018, 11:48 |
|
Tomasz Grysztar 02 Aug 2018, 02:22
Introduction
Getting started with fasmg To learn the inner workings of binary files we are going to construct them manually, with help of fasmg. This is a command line tool that takes the source text, which is like a script that defines how to assemble the binary file out of its components (down to bytes or even individual bits) and saves such produced file under a given name: Code: fasmg source.asm output.bin The source contains a series commands, each on its own line of text. One of the basic instructions is DB, which defines data as a series of bytes: Code: db 49,50,51 The definitions of data can use units larger than a byte, among other available instruction there is DW to define 16-bit (2-byte) "words", DD for 32-bit (4-byte) "double words" and DQ for 64-bit (8-byte) "quad words". They all store values as little-endian (there are easy methods to define big-endian data too, but we are not going to need them here). A data can also be defined as a string of bytes copied as-is from the source text. Such sequence of characters needs to be enclosed with either single or double quotes: Code: db 'Hi!' Code: db 3 dup '!' Any definition of data has assigned an address, starting from zero. Data can be given a name, by writing it before the DB or other similar instruction as a so-called label: Code: digits db 49,50,51,52,53,54,55,56,57 Code: dd digits What makes the assembler especially useful is that we can define various things out of order and fasmg is going to compute and put the right values in the right places, like: Code: dd null - digits digits db 49,50,51,52,53,54,55,56,57 null db 0 A label can also be created without a data definition on the same line, in such case the name needs to be followed by a colon: Code: dd eof db 'Hello!' eof: Code: dd length digits db 49,50,51,52,53,54,55,56,57 null db 0 length := null - digits Code: dd length digits db 49,50,51,52,53,54,55,56,57 length := $ - digits db 0 Various portions of executable files may end up loaded to a different addresses in memory. The instruction ORG allows to change the assumed address for the data definitions that follow, without altering the position in file. This changes the value of $ and the values of all labels defined after this point. Since this decouples $ from the position within the generated file, there is another special name $% that always equals to the position in file regardless of the assumed address. Code: org 0x100 start: offset = $% dd start ; equals 0x100 (256) dd offset ; equals 0 A data can be defined with ? in place of its value. This creates a so-called reserved bytes, which are stripped when at the end of file but not when they are followed by a regular data. Code: db ? ; this one is not cut db 'a' db ? ; this one is cut The SECTION instruction in the language of fasmg is very similar to ORG, except that it strips all the reserved bytes that were defined just before it, similarly to how they are normally stripped at the end of file. This is going to become useful when the file we make needs to contain sections that are larger in memory than in the file, with additional bytes reserved at the end of each section. In addition to special symbols $ and $% there is also $%%. It equals the current offset in file and unlike $% it does not count the reserved bytes (which may end up discarded): Code: dd SIZE_IN_FILE dd SIZE_IN_MEMORY section 0x1000 ADDRESS_IN_MEMORY = $ OFFSET_IN_FILE = $% db 'example' db 0x30 dup ? SIZE_IN_MEMORY := $ - ADDRESS_IN_MEMORY SIZE_IN_FILE := $%% - OFFSET_IN_FILE Code: include 'listing.inc' Code: fasmg source.asm output.bin -i "include 'listing.inc'" Code: fasmg source.asm output.bin -i include\ 'listing.inc' Code: [0000000000000000] 00000000: 07 00 00 00 dd SIZE_IN_FILE [0000000000000004] 00000004: 37 00 00 00 dd SIZE_IN_MEMORY [0000000000000008] section 0x1000 [0000000000001000] ADDRESS_IN_MEMORY = $ [0000000000001000] OFFSET_IN_FILE = $% [0000000000001000] 00000008: 65 78 61 6D 70 6C 65 db 'example' [0000000000001007] db 0x30 dup ? [0000000000001037] SIZE_IN_MEMORY := $ - ADDRESS_IN_MEMORY [0000000000001037] SIZE_IN_FILE := $%% - OFFSET_IN_FILE When the assembly scripts we write become more complex, we may notice some repeating patterns of commands that would become much more pleasant if we could replace them with specialized instructions. To help there, the assembler allows us to define macro-instructions. This way we can create a new command, named however we wish, and make it execute a customized sequence of instructions every time it is encountered. For example: Code: macro distanceto address
dd address - $
end macro Code: distanceto digits Code: dd digits - $ Code: macro align? pow2*,value:? db (-$) and (pow2-1) dup value end macro Code: ALIGN 4 Code: db (-$) and (4-1) dup ? We are going to learn more about the language of fasmg when a need arises, but at the moment we know enough to start making some files. |
|||
02 Aug 2018, 02:22 |
|
Tomasz Grysztar 04 Aug 2018, 23:44
Chapter 1
PE (Portable Executable) The road we are going to take is to learn inner workings of file formats by constructing some files from scratch. This approach is focused on experimentation, so we will use samples designed in a way that encourages playing with them and learning through direct experience. The first file we construct is going to be an executable for Windows operating system, in the format called Portable Executable. PE was designed in 1993 for Windows NT (the first 32-bit system in the family), and has been used from then on by the 32-bit and 64-bit implementations of Windows. Subsequently it has been adopted for some other uses, like EFI, but at this time we are going to focus on its original environment. 1.1 Building a simple program Before we go on, a few preparations. We should take the ALIGN macro we discussed earlier, it is going to become useful quite soon. We may also need to create some machine code for the actual program inside our executable and for this we need to include an instruction set for a processor architecture we need to work with. Our first choice is going to be very traditional, the 32-bit x86 architecture, so we include an instruction set for processors compatible with 80386: Code: include '80386.inc'
use32 A term that often pops up when discussing PE files is the program image. This refers to the layout of the program after it is loaded into memory to be executed, which is not necessarily the same as the structure of the program in the file on disk. The executable needs to define a mapping of sections of the file onto the corresponding areas in memory. Nevertheless, both the file on disk and image in memory start the same way - with the headers. These structures from the beginning of file become the initial portion of loaded program, at the address called the base of the image. All the other sections created in memory have to be placed after that. Any PE executable is constructed with an assumed value for the base of the image, for 32-bit programs this is usually 0x400000. We are going to define a constant with this value and use it as the base for our labels: Code: IMAGE_BASE := 0x400000 org IMAGE_BASE The next two constants choose the alignment settings for the disk and for the memory. This is one of the sources of discrepancy between the layouts of the file and of the image. Code: FILE_ALIGNMENT := 512 SECTION_ALIGNMENT := 4096 In memory, the sections are aligned to the size of page (which is 4096 bytes in the basic setup of x86 CPU). This is partly because memory can be allocated only in such increments, but also because different sections may require distinct attributes for the memory (like write-protection) and CPU can have them set up only for entire pages at once. These constants are better left with the standard values. While it is possible to tweak them in such way that it should still be possible for the operating system to construct the image, the loader may distrust and refuse to load an executable with a non-standard layout. There are also some additional constraints if chosen alignment in memory is smaller than the size of page (we may get back to it later). It is time to start writing the headers. The very first bytes of the file are usually an unique signature of the format, but in the case of PE a matter is a bit more complicated. At the time when PE format was designed DOS was still a popular operating system and many of the new formats - like NE (16-bit format used by the earliest versions of Windows), LE (used by OS/2, but also by drivers in Windows 9x) and finally PE - were based on the old MZ format used for the .EXE files in DOS. All these formats were made in such way that the initial portion of the file was a valid MZ program that could be executed by DOS, usually it was a tiny program that just displayed a message like "This program cannot be run in DOS mode". This small program was called a stub and its MZ header was extended to contain a special field, ignored by older software, containing the offset of the actual new executable header later in the file. This way it was even possible to have an executable that would contain two versions of the same software - one for DOS and one for Windows. This was not an usual thing to do, though. Mostly, the stub programs were just informing in one way or the other that the file was not intended to be run from DOS. Nowadays we do not need to worry much about someone mistakenly trying to execute our PE file in DOS, therefore we are going to make a minimal stub - not a real program, just something that resembles one enough for our PE executable to be valid: Code: Stub: .Signature dw "MZ" .BytesInLastSector dw SIZE_OF_STUB mod 512 .NumberOfSectors dw (SIZE_OF_STUB-1)/512 + 1 .NumberOfRelocations dw 0 .NumberOfHeaderParagraphs dw SIZE_OF_STUB_HEADER / 16 db 0x3C - ($-Stub) dup 0 .NewHeaderOffset dd Header-IMAGE_BASE align 16 SIZE_OF_STUB_HEADER := $ - Stub ; The code of a DOS program would go here. SIZE_OF_STUB := $ - Stub We compute the offset of a main PE header by subtracting IMAGE_BASE from its address (available through a label that we are going to define below). For all the headers there is such clear correspondence between addresses in image and positions in file. We also fill a couple of fields in the MZ header that are crucial for its integrity, namely the size of the header and of the entire program. The header is measured in 16-byte units (in DOS they were called paragraphs) and the "align 16" is there to make sure that this is a multiple of 16 (though in this case nothing needs to be done, the position immediately after the NewHeaderOffset is 64). The size of DOS program is given as a count of 512-byte sectors, but the last one of them is allowed to be not fully filled and BytesInLastSector gives the number of bytes in it. On a side note, when a label starts with a dot, it belongs to the namespace of a regular label that preceded it. The labels defined here could be accessed from elsewhere with identifiers like "Stub.Signature" or "Stub.NewHeaderOffset". With the stub ready, we can move on to the main header, this is where the actual PE signature is going to be. This header must be aligned on 8-byte boundary, hence we put an "align 8" here, though it again does nothing (but if we had put a real DOS program above, the position in file might have been misaligned). Code: align 8 Header: .Signature dw "PE",0 .Machine dw 0x14C ; IMAGE_FILE_MACHINE_I386 .NumberOfSections dw NUMBER_OF_SECTIONS .TimeDateStamp dd %t .PointerToSymbolTable dd 0 .NumberOfSymbols dd 0 .SizeOfOptionalHeader dw SectionTable - OptionalHeader .Characteristics dw 0x102 ; IMAGE_FILE_32BIT_MACHINE + IMAGE_FILE_EXECUTABLE_IMAGE According to the plan, the first example is going to be for a 32-bit mode of a x86 CPU and we state this in the Machine field, but also by including IMAGE_FILE_32BIT_MACHINE value in the Characteristics. The latter field is a set of flags and there is another one that we unquestionably need there - IMAGE_FILE_EXECUTABLE_IMAGE tells that the file contains an executable code. PE is closely related to COFF, which is a format of object files that are created by compilers as an intermediate stage before they are finally linked to create code that can be executed. These two formats have mostly identical headers (except for the PE signature, which is missing in COFF) and they share the values of various constants. The value of IMAGE_FILE_EXECUTABLE_IMAGE has been used by COFF to distinguish the object files from the executable ones (when we later talk about ELF format, which superseded COFF on the Unix systems, we are going to see that it has similar variants). In NumberOfSections we need to state how many sections do we plan to create. We do not know that yet, but we can use the name of a constant that we define later with the right value. TimeDateStamp needs to tell when the file was created, in the "milliseconds since Unix epoch" format. A special symbol %t is provided by fasmg with such value. PointerToSymbolTable and NumberOfSymbols are another relic of the COFF format. They are not used in PE and we just fill them with zeros. After the main header comes the so-called "optional header". This name is also a legacy of COFF, as this structure contains a crucial information about the entry point of an executable code and is definitely required for any PE image. It was only optional in COFF, when the file could be an intermediate object, not yet made into an executable. The optional header follows immediately after the main one and is in turn followed by the section table. Thus to obtain the size that we need to put in SizeOfOptionalHeader we just compute the difference between the OptionalHeader and SectionTable addresses. Code: OptionalHeader: .Magic dw 0x10B .MajorLinkerVersion db 0 .MinorLinkerVersion db 0 .SizeOfCode dd 0 .SizeOfInitializedData dd 0 .SizeOfUninitializedData dd 0 .AddressOfEntryPoint dd EntryPoint-IMAGE_BASE .BaseOfCode dd 0 .BaseOfData dd 0 The value of Magic identifies a variant of PE format. For classic 32-bit PE it is always 0x10B (a ZMAGIC value which COFF inherited from the old a.out format); while 0x20B is used to mark PE+ files, a variety intended mainly for 64-bit architectures. They slightly differ in format of the structures that follow, we are going to look at these differences later, when we create a 64-bit executable. Of the other fields in this initial portion of the "optional" header the only important one is AddressOfEntryPoint, which should contain an address of entry point relative to the base of the image. The specification calls this kind of value an RVA (Relative Virtual Address), while VA (Virtual Address) is just a direct address in memory. To compute an RVA we simply subtract IMAGE_BASE from the address (VA). The EntryPoint label is going to be defined later, in the code of our program. MajorLinkerVersion and MinorLinkerVersion are filled by a linker when it creates the executable, this allows the linker to put some mark of authorship on the executable. We are not a linker, so we can decide for ourselves what kind of mark to leave there. A simple choice is just zeros. The other fields, like SizeOfCode and AddressOfCode, are remnants of the original COFF model (which in turn inherited them from the old a.out) and they do not really matter to PE loader. Various kinds of code and data sections may be intermixed within the image and the true authority on their sizes and placement is held by the section table. The fields here are just a supplementary information and, for instance, if there were several sections of data with some code in-between, the sum of their sizes would serve only a statistical role. If we wanted to be pedantic about it, we could fill these fields with values copied from our section table, but for now we just leave them zeroed. An additional sign of the irrelevancy of these numbers is that in PE+ the entire BaseOfData field is readily sacrificed to allow the subsequent ImageBase field to be enlarged to 64-bit without moving the later ones. Code: .ImageBase dd IMAGE_BASE .SectionAlignment dd SECTION_ALIGNMENT .FileAlignment dd FILE_ALIGNMENT .MajorOperatingSystemVersion dw 3 .MinorOperatingSystemVersion dw 10 .MajorImageVersion dw 0 .MinorImageVersion dw 0 .MajorSubsystemVersion dw 3 .MinorSubsystemVersion dw 10 .Win32VersionValue dd 0 .SizeOfImage dd SIZE_OF_IMAGE .SizeOfHeaders dd SIZE_OF_HEADERS .CheckSum dd 0 .Subsystem dw 2 ; IMAGE_SUBSYSTEM_WINDOWS_GUI .DllCharacteristics dw 0 .SizeOfStackReserve dd 4096 .SizeOfStackCommit dd 4096 .SizeOfHeapReserve dd 65536 .SizeOfHeapCommit dd 0 .LoaderFlags dd 0 .NumberOfRvaAndSizes dd NUMBER_OF_RVA_AND_SIZES In contrast, this part of headers holds many important values. All the constants we defined earlier - the base of the image and the alignment values - are stored here exactly as they are. We also use two constants we have not yet defined to fill SizeOfImage and SizeOfHeaders, we are going to calculate these values later. MajorOperatingSystemVersion together with MinorOperatingSystemVersion as well as MajorSubsystemVersion with MinorSubsystemVersion declare what version of operating system is needed to execute the image. Programs created for older versions are allowed to run on the newer ones, and this example is not going to use any features that were not in Windows since the beginning, so to not unnecessarily limit the execution of program we put 3.10 there (this is the version number of first Windows NT that supported PE format). MajorImageVersion and MinorImageVersion could indicate the version of our program, but they are usually unused. And Win32VersionValue is just a reserved field, with currently unknown purpose; it needs to be zero. The same goes for LoaderFlags further below. CheckSum is a value computed over all the bytes of the executable that can be used to check whether the file has been modified in any way since the time when it was calculated. Normal programs are not required to have a valid checksum, so in this example we are going to skip this step. But even when we plan to compute the checksum, the value of this field should not partake in the summation so it is better to have it initially zeroed. Subsystem identifies the environment where the program wants to be run. For normal applications this is either GUI or console. DllCharacteristics is an additional set of flags supplementary to Characteristics in the main header. This is another case of a misnomer, the flags here are not necessarily related to whether the file is a DLL. Nevertheless, at the moment we do not need to set any of them. SizeOfStackReserve and SizeOfStackCommit set up the size of stack for our program, the former states how large the stack is allowed to become, while the latter determines the initial size. We go with a single page for both. SizeOfHeapReserve and SizeOfHeapCommit provide similar settings for the local heap, which is a pool from which program may allocate small blocks of memory whenever needed. We set up some usual values, though we are not going to use heap in our simple program. Finally, NumberOfRvaAndSizes specifies how many pairs consisting of a relative address and a size follow immediately after. This forms a sort of catalogue of specialized data structures present in the image. They come in a fixed order, as folows: Code: RvaAndSizes: .Export.Rva dd 0 .Export.Size dd 0 .Import.Rva dd ImportTable-IMAGE_BASE .Import.Size dd ImportTable.End-ImportTable .Resource.Rva dd 0 .Resource.Size dd 0 .Exception.Rva dd 0 .Exception.Size dd 0 .Certificate.Rva dd 0 .Certificate.Size dd 0 .BaseRelocation.Rva dd 0 .BaseRelocation.Size dd 0 .Debug.Rva dd 0 .Debug.Size dd 0 .Architecture.Rva dd 0 .Architecture.Size dd 0 .GlobalPtr.Rva dd 0 .GlobalPtr.Size dd 0 .TLS.Rva dd 0 .TLS.Size dd 0 .LoadConfig.Rva dd 0 .LoadConfig.Size dd 0 .BoundImport.Rva dd 0 .BoundImport.Size dd 0 .IAT.Rva dd 0 .IAT.Size dd 0 .DelayImport.Rva dd 0 .DelayImport.Size dd 0 .COMPlus.Rva dd 0 .COMPlus.Size dd 0 .Reserved.Rva dd 0 .Reserved.Size dd 0 Here the optional header ends, immediately followed by the section table - a crucial component of the headers. Code: SectionTable: .1.Name dq +'.text' .1.VirtualSize dd Section.1.End - Section.1 .1.VirtualAddress dd Section.1 - IMAGE_BASE .1.SizeOfRawData dd Section.1.SIZE_IN_FILE .1.PointerToRawData dd Section.1.OFFSET_IN_FILE .1.PointerToRelocations dd 0 .1.PointerToLineNumbers dd 0 .1.NumberOfRelocations dw 0 .1.NumberOfLineNumbers dw 0 .1.Characteristics dd 0x60000000 ; IMAGE_SCN_MEM_EXECUTE + IMAGE_SCN_MEM_READ .2.Name dq +'.rdata' .2.VirtualSize dd Section.2.End - Section.2 .2.VirtualAddress dd Section.2 - IMAGE_BASE .2.SizeOfRawData dd Section.2.SIZE_IN_FILE .2.PointerToRawData dd Section.2.OFFSET_IN_FILE .2.PointerToRelocations dd 0 .2.PointerToLineNumbers dd 0 .2.NumberOfRelocations dw 0 .2.NumberOfLineNumbers dw 0 .2.Characteristics dd 0x40000000 ; IMAGE_SCN_MEM_READ SectionTable.End: The name of the section is stored in an 8-byte field, padded with zeros. We use DQ to define this as a 64-bit value and convert the string to a number with the + operator, in order to enable range check. A DQ with a string argument would allow text of any length and it would simply pad it so that the size was a multiple of 8 bytes. By converting text to a number we ensure that it has to fit in a single 64-bit cell so the field is always exactly 8 bytes long. VirtualAddress and VirtualSize define the boundaries of a section within the image in memory. The starting address needs to be set up consistently with the SectionAlignment, we need to keep this in mind later when we define the labels used here. PointerToRawData and SizeOfRawData define the placement of the contents of a section within the file. Both values have to be aligned accordingly to the FileAlignment, so it is possible for section's data in file to be larger than the size of that section in memory. It can also be the other way around, since a section may reserve more memory than it contains actual data. In an extreme case the size in file might be 0 when a section contains nothing but reserved memory. We are going to compute the constants used there with help of the $% symbol, after ensuring the proper alignment within the file. The fields that refer to relocations and line numbers are in these structures because COFF objects use them, but for PE images they should be zeroed. Although PE could contain some relocations, they would be very different from the ones used by COFF and defined elsewhere (we are going to discuss them a bit later, the first example can work without them). Characteristics contain various flags, here we just mark both sections as a readable memory and the code section as executable. These settings translate directly into the attributes of allocated memory pages, so they are quite important. We could also use values like IMAGE_SCN_CNT_CODE and IMAGE_SCN_CNT_INITIALIZED_DATA (connected to the fields like SizeOfCode and SizeOfInitializedData in the main header), but this would mostly be just decorative. The end of the section table is also the end of the contents of the headers. Before we go further, we are going to fill up a few of the related constants. They are a bit redundant, the effect would be the same if we plugged the corresponding expressions directly in the places where we used their names earlier. But the use of middlemen constants helps to comfortably alter the way they are computed when this comes up in the future. Code: NUMBER_OF_RVA_AND_SIZES := (SectionTable-RvaAndSizes)/8 NUMBER_OF_SECTIONS := (SectionTable.End-SectionTable)/40 SIZE_OF_HEADERS := Section.1.OFFSET_IN_FILE As for the total size of headers, it has to be rounded up to the nearest multiple of FILE_ALIGNMENT, and this is at the same time the position where the contents of the initial section is going to begin. Therefore we can cheat a little and shift the responsibility to another constant, the one defining the offset in file for the first section. However, to correctly position our initial section we need to do some actual work. Code: align SECTION_ALIGNMENT
Section.1: Code: section $%% align FILE_ALIGNMENT,0 Section.1.OFFSET_IN_FILE: With the use of $%% as an argument to SECTION we temporarily switch from in-memory addressing to one tracing the actual position in file. This makes the address $ equal to the offset $% until we change this with another SECTION or ORG. After that we use the alignment macro once more, this time to align the offset in file to the nearest multiple of FILE_ALIGNMENT. While the previous alignment just moved our address in memory without adding anything to file, this time we provide the second argument to the macro to make it write the necessary amount of zeroed bytes to the output. Then Section.1.OFFSET_IN_FILE can be defined simply as a label, thanks to the address being the same as the position in file. Finally we switch back to in-memory addressing, at the address of Section.1 label. A simple ORG would suffice, but we use SECTION for the visual appeal: Code: section Section.1 EntryPoint: push 0 push CaptionString push MessageString push 0 call [MessageBoxA] push 0 call [ExitProcess] Section.1.End: Now we need to perform the full alignment ritual again, this time to set up the position of the second section. We also calculate the size of the first one in file simply by computing the difference between the aligned offsets. Code: align SECTION_ALIGNMENT Section.2: section $%% align FILE_ALIGNMENT,0 Section.1.SIZE_IN_FILE := $ - Section.1.OFFSET_IN_FILE Section.2.OFFSET_IN_FILE: Code: section Section.2 We start with the import table, which allows us to direct the loader to fill up our pointers with the addresses of the functions from system DLL files. This is actually a complex structure that consist of several smaller tables. First, there is an Import Directory Table. Code: ImportTable: .1.ImportLookupTableRva dd KernelLookupTable-IMAGE_BASE .1.TimeDateStamp dd 0 .1.ForwarderChain dd 0 .1.NameRva dd KernelDLLName-IMAGE_BASE .1.ImportAddressTableRva dd KernelAddressTable-IMAGE_BASE .2.ImportLookupTableRva dd UserLookupTable-IMAGE_BASE .2.TimeDateStamp dd 0 .2.ForwarderChain dd 0 .2.NameRva dd UserDLLName-IMAGE_BASE .2.ImportAddressTableRva dd UserAddressTable-IMAGE_BASE dd 0,0,0,0,0 NameRva is a relative address of the name of DLL file. We are going to put these names near the end of the import-related data. ImportLookupTableRva and ImportAddressTableRva point to two parallel tables. The former contains relative addresses of structures declaring functions to be imported, while the latter is going to contain actual addresses of imported functions. The functions can be in any order, as long as the same one is used for both tables. When our image is loaded into memory, the operating system is going to look for all the functions defined by the first table and fill the second one with corresponding addresses. TimeDateStamp and ForwarderChain fields are used when the imports are bound - that is, when the second table is pre-filled with addresses of imported functions to save time when loading the image. This obviously can work correctly only when all the addresses in imported library are exactly as they were upon binding, and TimeDateStamp keeps the value of the timestamp of the DLL to provide a way to verify that it is exactly the same file. If the timestamps match, the loader can skip looking up all the functions, otherwise it does it as usual. Our imports are not bound, we need the loader to fill the addresses for us, therefore we keep TimeDateStamp zeroed in every case. If the imports were bound, ForwarderChain would be interpreted as an index of a function that could not be bound because it was a forwarded import from another DLL. The value of the corresponding entry in the import address table would be an index of another such function, and so on. If we wanted to indicate that there were no such functions, we should put -1 in this field, but since we do not use binding (as indicated by the zeroed TimeDateStamp) this value is irrelevant. Now we need to create lookup tables and address tables for every DLL. The initial contents of the parallel tables should be the same, they both should contain relative addresses to the lookup entries defining the functions. When the image is loaded, the IAT is rewritten with the matching addresses. We can then use these values directly, therefore we label them with names of the functions and this is exactly what is needed to get the CALL instructions in our code to work. Code: KernelLookupTable: dd ExitProcessLookup-IMAGE_BASE dd 0 KernelAddressTable: ExitProcess dd ExitProcessLookup-IMAGE_BASE ; this is going to be replaced with the address of the function dd 0 UserLookupTable: dd MessageBoxALookup-IMAGE_BASE dd 0 UserAddressTable: MessageBoxA dd MessageBoxALookup-IMAGE_BASE ; this is going to be replaced with the address of the function dd 0 We import only one function from each DLL, so the tables are short. The end of a table is marked by a zeroed entry. Next come the lookup definitions for individual functions. Each such structure contains a 16-bit hint followed by the name of the function as a null-terminated string. The hint is an index into the export table of DLL, where the loader may look for the function with such name. If the hint fails, the loader continues to search for the function as usual, thus we do not have to know the right values to put there. Code: ExitProcessLookup: .Hint dw 0 .Name db 'ExitProcess',0 align 2 MessageBoxALookup: .Hint dw 0 .Name db 'MessageBoxA',0 Finally, we conclude the import table with the names of DLL files that we import. They are a plain null-terminated strings. Code: KernelDLLName db 'KERNEL32.DLL',0 UserDLLName db 'USER32.DLL',0 ImportTable.End: Code: CaptionString db "PE tutorial",0 MessageString db "I am alive and well!",0 Section.2.End: Code: align SECTION_ALIGNMENT SIZE_OF_IMAGE := $ - IMAGE_BASE section $%% align FILE_ALIGNMENT,0 Section.2.SIZE_IN_FILE := $ - Section.2.OFFSET_IN_FILE This is it, the source for our first PE image is ready (a copy is in the attached "basic.asm" file). We can now assemble it into a file with the "exe" extension and let it run. We can also combine it with the "listing.inc" script to contemplate the binary data juxtaposed with the commands that generated it. You may notice that numerous lines from "80386.inc" show up in the listing. To get rid of them, we can hide the included file inside a simple macro: Code: macro use? file* include file end macro use '80386.inc' use32 Code: use 'ntimage.inc' Code: .Characteristics dw IMAGE_FILE_32BIT_MACHINE + IMAGE_FILE_EXECUTABLE_IMAGE Code: .DllCharacteristics dw IMAGE_DLLCHARACTERISTICS_NX_COMPAT It was a first step towards making our source more maintainable. Another one could be to automate some of the tasks. For example, we can generate all the entries in the section table with a simple repetition: Code: SectionTable: repeat NUMBER_OF_SECTIONS, n:1 .n.Name dq Section.n.NAME .n.VirtualSize dd Section.n.End - Section.n .n.VirtualAddress dd Section.n - IMAGE_BASE .n.SizeOfRawData dd Section.n.SIZE_IN_FILE .n.PointerToRawData dd Section.n.OFFSET_IN_FILE .n.PointerToRelocations dd 0 .n.PointerToLineNumbers dd 0 .n.NumberOfRelocations dw 0 .n.NumberOfLineNumbers dw 0 .n.Characteristics dd Section.n.CHARACTERISTICS end repeat SectionTable.End: This approach requires that we define several more constants. We also have to change how the NUMBER_OF_SECTIONS is defined, we can no longer compute it from the size of the section table, as this would create a circular dependence: Code: NUMBER_OF_SECTIONS := 2 Section.1.NAME := +'.text' Section.1.CHARACTERISTICS := IMAGE_SCN_MEM_EXECUTE + IMAGE_SCN_MEM_READ Section.2.NAME := +'.rdata' Section.2.CHARACTERISTICS := IMAGE_SCN_MEM_READ Code: CURRENT_SECTION = 0 macro section? name*, characteristics:0 CURRENT_SECTION = CURRENT_SECTION + 1 repeat 1, new:CURRENT_SECTION, previous:CURRENT_SECTION-1 Section.previous.End: align SECTION_ALIGNMENT Section.new.NAME := +name Section.new.CHARACTERISTICS := characteristics Section.new: section $%% align FILE_ALIGNMENT,0 if previous > 0 Section.previous.SIZE_IN_FILE := $ - Section.previous.OFFSET_IN_FILE end if Section.new.OFFSET_IN_FILE: org Section.new end repeat end macro To define labels and constants that correspond to enumerated section entries, we need to extract the number from the CURRENT_SECTION variable and somehow place it into names. The trick in fasmg is to use REPEAT with just a single repetition, solely for the purpose of defining counters that get replaced with numbers before the repeated text is assembled. The macro does everything that we have previously done manually when starting a new section. The ending address and the size in file get defined only when the next section is started, so we need to define an additional false (not counted into the total number) section at the end, together with the definition of the NUMBER_OF_SECTIONS and the SIZE_OF_IMAGE. Code: postpone NUMBER_OF_SECTIONS := CURRENT_SECTION section '' SIZE_OF_IMAGE := $ - IMAGE_BASE end postpone This macro required us to learn a bit more of the assembler's trickery, but it makes the section definitions much more pleasant to the eye: Code: section '.text', IMAGE_SCN_MEM_EXECUTE + IMAGE_SCN_MEM_READ EntryPoint: push 0 push CaptionString push MessageString push 0 call [MessageBoxA] push 0 call [ExitProcess] section '.rdata', IMAGE_SCN_MEM_READ ImportTable: .1.ImportLookupTableRva dd KernelLookupTable-IMAGE_BASE .1.TimeDateStamp dd 0 .1.ForwarderChain dd 0 .1.NameRva dd KernelDLLName-IMAGE_BASE .1.ImportAddressTableRva dd KernelAddressTable-IMAGE_BASE .2.ImportLookupTableRva dd UserLookupTable-IMAGE_BASE .2.TimeDateStamp dd 0 .2.ForwarderChain dd 0 .2.NameRva dd UserDLLName-IMAGE_BASE .2.ImportAddressTableRva dd UserAddressTable-IMAGE_BASE dd 0,0,0,0,0 KernelLookupTable: dd ExitProcessLookup-IMAGE_BASE dd 0 KernelAddressTable: ExitProcess dd ExitProcessLookup-IMAGE_BASE ; this is going to be replaced with the address of the function dd 0 UserLookupTable: dd MessageBoxALookup-IMAGE_BASE dd 0 UserAddressTable: MessageBoxA dd MessageBoxALookup-IMAGE_BASE ; this is going to be replaced with the address of the function dd 0 align 2 ExitProcessLookup: .Hint dw 0 .Name db 'ExitProcess',0 align 2 MessageBoxALookup: .Hint dw 0 .Name db 'MessageBoxA',0 KernelDLLName db 'KERNEL32.DLL',0 UserDLLName db 'USER32.DLL',0 ImportTable.End: CaptionString db "PE tutorial",0 MessageString db "I am alive and well!",0 Code: iterate name, Export, Import, Resource, Exception, Certificate, BaseRelocation, Debug, Architecture, GlobalPtr, TLS, LoadConfig, BoundImport, IAT, DelayImport, COMPlus, Reserved if defined name#Table .name.Rva dd name#Table-IMAGE_BASE .name.Size dd name#Table.End-name#Table else .name.Rva dd 0 .name.Size dd 0 end if end iterate A variant of the first source that has all these improvements is in the attached "basic_template.asm" file. We are going to use it as a base for the continued experiments. |
|||
04 Aug 2018, 23:44 |
|
Tomasz Grysztar 08 Aug 2018, 09:03
1.3 Making a library
We already know how to import functions from a DLL, now it is time to make our own one. The relocatable image that we have just prepared should be our starting point. It is easy for libraries to have clashing addresses, even if we try to come up with a unique one, therefore having a relocation table is practically mandatory. What distinguishes a DLL is a single bit in Characteristics, with a self-explanatory name: Code: .Characteristics dw IMAGE_FILE_32BIT_MACHINE + IMAGE_FILE_EXECUTABLE_IMAGE + IMAGE_FILE_DLL Let us rewrite the '.text' section then. Code: section '.text', IMAGE_SCN_MEM_EXECUTE + IMAGE_SCN_MEM_READ EntryPoint: mov eax,1 ret 12 What comes next is the function that we are going to export. Code: ShowOff: push 0 push CaptionString push MessageString push 0 call [MessageBoxA] ret The function is ready, but now we need to construct an export table for our DLL. We can put in it the '.rdata' section, just like the import table, we only need to make sure that the starting address is aligned. A table that lies right at the start of the section inherits a nice alignment, but when we put one on top of another data, we should better use ALIGN to keep the address round enough. Code: align 4 ExportTable: .ExportFlags dd 0 .TimeDateStamp dd %t .MajorVersion dw 0 .MinorVersion dw 0 .NameRva dd LibraryName-IMAGE_BASE .OrdinalBase dd 1 .AddressTableEntries dd 1 .NumberOfNamePointers dd 1 .ExportAddressTableRva dd ExportAddressTable-IMAGE_BASE .NamePointerRva dd ExportNamePointerTable-IMAGE_BASE .OrdinalTableRva dd ExportOrdinalTable-IMAGE_BASE ExportFlags is an unused field and should always be zero. TimeDateStamp is the value used to verify whether the bound imports refer to the same version the library, as mentioned earlier. Nowadays ASLR makes bound imports mostly obsolete and we are not going to use them, but we put a good timestamp there just in case someone wanted to try binding to our library. MajorVersion and MinorVersion we can set to any values we wish, they are mostly irrelevant. NameRva points the name of the library, a string that we are going to define near the end of the table to not mess up the alignment. To understand the purpose of OrdinalBase we need to discuss one thing about the import table that got omitted earlier. It possible to specify a function to import not by the name, but by an ordinal number which identifies a record within the Export Address Table. The addresses (and therefore the functions they point to) are numbered starting from the value of OrdinalBase. A usual choice of this offset is 1, then the first address in the table has ordinal 1, the second one - ordinal 2, and so on. An entry in Import Lookup Table that would normally contain an RVA may be marked as containing an ordinal number by having the highest bit set. Its value is then 0x8000000 plus the ordinal of a function to import. However, this is a discouraged method. Different versions of a library may not have the same numbering of the functions unless someone paid a close attention to it. Therefore we are not going to use this technique and the value of OrdinalBase has no other uses. AddressTableEntries gives the number of addresses in EAT, which has a relative address specified in ExportAddressTableRva. NumberOfNamePointers defines the number of entries in two other tables, pointed to by NamePointerRva and OrdinalTableRva. These tables run in parallel, the records of the first one point to the names of the functions while the second one defines corresponding indexes in EAT. The latter are 16-bit numbers that count the records in EAT starting from zero and do not depend on the contents of OrdinalBase. EAT can have a different length than the other two tables. It is possible to have addresses that have no associated name and such functions could be imported only through their ordinal numbers. On the other hand it is also possible to have multiple names pointing to the same entry in EAT so a single function can have several aliases. In this example we define just one function with one name, thus we put 1 everywhere. Code: ExportAddressTable: dd ShowOff-IMAGE_BASE Code: ExportNamePointerTable: dd ShowOff.Name-IMAGE_BASE ExportOrdinalTable: dw 0 All that remains is to define the needed strings, at this point we no longer have to worry about them causing any misalignment. Code: LibraryName db 'LIBRARY.DLL',0 ShowOff.Name db 'ShowOff',0 ExportTable.End: Code: MessageString db "This is a message from the depth of the library.",0 But to test the library, we also need to make a program that uses it. We should once more copy our template and modify the contents of the '.text' section to make it call the function from our DLL: Code: EntryPoint: call [ShowOff] push 0 call [ExitProcess] Code: macro import? items& align 4 ImportTable: iterate item, items match name.=DLL?, item .name.ImportLookupTableRva dd ImportLookupTable.name-IMAGE_BASE .name.TimeDateStamp dd 0 .name.ForwarderChain dd 0 .name.NameRva dd ImportLibraryName.name-IMAGE_BASE .name.ImportAddressTableRva dd ImportAddressTable.name-IMAGE_BASE else if % = 1 err 'please start with a name of a DLL' end if end iterate dd 0,0,0,0 iterate item, items match name.=DLL?, item dd 0 ImportLookupTable.name: else dd ImportLookup.item-IMAGE_BASE end match end iterate iterate item, items match name.=DLL?, item dd 0 ImportAddressTable.name: else item dd ImportLookup.item-IMAGE_BASE end match end iterate dd 0 iterate item, items match name.=DLL?, item ImportLibraryName.name db `item,0 else align 2 ImportLookup.item: dw 0 db `item,0 end match end iterate ImportTable.End: end macro A few new tricks have been used here. When a pattern given to MATCH contains a name that is not preceded by equality sign, it is a wildcard matching any non-empty text. Moreover, that name is then replaced with the corresponding text everywhere inside the MATCH block. In this case "name" becomes a parameter containing the name of the library without the extension. The "dll" text is matched literally because of the equality sign, though the trailing question mark makes this a case-insensitive requirement. The macro should not allow function names to be given without a library being defined first, so if the initial item does match the pattern, an error is signalled. The number of processed item is taken from the special counter %, which is available in any repeating block. The first loop looks only at the names of the libraries and defines Import Directory Table with records for each one of them. The table should end with five zeroed fields, but it only makes four of them. This is a little tricky - in the following loops every time a new ILT or IAT is started, it generates a zeroed field to close the previous one. The very first time it happens, that zero becomes the missing fifth one to complete the Import Directory Table. When it comes to defining the names of libraries and function, the backquote is used to make the text of a parameter into a string. The internal workings of the macro are normally hidden when we generate listing, but there is a way to prioritize the expansion of the macro and make the lines it generates show up. An exclamation mark after the name gives macro such priority: Code: macro import?! items& With the assistance of the macro we can now define an entire '.rdata' section for our testing program in a just a couple of lines: Code: section '.rdata', IMAGE_SCN_MEM_READ import LIBRARY.DLL, ShowOff, \ KERNEL32.DLL, ExitProcess We can now assemble the program and the library, keeping in mind that the latter should be written into a file named "library.dll". The attached files "library.asm" and "library_user.asm" contain sources prepared according to the above process. If we keep experimenting, we might want to create libraries with more functions. This is a good excuse to create another macro, we should not have to concern ourselves with manual creation of the export tables. Code: macro export? library*,functions& align 4 iterate name, functions EXPORT_ADDRESS.% := name-IMAGE_BASE EXPORT_NAME.% = `name EXPORT_ORDINAL.% = %-1 EXPORT_COUNT = %% end iterate We define a numbered constants with values that should go into appropriate sub-tables. We need them in this form, because we may have to shuffle some around, remembering that the table of names must have them ordered lexically. We also define EXPORT_COUNT value using a special symbol %%, which is a companion to % and is the total number of repetitions. The same definition is redone with every iteration, but this is quite harmless. Actually, that %% could be replaced with % to the same effect, as only the last assigned value counts. Code: D = EXPORT_COUNT while D > 1 D = D shr 1 repeat EXPORT_COUNT-D X = D+% while X-D > 0 repeat 1, i:X-D, j:X if lengthof EXPORT_NAME.i > lengthof EXPORT_NAME.j S = lengthof EXPORT_NAME.i else S = lengthof EXPORT_NAME.j end if if EXPORT_NAME.i bswap S > EXPORT_NAME.j bswap S T = EXPORT_NAME.i EXPORT_NAME.i = EXPORT_NAME.j EXPORT_NAME.j = T T = EXPORT_ORDINAL.i EXPORT_ORDINAL.i = EXPORT_ORDINAL.j EXPORT_ORDINAL.j = T else break end if end repeat X = X-D end while end repeat end while While the outer layers should more or less speak for themselves, the interior is a bit convoluted because of the peculiarities of the language. The innermost REPEAT is not a real loop but an idiomatic expression that we have already seen before. It makes a text substitution that replaces "i" and "j" with numbers computed from associated expressions. The assembler performs comparisons numerically and the strings are converted into numbers using the little-endian encoding - the first byte of the text is the least significant byte of the number. Therefore to compare texts lexically we need to reverse the order of bytes in corresponding numbers and this is done with BSWAP. The size of the reversed numeric data should be the same for both strings, because then compared values have the same position of the most significant byte (which contains what was the first byte of the text). The greater of the two lengths becomes the chosen size, since the numbers must be large enough to contain the strings. When the list of names has some of its values swapped during the sorting, the corresponding ordinal numbers are exchanged as well, while the list of addresses remains unaltered. This way every function keeps the ordinal number it was assigned based on the order of names given to the macro. Code: ExportTable: .ExportFlags dd 0 .TimeDateStamp dd %t .MajorVersion dw 0 .MinorVersion dw 0 .NameRva dd ExportLibraryName-IMAGE_BASE .OrdinalBase dd 1 .AddressTableEntries dd EXPORT_COUNT .NumberOfNamePointers dd EXPORT_COUNT .ExportAddressTableRva dd ExportAddressTable-IMAGE_BASE .NamePointerRva dd ExportNamePointerTable-IMAGE_BASE .OrdinalTableRva dd ExportOrdinalTable-IMAGE_BASE ExportAddressTable: repeat EXPORT_COUNT dd EXPORT_ADDRESS.% end repeat ExportNamePointerTable: repeat EXPORT_COUNT dd ExportName.%-IMAGE_BASE end repeat ExportOrdinalTable: repeat EXPORT_COUNT dw EXPORT_ORDINAL.% end repeat ExportLibraryName db `library,0 repeat EXPORT_COUNT ExportName.% db EXPORT_NAME.%,0 end repeat ExportTable.End: end macro With such defined macro the export table of our library could be defined simply as: Code: export LIBRARY.DLL, ShowOff |
|||
08 Aug 2018, 09:03 |
|
Tomasz Grysztar 08 Aug 2018, 09:18
1.4 Embedding resources
The resources are like a small files enclosed within the executable. They contain supplementary data of various kinds, often in formats that could just as well function in separate files. Plain resource tables that were present in some older formats (like NE used by 16-bit Windows) in case of PE were replaced with a sorted tree structure. This allows to efficiently search for resources when needed, but it also means that we need to do a little more work to construct a resource directory in our image. While the structure of the resource tree was designed in a way that could allow arbitrarily many levels, in practice there are always exactly three. On the initial level there are branches for different types of data, on the second level they are split according to the identifiers of individual resources. The final level allows each of the resources to have versions in different languages. To start with a tree that is as simple as possible, we are going to define a solitary resource that does not need translations into other languages. Each level is then going to be a single list with just one entry. The type of resource that we are going to define first is the manifest file. This is an XML that is read by the system when it loads the image and it allows to specify various requirements that the program may have. We need to be able to immediately see the result of embedding it in our PE, therefore we are going to use this one: Code: <?xml version='1.0' encoding='UTF-8' standalone='yes'?> <assembly xmlns='urn:schemas-microsoft-com:asm.v1' manifestVersion='1.0'> <trustInfo xmlns="urn:schemas-microsoft-com:asm.v3"> <security> <requestedPrivileges> <requestedExecutionLevel level='requireAdministrator' uiAccess='false' /> </requestedPrivileges> </security> </trustInfo> </assembly> The attached "manifest.xml" contains the text of this manifest, we are going to insert the contents of this file directly into our image. We also need to include an additional header that gives names to standard resource types and identifiers: Code: use 'winuser.inc' All the data associated with resources is usually stored in the ".rsrc" section. Thanks to our macros we can add such section to our image easily: Code: section '.rsrc', IMAGE_SCN_MEM_READ Code: ResourceTable: .Characteristics dd 0 .TimeDateStamp dd 0 .MajorVersion dw 0 .MinorVersion dw 0 .NumberOfNameEntries dw 0 .NumberOfIdEntries dw 1 .1.Id dd RT_MANIFEST .1.Offset dd 0x80000000 + ResourceDirectory_Manifest-ResourceTable Characteristics was intended to hold some flags, but it has never been used for anything and should always be zero. TimeDateStamp should be the time when the resource was created by the compiler. We could put the value of %t here, but by zeroing it we can emphasize that we are not really a resource compiler and this structure is not made by one. MajorVersion and MinorVersion as usual can be set to whatever we want. NumberOfNameEntries and NumberOfIdEntries define how many entries are in the directory, their total number is the sum of these two values. The table that follows should list named entries first, then ones with numeric identifiers. Both sub-lists should be sorted, but in our first sample every one of them is going to contain just a single element, so we do not have to worry about the ordering. The root directory lists the types of resources and all standard ones are identified by numbers, therefore we use only the second kind of an entry. The only element of the table points to a subdirectory containing RT_MANIFEST resources. The highest bit of Offset field indicates that this is a relative address of a subdirectory and not a data entry (this is what allows in theory to terminate the tree at any level). The offset given in the lower bits is not an RVA, but a distance from the beginning of resource table. Next we define a subdirectory listing all the resources of this type: Code: ResourceDirectory_Manifest: .Characteristics dd 0 .TimeDateStamp dd 0 .MajorVersion dw 0 .MinorVersion dw 0 .NumberOfNameEntries dw 0 .NumberOfIdEntries dw 1 .1.Id dd CREATEPROCESS_MANIFEST_RESOURCE_ID .1.Offset dd 0x80000000 + ResourceDirectory_Manifest_CreateProcess-ResourceTable We point to another subdirectory, to reach the required three levels: Code: ResourceDirectory_Manifest_CreateProcess: .Characteristics dd 0 .TimeDateStamp dd 0 .MajorVersion dw 0 .MinorVersion dw 0 .NumberOfNameEntries dw 0 .NumberOfIdEntries dw 1 .1.Id dd 0 ; LANG_NEUTRAL + SUBLANG_NEUTRAL .1.Offset dd ResourceDataEntry_Manifest_CreateProcess-ResourceTable The highest bit of Offset is cleared, because this time it does not point to a subdirectory, but to a data entry. Code: ResourceDataEntry_Manifest_CreateProcess: .DataRva dd Manifest_CreateProcess-IMAGE_BASE .Size dd Manifest_CreateProcess.End-Manifest_CreateProcess .Codepage dd 65001 ; UTF-8 .Reserved dd 0 What remains is to provide the data we just pointed to. We can use FILE to insert the content of a file directly into assembled output instead of having to define it with commands like DB: Code: Manifest_CreateProcess: file 'manifest.xml' .End: ResourceTable.End: Before we move on to experiment with other types of resources, we should better prepare another macro to make the construction of resource directories less tedious. To be able to freely define contents of individual resources, we need a macro that can incorporate blocks of definitions. To generate the same resource section as in the basic example we would be using syntax like: Code: resource_table resource RT_MANIFEST, CREATEPROCESS_MANIFEST_RESOURCE_ID file 'manifest.xml' end resource end resource_table Code: macro resource_table? RC_INDEX = 0 macro resource? type*,id*,lang:0 RC_INDEX = RC_INDEX + 1 repeat 1, i:RC_INDEX RC_TYPE.i := type RC_ID.i := id RC_LANG.i := lang RC_RVA.i := $-IMAGE_BASE macro end?.resource? RC_SIZE.i := $-IMAGE_BASE-RC_RVA.i purge end?.resource? end macro end repeat end macro The inner macro in turn defines another one, specialized to end the definition of the started resource. When a macro is defined in the namespace of case-insensitive END symbol, it can be invoked by an END command looking similar to how the blocks of assembly commands are usually closed. In simple terms this means that such macro can be called by putting a space instead of a dot between the END and the specific name. Because "i" is replaced with a number before the text inside REPEAT block is interpreted, the innermost macro has a text tailored to close the definition of one specific resource. To ensure that it cannot be executed more than once, the macro removes its own definition with PURGE. Similarly, the main macro also defines another one that is going to end the entire table: Code: macro end?.resource_table? ResourceTable.End: RESOURCE_COUNT := RC_INDEX purge resource?, end?.resource_table? There is more to do at the end of the table, though. Code: repeat RESOURCE_COUNT RC_ORDER.% = % end repeat D = RESOURCE_COUNT while D > 1 D = D shr 1 repeat RESOURCE_COUNT-D X = D+% while X-D > 0 repeat 1, x_d:X-D, x:X repeat 1, i:RC_ORDER.x_d, j:RC_ORDER.x if RC_TYPE.i > RC_TYPE.j |\ (RC_TYPE.i = RC_TYPE.j & RC_ID.i > RC_ID.j ) |\ (RC_TYPE.i = RC_TYPE.j & RC_ID.i = RC_ID.j & RC_LANG.i > RC_LANG.j) RC_ORDER.x_d = j RC_ORDER.x = i else break end if end repeat end repeat X = X-D end while end repeat end while repeat RESOURCE_COUNT, i:1 repeat 1, n:RC_ORDER.% RESOURCE_TYPE.i := RC_TYPE.n RESOURCE_ID.i := RC_ID.n RESOURCE_LANG.i := RC_LANG.n RESOURCE_RVA.i := RC_RVA.n RESOURCE_SIZE.i := RC_SIZE.n end repeat end repeat end macro If we had some resources identified by names instead of numbers, a simple comparing clause as above would not suffice to sort them correctly. Therefore we continue with the assumption that all identifiers are numeric. The additions needed to make the macro properly handle named resources would only make it harder to follow at the moment. Back to the main macro, we can now use the values collected and sorted by inner macros to construct the tree. We still need a bit of preparation, though. Code: TYPE_COUNT = 0 repeat RESOURCE_COUNT if % = 1 | RESOURCE_TYPE.% <> TYPE TYPE = RESOURCE_TYPE.% TYPE_COUNT = TYPE_COUNT + 1 ID = RESOURCE_ID.% ID_FIRST = % ID_COUNT.% = 1 LANG_FIRST = % LANG_COUNT.% = 1 else if RESOURCE_ID.% <> ID ID = RESOURCE_ID.% repeat 1, first:ID_FIRST ID_COUNT.first = ID_COUNT.first + 1 end repeat LANG_FIRST = % LANG_COUNT.% = 1 else repeat 1, first:LANG_FIRST LANG_COUNT.first = LANG_COUNT.first + 1 end repeat end if end if end repeat First, it increments TYPE_COUNT for every new type of resource it encounters, so this variable ends up holding the total number of types. Second, it counts how many different identifiers are there for a given type and keeps this value in ID_COUNT at a position being the number of first entry with such type. Finally, it counts how many entries are there that have the same type and identifier, storing this value in LANG_COUNT at a position being the number of the first such entry. With all that sorting and counting done, the construction of resource directories is going to be pretty straighforward. Code: align 4 ResourceTable: .Characteristics dd 0 .TimeDateStamp dd 0 .MajorVersion dw 0 .MinorVersion dw 0 .NumberOfNameEntries dw 0 .NumberOfIdEntries dw TYPE_COUNT repeat RESOURCE_COUNT if defined ID_COUNT.% dd RESOURCE_TYPE.% dd 0x80000000 + ResourceTypeDirectory#%-ResourceTable end if end repeat Code: repeat RESOURCE_COUNT if defined ID_COUNT.% ResourceTypeDirectory#%: .Characteristics dd 0 .TimeDateStamp dd 0 .MajorVersion dw 0 .MinorVersion dw 0 .NumberOfNameEntries dw 0 .NumberOfIdEntries dw ID_COUNT.% end if if defined LANG_COUNT.% dd RESOURCE_ID.% dd 0x80000000 + ResourceIdDirectory#%-ResourceTable end if end repeat Code: repeat RESOURCE_COUNT if defined LANG_COUNT.% ResourceIdDirectory#%: .Characteristics dd 0 .TimeDateStamp dd 0 .MajorVersion dw 0 .MinorVersion dw 0 .NumberOfNameEntries dw 0 .NumberOfIdEntries dw LANG_COUNT.% end if dd RESOURCE_LANG.% dd ResourceDataEntry#%-ResourceTable end repeat Code: repeat RESOURCE_COUNT ResourceDataEntry#%: .DataRva dd RESOURCE_RVA.% .Size dd RESOURCE_SIZE.% .Codepage dd 65001 .Reserved dd 0 end repeat end macro With the complexities of sorted directories out of the way, we can now focus more on the contents of resources. We are going to construct a simple icon for our program. An icon is defined through a resource of type RT_GROUP_ICON, which is a table that lists variants of the same image that differ in color depth or resolution. The actual images are stored in RT_ICON resources, each in its own one. It is a bit like if it was another level of the tree, a version of an icon chosen according to the language may in turn have multiple variants suitable for different display modes. But this arrangement predates the PE format and the tree structure of resource directories. Every image associated with an icon is a separate resource with its own identifier. The identifier of an icon as seen through the API is the identifier of an RT_GROUP_ICON resource. We can choose an identifier freely. When Windows looks for an icon to display for presenting the program file, it simply selects the first one in the directory. Code: IDI_MAIN_ICON := 101 The format of RT_GROUP_ICON resource is very similar to the one of main header of an ICO file. Code: resource_table resource RT_GROUP_ICON, IDI_MAIN_ICON MainIcon: .idReserved dw 0 .idType dw 1 .idCount dw 1 .1.bWidth db 32 .1.bHeight db 32 .1.bColorCount db 16 .1.bReserved db 0 .1.wPlanes dw 1 .1.wBitCount dw 4 .1.dwBytesInRes dd Icon_1.End-Icon_1 .1.nId dw 1 end resource If we had more than one image, we would need to make sure that each gets a unique number. Here we just use 1 as the identifier of our single picture. Code: resource RT_ICON, 1 Icon_1: .biSize dd 40 .biWidth dd 32 .biHeight dd 64 .biPlanes dw 1 .biBitCount dw 4 .biCompression dd 0 .biSizeImage dd .Bitmap.end-.Bitmap .biXPelsPerMeter dd 0 .biYPelsPerMeter dd 0 .biClrUsed dd 16 .biClrImportant dd 16 .Palette: repeat 16, i:0 db i*16, i*16, 0, 0 end repeat .Bitmap: repeat 32, y:0 repeat 32, x:0 COLOR = (x + y) / 4 if x and 1 = 1 db HIGH shl 4 + COLOR else HIGH = COLOR end if end repeat end repeat .Bitmap.mask: db 32 * 32 / 8 dup 0 .Bitmap.end: Icon_1.End: end resource end resource_table The mask provides an old and simple method of defining transparency. Where the bits in mask are zeroed, the icon is opaque. But in places where bits in mask are set, a XOR operation is performed to combine the background with the icon (it gives a true transparency only when the color of the image is zero at such point). For simplicity, we zero the entire mask and make an icon that is just a solid square filled with a simple diagonal gradient. With a 16-color palette a single pixel occupies four bits, so there are two of them per byte. The HIGH variable temporarily holds the color of one pixel that is then combined with the next one. The layout of the image is as usual in bitmap files, with the first byte corresponding to the lower left corner of the square. A program assembled with these resources should have such generated icon appearing in the directory view and file properties. An example source is provided in the "resource_icon.asm" file. |
|||
08 Aug 2018, 09:18 |
|
Tomasz Grysztar 08 Aug 2018, 12:46
1.5 Moving to 64 bits with PE+
PE+ is a variant used when the target architecture uses 64-bit addresses instead of 32-bit. It has some of the fields adjusted so that they can hold larger values, though there are only few such places. All RVA values are expected to still fit in 32 bits and their corresponding fields are unchanged. To make an example that uses such format we need to move to an architecture with large addresses. An obvious choice is the 64-bit successor of x86, traditionally called x86-64 (or x64 for short). To set it up we need to replace the lines we have been using to select the 32-bit 80386 instruction set: Code: use 'x64.inc'
use64 Consequently, we need to change the Machine field in the main header to relect the new choice: Code: .Machine dw IMAGE_FILE_MACHINE_AMD64 Code: .Characteristics dw IMAGE_FILE_EXECUTABLE_IMAGE + IMAGE_FILE_LARGE_ADDRESS_AWARE To indicate that the image uses PE+ format, Magic value in the optional header needs to be set to 0x20B. To make the template a bit more flexible, we are going to define another constant next to DEFAULT_IMAGE_BASE: Code: MAGIC := 0x20B Code: .Magic dw MAGIC In places where the structure of PE+ differs from the classic one, we will change it conditionally, depending on the value of MAGIC. This should allow the template to correctly produce either variant of PE, with switching done in a single place. The value of ImageBase is a an absolute address and it needs to be 64-bit, but the required extension is done in such way that the offsets of fields further down remain unchanged. The larger address occupies the space of two fields of the original header. BaseOfData, which preceded ImageBase, is sacrificed for this purpose and is no longer present in PE+ header: Code: if MAGIC = 0x20B .ImageBase dq DEFAULT_IMAGE_BASE else .BaseOfData dd 0 .ImageBase dd DEFAULT_IMAGE_BASE end if The other fields that need extending are the ones related to sizes of stack and heap. They are near the end of the header and are simply expanded to the larger size. Code: if MAGIC = 0x20B .SizeOfStackReserve dq 4096 .SizeOfStackCommit dq 4096 .SizeOfHeapReserve dq 65536 .SizeOfHeapCommit dq 0 else .SizeOfStackReserve dd 4096 .SizeOfStackCommit dd 4096 .SizeOfHeapReserve dd 65536 .SizeOfHeapCommit dd 0 end if Another place where some fields need to be enlarged is the import table. The entries in IAT should be 64-bit to be able to contain the imported addresses. And because the initial content of IAT should be identical to ILT, the other table is affected as well. We can modify the IMPORT macro as follows to make it generate appropriate structures for either variant of PE: Code: macro import? items& align 4 ImportTable: iterate item, items match name.=DLL?, item .name.ImportLookupTableRva dd ImportLookupTable.name-IMAGE_BASE .name.TimeDateStamp dd 0 .name.ForwarderChain dd 0 .name.NameRva dd ImportLibraryName.name-IMAGE_BASE .name.ImportAddressTableRva dd ImportAddressTable.name-IMAGE_BASE else if % = 1 err 'please start with a name of a DLL' end if end iterate if MAGIC <> 0x20B dd 0,0,0,0 else dd 0,0,0 align 8, 0 end if iterate item, items match name.=DLL?, item if MAGIC <> 0x20B dd 0 else dq 0 end if ImportLookupTable.name: else if MAGIC <> 0x20B dd ImportLookup.item-IMAGE_BASE else dq ImportLookup.item-IMAGE_BASE end if end match end iterate iterate item, items match name.=DLL?, item if MAGIC <> 0x20B dd 0 else dq 0 end if ImportAddressTable.name: else if MAGIC <> 0x20B item dd ImportLookup.item-IMAGE_BASE else item dq ImportLookup.item-IMAGE_BASE end if end match end iterate if MAGIC <> 0x20B dd 0 else dq 0 end if iterate item, items match name.=DLL?, item ImportLibraryName.name db `item,0 else align 2 ImportLookup.item: dw 0 db `item,0 end match end iterate ImportTable.End: end macro In places like the export table or resource section there is nothing to do, all these structures use only relative addresses and these are kept in 32-bit fields. The relocations may need some attention, but first we need to have an actual code, and this is a section that needs to be rewritten completely. As we have switched to a different machine architecture, we need to adapt to the new instruction set and calling conventions used by the functions of the operating system. This portion of our program should now look like: Code: section '.text', IMAGE_SCN_MEM_EXECUTE + IMAGE_SCN_MEM_READ EntryPoint: sub rsp,8*5 mov r9d,0 lea r8,[CaptionString] lea rdx,[MessageString] mov rcx,0 call [MessageBoxA] mov ecx,0 call [ExitProcess] section '.rdata', IMAGE_SCN_MEM_READ import USER32.DLL, MessageBoxA, \ KERNEL32.DLL, ExitProcess CaptionString db "PE tutorial",0 MessageString db "I am 64-bit, alive and well!",0 The initial SUB instruction adjusts the stack pointer to reserve a space required by convention for the later function calls and at the same time it also corrects the alignment of the stack. The operating system requires that address to be a multiple of 16 before a function is called, but at entry point it is misaligned by 8 bytes. This is analogous to a starting point of a function, the misalignment is caused by a 64-bit return address stored on the stack. The function calling convention used by Windows on x86-64 uses registers to pass the first four parameters, while the further ones are passed on the stack. None of the functions we call in our simple program takes more than four parameters, so we only need to put the values in registers. Nevertheless, a space for the first four parameters has to be reserved on the stack anyway, a function is allowed to use that area to keep their values. This means that we need to reserve at least four 64-bit units of stack space, but to fix the misalignment there must be an odd number of them, therefore we take five units in total. In the long mode of x86-64 processor, when an instruction has a memory operand (these are the ones enclosed in square brackets), the corresponding machine code does not contain an absolute address but an offset relative to the instruction pointer. The distances within the same image never change, therefore such code does not need relocating. To set up parameters that contain addresses we used LEA instead of MOV to take advantage of this feature. It makes the entire program freely movable with no relocations to apply. So even if we get rid of DD macro (so far our only way to gather relocation entries) we can assemble with PE+ with dynamic base and run it with no issues. However, in case we wanted to experiment with 64-bit instructions that may need relocations after all, we should prepare macros like this: Code: if MAGIC <> 0x20B macro dd? data& iterate unit, data match ?, unit dd ? else if unit relativeto BASE_RELOCATION repeat 1, i:FIXUP_INDEX FIXUP_RVA_#i := $ - IMAGE_BASE FIXUP_TYPE_#i := IMAGE_REL_BASED_HIGHLOW end repeat FIXUP_INDEX = FIXUP_INDEX + 1 dd unit-BASE_RELOCATION else dd unit end if end iterate end macro else macro dq? data& iterate unit, data match ?, unit dq ? else if unit relativeto BASE_RELOCATION repeat 1, i:FIXUP_INDEX FIXUP_RVA_#i := $ - IMAGE_BASE FIXUP_TYPE_#i := IMAGE_REL_BASED_DIR64 end repeat FIXUP_INDEX = FIXUP_INDEX + 1 dq unit-BASE_RELOCATION else dq unit end if end iterate end macro end if Code: repeat (INDEX-FIRST), i:FIRST dw FIXUP_RVA_#i and 0xFFF + FIXUP_TYPE_#i shl 12 end repeat Code: mov r9d,0 mov r8,CaptionString mov rdx,MessageString mov rcx,0 call [MessageBoxA] Last edited by Tomasz Grysztar on 28 Jan 2023, 14:46; edited 3 times in total |
|||
08 Aug 2018, 12:46 |
|
Tomasz Grysztar 16 Aug 2018, 09:08
1.6 Experimenting further
We already have several constants defined near the beginning of our source that could be used to customize the produced image, like DEFAULT_IMAGE_BASE or SECTION_ALIGNMENT. But if we are to start toying with them, we should consider the constraints imposed by specification. For example, base of the image is required to be a multiple of 0x10000. To make sure that our custom values do not break the rules, we should add some checks that would remind us of the guidelines when we cross them. The simplest way to do it is an ASSERT statement: Code: assert DEFAULT_IMAGE_BASE mod 0x10000 = 0 As mentioned earlier, the ALIGN macro could also use some check in case it was used with freely chosen numbers: Code: macro align? pow2*,value:? assert bsf(pow2) = bsr(pow2) if $ relativeto BASE_RELOCATION assert pow2 <= 0x10000 db (BASE_RELOCATION-$)and(pow2-1) dup value else db (-$)and(pow2-1) dup value end if end macro The specification gives a few additional rules concerning non-standard aligments. The SECTION_ALIGNMENT must not be smaller that FILE_ALIGNMENT, and if it is smaller than the page size (0x1000 bytes for x86 architectures) the two values need to be equal: Code: assert SECTION_ALIGNMENT >= FILE_ALIGNMENT if SECTION_ALIGNMENT < 0x1000 assert FILE_ALIGNMENT = SECTION_ALIGNMENT end if Code: if SECTION_ALIGNMENT < 0x1000 org $%% else section $%% end if align FILE_ALIGNMENT,0 A 64-bit template with these additions is in the attached "universal_template.asm" file. It may serve as a starting point for futher experiments. For example, to morph the template back into a 32-bit program, there are only a few simple steps needed. To adjust the format of the file, it is enough to change the value of Machine field to IMAGE_FILE_MACHINE_I386 and set MAGIC to 0x10B. In addition to that the actual program instructions in ".text" section need to be replaced and this requires USE32 in place of USE64. The "x64.inc" can stay included, it can handle both modes. Nevertheless, there is more that could be done to make the template easily customizable. Headers and macros could be moved to a separate file that normally would not need to be modified and constants defining more options could be introduced. At this point it should be an easy exercise to anyone interested in experimenting more with these examples. Instead, we are now going to try some new tricks. Earlier we omitted the computation of the checksum, because it is not needed for usual programs. But it is something worth having in a toolbox, in case it ever becomes necessary. To calculate correct checksum for a program made from our template, the following block of commands should suffice: Code: postpone ? CHECKSUM = 0 repeat $% shr 1, POSITION:0 load H:word from :POSITION shl 1 CHECKSUM = CHECKSUM + H end repeat while CHECKSUM shr 16 CHECKSUM = CHECKSUM shr 16 + CHECKSUM and 0FFFFh end while CHECKSUM = CHECKSUM + $% store CHECKSUM:dword at :OptionalHeader.CheckSum-IMAGE_BASE end postpone The LOAD allows to read from the previously generated output, with the colon before the address meaning that it is an offset within the produced file. After the calculation is done, the value in optional header is updated with STORE, which has syntax analogous to LOAD. Because Windows does not require correct checksum in the normal programs it executes, some third-party tool may be needed to verify that the result of this computation is valid. On the other hand, if there is no way to tell whether the checksum is correct, it also means that it is not really necessary. Obviously, the checksum can also be computed and updated on a previously generated image. We can even do it with the same assembler, by using FILE command to read the contents of PE file and then update it with STORE before it is written to the new output: Code: file "universal_template.exe" load STUB_SIGNATURE:2 from 0 load PE_OFFSET:4 from 0x3C load PE_SIGNATURE:4 from PE_OFFSET if STUB_SIGNATURE = "MZ" & PE_SIGNATURE = "PE" CHECKSUM_OFFSET = PE_OFFSET + 24 + 64 CHECKSUM = 0 store CHECKSUM:4 at CHECKSUM_OFFSET repeat $% shr 1, POSITION:0 load H:word from :POSITION shl 1 CHECKSUM = CHECKSUM + H end repeat while CHECKSUM shr 16 CHECKSUM = CHECKSUM shr 16 + CHECKSUM and 0FFFFh end while CHECKSUM = CHECKSUM + $% store CHECKSUM:4 at CHECKSUM_OFFSET else err 'PE format not recognized' end if The same template can also be adapted to produce files for non-x86 architectures. For example, just a couple of changes suffices to make an ARM64 executable. Obviously, we need to switch the CPU instruction set. Replace Code: use 'x64.inc'
use64 Code: use 'aarch64.inc'
define xIP0? x16
define xIP1? x17 The default image base needs to be different for ARM64: Code: DEFAULT_IMAGE_BASE := 0x140000000 Code: .Machine dw IMAGE_FILE_MACHINE_ARM64 Code: EntryPoint: mov x0,0 adr x1,MessageString adr x2,CaptionString mov x3,0 bl stub_MessageBoxA bl stub_ExitProcess stub_MessageBoxA: adr xip0,MessageBoxA ldr xip0,[xip0] br xip0 stub_ExitProcess: adr xip0,ExitProcess ldr xip0,[xip0] br xip0 Code: adrp xip0,MessageBoxA ldr xip0,[xip0,(MessageBoxA-IMAGE_BASE) and 0FFFh] Last edited by Tomasz Grysztar on 30 Jan 2023, 09:08; edited 2 times in total |
|||
16 Aug 2018, 09:08 |
|
Tomasz Grysztar 08 Sep 2018, 16:32
Chapter 2
ELF (Executable and Linkable Format) ELF was initially designed for Unix systems as a successor to COFF, having a more extendable structure and fewer limitations. It should be noted that the original COFF was not nearly as complex as its evolved forms known today. Around 1989, when the development of Windows NT was starting and its PE/COFF variations were most likely being conceived, Unix System V Release 4 was already out, using a fresh design of ELF in place of older COFF. Interestingly, around the same time NeXT machines started showing up, with a system that used another new format called Mach-O (arguably the most powerful of the three). Each of these formats uses a different approach to arrange its contents, even though there may have been at least some convergent evolution in their capabilities. As the name implies, ELF can be used both for directly executable files and for linkable objects that are intermediate stage in a compilation. Since there are many structures that are required by one of these variants but not the other, ELF has flexible headers that allow to include only the tables that are relevant. Nevertheless it is always possible for an ELF file to contain all the tables, even the ones that are not mandatory for a given variant. For example, executable file produced by a compiler is likely to contain a symbol table, even though its presence is only required in the object files. 2.1 A minimal executable file Nowadays the most common system using ELF is Linux, therefore it is going to be a platform for our experiments. The first one is going to be a tiny executable containing only the parts of ELF format that are absolutely necessary for a valid program. Because some of the Linux systems on x86-64 architecture may not be able to run 32-bit programs, this time we have no common denominator. Therefore, right from the start, we will prepare examples in a flexible form that may produce either a 32-bit or 64-bit file. The differences in structure of such two variants are relatively few. To select which variant we want to create, we are going to define a constant using the "-i" switch in the command line. A command to assemble 32-bit file could look like: Code: fasmg basic.asm -i include\ 'listing.inc' -i MACHINE:=EM_386 Code: fasmg basic.asm -i include\ 'listing.inc' -i MACHINE:=EM_X86_64 Code: fasmg basic.asm -i include\ 'listing.inc' -i MACHINE:=EM_AARCH64 Let us start with the same couple of simple macros as in the previous chapter, and also read a set of constants from the "elf.inc" file: Code: macro align? pow2*,value:? db (-$) and (pow2-1) dup value end macro macro use? file* include file end macro use 'elf.inc' Code: if MACHINE = EM_386 CLASS := ELFCLASS32 BASE_ADDRESS := 0x8048000 use '80386.inc' use32 else if MACHINE = EM_X86_64 CLASS := ELFCLASS64 BASE_ADDRESS := 0x400000 use 'x64.inc' use64 else if MACHINE = EM_AARCH64 CLASS := ELFCLASS64 BASE_ADDRESS := 0x400000 use 'aarch64.inc' end if To be continued... Code: org BASE_ADDRESS Header: .e_ident db 0x7F, 'ELF', CLASS, ELFDATA2LSB, EV_CURRENT, ELFOSABI_LINUX, (.e_ident+16-$) dup 0 .e_type dw ET_EXEC .e_machine dw MACHINE .e_version dd EV_CURRENT if CLASS = ELFCLASS32 .e_entry dd start .e_phoff dd PROGRAM_HEADER_OFFSET .e_shoff dd 0 else .e_entry dq start .e_phoff dq PROGRAM_HEADER_OFFSET .e_shoff dq 0 end if .e_flags dd 0 .e_ehsize dw HEADER_LENGTH .e_phentsize dw SEGMENT_HEADER_LENGTH .e_phnum dw NUMBER_OF_SEGMENTS .e_shentsize dw SECTION_HEADER_LENGTH .e_shnum dw 0 .e_shstrndx dw 0 HEADER_LENGTH := $% PROGRAM_HEADER_OFFSET := $% ProgramHeader: if CLASS = ELFCLASS32 repeat NUMBER_OF_SEGMENTS, n:1 .n.p_type dd Segment.n.TYPE .n.p_offset dd Segment.n.OFFSET .n.p_vaddr dd Segment.n.ADDRESS .n.p_paddr dd Segment.n.ADDRESS .n.p_filesz dd Segment.n.SIZE_IN_FILE .n.p_memsz dd Segment.n.SIZE_IN_MEMORY .n.p_flags dd Segment.n.FLAGS .n.p_align dd Segment.n.ALIGN end repeat else repeat NUMBER_OF_SEGMENTS, n:1 .n.p_type dd Segment.n.TYPE .n.p_flags dd Segment.n.FLAGS .n.p_offset dq Segment.n.OFFSET .n.p_vaddr dq Segment.n.ADDRESS .n.p_paddr dq Segment.n.ADDRESS .n.p_filesz dq Segment.n.SIZE_IN_FILE .n.p_memsz dq Segment.n.SIZE_IN_MEMORY .n.p_align dq Segment.n.ALIGN end repeat end if SEGMENT_HEADER_LENGTH := ($ - ProgramHeader) / NUMBER_OF_SEGMENTS virtual at 0 if CLASS = ELFCLASS32 .sh_name dd ? .sh_type dd ? .sh_flags dd ? .sh_addr dd ? .sh_offset dd ? .sh_size dd ? .sh_link dd ? .sh_info dd ? .sh_addralign dd ? .sh_entsize dd ? else .sh_name dd ? .sh_type dd ? .sh_flags dq ? .sh_addr dq ? .sh_offset dq ? .sh_size dq ? .sh_link dd ? .sh_info dd ? .sh_addralign dq ? .sh_entsize dq ? end if SECTION_HEADER_LENGTH := $ end virtual NUMBER_OF_SEGMENTS := 2 Segment.1.TYPE := PT_LOAD Segment.1.FLAGS := PF_R+PF_X Segment.1.ALIGN := 1000h Segment.1.OFFSET := 0 Segment.1.ADDRESS := BASE_ADDRESS start: if MACHINE = EM_386 mov eax,4 ; sys_write mov ebx,1 mov ecx,msg mov edx,msg.length int 0x80 mov eax,1 ; sys_exit xor ebx,ebx int 0x80 else if MACHINE = EM_X86_64 mov eax,1 ; sys_write mov edi,1 lea rsi,[msg] mov edx,msg.length syscall mov eax,60 ; sys_exit xor edi,edi syscall else if MACHINE = EM_AARCH64 mov x8,64 ; sys_write mov x0,1 adr x1,msg mov x2,msg.length svc 0 mov x8,93 ; sys_exit mov x0,0 svc 0 end if Segment.1.SIZE_IN_FILE := $%% - Segment.1.OFFSET Segment.1.SIZE_IN_MEMORY := $ - Segment.1.ADDRESS Segment.2.TYPE := PT_LOAD Segment.2.FLAGS := PF_R+PF_W Segment.2.ALIGN := 1000h Segment.2.OFFSET = $%% align Segment.2.ALIGN Segment.2.ADDRESS := $ + Segment.2.OFFSET and (Segment.2.ALIGN-1) section Segment.2.ADDRESS msg db "I am alive and well!",0xA .length = $ - . Segment.2.SIZE_IN_FILE := $%% - Segment.2.OFFSET Segment.2.SIZE_IN_MEMORY := $ - Segment.2.ADDRESS Code: macro align? pow2*,value:? db (-$) and (pow2-1) dup value end macro macro use? file* include file end macro use 'elf.inc' if MACHINE = EM_386 CLASS := ELFCLASS32 BASE_ADDRESS := 0x8048000 use '80386.inc' use32 else if MACHINE = EM_X86_64 CLASS := ELFCLASS64 BASE_ADDRESS := 0x400000 use 'x64.inc' use64 else if MACHINE = EM_AARCH64 CLASS := ELFCLASS64 BASE_ADDRESS := 0x400000 use 'aarch64.inc' end if org BASE_ADDRESS Header: .e_ident db 0x7F, 'ELF', CLASS, ELFDATA2LSB, EV_CURRENT, ELFOSABI_LINUX, (.e_ident+16-$) dup 0 .e_type dw ET_EXEC .e_machine dw MACHINE .e_version dd EV_CURRENT if CLASS = ELFCLASS32 .e_entry dd start .e_phoff dd PROGRAM_HEADER_OFFSET .e_shoff dd 0 else .e_entry dq start .e_phoff dq PROGRAM_HEADER_OFFSET .e_shoff dq 0 end if .e_flags dd 0 .e_ehsize dw HEADER_LENGTH .e_phentsize dw SEGMENT_HEADER_LENGTH .e_phnum dw NUMBER_OF_SEGMENTS .e_shentsize dw SECTION_HEADER_LENGTH .e_shnum dw 0 .e_shstrndx dw 0 HEADER_LENGTH := $% SEGMENT_NUMBER = 0 HEADERS_UNMAPPED = 1 macro segment? type, flags:PF_R SEGMENT_NUMBER = SEGMENT_NUMBER + 1 local SEGMENT_BASE, SEGMENT_OFFSET repeat 1, n:SEGMENT_NUMBER Segment.n.TYPE := type Segment.n.FLAGS := flags if Segment.n.TYPE = PT_LOAD Segment.n.ALIGN := 1000h else Segment.n.ALIGN := 1 end if if HEADERS_UNMAPPED & Segment.n.TYPE = PT_LOAD SEGMENT_OFFSET = 0 SEGMENT_BASE = BASE_ADDRESS HEADERS_UNMAPPED = 0 else SEGMENT_OFFSET = $%% if Segment.n.TYPE = PT_LOAD align Segment.n.ALIGN section $ + SEGMENT_OFFSET and (Segment.n.ALIGN-1) end if SEGMENT_BASE = $ end if macro end?.segment? Segment.n.OFFSET := SEGMENT_OFFSET Segment.n.ADDRESS := SEGMENT_BASE Segment.n.SIZE_IN_FILE := $%% - SEGMENT_OFFSET Segment.n.SIZE_IN_MEMORY := $ - SEGMENT_BASE end macro end repeat end macro postpone NUMBER_OF_SEGMENTS := SEGMENT_NUMBER if HEADERS_UNMAPPED err 'At least one PT_LOAD segment should be present' end if end postpone segment PT_PHDR, PF_R PROGRAM_HEADER_OFFSET := $% ProgramHeader: if CLASS = ELFCLASS32 repeat NUMBER_OF_SEGMENTS, n:1 .n.p_type dd Segment.n.TYPE .n.p_offset dd Segment.n.OFFSET .n.p_vaddr dd Segment.n.ADDRESS .n.p_paddr dd Segment.n.ADDRESS .n.p_filesz dd Segment.n.SIZE_IN_FILE .n.p_memsz dd Segment.n.SIZE_IN_MEMORY .n.p_flags dd Segment.n.FLAGS .n.p_align dd Segment.n.ALIGN end repeat else repeat NUMBER_OF_SEGMENTS, n:1 .n.p_type dd Segment.n.TYPE .n.p_flags dd Segment.n.FLAGS .n.p_offset dq Segment.n.OFFSET .n.p_vaddr dq Segment.n.ADDRESS .n.p_paddr dq Segment.n.ADDRESS .n.p_filesz dq Segment.n.SIZE_IN_FILE .n.p_memsz dq Segment.n.SIZE_IN_MEMORY .n.p_align dq Segment.n.ALIGN end repeat end if SEGMENT_HEADER_LENGTH := ($ - ProgramHeader) / NUMBER_OF_SEGMENTS end segment virtual at 0 if CLASS = ELFCLASS32 .sh_name dd ? .sh_type dd ? .sh_flags dd ? .sh_addr dd ? .sh_offset dd ? .sh_size dd ? .sh_link dd ? .sh_info dd ? .sh_addralign dd ? .sh_entsize dd ? else .sh_name dd ? .sh_type dd ? .sh_flags dq ? .sh_addr dq ? .sh_offset dq ? .sh_size dq ? .sh_link dd ? .sh_info dd ? .sh_addralign dq ? .sh_entsize dq ? end if SECTION_HEADER_LENGTH := $ end virtual segment PT_LOAD, PF_R+PF_X start: if MACHINE = EM_386 mov eax,4 ; sys_write mov ebx,1 mov ecx,msg mov edx,msg.length int 0x80 mov eax,1 ; sys_exit xor ebx,ebx int 0x80 else if MACHINE = EM_X86_64 mov eax,1 ; sys_write mov edi,1 lea rsi,[msg] mov edx,msg.length syscall mov eax,60 ; sys_exit xor edi,edi syscall else if MACHINE = EM_AARCH64 mov x8,64 ; sys_write mov x0,1 adr x1,msg mov x2,msg.length svc 0 mov x8,93 ; sys_exit mov x0,0 svc 0 end if end segment segment PT_LOAD, PF_R+PF_W msg db "I am alive and well!",0xA .length = $ - . end segment Last edited by Tomasz Grysztar on 05 Feb 2023, 19:48; edited 14 times in total |
|||
08 Sep 2018, 16:32 |
|
Tomasz Grysztar 23 Feb 2019, 17:11
It may take quite some time before I manage to finish the chapter on ELF, and Mach-O seems almost out of sight. If you can't wait, you may at least take a look at the sets of macros I made for creation of Mach-O executables and objects, with a commentary on some issues I encountered.
|
|||
23 Feb 2019, 17:11 |
|
< Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.