


DEX4U_RELOCATABLE_EXE_32 FORMAT 

* DRAFT PROPOSAL *
Author 0x4e71


GOALS
* format should be simple
* format should be compact
* format should be relocatable
* it must not require drastic changes in OS architecture
* it should be possible to convert from other exe/obj formats (e.g. PE, a.out) to this format

STRUCTURE OF AN EXECUTABLE
 header
 relocations (if needed)
 binary image

DESCRIPTION
To keep the format simple there is only one section (code+data+anything else) in the executable, no
multi section programs are possible and there are no section attributes (memory must be always read/write/execute).
The header contains the following information:

 signature (1 dword)
 consists of 3 bytes string "DEX" + 1 byte for version of the executable
 version 1 for this version. This way the loader can be adapted to recognize future
 versions too (64 bit, whatever)
 
 program size (dword)
 specifies how much memory the program image will need.
 Useful if memory management for program loading is implemented
 Can be bigger than the size of the image, to allow pre allocating space for vars, buffers etc.
 
 image offset (dword)
 Offset (relative to start of file) of the binary image of the program. 
 Used by the loader.
 
 entry offset (dword)
 Offset (relative to start of image) of the address to jump to.
 
RELOCATIONS 
 The only type of relocation supported at the moment is 32 bit absolute, that 
 means that the absolute address of the program start is added to the relocation entry.
 Example:
 
 mov eax,[var] 
  because of ORG 0, this may be assembled to
 mov eax,[0x120]
  if the program is loaded at 0x400000, this will be patched and become
 mov eax,[0x400120]

If possible I'd like to avoid other type of relocations, to keep things simple.
How it works:
The relocation data in the executable is computed to a list of offsets (relative to the start of the image( 
zero-based) each pointing to a double word in the image that needs to be patched by adding the 
program start address (like the above example)
The relocation data however, is not stored as a list of 32 bit offsets, this is 
done to keep the executable compact.
Instead of storing a list of 32 bit offsets, we exploit the fact that code needing
relocation often is very near other code needing relocation.
Therefore the relocations are grouped in one or more relocations entries,
each entry consisting of:
 DWORD 32 bit offset, relative to start of image
 WORD  number of bytes following
 BYTE  relative offset, to be added to the above,
       result = the offset of an entry needing relocation
 more BYTE's to be added to the previous offset,        
       each for an entry needing relocation.
If the distance to the next relocations is more than an 8 bit offset, a new 
entry with 32 bit offset is created.
An offset of 0xFFFFFFFF marks the end of relocations.

It's easy make a "relocator" for this format, here is an example:

; ESI = points to relocation data
; EDX = points to the start of program image in memory

.relloop:       
        lodsd                ; load 32 bit base offset
        cmp eax,0xFFFFFFFF   ; end?
        jz .donereloc      
        mov ebx,eax          ; ebx = base offset
        lodsw                ; get number of byte entries
        mov ecx,eax          ; load loop counter
.r:     xor eax,eax          ; clear eax
        lodsb                ; get byte entry 
        add ebx,eax          ; add to offset
        mov eax,[edx+ebx]    ; get dword to patch in binary image 
        add eax,edx          ; add program address to it
        mov [edx+ebx],eax    ; store it back
        loop .r              ; repeat for all byte entries
        jmp .relloop         ; repeat until end of relocations
.donereloc:       

   
BINARY IMAGE
The binary image of the program, its origin address must be 0. 
The loader will patch the correct addresses in before executing.


STRUCTURE SPECIFICATION

HEADER:
 byte  'D','E','X'   = Exe signature
 byte  VERSION       = Exe format version: 1=DEX4U_RELOCATABLE_EXE_32
 dword program_size  = Total size of the image (excluding headers and relocations) in memory 
                       (can be bigger than the size of the file, to allow BSS and similar)
 dword image_offset  = Offset to start of binary image (relative to start of file)
 dword entry_offset  = Offset of entry point (relative to start of image)
 
RELOCATION TABLE:
  Zero or more of these entries: [
	  dword abs_offset   = 32 bit absolute offset from start of image
	  word  num_entries  = Number of byte entries following	  
	  num_entries times [ 
	     byte  rel_offset = 8 bit relative offset of dword to be patched in the image, 
	                        the offset is:
	                         - relative to absolute offset, for the first byte
	                         - relative to the offset computed by the previous byte, for all other bytes
	  ]
	                            
  ]	      
  dword 0xFFFFFFFF   = end relocations marker
  
IMAGE:  
  Zero or more [
  	byte XX
]
  

  
  

 