flat assembler
Message board for the users of flat assembler.
Index
> Non-x86 architectures > aarch64 includes for fasmg Goto page 1, 2 Next |
Author |
|
revolution 03 Sep 2017, 16:17
tthsqe wrote: 1. Are you going to use them? tthsqe wrote: 2. aarch64 syntax does have some strange points; what about renaming the identifiers "v0.16b" etc. to "v0$16b". The ".16b" does not play well with fasmg and is currently a thorn in the side of my parser. tthsqe wrote: 3. There is an instruction "as". This conflicts with the keyword in fasmg. How to get around this? tthsqe wrote: 4. The relocations have identifiers ":lo12:", ":abs_g0_nc:", ":abs_g1_nc:", ":abs_g2:", ":pg_hi21:". I'm not sure how well these play with fasmg |
|||
03 Sep 2017, 16:17 |
|
tthsqe 03 Sep 2017, 16:28
and some relocation should be generated automatically. Like on adrp, and jumps. Its not even clear what relocations the assembler should support. For example, in elf format, add x0, x1, :lo12:symbol should generate an add instruction as well as a R_AARCH64_ADD_ABS_LO12_NC for this instruction. Can this R_AARCH64_ADD_ABS_LO12_NC also relocate the immediate field of a subtract instruction, for example? Because of the fix instruction widths, we have different problems than x86...
|
|||
03 Sep 2017, 16:28 |
|
revolution 03 Sep 2017, 16:49
tthsqe wrote: and some relocation should be generated automatically. Like on adrp, and jumps. Its not even clear what relocations the assembler should support. For example, in elf format, add x0, x1, :lo12:symbol should generate an add instruction as well as a R_AARCH64_ADD_ABS_LO12_NC for this instruction. Can this R_AARCH64_ADD_ABS_LO12_NC also relocate the immediate field of a subtract instruction, for example? Because of the fix instruction widths, we have different problems than x86... There are also composite relocations that span sequences of several instructions. The spec is silent on whether the instruction order can be changed, or if intervening instructions are permitted, or if omitting one or more is permitted. So an assembler should be able to examine multiple lines of code to be able to check for violations and report errors accordingly. Last edited by revolution on 04 Sep 2017, 11:19; edited 1 time in total |
|||
03 Sep 2017, 16:49 |
|
Tomasz Grysztar 03 Sep 2017, 17:57
tthsqe wrote: 2. aarch64 syntax does have some strange points; what about renaming the identifiers "v0.16b" etc. to "v0$16b". The ".16b" does not play well with fasmg and is currently a thorn in the side of my parser. For the latter case, for example when you need a symbol to be available as ELEMENT, it could help if fasmg allowed to use numeric names after the dot (so except for the first name in the chained identifier). I think I'm going to work on such change. tthsqe wrote: 3. There is an instruction "as". This conflicts with the keyword in fasmg. How to get around this? But even when there is an actual conflict and you need a name that is normally an internal directive of fasmg, you can still re-define everything freely, and there are some tricks you may use for the purpose of making the original directive still available in some contexts. The 8051 example I mentioned earlier re-defines EQU keyword. In my Mach-O macros you can see how the SECTION is overloaded with a new meaning, but still allows the macros to use the original directive when needed. |
|||
03 Sep 2017, 17:57 |
|
Tomasz Grysztar 03 Sep 2017, 18:24
Tomasz Grysztar wrote: I think I'm going to work on such change. |
|||
03 Sep 2017, 18:24 |
|
tthsqe 03 Sep 2017, 18:35
As for the "as" instruction, it is actually "at" and the macro seems to be working fine now. It must have been one of my bugs at the time.
|
|||
03 Sep 2017, 18:35 |
|
tthsqe 04 Sep 2017, 09:25
Quote: I think I'm going to work on such change. Very good! With version g.hwx32 and a simplified parser, the assembly time is down to 8.5 seconds as compared with 12.9 from the previous version. However, now there are just as many elements as there are aarch64 registers, which is well over 600. I hope the storage needs of elements (each with simple metadata) are not excessive. EDIT: now it is down to 6.3 seconds! You have to keep the critical macros small. |
|||
04 Sep 2017, 09:25 |
|
Tomasz Grysztar 04 Sep 2017, 10:09
tthsqe wrote: EDIT: now it is down to 6.3 seconds! You have to keep the critical macros small. |
|||
04 Sep 2017, 10:09 |
|
tthsqe 04 Sep 2017, 10:19
I know this might be heretical, but as fasmg engine is quite nice, have you any plans for a C/C++ port?
|
|||
04 Sep 2017, 10:19 |
|
Tomasz Grysztar 04 Sep 2017, 10:28
tthsqe wrote: I know this might be heretical, but as fasmg engine is quite nice, have you any plans for a C/C++ port? My current plans focus mostly on writing more macros and more tutorials or other materials that could help others develop new interesting things with fasmg. I am also seriously considering writing a book about assembly language (in general, not limited to just x86) that would use fasmg as its main teaching tool. |
|||
04 Sep 2017, 10:28 |
|
revolution 04 Sep 2017, 11:20
tthsqe wrote: EDIT: now it is down to 6.3 seconds! You have to keep the critical macros small. |
|||
04 Sep 2017, 11:20 |
|
tthsqe 04 Sep 2017, 15:49
The 6.3 seconds was for a plain 'format binary' that assembled in 2 passes. The executable is over 100KB and contains about 6KB of data. The rest is code (or i guess padding), which is mostly integer instructions.
These are all timings for executable formats. Code: x86-64 pe 5 passes, 23.6 seconds, 114176 bytes x86-64 elf 4 passes, 18.3 seconds, 110966 bytes x86-64 macho 4 passes, 18.7 seconds, 123591 bytes aarch64 elf 3 passes, 9.6 seconds, 132658 bytes |
|||
04 Sep 2017, 15:49 |
|
tthsqe 15 Sep 2017, 21:02
I am getting these approximate timings for fasmg on code that is mostly integer and not particular in any way:
Code: x86-64: 4 seconds/pass/100KB output aarch64: 2 seconds/pass/100KB output The elf64 executable format is working and the resulting program plays chess on android phones. However, no other formats are supported yet and as such, relocations have not been settled. Besides the elf object format, which I hope to try soon, what other formats are there and can they be easily tested? |
|||
15 Sep 2017, 21:02 |
|
Tomasz Grysztar 15 Sep 2017, 21:44
Could you include the link to the repository in this thread? I would also add it to the list if you agree.
|
|||
15 Sep 2017, 21:44 |
|
tthsqe 19 Sep 2017, 10:47
So I finally got the dynamic linking in elf format to work. However, it is not a real solution because it required a hex editor. I hope you can improve the macros or give some hints on how to improve them.
The problem is that the loader complains/crashes when then INTERP segment or the DYNAMIC segment is not contained in a LOAD segment. the error from readelf also seems to be bogus: https://github.com/autc04/Retro68/blob/master/binutils/binutils/readelf.c#L4968 you can see how I had to edit the program headers here Code: $ ./fasmg arm/include/hello/elf_exe_dylink.arm prog; chmod 755 ./prog flat assembler version g.hwx32 3 passes, 816 bytes. $ readelf -l ./prog Elf file type is EXEC (Executable file) Entry point 0x400140 There are 4 program headers, starting at offset 64 Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flags Align INTERP 0x0000000000000120 0x0000000000400120 0x0000000000400120 0x000000000000001b 0x000000000000001b R 1 [Requesting program interpreter: /lib/ld-linux-aarch64.so.1] LOAD 0x000000000000013b 0x000000000040013b 0x000000000040013b 0x0000000000000039 0x0000000000000039 R E 1000 LOAD 0x0000000000000174 0x0000000000401174 0x0000000000401174 0x00000000000001bc 0x00000000000001bc RW 1000 DYNAMIC 0x0000000000000330 0x0000000000402330 0x0000000000402330 0x0000000000000000 0x0000000000000000 RW 1 readelf: Error: the dynamic segment offset + size exceeds the size of the file ... some hex editing for first LOAD and DYNAMIC segment ... $ readelf -l ./prog Elf file type is EXEC (Executable file) Entry point 0x400140 There are 4 program headers, starting at offset 64 Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flags Align INTERP 0x0000000000000120 0x0000000000400120 0x0000000000400120 0x000000000000001b 0x000000000000001b R 1 [Requesting program interpreter: /lib/ld-linux-aarch64.so.1] LOAD 0x0000000000000000 0x0000000000400000 0x0000000000400000 0x0000000000000174 0x0000000000000039 R E 1000 LOAD 0x0000000000000174 0x0000000000401174 0x0000000000401174 0x00000000000001bc 0x00000000000001bc RW 1000 DYNAMIC 0x0000000000000260 0x0000000000401260 0x0000000000401260 0x00000000000000d0 0x00000000000000d0 RW 1 readelf: Error: the dynamic segment offset + size exceeds the size of the file $ qemu-aarch64 ./prog 1 Hello World! 2 Hello World! It is not too important, but the source that I assembled then hex edited is Code: include 'format/format.inc' format ELF64 executable 0 entry Start macro balign boundary local size size = (boundary-1)-($+boundary-1) mod boundary ;db size dup ? db (size mod 4) dup 0 dd (size/4) dup 0xd503201f end macro segment interpreter readable db '/lib/ld-linux-aarch64.so.1',0 segment readable executable align 64 Start: mov x0, 1 adr x1, Test mov x2, TestEnd - Test mov x8, 64 svc 0 mov x0, 1 adr x1, Message mov x2, MessageEnd - Message ldr x8, write blr x8 mov x0, 0 ldr x8, exit blr x8 segment readable writeable align 64 write: dq 0 exit: dq 0 symtab: Elf64_Sym 0, 0, 0, 0, 0, 0, 0 Elf64_Sym _write - strtab, 0, 0, STB_GLOBAL, STT_FUNC, 0, 0 Elf64_Sym _exit - strtab, 0, 0, STB_GLOBAL, STT_FUNC, 0, 0 rela: Elf64_Rela write, 1, R_AARCH64_ABS64, 0 Elf64_Rela exit, 2, R_AARCH64_ABS64, 0 relasz = $ - rela rel: relsz = $ - rel hash: dd 1, 3 ; size of bucket and size of chain dd 0 ; fake bucket, just one hash value repeat 3 dd % ; chain for all symbol table entries end repeat Test: db '1 Hello World!', 10 TestEnd: Message: db '2 Hello World!', 10 MessageEnd: strtab: _null db 0 _libc db 'libc.so.6',0 _write db 'write',0 _exit db 'exit',0 strsz = $ - strtab balign 16 dq DT_NEEDED, _libc-strtab dq DT_STRTAB, strtab dq DT_STRSZ, strsz dq DT_SYMTAB, symtab dq DT_SYMENT, sizeof.Elf64_Sym dq DT_RELA, rela dq DT_RELASZ, relasz dq DT_RELAENT, sizeof.Elf64_Rela dq DT_REL, rel dq DT_RELSZ, relsz dq DT_RELENT, sizeof.Elf64_Rela dq DT_HASH, hash dq DT_NULL, 0 segment dynamic readable writeable |
|||
19 Sep 2017, 10:47 |
|
Tomasz Grysztar 19 Sep 2017, 11:04
We could add a "subsegment" macro to allow defining overlapping segments. I have never needed such overlap before.
|
|||
19 Sep 2017, 11:04 |
|
tthsqe 19 Sep 2017, 11:50
I couldn't find any references on the elf format that required the interpreter segment or the dynamic segment to be contained within a load segment. However, it really only works for this ld-linux-aarch64.so.1 if this is the case. If you could add some macros to your elf macros that allow one to define overlapping (or simply to put INTERP or DYNAMIC segments inside another LOAD) segments, then I think this can be settled.
Next is the object format |
|||
19 Sep 2017, 11:50 |
|
Tomasz Grysztar 19 Sep 2017, 14:37
I have made some alterations in the macros. The syntax may look a bit strange, but I tried to keep the changes minimal, I like how simple they are (and you can easily add some wrapper macros anyway):
Code: ELFCLASSNONE = 0 ELFCLASS32 = 1 ELFCLASS64 = 2 ELFDATANONE = 0 ELFDATA2LSB = 1 ELFDATA2MSB = 2 ELFOSABI_NONE = 0 ELFOSABI_HPUX = 1 ELFOSABI_NETBSD = 2 ELFOSABI_GNU = 3 ELFOSABI_LINUX = 3 ELFOSABI_SOLARIS = 6 ELFOSABI_AIX = 7 ELFOSABI_IRIX = 8 ELFOSABI_FREEBSD = 9 ELFOSABI_TRU64 = 10 ELFOSABI_MODESTO = 11 ELFOSABI_OPENBSD = 12 ELFOSABI_OPENVMS = 13 ELFOSABI_NSK = 14 ELFOSABI_AROS = 15 ELFOSABI_FENIXOS = 16 ELFOSABI_CLOUDABI = 17 ELFOSABI_OPENVOS = 18 ET_NONE = 0 ET_REL = 1 ET_EXEC = 2 ET_DYN = 3 ET_CORE = 4 ET_LOPROC = 0xff00 ET_HIPROC = 0xffff EM_NONE = 0 EM_M32 = 1 EM_SPARC = 2 EM_386 = 3 EM_68K = 4 EM_88K = 5 EM_860 = 7 EM_MIPS = 8 EM_X86_64 = 62 EV_NONE = 0 EV_CURRENT = 1 PT_NULL = 0 PT_LOAD = 1 PT_DYNAMIC = 2 PT_INTERP = 3 PT_NOTE = 4 PT_SHLIB = 5 PT_PHDR = 6 PT_GNU_EH_FRAME = 0x6474e550 PT_GNU_STACK = 0x6474e551 PT_LOPROC = 0x70000000 PT_HIPROC = 0x7fffffff PF_X = 1 PF_W = 2 PF_R = 4 PF_MASKOS = 0x0ff00000 PF_MASKPROC = 0xf0000000 macro align boundary,value:? db (boundary-1)-($+boundary-1) mod boundary dup value end macro ELF:: namespace ELF if defined Settings.Class CLASS := Settings.Class else CLASS := ELFCLASS32 end if if defined Settings.Machine MACHINE := Settings.Machine else MACHINE := EM_386 end if if defined Settings.ABI ABI := Settings.ABI else ABI := ELFOSABI_NONE end if if defined Settings.BaseAddress BASE_ADDRESS := Settings.BaseAddress else BASE_ADDRESS := 8048000h end if Header: e_ident db 0x7F,'ELF',CLASS,ELFDATA2LSB,EV_CURRENT,ABI,(16-$) dup 0 e_type dw ET_EXEC e_machine dw MACHINE e_version dd EV_CURRENT if CLASS <> ELFCLASS64 e_entry dd start e_phoff dd ProgramHeader e_shoff dd 0 e_flags dd 0 e_ehsize dw ProgramHeader e_phentsize dw SEGMENT_HEADER_LENGTH e_phnum dw NUMBER_OF_SEGMENTS e_shentsize dw 28h e_shnum dw 0 e_shstrndx dw 0 else e_entry dq start e_phoff dq ProgramHeader e_shoff dq 0 e_flags dd 0 e_ehsize dw ProgramHeader e_phentsize dw SEGMENT_HEADER_LENGTH e_phnum dw NUMBER_OF_SEGMENTS e_shentsize dw 40h e_shnum dw 0 e_shstrndx dw 0 end if ProgramHeader: if CLASS <> ELFCLASS64 p_type dd PT_LOAD p_offset dd 0 p_vaddr dd 0 p_paddr dd 0 p_filesz dd 0 p_memsz dd 0 p_flags dd PF_R+PF_W+PF_X p_align dd 1000h else p_type dd PT_LOAD p_flags dd PF_R+PF_W+PF_X p_offset dq 0 p_vaddr dq 0 p_paddr dq 0 p_filesz dq 0 p_memsz dq 0 p_align dq 1000h end if SEGMENT_HEADER_LENGTH = $ - ProgramHeader db (NUMBER_OF_SEGMENTS-1)*SEGMENT_HEADER_LENGTH dup 0 SEGMENT_INDEX = 0 NEXT_SEGMENT_INDEX = 0 SEGMENT_TYPE = PT_LOAD FILE_OFFSET = $% SEGMENT_BASE = BASE_ADDRESS + FILE_OFFSET and 0FFFh org SEGMENT_BASE start: store SEGMENT_BASE at ELF:p_vaddr store SEGMENT_BASE at ELF:p_paddr store FILE_OFFSET at ELF:p_offset end namespace macro segment? namespace ELF if SEGMENT_TYPE = PT_LOAD RAW_DATA_SIZE = $%% - FILE_OFFSET FILE_OFFSET = $%% SEGMENT_SIZE = $ - SEGMENT_BASE store RAW_DATA_SIZE at ELF:p_filesz+SEGMENT_INDEX*SEGMENT_HEADER_LENGTH store SEGMENT_SIZE at ELF:p_memsz+SEGMENT_INDEX*SEGMENT_HEADER_LENGTH if NEXT_SEGMENT_INDEX > 0 align 1000h SEGMENT_BASE = $ + FILE_OFFSET and 0FFFh end if section SEGMENT_BASE else FILE_OFFSET = $% SEGMENT_SIZE = $ - SEGMENT_BASE store SEGMENT_SIZE at ELF:p_filesz+SEGMENT_INDEX*SEGMENT_HEADER_LENGTH store SEGMENT_SIZE at ELF:p_memsz+SEGMENT_INDEX*SEGMENT_HEADER_LENGTH SEGMENT_BASE = SEGMENT_BASE and not 0FFFh + FILE_OFFSET and 0FFFh if $% > $%% store 0:byte at $-1 end if org SEGMENT_BASE end if if SEGMENT_SIZE > 0 & NEXT_SEGMENT_INDEX = 0 NEXT_SEGMENT_INDEX = 1 end if end namespace end macro macro segment? attributes* namespace ELF match }, attributes segment restore SEGMENT_TYPE,SEGMENT_FLAGS,SEGMENT_INDEX,SEGMENT_BASE,FILE_OFFSET org SEGMENT_BASE + $% - FILE_OFFSET else local seq,list match { _attributes, attributes define seq _attributes SEGMENT_TYPE =: PT_NULL SEGMENT_FLAGS =: 0 SEGMENT_INDEX =: SEGMENT_INDEX SEGMENT_BASE =: SEGMENT_BASE FILE_OFFSET =: FILE_OFFSET if NEXT_SEGMENT_INDEX = 0 NEXT_SEGMENT_INDEX = 1 end if else segment define seq attributes SEGMENT_TYPE = PT_LOAD SEGMENT_FLAGS = 0 end match SEGMENT_INDEX = NEXT_SEGMENT_INDEX NEXT_SEGMENT_INDEX = NEXT_SEGMENT_INDEX + 1 store SEGMENT_BASE at ELF:p_vaddr+SEGMENT_INDEX*SEGMENT_HEADER_LENGTH store SEGMENT_BASE at ELF:p_paddr+SEGMENT_INDEX*SEGMENT_HEADER_LENGTH store FILE_OFFSET at ELF:p_offset+SEGMENT_INDEX*SEGMENT_HEADER_LENGTH while 1 match car cdr, seq define list car define seq cdr else match any, seq define list any end match break end match end while irpv attribute, list match =readable?, attribute SEGMENT_FLAGS = SEGMENT_FLAGS or PF_R else match =writeable?, attribute SEGMENT_FLAGS = SEGMENT_FLAGS or PF_W else match =executable?, attribute SEGMENT_FLAGS = SEGMENT_FLAGS or PF_X else match =interpreter?, attribute SEGMENT_TYPE = PT_INTERP else match =dynamic?, attribute SEGMENT_TYPE = PT_DYNAMIC else match =note?, attribute SEGMENT_TYPE = PT_NOTE else match =gnustack?, attribute SEGMENT_TYPE = PT_GNU_STACK else match =gnuehframe?, attribute SEGMENT_TYPE = PT_GNU_EH_FRAME else err 'invalid argument' end match end irpv store SEGMENT_TYPE at ELF:p_type+SEGMENT_INDEX*SEGMENT_HEADER_LENGTH store SEGMENT_FLAGS at ELF:p_flags+SEGMENT_INDEX*SEGMENT_HEADER_LENGTH if SEGMENT_TYPE = PT_LOAD store 1000h at ELF:p_align+SEGMENT_INDEX*SEGMENT_HEADER_LENGTH else store 1 at ELF:p_align+SEGMENT_INDEX*SEGMENT_HEADER_LENGTH end if end match end namespace end macro macro entry? address* namespace ELF store address at ELF:e_entry end namespace end macro postpone purge segment? segment namespace ELF NUMBER_OF_SEGMENTS := NEXT_SEGMENT_INDEX end namespace end postpone Code: segment readable executable segment { interpreter readable db '/lib/ld-linux-aarch64.so.1',0 segment } ; continue defining the content of outer segment |
|||
19 Sep 2017, 14:37 |
|
tthsqe 19 Sep 2017, 19:18
So the main repo is
https://github.com/tthsqe12/asm and I have just put the different working elf formats in https://github.com/tthsqe12/asm/tree/master/arm/include/hello and the crazy different relocations are demoed in https://github.com/tthsqe12/asm/blob/master/arm/include/hello/elf_obj.arm The problem is that the interpreter still doesn't like elf_exe_dylink. The first load command needs to start at the file offset zero, otherwise the interpreter blows up. I tried starting before the interpreter path but after the start of the file, and blew up just the same. Will investigate futher in the future, but it is nominally working now. |
|||
19 Sep 2017, 19:18 |
|
Goto page 1, 2 Next < Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2025, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.