flat assembler
Message board for the users of flat assembler.
Index
> Linux > how to put data in .data and code in .text in same macro? |
Author |
|
Tomasz Grysztar 17 Mar 2009, 15:26
Your error must be caused by something different, because - for instance - this code:
Code: macro Init { section '.data' writeable MyAddress dd 0 section '.text' executable mov ecx, [esp] ; more code that will eventually put an address into MyAddress: mov [MyAddress], ecx } format ELF Init However, you must remember, that fasm always generates the output exactly in the order specified in source, so if you write something like this: Code: section '.data' writeable MyAddress dd 0 section '.text' executable mov [MyAddress],eax section '.data' writeable MyAddress2 dd 0 section '.text' executable mov [MyAddress2],eax Anyway, the problem of defining data from some macro inside the code is a known one, and there are some existing macro solutions for it, see for example this thread: http://board.flatassembler.net/topic.php?t=8619 |
|||
17 Mar 2009, 15:26 |
|
buzzkill 17 Mar 2009, 16:30
Tomasz Grysztar wrote: Your error must be caused by something different It was , it was caused by me thinking that it would be a good idea to code this up at 6am this morning when I couldn't sleep... You're right, there's nothing wrong with the macro, I just forgot to actually call it that's why the address variable was not accessible... I'm a moron... But, on the bright side, I learned something again from your post: I didn't know there could be two sections with the same name in one ELF binary, but I looked it up in the ELF specs, and there can be. I also put your code example in a src file, assembled and linked it, and was pleasantly surprised when I saw that ld had combined both .data sections into one, and both .text sections into one, so that means I don't have to do any linking voodoo. In conclusion, let me share with you the practical result of my macro fiddling: a way to use faster (sysenter) syscalls on linux, should you ever find yourself in need of those libfasm.inc Code: ; vim: set ft=fasm: ; Set up syscalls. Must be called _before_ the first syscall is made. macro Init { ; The address of __kernel_vsyscall section '.data' writeable vsyscall dd 0 ; Determine the __kernel_vsyscall address section '.text' executable local ..l1, ..l2 mov ecx, [esp] ; argc (at least 1: program name) lea edi, [esp+ecx*4+8] ; skip argc + argv[] + term. null -> start of envp[] ..l1: mov ecx, [edi] ; current envp[]-entry add edi, 4 ; prepare for next entry, or skip term. null before jecxz jecxz ..l2 ; end of envp[]? continue with searching aux.vector jmp ..l1 ; proceed with next envp[]-entry ..l2: mov ecx, [edi] ; entry type of aux.vector (consists of type/value pairs) add edi, 8 ; prepare for next type, skipping value of current entry cmp ecx, 32 ; AT_SYSINFO? jne ..l2 ; no: try the next type mov ecx, [edi-4] ; yes: value for type AT_SYSINFO is __kernel_vsyscall address mov [vsyscall], ecx ; from now on we can make syscalls like so: call [vsyscall] } ;EOF fl_tst.asm Code: ; vim: set ft=fasm: ; assemble/link: ; $ fasm fl_tst.asm fl_tst.o ; $ ld fl_tst.o -o fl_tst format ELF section '.text' executable include 'libfasm.inc' public _start _start: Init ; macro that sets up address for syscalls mov eax, 4 ; write mov ebx, 1 ; fd 1: stdout mov ecx, msg ; string to print (not 0-terminated!) mov edx, msg.len ; string length call [vsyscall] mov eax, 1 ; exit mov ebx, 0 ; return value call [vsyscall] section '.data' writeable msg db "It works!", 0Ah .len = $-msg ;EOF Anyway, thanks for your help (again) |
|||
17 Mar 2009, 16:30 |
|
Tomasz Grysztar 17 Mar 2009, 16:44
buzzkill wrote: I also put your code example in a src file, assembled and linked it, and was pleasantly surprised when I saw that ld had combined both .data sections into one, and both .text sections into one, so that means I don't have to do any linking voodoo. Well, combining sections in such way is exactly what linker is for. However I would still recommend that you add "align 1" into definition of second code section (and all the following), because the default alignment is 4, and thus linker will put some alignment bytes in the middle of your code instructions. On the other hand, linker is perhaps wise enough to put the NOP instructions there, so that may not be really harmful. |
|||
17 Mar 2009, 16:44 |
|
buzzkill 17 Mar 2009, 19:13
Damn, you're right again The merging of the sections by ld isn't as clean as I first thought, like you said there is "padding with NOPs" to alignment. And while NOPs aren't harmful, I don't like to see them, especially in an asm program And I think it's sloppy to force a user of this macro to "align 1" his own text section(s), I want my macros to be non-intrusive.
I tried it with a custom linker script, but to no avail, and I also couldn't find some objcopy trickery to help me with this... I could ofcourse put the code in an .init section and the data perhaps in a .bss, but that's also sloppy I feel. So let me ask you: is there any way to force fasm to complete all passes first and then construct the sections, merging them like ld, but more intelligently than ld? Although I have a feeling what your answer will be To me, having multiple .data or .text sections in the same binary seems more "hackish" than having just one... Suppose you have several macros that need to create global variables, you'd wind up with a big pile of sections... Anyway, I guess I'm just dissapointed my macro trickery doesn't work like I want it too But should you have any more insights into this, I'd love to hear them... |
|||
17 Mar 2009, 19:13 |
|
Tomasz Grysztar 17 Mar 2009, 19:17
buzzkill wrote: And I think it's sloppy to force a user of this macro to "align 1" his own text section(s), I want my macros to be non-intrusive. buzzkill wrote: So let me ask you: is there any way to force fasm to complete all passes first and then construct the sections, merging them like ld, but more intelligently than ld? Although I have a feeling what your answer will be Check out the macros from the thread I linked few posts earlier here. |
|||
17 Mar 2009, 19:17 |
|
buzzkill 17 Mar 2009, 20:30
Quote:
Unfortunately, I can't get those to work it seems: Code: format ELF include 'globals.inc' section '.data' writeable MyAddress_1 db 1 .idata { MyAddress_2 db 2 } iData section '.text' executable public _start _start: mov [MyAddress_1], al mov [MyAddress_2], al mov [MyAddress_3], al mov eax, 1 mov ebx, 0 int 80h section '.data' writeable .idata { MyAddress_3 db 3 } iData gives me: Code: $ fasm qqq.asm qqq.o -s qqq.fas flat assembler version 1.67.35 (16384 kilobytes memory) qqq.asm [34]: iData globals.inc [53] iData [3]: instr qqq.asm [13] z?0 [1]: MyAddress_2 db 2 error: symbol already defined. and: Code: format ELF include 'globals.inc' section '.data' writeable MyAddress_1 db 1 .idata { MyAddress_2 db 2 } iData section '.text' executable .idata { MyAddress_3 db 3 } iData public _start _start: mov [MyAddress_1], al mov [MyAddress_2], al mov [MyAddress_3], al mov eax, 1 mov ebx, 0 int 80h gives me: Code: $ fasm qqq.asm qqq.o -s qqq.fas flat assembler version 1.67.35 (16384 kilobytes memory) qqq.asm [23]: iData globals.inc [53] iData [3]: instr qqq.asm [13] z?0 [1]: MyAddress_2 db 2 error: symbol already defined. So, basically the same error. Only this works: Code: format ELF include 'globals.inc' section '.data' writeable MyAddress_1 db 1 .idata { MyAddress_2 db 2 } iData section '.text' executable public _start _start: mov [MyAddress_1], al mov [MyAddress_2], al mov eax, 1 mov ebx, 0 int 80h But that's no different from just: Code: section '.data' writeable MyAddress_1 db 1 MyAddress_2 db 2 So I don't get how I can put something into the .data section from another place than the .data section itself Or I just don't understand what you mean, that's possible too ofcourse |
|||
17 Mar 2009, 20:30 |
|
Tomasz Grysztar 17 Mar 2009, 20:34
It should be used this way:
Code: format ELF include 'globals.inc' .idata { MyAddress_1 db 1 } section '.text' executable .idata { MyAddress_2 db 2 } public _start _start: mov [MyAddress_1], al mov [MyAddress_2], al mov [MyAddress_3], al mov eax, 1 mov ebx, 0 int 80h .idata { MyAddress_3 db 3 } section '.data' writeable iData |
|||
17 Mar 2009, 20:34 |
|
buzzkill 17 Mar 2009, 21:00
Ah thanks, I see. I'll look into incorporating those, but they look kinda "baroque" to me: making declaring and instantiating variables two separate things is almost starting to look like a HLL construction It seems the fasm macro style takes some getting used to for me, I remember nasm preprocessor stuff being easier I think. Maybe I'd better stick to writing actual code first before trying to hide it all away behind macros
|
|||
17 Mar 2009, 21:00 |
|
Tomasz Grysztar 17 Mar 2009, 21:09
Yes, fasm's preprocessor is a bit tricky and definitely non-standard. I may recommend reading the Understanding fasm article if you want to know how those things work. After the 1.68 release I plan go on with writing it, as it's now ended in the middle of the interesting stuff.
|
|||
17 Mar 2009, 21:09 |
|
buzzkill 18 Mar 2009, 20:21
That was also an interesting read, if you ever expand on it, let me know (since you obviously have more interesting stuff to say ).
One thing though: just before the last paragraph you say: Quote: As for the exact explanation of IRPS directive, we first need to know a few more details about how preprocessor perceives the source text. I then looked it up in the manual, but I thought the functions of IRP and IRPS seemed very similar. So I assembled the IRPS example from the manual: Code: irps reg, al bx ecx { xor reg,reg } and then assembled this: Code: irp reg, al,bx,ecx { xor reg,reg } and both those fragments assemble to exactly the same thing. So I'm still wondering: what exactly is the difference between IRP and IRPS? Why/when would you choose one over the other? BTW, I really liked the paragraph "Using flat assembler as pure interpreter", I never would have thought to use an assembler in that way |
|||
18 Mar 2009, 20:21 |
|
Tomasz Grysztar 18 Mar 2009, 20:54
buzzkill wrote: but after that the IRPS command is never mentioned again, so I'm not sure I got the "exact explanation"... Yes, I just haven't written that part yet. I'm sorry, I'll try to go on with it as soon as possible. |
|||
18 Mar 2009, 20:54 |
|
buzzkill 18 Mar 2009, 21:27
No hurry, I'm not trying to rush you It's just something I noticed, that's all
|
|||
18 Mar 2009, 21:27 |
|
buzzkill 19 Mar 2009, 13:31
Tomasz,
I'd like to bring one other thing to your attention: If I assemble the fl_tst.asm src file from this post above, fasm generates an object file with a NOBITS .text section: (the [ 1] section in this output) Code: $ fasm fl_tst.asm fl_tst.o flat assembler version 1.67.35 (16384 kilobytes memory) 3 passes, 643 bytes. $ readelf -S fl_tst.o There are 8 section headers, starting at offset 0x143: Section Headers: [Nr] Name Type Addr Off Size ES Flg Lk Inf Al [ 0] NULL 00000000 000000 000000 00 0 0 0 [ 1] .text NOBITS 00000000 000034 000000 00 AX 0 0 4 [ 2] .data PROGBITS 00000000 000034 000004 00 WA 0 0 4 [ 3] .text PROGBITS 00000000 000038 00004d 00 AX 0 0 4 [ 4] .rel.text REL 00000000 00008f 000020 08 6 3 4 [ 5] .data PROGBITS 00000000 000085 00000a 00 WA 0 0 4 [ 6] .symtab SYMTAB 00000000 0000af 000060 10 7 5 4 [ 7] .strtab STRTAB 00000000 00010f 000034 00 0 0 1 Key to Flags: W (write), A (alloc), X (execute), M (merge), S (strings) I (info), L (link order), G (group), x (unknown) O (extra OS processing required) o (OS specific), p (processor specific) I think this is a bug, because (AFAIK) a .text section can never be NOBITS. As a result of this, when the linker merges both .text sections, the resulting .text section is also NOBITS, which as you know leads to problems with the standard linux tools (like you can't disassemble such a .text section). I can see how fasm would come to generate this, because after preprocessing you would have first the declaration of a .text section, then a .data section, and then another .text section, and in this last .text section it would put all the code. But still, I don't feel this is the correct way to go about this: it would lead to a "non-standard" executable, and it's obviously not what the programmer (me ) intended. I still think that an assembler should interpret a "section <.name>" line as: IF there is already a section called <.name> THEN add the following contents to that section ELSE create a new section called <.name> and put these contents in it. This method would work for all sections, be they .data or .text or whatever. Also, this may make it easier to work with large codebases that are split up into different modules/units/whatever you like to call them. Now, if you feel this isn't the way to go with fasm, I will respect your decision ofcourse and not bring it up again Maybe some other linux fasm users could chime in here and let us know how they feel about this? |
|||
19 Mar 2009, 13:31 |
|
Tomasz Grysztar 19 Mar 2009, 14:52
buzzkill wrote: As a result of this, when the linker merges both .text sections, the resulting .text section is also NOBITS, which as you know leads to problems with the standard linux tools (like you can't disassemble such a .text section). No, it's not possible for the resulting section to be NOBITS when it actually contains some code. The NOBITS things is only related to how this particular section is represented in file. After it gets merged with some section that contains some data in file, the resulting section can no longer be NOBITS. Please check it out. buzzkill wrote: I can see how fasm would come to generate this, because after preprocessing you would have first the declaration of a .text section, then a .data section, and then another .text section, and in this last .text section it would put all the code. But still, I don't feel this is the correct way to go about this: it would lead to a "non-standard" executable, and it's obviously not what the programmer (me ) intended. fasm's design is such that everything is put into output in the exactly same order, as it was put into source. Well, it was designed for a bit different kind of programmer's intentions, I guess. Still, this should not affect the final executable in any bad way, really. The linker combines all the sections appropriately, and it will tell you if there's anything strange with them (like if you generate one .bss section that actually has some data in it, and this way you force the entire final .bss section to become PROGBITS). |
|||
19 Mar 2009, 14:52 |
|
buzzkill 19 Mar 2009, 15:10
Tomasz Grysztar wrote: No, it's not possible for the resulting section to be NOBITS when it actually contains some code. The NOBITS things is only related to how this particular section is represented in file. After it gets merged with some section that contains some data in file, the resulting section can no longer be NOBITS. Please check it out. For the record: Code: $ cat fl_tst.asm ; assemble/link: ; $ fasm fl_tst.asm fl_tst.o ; $ ld fl_tst.o -o fl_tst format ELF section '.text' executable include 'libfasm.inc' public _start _start: Init ; macro that sets up address for syscalls mov eax, 4 ; write mov ebx, 1 ; fd 1: stdout mov ecx, msg ; string to print (not 0-terminated!) mov edx, msg.len ; string length call [vsyscall] mov eax, 1 ; exit mov ebx, 0 ; return value call [vsyscall] section '.data' writeable msg db "It works!", 0Ah .len = $-msg Code: $ cat libfasm.inc ; Set up syscalls. Must be called _before_ the first syscall is made. macro Init { ; The address of __kernel_vsyscall section '.data' writeable vsyscall dd 0 ; Determine the __kernel_vsyscall address section '.text' executable local ..l1, ..l2 mov ecx, [esp] ; argc (at least 1: program name) lea edi, [esp+ecx*4+8] ; skip argc + argv[] + term. null -> start of envp[] ..l1: mov ecx, [edi] ; current envp[]-entry add edi, 4 ; prepare for next entry, or skip term. null before jecxz jecxz ..l2 ; end of envp[]? continue with searching aux.vector jmp ..l1 ; proceed with next envp[]-entry ..l2: mov ecx, [edi] ; entry type of aux.vector (consists of type/value pairs) add edi, 8 ; prepare for next type, skipping value of current entry cmp ecx, 32 ; AT_SYSINFO? jne ..l2 ; no: try the next type mov ecx, [edi-4] ; yes: value for type AT_SYSINFO is __kernel_vsyscall address mov [vsyscall], ecx ; from now on we can make syscalls like so: call [vsyscall] } Code: $ fasm fl_tst.asm fl_tst.o flat assembler version 1.67.35 (16384 kilobytes memory) 3 passes, 643 bytes. $ ld fl_tst.o -o fl_tst $ readelf -S fl_tst There are 6 section headers, starting at offset 0x11c: Section Headers: [Nr] Name Type Addr Off Size ES Flg Lk Inf Al [ 0] NULL 00000000 000000 000000 00 0 0 0 [ 1] .text NOBITS 08048094 000094 00004d 00 AX 0 0 4 [ 2] .data PROGBITS 080490e4 0000e4 00000e 00 WA 0 0 4 [ 3] .shstrtab STRTAB 00000000 0000f2 000027 00 0 0 1 [ 4] .symtab SYMTAB 00000000 00020c 000070 10 5 3 4 [ 5] .strtab STRTAB 00000000 00027c 000020 00 0 0 1 Key to Flags: W (write), A (alloc), X (execute), M (merge), S (strings) I (info), L (link order), G (group), x (unknown) O (extra OS processing required) o (OS specific), p (processor specific) |
|||
19 Mar 2009, 15:10 |
|
Tomasz Grysztar 19 Mar 2009, 15:22
Hmmm, it seems to be some bug in your ld. This is what I get:
Code: $ fasm fl_tst.asm flat assembler version 1.67.35 (16384 kilobytes memory) 3 passes, 643 bytes. $ ld fl_tst.o -o fl_tst ld: warning: section `.text' type changed to PROGBITS $ readelf -S fl_tst There are 6 section headers, starting at offset 0xfc: Section Headers: [Nr] Name Type Addr Off Size ES Flg Lk Inf Al [ 0] NULL 00000000 000000 000000 00 0 0 0 [ 1] .text PROGBITS 08048074 000074 00004d 00 AX 0 0 4 [ 2] .data PROGBITS 080490c4 0000c4 00000e 00 WA 0 0 4 [ 3] .shstrtab STRTAB 00000000 0000d2 000027 00 0 0 1 [ 4] .symtab SYMTAB 00000000 0001ec 000070 10 5 3 4 [ 5] .strtab STRTAB 00000000 00025c 000020 00 0 0 1 Key to Flags: W (write), A (alloc), X (execute), M (merge), S (strings) I (info), L (link order), G (group), x (unknown) O (extra OS processing required) o (OS specific), p (processor specific) $ ld -V GNU ld version 2.18.50.0.9-7.fc10 20080822 Supported emulations: elf_i386 i386linux elf_x86_64 |
|||
19 Mar 2009, 15:22 |
|
buzzkill 19 Mar 2009, 15:32
Ok then, hadn't occurred to me that my tools would be buggy
I run Code: $ ld -V GNU ld (GNU Binutils) 2.18 Supported emulations: elf_i386 i386linux elf_x86_64 which is the stable version for my distro, but apparently not as stable as it should be Let's put an end to this discussion then, and I will go look for a new distro |
|||
19 Mar 2009, 15:32 |
|
pelaillo 19 Mar 2009, 18:36
buzzkill wrote: ... I will go look for a new distro Don't be so drastic. The bug has been caught a few months ago. I have to upgrade to the unstable release of binutils in order to get the proper result: Code: $ ld -V GNU ld (GNU Binutils for Debian) 2.19.51.20090315 Supported emulations: elf_i386 i386linux elf_x86_64 $ ld fl_tst.o -o fl_tst ld: warning: section `.text' type changed to PROGBITS |
|||
19 Mar 2009, 18:36 |
|
buzzkill 19 Mar 2009, 18:56
OK, that would be a drastic measure, you're right OTOH, updating binutils on Gentoo might be a little bit more work than on other distros... Besides, I'm a bit disappointed that the stable 'version' of a source-based distro provides a binutils with an (as you say) months-old bug in it... That's really the kind of thing I'd expect my distro to take care of for me, because I don't have time to follow bugtrackers/mailing-lists/etc for every (major) package myself.
(btw, that's a pretty recent build you're running there, 4 days old ) |
|||
19 Mar 2009, 18:56 |
|
< Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.