flat assembler
Message board for the users of flat assembler.
Index
> Macroinstructions > Help me to optimise this macro for less RAM usage |
Author |
|
vid 07 Dec 2007, 15:30
first quick idea: write first macro separately for each of "push" and "pop", and then you can get rid of "match =push, instr" and "match =pop, instr" parts.
EDIT: change my mind, this won't matter too much, it will only save a memory before preprocessing, not after. |
|||
07 Dec 2007, 15:30 |
|
vid 07 Dec 2007, 15:57
I think this macro cannot be considerably optimized.
"push" instruction is extra bitch, because it allows multiple space-separated arguments. Combined with "ptr" syntax, there is no other way to parse arguments, other than parsing generated opcodes, like you did. Even push itself can be pretty ambigous, and depends on how assembler decides to assemble it: Code: push -5 dword ptr ebx push dword ptr ebx -5 ... |
|||
07 Dec 2007, 15:57 |
|
revolution 08 Dec 2007, 08:48
vid wrote: I think this macro cannot be considerably optimized. vid wrote: Even push itself can be pretty ambigous, and depends on how assembler decides to assemble it: |
|||
08 Dec 2007, 08:48 |
|
vid 08 Dec 2007, 10:03
Quote: Hmm, pity about that. The testing part is also very memory intensive with the irp's. If it was to be completely unrolled with just a long list of push/pop instruction it will use much less memory also. Of course that would be more tedious to write and debug. I doubt about that. IMO it will have similar requirement. FASM will unroll it to same code you'd write, and macro nesting needs just couple of bytes on stack. I'm not sure, we'd need to ask tomasz |
|||
08 Dec 2007, 10:03 |
|
revolution 08 Dec 2007, 10:51
vid wrote: I doubt about that. IMO it will have similar requirement. FASM will unroll it to same code you'd write, and macro nesting needs just couple of bytes on stack. I'm not sure, we'd need to ask tomasz |
|||
08 Dec 2007, 10:51 |
|
vid 08 Dec 2007, 13:24
i tried to compile it with max available for me (250MB), and still it didn't compile. But please, try it, if you have more RAM
|
|||
08 Dec 2007, 13:24 |
|
LocoDelAssembly 08 Dec 2007, 16:15
Compiles at 768 MB, and if you remove the use detection code outside and place it only when needed you save around 64 MB. Removing "local" also reduces memory requirements (the xxxx?xxxxxxxx are a lot longer than just "i", "a", etc).
Still, note that these macros will not solve all cases, imagine this bitchy code Code: proc_esp foo, arg mov ecx, 3 .loop: push ecx dec ecx jnz .loop invoke bar, arg ; How could it detect correctly arg position? add esp, 3*4 ret endp The code is very unrealistic, but it is just to show that push will not be properly tracked always. Here another more realistic Code: proc_esp bar, arg1, arg2, arg3, arg4 cmp [arg1], 1 jne .push_arg2 push [arg3] ; Will be counted even though it has 50 % of changes to get executed only jmp .call_func .push_arg2: push [arg2] ; Will be counted even though it has 50 % of changes to get executed only (and arg2 position will be wrongly considered as shifted) jmp .call_func .call_func: push [arg4] ; What gonna happen here? call func ret endp These problems doesn't exists with EBP-based frames and I'm particularly worried about the second example, it is not very uncommon actually. Optimizing this is a good challenge though but to address this problem I think that instead of trying to provide automated tracking we need to feed the code with programmer's info instead so when you are going to access a parameter, the macros will compute deltas based on the info provided by hand. To make this handy the delta could be modified in both, relative and absolute fashion. |
|||
08 Dec 2007, 16:15 |
|
vid 08 Dec 2007, 17:13
I think trying to make this automated is not best idea.
If you want to keep track of stack, you should use win64 style: one "sub rsp,XXX" at begininng, and then just moves to/from memory. But optimizing macro is still good challenge. Changing length of names is still good idea: maybe tomasz could use "a?1" instead of "a?00000001", and save quite lot of memory if lot of macros with locals are used |
|||
08 Dec 2007, 17:13 |
|
Tomasz Grysztar 08 Dec 2007, 17:16
vid wrote: But optimizing macro is still good challenge. Changing length of names is still good idea: maybe tomasz could use "a?1" instead of "a?00000001", and save quite lot of memory if lot of macros with locals are used Good idea, and very simple to do. |
|||
08 Dec 2007, 17:16 |
|
revolution 08 Dec 2007, 18:30
LocoDelAssembly wrote: ... but to address this problem I think that instead of trying to provide automated tracking we need to feed the code with programmer's info instead so when you are going to access a parameter, the macros will compute deltas based on the info provided by hand. Pulling the 'use' detection code means creating a macro for each 'use' and assigning globally unique names for the two parameters. Not too much of an overhead, and can be done quite quickly. I think I need to seriously change my estimate above of 1/3 memory for unrolling the irp's. I calculated there are 120987 pushes and pops combined. With each uncompressed macro call being ~1500 bytes that comes to ... well a big number. So I reevaluate to more like 90%. |
|||
08 Dec 2007, 18:30 |
|
bitRAKE 08 Dec 2007, 19:06
Code: pushad mov dword [esp+_ESP_],.x jmp MyFunk .x: popad . . . MyFunk: ... jmp dword [esp+_ESP_] |
|||
08 Dec 2007, 19:06 |
|
revolution 08 Dec 2007, 22:59
bitRAKE wrote:
Code: mov [this_procs_global_store],esp do_whatever: nop ... mov esp,[this_procs_global_store] retn arg_count |
|||
08 Dec 2007, 22:59 |
|
bitRAKE 08 Dec 2007, 23:40
revolution wrote: Umm,no it doesn't, esp is not free, any form of stack access that doesn't get later restored will kill your code. You need a procedure unique global memory slot to save and restore esp if you want all registers free. |
|||
08 Dec 2007, 23:40 |
|
< Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.