flat assembler
Message board for the users of flat assembler.
Index
> Main > fasm-metaprogramming Goto page 1, 2 Next |
Author |
|
Tyler 18 Apr 2010, 10:39
Windows won't allow it to work, and you also have to consider the cache, if a part of memory is already in the instruction cache, I don't think changes in memory are reflected there. At least that's what I read somewhere around here.
|
|||
18 Apr 2010, 10:39 |
|
revolution 18 Apr 2010, 10:46
Tyler: Any change to memory will be reflected back into the CPU at all cache levels. This is by design so that self modifying code will work. It might not be fast or efficient but it will work.
ander-skirnir: Perhaps you are looking for fasm.dll, it exists on here somewhere. It has a few limitations but will basically allow on-the-fly assembly. |
|||
18 Apr 2010, 10:46 |
|
Tyler 18 Apr 2010, 10:57
It was the prefetch queue that I remembered reading about in the thread about me trying to find a virus. There's also an example of fake smc in that thread
|
|||
18 Apr 2010, 10:57 |
|
revolution 18 Apr 2010, 11:17
Tyler: No one runs 8086/286 systems anymore. And anyone that does can't run fasm anyway so the point is moot.
|
|||
18 Apr 2010, 11:17 |
|
ander-skirnir 18 Apr 2010, 11:30
> fasm.dll
Okay, ty, ill try. Btw i need it to implement toy common-lisp (very small subset of) compiler, that will allow to defun (define functions) at runtime without any layering/bytecoding. Im wondering, is it right that asm-o-generation on-the-fly is best and fastest way to compile functions in dynamic compilers? |
|||
18 Apr 2010, 11:30 |
|
revolution 18 Apr 2010, 11:37
LISP ---> assembly ---> binary ---> execute.
That is a common path for almost all native apps written today, just replace LISP with C, or C++, or whatever, and the remainder is still the same. But as for on-the-fly, that is not the usual case. Most often for on-the-fly you would have this: JAVA/C# ---> bytecode ---> JIT compiler ---> execute. Or this: PERL/JS ---> interpreter. |
|||
18 Apr 2010, 11:37 |
|
ander-skirnir 18 Apr 2010, 11:54
But common-lisp have no compile-time / runtime difference. It must provide dynamic compilation to machine code in working system. I can define function that defines functions, and all of em must be honestly native-compiled in runtime. Can i done that without on-the-fly?
|
|||
18 Apr 2010, 11:54 |
|
revolution 18 Apr 2010, 12:08
BTW: Here is the dll thread: http://board.flatassembler.net/topic.php?t=6239
|
|||
18 Apr 2010, 12:08 |
|
cthug 18 Apr 2010, 12:33
You could cheat and use tcc, and use its on-the-fly C compilation and assembly.
|
|||
18 Apr 2010, 12:33 |
|
ander-skirnir 18 Apr 2010, 12:56
> http://board.flatassembler.net/topic.php?t=6239
Ty again. > tcc Yeah, i like it so much - awsom compiler, but clever people say that way the c compiled in native code are much different from how good common-lisp compiler could do. CL has both static/dynamic scoping, not-emulated-by-structs/classes lambdas, closures and so-so on - things, that cannot be done efficient by translating to c. |
|||
18 Apr 2010, 12:56 |
|
cthug 18 Apr 2010, 13:14
tcc has a assembly(GAS compatible), built in, so you pass you dynamically generated code to libtcc to assembly and execute it. The only problem is bloat, it adds about 400KB to your executable, so you might have to edit tcc source
|
|||
18 Apr 2010, 13:14 |
|
MazeGen 20 Apr 2010, 09:18
revolution wrote: Tyler: Any change to memory will be reflected back into the CPU at all cache levels. This is by design so that self modifying code will work. It might not be fast or efficient but it will work. It is not reflected back in all cases. The following test returns "cache not updated" on my PC: Code: format PE GUI entry start section '.text' code readable writeable executable start: mov al, 90h ; NOP instruction mov ecx, last_eip - get_eip call get_eip get_eip: mov edi, [esp] rep stosb ; rewrite bytes from get_eip to last_eip by NOPs is_40: DB 40h ; dummy INC EAX last_eip: pop edi ; if byte 40h was rewitten, REP STOSB didn't rewrite itself ; - code cache was not updated cmp byte [edi+(is_40-get_eip)], 40h jne cache_not_updated cache_updated: push 0 push caption push updated push 0 call [MessageBoxA] push 1 call [ExitProcess] cache_not_updated: push 0 push caption push not_updated push 0 call [MessageBoxA] push 0 call [ExitProcess] section '.data' data readable writeable caption db 'x',0 not_updated db 'cache not updated',0 updated db 'cache updated',0 section '.idata' import data readable writeable dd 0,0,0,RVA kernel_name,RVA kernel_table dd 0,0,0,RVA user_name,RVA user_table dd 0,0,0,0,0 kernel_table: ExitProcess dd RVA _ExitProcess dd 0 user_table: MessageBoxA dd RVA _MessageBoxA dd 0 kernel_name db 'KERNEL32.DLL',0 user_name db 'USER32.DLL',0 _ExitProcess dw 0 db 'ExitProcess',0 _MessageBoxA dw 0 db 'MessageBoxA',0 |
|||
20 Apr 2010, 09:18 |
|
revolution 20 Apr 2010, 09:54
That is a neat trick. It goes against what the Intel manual states:
18.29.1 Self-Modifying Code with Cache Enabled wrote: On the Intel486 processor, a write to an instruction in the cache will modify it in both the cache and memory. If the instruction was prefetched before the write, however, the old version of the instruction could be the one executed. To prevent this problem, it is necessary to flush the instruction prefetch unit of the Intel486 processor by coding a jump instruction immediately after any write that modifies an instruction. The P6 family and Pentium processors, however, check whether a write may modify an instruction that has been prefetched for execution. This check is based on the linear address of the instruction. If the linear address of an instruction is found to be present in the prefetch queue, the P6 family and Pentium processors flush the prefetch queue, eliminating the need to code a jump instruction after any writes that modify an instruction. I guess this is a consequence of the special circuitry for rep stosx speed-ups. |
|||
20 Apr 2010, 09:54 |
|
MazeGen 20 Apr 2010, 10:48
revolution wrote: Is it enough the check if ecx is zero? Single stepping make ecx=2, and direct execution makes ecx=0. I think so. revolution wrote: I guess this is a consequence of the special circuitry for rep stosx speed-ups. It seems so. I would need more processors to test it. |
|||
20 Apr 2010, 10:48 |
|
revolution 20 Apr 2010, 11:29
Of note is that eax is not incremented during direct execution. So at least that instruction is properly flushed.
|
|||
20 Apr 2010, 11:29 |
|
LocoDelAssembly 20 Apr 2010, 13:04
Same behaviour in Athlon (K7), and Phenom II.
Perhaps the manual explains in more detail the complete spec somewhere else? It wouldn't be the first time that Intel manuals puts a too much general description of something and latter in the manual a contradiction appears. PS: Or perhaps "REP STOSB" is defined as an instruction rather than just a prefixed instruction equivalent to the code below? Code: rep_stosb: test ecx, ecx jz .out .stosb: stosb loop .stosb .out: |
|||
20 Apr 2010, 13:04 |
|
revolution 20 Apr 2010, 14:46
LocoDelAssembly wrote: Or perhaps "REP STOSB" is defined as an instruction rather than just a prefixed instruction ... |
|||
20 Apr 2010, 14:46 |
|
LocoDelAssembly 20 Apr 2010, 17:03
revolution wrote: But then interrupts might screw you if you tried to rely upon that idea. |
|||
20 Apr 2010, 17:03 |
|
revolution 21 Apr 2010, 08:41
We can see the interrupt in action with this code:
Code: MAXIMUM_LENGTH = 1 shl 26 format pe console include 'win32ax.inc' .code virtual inc eax load instr_inc_eax byte from $$ end virtual virtual ret load instr_ret byte from $$ end virtual virtual nop load instr_nop byte from $$ end virtual virtual nop nop rep stosb load instr_rep_stosb dword from $$ end virtual virtual nop rep stosw load instr_rep_stosw dword from $$ end virtual virtual nop nop rep stosd load instr_rep_stosd dword from $$ end virtual proc begin uses ebx invoke GetStdHandle,STD_OUTPUT_HANDLE mov ebx,eax stdcall test_lengths,instr_rep_stosb,'STOSB',ebx,0 stdcall print_string,ebx,<13,10> stdcall test_lengths,instr_rep_stosw,'STOSW',ebx,1 stdcall print_string,ebx,<13,10> stdcall test_lengths,instr_rep_stosd,'STOSD',ebx,2 stdcall print_string,ebx,<13,10> invoke ExitProcess,0 endp proc test_lengths uses ebx,rep_instruction,name,handle,shift mov ebx,4 .loop: mov ecx,[shift] lea eax,[ebx+4] shr eax,cl stdcall make_code_section,eax,[rep_instruction] ccall cprint_formatted_string,[handle],<'%s length = 0x%08x, bytes written before interrupt: 0x%08x',13,10>,[name],ebx,eax add ebx,ebx cmp ebx,MAXIMUM_LENGTH jbe .loop ret endp proc make_code_section uses edi,run_length,rep_instruction mov eax,[rep_instruction] mov edi,rep_stos_test stosd mov eax,instr_inc_eax * 0x01010101 mov ecx,MAXIMUM_LENGTH shr 2 rep stosd mov eax,instr_ret * 0x01010101 stosd mov edi,rep_stos_test mov ecx,[run_length] mov eax,instr_nop * 0x01010101 call rep_stos_test sub eax,instr_nop * 0x01010101 sub eax,MAXIMUM_LENGTH neg eax ret endp proc cprint_formatted_string c handle,format,parameters stdcall print_formatted_string,[handle],[format],addr parameters ret endp proc print_formatted_string handle,format,parameters locals string rb 1024 endl invoke wvsprintf,addr string,[format],[parameters] stdcall print_string,[handle],addr string ret endp proc print_string handle,string locals written dd ? endl invoke lstrlen,[string] invoke WriteFile,[handle],[string],eax,addr written,NULL ret endp section 'rep_stos' code readable writeable executable rep_stos_test: rb MAXIMUM_LENGTH + 1 shl 12 .end begin A small section of screen dump: Code: ... STOSB length = 0x00100000, bytes written before interrupt: 0x00100000 STOSB length = 0x00200000, bytes written before interrupt: 0x001bda5c STOSB length = 0x00400000, bytes written before interrupt: 0x00400000 STOSB length = 0x00800000, bytes written before interrupt: 0x004558dc ... If you were to single step every test then all the "bytes written" figures will be zero, meaning that the rep stosx overwrites itself and can never even get one byte written past its own location. So basically the CPU tries to completely run the rep stosx from its internal state and won't re-read the instruction from memory unless it gets interrupted and has to return to restart. |
|||
21 Apr 2010, 08:41 |
|
Goto page 1, 2 Next < Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.