flat assembler
Message board for the users of flat assembler.
Index
> Windows > Interesting Time Problem Goto page Previous 1, 2, 3 Next |
Author |
|
asmcoder 19 Jun 2009, 20:09
[content deleted]
Last edited by asmcoder on 14 Aug 2009, 14:50; edited 1 time in total |
|||
19 Jun 2009, 20:09 |
|
LocoDelAssembly 19 Jun 2009, 21:19
asmcoder, there is a mistake in your example, it is "jmp $-5". As for your question what would happen is that the last applied patch to the 5 NOPs would be effective, this means that only one hook will be active at a time (I have told you this earlier). Note the winner could be using the "jmp $-5" from the loser as after patching the NOPs a preemption may occur, but since the "jmp $-5" will be the same for all the "competitors" then it doesn't creates any problem.
The only complicated part is the uninstallation of the hook, in that case you would need to perform a CMPXCHG8B to restore the "mov edi, edi" ONLY if your hook is the currently active. Not sure if you got the answer already but there are no problems on SMP system by installing a hook this way. If you change "mov edi, edi" with "jmp $-5" while other core is executing the function's entry point then the processor will execute "mov edi, edi" because it was already in the pipeline. If the change occurs at the same time the instruction is entering into the pipeline of any core then the result will be either the mov will be executed or the jmp, but for sure no crash will occur. PS: And it would be still possible to maintain a hook chain if CMPXCHG8B where used for installation, but I think the effectiveness wouldn't be any good as it relies on the competitors using the very same technique and some extra protocol between them to remove hooks from the chain. (Or trust that no one will attempt to unload itself) |
|||
19 Jun 2009, 21:19 |
|
asmcoder 19 Jun 2009, 21:25
[content deleted]
Last edited by asmcoder on 14 Aug 2009, 14:50; edited 2 times in total |
|||
19 Jun 2009, 21:25 |
|
asmcoder 19 Jun 2009, 21:27
[content deleted]
Last edited by asmcoder on 14 Aug 2009, 14:50; edited 1 time in total |
|||
19 Jun 2009, 21:27 |
|
arigity 19 Jun 2009, 21:48
asmcoder wrote: from example: its an example, if your looking for something more serious microsoft has a detouring library specifically for runtime hooking/detouring. |
|||
19 Jun 2009, 21:48 |
|
LocoDelAssembly 19 Jun 2009, 21:59
WAIT, there IS a problem with multiple competitors... The VirtualProtect part may cause some competitors crash or even left the page(s) with write access.
Perhaps it would be better not restoring the protection. Quote:
OK, but who would patch like that? I don't see any reason for using the CALL instruction, if you want the hook to forward the call after doing something then just jmp to function_address+2. If you want to use the same hook for multiple functions then change the 5 NOPs with "call ADDRESS_OF_HOOKPROC" and forward the call with pop eax/add eax,2/jmp eax in your hook procedure. As for the pipeline depends on the processor but I think mine grabs 16 bytes at a time. What it is sure is that NO processor (starting from 80386 at least) executes an instruction directly from memory, it is first loaded into internal memory (queue on older processors, pipeline since 80486). It is true what you mention about alignment however, the possibility of fetching half of the "jmp $-5" could be possible if the instruction will be written at an odd address, I think (I don't remember now if LOCK prefix works well on unaligned writes). |
|||
19 Jun 2009, 21:59 |
|
asmcoder 19 Jun 2009, 22:02
[content deleted]
Last edited by asmcoder on 14 Aug 2009, 14:50; edited 1 time in total |
|||
19 Jun 2009, 22:02 |
|
LocoDelAssembly 19 Jun 2009, 22:19
Quote:
Code: mov eax, [IAT@function] ; example: mov eax, [GetSystemTime] add eax, 2 jmp eax No hardcoding there, Windows prepares the import table for you and you just load the address of the function from it and add two before jumping. About the instruction cache is right too, but since it is a limited resource sooner or later the patched code will be fetched by the processor. Quote:
Probably right, but unfortunately there is no call instruction that fits in two bytes. |
|||
19 Jun 2009, 22:19 |
|
asmcoder 19 Jun 2009, 22:27
[content deleted]
Last edited by asmcoder on 14 Aug 2009, 14:50; edited 1 time in total |
|||
19 Jun 2009, 22:27 |
|
LocoDelAssembly 19 Jun 2009, 22:51
Quote:
Yes, and actually the fetch occurs from L1 cache and in turn the cache lines are filled in a burst read cycle of 16, 32 or 64 bytes (processor dependant, my CPU does the latter). In the case of misaligment perhaps this may help: Code: mov byte [ebx-5], $E9 mov eax, hook sub eax, ebx mov [ebx-4], eax mov byte [ebx], $84 ; TEST opcode ("mov edi, edi" is transformed into "test BH, BH") mov byte [ebx+1], $F9 ; $-5 but also transforms "test BH, BH" into "test CL, BH") mov byte [ebx], $EB ; JMP rel8 opcode ; JMP $-5 Fully assembled at this point. But there may still be problems in the five NOPs if them were partially cached. (If cache snooping/whatever can't solve this problem by itself). PS: The burst read can't be interrupted in the middle by other processor, it is a different type of accessing in which the memories start sending data in burst starting at a given address (the processor does not signal every address, it does it once and then the memory increments the address on every cycle itself). [edit]Code changed to not use double NOP anymore[/edit] [edit2]Ignore the code, it won't be any better than the two NOPs. Any suggestion about proper "mov edi, edi" patching (in unaligned cases) is welcome[/edit2] Last edited by LocoDelAssembly on 20 Jun 2009, 02:46; edited 2 times in total |
|||
19 Jun 2009, 22:51 |
|
asmcoder 19 Jun 2009, 23:10
[content deleted]
Last edited by asmcoder on 14 Aug 2009, 14:50; edited 1 time in total |
|||
19 Jun 2009, 23:10 |
|
LocoDelAssembly 19 Jun 2009, 23:39
Quote:
You don't need XCHG to make the write atomic, a single MOV will do it. What XCHG will do is to implicitly lock the bus so no other core can write to memory, then it will copy the memory contents to EAX, the previous EAX content to memory and then release the bus. However, if I remember right unaligned writes are not atomic in some (or all) processors even if LOCK is used. I think I've read that in some old Intel manual. I have put the NOPs because them are one byte each, so I can then patch with "JMP $-5" byte-by-byte without problems. If a processor already fetched one on the NOPs, then it will execute either the STC or the second NOP, otherwise, it will execute "JMP $-5". BTW, unfortunately "mov edi, edi" isn't always aligned, on my PC, GetSystemTimeAsFileTime is located at $7C8017E9... Yet, damn, I'm not solving the problem, a processor may misinterpret "mov edi, edi" when I apply the two-nops patch... Well, at least for GetSystemTimeAsFileTime perhaps nothing wrong will happen because "mov edi, edi" isn't crossing a 16-byte boundary (but if it is the case then my original code was correct already). |
|||
19 Jun 2009, 23:39 |
|
asmcoder 19 Jun 2009, 23:52
[content deleted]
Last edited by asmcoder on 14 Aug 2009, 14:50; edited 1 time in total |
|||
19 Jun 2009, 23:52 |
|
asmcoder 20 Jun 2009, 00:01
[content deleted]
Last edited by asmcoder on 14 Aug 2009, 14:50; edited 1 time in total |
|||
20 Jun 2009, 00:01 |
|
LocoDelAssembly 20 Jun 2009, 00:19
Quote:
Performance? I think it is better to execute a single instruction than two. Quote:
core 2 will read either old value or the value just written by core1 but not a mix and/or complete corruption (if address was multiple of four, otherwise I'm not sure). I have edited my code above, I think that now "mov edi, edi" patching is safe. It must still be investigated what about the five NOPs however, if the CPUs don't realize they must flush their L1 cache line(s) possibly containing part of the NOPs then a crash can still occur. |
|||
20 Jun 2009, 00:19 |
|
asmcoder 20 Jun 2009, 14:31
[content deleted]
Last edited by asmcoder on 14 Aug 2009, 14:50; edited 1 time in total |
|||
20 Jun 2009, 14:31 |
|
Borsuc 21 Jun 2009, 00:17
Aren't there hook APIs for this thing?
|
|||
21 Jun 2009, 00:17 |
|
asmcoder 21 Jun 2009, 00:50
[content deleted]
Last edited by asmcoder on 14 Aug 2009, 14:50; edited 1 time in total |
|||
21 Jun 2009, 00:50 |
|
Borsuc 21 Jun 2009, 02:13
hmm let me investigate and see what JauntePE uses... I tried to google for "madhook library", didn't find anything relevant yet.
|
|||
21 Jun 2009, 02:13 |
|
Goto page Previous 1, 2, 3 Next < Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2025, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.