flat assembler
Message board for the users of flat assembler.

Index > Windows > Interesting Time Problem

Goto page Previous  1, 2, 3  Next
Author
Thread Post new topic Reply to topic
asmcoder



Joined: 02 Jun 2008
Posts: 784
asmcoder 19 Jun 2009, 20:09
[content deleted]


Last edited by asmcoder on 14 Aug 2009, 14:50; edited 1 time in total
Post 19 Jun 2009, 20:09
View user's profile Send private message Reply with quote
arigity



Joined: 22 Dec 2008
Posts: 45
arigity 19 Jun 2009, 20:40
take a look at the fasm hooking example to see how it might look. it will crash if you try to hook something already hooked because it wasn't done with the idea of hooking hooks but it can be easily modified for doing so.
Post 19 Jun 2009, 20:40
View user's profile Send private message Reply with quote
LocoDelAssembly
Your code has a bug


Joined: 06 May 2005
Posts: 4624
Location: Argentina
LocoDelAssembly 19 Jun 2009, 21:19
asmcoder, there is a mistake in your example, it is "jmp $-5". As for your question what would happen is that the last applied patch to the 5 NOPs would be effective, this means that only one hook will be active at a time (I have told you this earlier). Note the winner could be using the "jmp $-5" from the loser as after patching the NOPs a preemption may occur, but since the "jmp $-5" will be the same for all the "competitors" then it doesn't creates any problem.

The only complicated part is the uninstallation of the hook, in that case you would need to perform a CMPXCHG8B to restore the "mov edi, edi" ONLY if your hook is the currently active.

Not sure if you got the answer already but there are no problems on SMP system by installing a hook this way. If you change "mov edi, edi" with "jmp $-5" while other core is executing the function's entry point then the processor will execute "mov edi, edi" because it was already in the pipeline. If the change occurs at the same time the instruction is entering into the pipeline of any core then the result will be either the mov will be executed or the jmp, but for sure no crash will occur.

PS: And it would be still possible to maintain a hook chain if CMPXCHG8B where used for installation, but I think the effectiveness wouldn't be any good as it relies on the competitors using the very same technique and some extra protocol between them to remove hooks from the chain. (Or trust that no one will attempt to unload itself)
Post 19 Jun 2009, 21:19
View user's profile Send private message Reply with quote
asmcoder



Joined: 02 Jun 2008
Posts: 784
asmcoder 19 Jun 2009, 21:25
[content deleted]


Last edited by asmcoder on 14 Aug 2009, 14:50; edited 2 times in total
Post 19 Jun 2009, 21:25
View user's profile Send private message Reply with quote
asmcoder



Joined: 02 Jun 2008
Posts: 784
asmcoder 19 Jun 2009, 21:27
[content deleted]


Last edited by asmcoder on 14 Aug 2009, 14:50; edited 1 time in total
Post 19 Jun 2009, 21:27
View user's profile Send private message Reply with quote
arigity



Joined: 22 Dec 2008
Posts: 45
arigity 19 Jun 2009, 21:48
asmcoder wrote:
from example:
Code:
    ; here we write the actual detour
    mov   byte [esi], $E9
    mov   edx, edi
    sub   edx, esi
    sub   edx, $5
    mov   dword [esi+1], edx    

yes i know very lame. i was playing like this some time ago, but now its time for serious coding. this is a bad way of writing hook.

so at least im right, you cant hook in memory with active threads.


its an example, if your looking for something more serious microsoft has a detouring library specifically for runtime hooking/detouring.
Post 19 Jun 2009, 21:48
View user's profile Send private message Reply with quote
LocoDelAssembly
Your code has a bug


Joined: 06 May 2005
Posts: 4624
Location: Argentina
LocoDelAssembly 19 Jun 2009, 21:59
WAIT, there IS a problem with multiple competitors... The VirtualProtect part may cause some competitors crash or even left the page(s) with write access.

Perhaps it would be better not restoring the protection.

Quote:

mov edi,edi -> call $-5 (xchg cant handle 5 bytes, but can 2)
5 previous nops -> jmp ADDRESS_OF_HOOKPROC
and just ret from hookproc to normal dll function


OK, but who would patch like that? I don't see any reason for using the CALL instruction, if you want the hook to forward the call after doing something then just jmp to function_address+2. If you want to use the same hook for multiple functions then change the 5 NOPs with "call ADDRESS_OF_HOOKPROC" and forward the call with pop eax/add eax,2/jmp eax in your hook procedure.

As for the pipeline depends on the processor but I think mine grabs 16 bytes at a time. What it is sure is that NO processor (starting from 80386 at least) executes an instruction directly from memory, it is first loaded into internal memory (queue on older processors, pipeline since 80486). It is true what you mention about alignment however, the possibility of fetching half of the "jmp $-5" could be possible if the instruction will be written at an odd address, I think (I don't remember now if LOCK prefix works well on unaligned writes).
Post 19 Jun 2009, 21:59
View user's profile Send private message Reply with quote
asmcoder



Joined: 02 Jun 2008
Posts: 784
asmcoder 19 Jun 2009, 22:02
[content deleted]


Last edited by asmcoder on 14 Aug 2009, 14:50; edited 1 time in total
Post 19 Jun 2009, 22:02
View user's profile Send private message Reply with quote
LocoDelAssembly
Your code has a bug


Joined: 06 May 2005
Posts: 4624
Location: Argentina
LocoDelAssembly 19 Jun 2009, 22:19
Quote:

with jump you have to hardcode address of return.

Code:
mov eax, [IAT@function] ; example: mov eax, [GetSystemTime]
add eax, 2
jmp eax    

No hardcoding there, Windows prepares the import table for you and you just load the address of the function from it and add two before jumping.

About the instruction cache is right too, but since it is a limited resource sooner or later the patched code will be fetched by the processor.

Quote:

with call you just return by ret. and call in '2 bytes' is better because you dont have to add 2 bytes after. it wont cause any problems, works same like jmp i belive.

Probably right, but unfortunately there is no call instruction that fits in two bytes.
Post 19 Jun 2009, 22:19
View user's profile Send private message Reply with quote
asmcoder



Joined: 02 Jun 2008
Posts: 784
asmcoder 19 Jun 2009, 22:27
[content deleted]


Last edited by asmcoder on 14 Aug 2009, 14:50; edited 1 time in total
Post 19 Jun 2009, 22:27
View user's profile Send private message Reply with quote
LocoDelAssembly
Your code has a bug


Joined: 06 May 2005
Posts: 4624
Location: Argentina
LocoDelAssembly 19 Jun 2009, 22:51
Quote:

ok, i still need to know does unaligned memory access can be atomic and you are 100% sure cpu fetch 16 bytes 'at once', without allowing anything else access to RAM?

Yes, and actually the fetch occurs from L1 cache and in turn the cache lines are filled in a burst read cycle of 16, 32 or 64 bytes (processor dependant, my CPU does the latter).

In the case of misaligment perhaps this may help:
Code:
      mov     byte [ebx-5], $E9
      mov     eax, hook
      sub     eax, ebx
      mov     [ebx-4], eax

      mov     byte [ebx], $84 ; TEST opcode ("mov edi, edi" is transformed into "test BH, BH")

      mov     byte [ebx+1], $F9 ; $-5 but also transforms "test BH, BH" into "test CL, BH")
      mov     byte [ebx], $EB   ; JMP rel8 opcode
      ; JMP $-5 Fully assembled at this point.    


But there may still be problems in the five NOPs if them were partially cached. (If cache snooping/whatever can't solve this problem by itself).

PS: The burst read can't be interrupted in the middle by other processor, it is a different type of accessing in which the memories start sending data in burst starting at a given address (the processor does not signal every address, it does it once and then the memory increments the address on every cycle itself).

[edit]Code changed to not use double NOP anymore[/edit]

[edit2]Ignore the code, it won't be any better than the two NOPs. Any suggestion about proper "mov edi, edi" patching (in unaligned cases) is welcome[/edit2]


Last edited by LocoDelAssembly on 20 Jun 2009, 02:46; edited 2 times in total
Post 19 Jun 2009, 22:51
View user's profile Send private message Reply with quote
asmcoder



Joined: 02 Jun 2008
Posts: 784
asmcoder 19 Jun 2009, 23:10
[content deleted]


Last edited by asmcoder on 14 Aug 2009, 14:50; edited 1 time in total
Post 19 Jun 2009, 23:10
View user's profile Send private message Reply with quote
LocoDelAssembly
Your code has a bug


Joined: 06 May 2005
Posts: 4624
Location: Argentina
LocoDelAssembly 19 Jun 2009, 23:39
Quote:

mov edi,edi is always aligned properly, so if i use xchg eax,[mov_edi_edi_address] will it be safe?

You don't need XCHG to make the write atomic, a single MOV will do it. What XCHG will do is to implicitly lock the bus so no other core can write to memory, then it will copy the memory contents to EAX, the previous EAX content to memory and then release the bus. However, if I remember right unaligned writes are not atomic in some (or all) processors even if LOCK is used. I think I've read that in some old Intel manual.

I have put the NOPs because them are one byte each, so I can then patch with "JMP $-5" byte-by-byte without problems. If a processor already fetched one on the NOPs, then it will execute either the STC or the second NOP, otherwise, it will execute "JMP $-5".

BTW, unfortunately "mov edi, edi" isn't always aligned, on my PC, GetSystemTimeAsFileTime is located at $7C8017E9...

Yet, damn, I'm not solving the problem, a processor may misinterpret "mov edi, edi" when I apply the two-nops patch... Well, at least for GetSystemTimeAsFileTime perhaps nothing wrong will happen because "mov edi, edi" isn't crossing a 16-byte boundary (but if it is the case then my original code was correct already).
Post 19 Jun 2009, 23:39
View user's profile Send private message Reply with quote
asmcoder



Joined: 02 Jun 2008
Posts: 784
asmcoder 19 Jun 2009, 23:52
[content deleted]


Last edited by asmcoder on 14 Aug 2009, 14:50; edited 1 time in total
Post 19 Jun 2009, 23:52
View user's profile Send private message Reply with quote
asmcoder



Joined: 02 Jun 2008
Posts: 784
asmcoder 20 Jun 2009, 00:01
[content deleted]


Last edited by asmcoder on 14 Aug 2009, 14:50; edited 1 time in total
Post 20 Jun 2009, 00:01
View user's profile Send private message Reply with quote
LocoDelAssembly
Your code has a bug


Joined: 06 May 2005
Posts: 4624
Location: Argentina
LocoDelAssembly 20 Jun 2009, 00:19
Quote:

so why instead ms didnt put there 2 nops?

Performance? I think it is better to execute a single instruction than two.

Quote:

core 1: mov dword [adress],eax
core 2: mov eax,dword [adres]

core 1 wont finish mov'ing and core2 will read old and new data partialy.


core 2 will read either old value or the value just written by core1 but not a mix and/or complete corruption (if address was multiple of four, otherwise I'm not sure).

I have edited my code above, I think that now "mov edi, edi" patching is safe. It must still be investigated what about the five NOPs however, if the CPUs don't realize they must flush their L1 cache line(s) possibly containing part of the NOPs then a crash can still occur.
Post 20 Jun 2009, 00:19
View user's profile Send private message Reply with quote
asmcoder



Joined: 02 Jun 2008
Posts: 784
asmcoder 20 Jun 2009, 14:31
[content deleted]


Last edited by asmcoder on 14 Aug 2009, 14:50; edited 1 time in total
Post 20 Jun 2009, 14:31
View user's profile Send private message Reply with quote
Borsuc



Joined: 29 Dec 2005
Posts: 2465
Location: Bucharest, Romania
Borsuc 21 Jun 2009, 00:17
Aren't there hook APIs for this thing?
Post 21 Jun 2009, 00:17
View user's profile Send private message Reply with quote
asmcoder



Joined: 02 Jun 2008
Posts: 784
asmcoder 21 Jun 2009, 00:50
[content deleted]


Last edited by asmcoder on 14 Aug 2009, 14:50; edited 1 time in total
Post 21 Jun 2009, 00:50
View user's profile Send private message Reply with quote
Borsuc



Joined: 29 Dec 2005
Posts: 2465
Location: Bucharest, Romania
Borsuc 21 Jun 2009, 02:13
hmm let me investigate and see what JauntePE uses... I tried to google for "madhook library", didn't find anything relevant yet.
Post 21 Jun 2009, 02:13
View user's profile Send private message Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  
Goto page Previous  1, 2, 3  Next

< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2025, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.