flat assembler
Message board for the users of flat assembler.
Index
> Windows > Edit:1:Fastest way to read ticks without function Goto page Previous 1, 2 |
Author |
|
windwakr 11 Oct 2008, 15:05
Does anyone know how often InterruptTime is updated? I read somewhere its every 100ns, is that right? Is it an accurate way to do timing? 100ns is like every 10 millionth right?
|
|||
11 Oct 2008, 15:05 |
|
revolution 11 Oct 2008, 15:16
FileTime has a theoretical granularity of 100ns but I doubt that Windows has access to any such timer to that accuracy. There is the high performance timer on the motherboard but it does not interrupt at such a rate of 10meg. Your best bet for any high accuracy timing is the QueryPerformanceCounter/QueryPerformanceFrequency APIs.
|
|||
11 Oct 2008, 15:16 |
|
LocoDelAssembly 11 Oct 2008, 15:16
Code: typedef struct _Device{ long val; long dev_bus_reads; } Device; volatile Device mem_mapped_device; void read(Device *pDevice){ pDevice->val = mem_mapped_device.val; pDevice->dev_bus_reads = mem_mapped_device.dev_bus_reads; } Now, the question, is it supposed that the compiler should be "smart enough" to transform those two assignations into one "long long int" (64-bit) assignation? |
|||
11 Oct 2008, 15:16 |
|
revolution 11 Oct 2008, 15:20
LocoDelAssembly wrote:
|
|||
11 Oct 2008, 15:20 |
|
LocoDelAssembly 11 Oct 2008, 15:39
Quote:
I wrote it. Represents "compiler was told to do it in two reads with the keyword volatile and can't disobey that". Visual C++ 2005 disassembly view of a fragment of code using that: Code: typedef struct _Device{ long val; long bus_reads; } Device; volatile Device mem_mapped_device; void read(Device *pDevice){ pDevice->val = mem_mapped_device.val; pDevice->bus_reads = mem_mapped_device.bus_reads; } int main(){ Device dev; _asm{int 3} 00401000 int 3 read(&dev); 00401001 mov eax,dword ptr [mem_mapped_device (403374h)] 00401006 mov ecx,dword ptr [mem_mapped_device+4 (403378h)] GetTickCount(); //Just to convince the compiler that registers must be preserved 0040100C call dword ptr [__imp__GetTickCount@0 (402000h)] return 0; 00401012 xor eax,eax } As you can see it only read the values but haven't stored them on stack because them wasn't used later. Removing volatile: Code: typedef struct _Device{ long val; long bus_reads; } Device; Device mem_mapped_device; void read(Device *pDevice){ pDevice->val = mem_mapped_device.val; pDevice->bus_reads = mem_mapped_device.bus_reads; } int main(){ Device dev; _asm{int 3} 00401000 int 3 read(&dev); GetTickCount(); //Just to convince the compiler that registers must be preserved 00401001 call dword ptr [__imp__GetTickCount@0 (402000h)] return 0; 00401007 xor eax,eax } Now it also realized that the copy is not worth to be done. And what is my point? It is the HLL programmer that it is not smart enough, not the compiler that was told that doing two separated reads is different than doing only one (my pseudo-example shows that, doing two reads and a single read would give different values for Device.bus_reads). |
|||
11 Oct 2008, 15:39 |
|
revolution 11 Oct 2008, 16:09
If the compiler is treated as a robot then we expect such behaviour. Is it wrong to want the compiler to be smarter and just know what we want? Even when we tell it we want something else?
|
|||
11 Oct 2008, 16:09 |
|
windwakr 11 Oct 2008, 16:33
I have been trying to track down the code in the dll's for QueryPerformanceCounter/QueryPerformanceFrequency. I have found those just call NtQueryPerformanceCounter, and thats make a systemcall to A5, but how do I find the code to that?
|
|||
11 Oct 2008, 16:33 |
|
revolution 11 Oct 2008, 17:10
windwakr wrote: ...a systemcall to A5, but how do I find the code to that? But a simpler solution, if you are looking for code to read the mobo counter, is to go to the hardware for the standard PC mobos. Linux source code will almost certainly have a driver to read the counter/timer. I would give the I/O ports directly but I have long since forgotten them now. |
|||
11 Oct 2008, 17:10 |
|
windwakr 11 Oct 2008, 23:15
EDIT: I updated the code, it sets itself to high priority when it starts, that gives about 20,000 more unique numbers a second for me. I didn't see any difference from high when I tried realtime.
Is this a somewhat good way of figuring out resolution of QueryPerformanceCounter? With this code I 'think' I have around 1.2 microsecond resolution with it, but I don't know how good the code is. Please forgive me for badly written code, I hope someone can understand what I'm doing... Code: include 'win32ax.inc' .data n1 rd 2 n2 dd ? n3 dd ? n4 dd ? n5 dd 0 title db '...',0 fmt db 'Last time measured: %u',13,10,'Frequency: %u',13,10,'Start number plus one second: %u',13,10,'"Unique" values encountered: %u',0 buf rb 512 .code start: invoke GetCurrentProcess invoke SetPriorityClass,eax,HIGH_PRIORITY_CLASS invoke QueryPerformanceCounter,n1 invoke QueryPerformanceFrequency,n2 mov eax,[n1] add eax,[n2] mov [n3],eax looper: mov eax,[n1] mov [n4],eax invoke QueryPerformanceCounter,n1 mov eax,[n1] cmp eax,[n4] je looper inc [n5] cmp eax,[n3] jb looper cinvoke wsprintf,buf,fmt,[n1],[n2],[n3],[n5] invoke MessageBox,NULL,buf,title,MB_OK invoke ExitProcess,0 .end start Just divide one by the 'Unique' number for resolution Last edited by windwakr on 12 Oct 2008, 03:27; edited 3 times in total |
|||
11 Oct 2008, 23:15 |
|
revolution 12 Oct 2008, 02:44
You forgot to initialised n5.
But don't forget that the OS is multitasking. You may miss some values because the OS is doing other tasks. You might want to try raising the priority to realtime and see if you get more unique values. |
|||
12 Oct 2008, 02:44 |
|
Goto page Previous 1, 2 < Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.