flat assembler
Message board for the users of flat assembler.

Index > Windows > Edit:1:Fastest way to read ticks without function

Goto page Previous  1, 2
Author
Thread Post new topic Reply to topic
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20300
Location: In your JS exploiting you and your system
revolution 11 Oct 2008, 08:16
hopcode: Thanks for the excellent posting.

However I would hesitate to say the PIB and TIB are always at the fixed addresses. Since they are not documented there is no guarantee as to where they will be situated. What about Windows 7? Or embedded NT? Embedded XP? Vista? XP64? Vista64? There is much variation in OSes and future OSes might just break your code (and usually at the most inconvenient time). Is that worth the minuscule theoretical saving of a few clock ticks?
Post 11 Oct 2008, 08:16
View user's profile Send private message Visit poster's website Reply with quote
windwakr



Joined: 30 Jun 2004
Posts: 827
windwakr 11 Oct 2008, 15:05
Does anyone know how often InterruptTime is updated? I read somewhere its every 100ns, is that right? Is it an accurate way to do timing? 100ns is like every 10 millionth right?
Post 11 Oct 2008, 15:05
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20300
Location: In your JS exploiting you and your system
revolution 11 Oct 2008, 15:16
FileTime has a theoretical granularity of 100ns but I doubt that Windows has access to any such timer to that accuracy. There is the high performance timer on the motherboard but it does not interrupt at such a rate of 10meg. Your best bet for any high accuracy timing is the QueryPerformanceCounter/QueryPerformanceFrequency APIs.
Post 11 Oct 2008, 15:16
View user's profile Send private message Visit poster's website Reply with quote
LocoDelAssembly
Your code has a bug


Joined: 06 May 2005
Posts: 4624
Location: Argentina
LocoDelAssembly 11 Oct 2008, 15:16
Code:
typedef struct _Device{
  long val;
  long dev_bus_reads;
} Device;

volatile Device mem_mapped_device;

void read(Device *pDevice){
  pDevice->val = mem_mapped_device.val;
  pDevice->dev_bus_reads = mem_mapped_device.dev_bus_reads;
}    


Now, the question, is it supposed that the compiler should be "smart enough" to transform those two assignations into one "long long int" (64-bit) assignation?
Post 11 Oct 2008, 15:16
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20300
Location: In your JS exploiting you and your system
revolution 11 Oct 2008, 15:20
LocoDelAssembly wrote:
Code:
typedef struct _Device{
  long val;
  long dev_bus_reads;
} Device;

volatile Device mem_mapped_device;

void read(Device *pDevice){
  pDevice->val = mem_mapped_device.val;
  pDevice->dev_bus_reads = mem_mapped_device.dev_bus_reads;
}    


Now, the question, is it supposed that the compiler should be "smart enough" to transform those two assignations into one "long long int" (64-bit) assignation?
Where is the code from? What does the code represent?
Post 11 Oct 2008, 15:20
View user's profile Send private message Visit poster's website Reply with quote
LocoDelAssembly
Your code has a bug


Joined: 06 May 2005
Posts: 4624
Location: Argentina
LocoDelAssembly 11 Oct 2008, 15:39
Quote:

Where is the code from? What does the code represent?

I wrote it. Represents "compiler was told to do it in two reads with the keyword volatile and can't disobey that".

Visual C++ 2005 disassembly view of a fragment of code using that:
Code:
typedef struct _Device{
  long val;
  long bus_reads;
} Device;

volatile Device mem_mapped_device;

void read(Device *pDevice){
  pDevice->val = mem_mapped_device.val;
  pDevice->bus_reads = mem_mapped_device.bus_reads;
}

int main(){

       Device dev;
 _asm{int 3}
00401000  int         3    
        read(&dev);
00401001  mov         eax,dword ptr [mem_mapped_device (403374h)] 
00401006  mov         ecx,dword ptr [mem_mapped_device+4 (403378h)] 
       GetTickCount(); //Just to convince the compiler that registers must be preserved
0040100C  call        dword ptr [__imp__GetTickCount@0 (402000h)] 
      return 0;
00401012  xor         eax,eax 
}    


As you can see it only read the values but haven't stored them on stack because them wasn't used later.

Removing volatile:
Code:
typedef struct _Device{
  long val;
  long bus_reads;
} Device;

Device mem_mapped_device;

void read(Device *pDevice){
  pDevice->val = mem_mapped_device.val;
  pDevice->bus_reads = mem_mapped_device.bus_reads;
}

int main(){

    Device dev;
 _asm{int 3}
00401000  int         3    
        read(&dev);
     GetTickCount(); //Just to convince the compiler that registers must be preserved
00401001  call        dword ptr [__imp__GetTickCount@0 (402000h)] 
      return 0;
00401007  xor         eax,eax 
}    


Now it also realized that the copy is not worth to be done.

And what is my point? It is the HLL programmer that it is not smart enough, not the compiler that was told that doing two separated reads is different than doing only one (my pseudo-example shows that, doing two reads and a single read would give different values for Device.bus_reads).
Post 11 Oct 2008, 15:39
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20300
Location: In your JS exploiting you and your system
revolution 11 Oct 2008, 16:09
If the compiler is treated as a robot then we expect such behaviour. Is it wrong to want the compiler to be smarter and just know what we want? Even when we tell it we want something else? Wink
Post 11 Oct 2008, 16:09
View user's profile Send private message Visit poster's website Reply with quote
windwakr



Joined: 30 Jun 2004
Posts: 827
windwakr 11 Oct 2008, 16:33
I have been trying to track down the code in the dll's for QueryPerformanceCounter/QueryPerformanceFrequency. I have found those just call NtQueryPerformanceCounter, and thats make a systemcall to A5, but how do I find the code to that?

_________________
----> * <---- My star, won HERE
Post 11 Oct 2008, 16:33
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20300
Location: In your JS exploiting you and your system
revolution 11 Oct 2008, 17:10
windwakr wrote:
...a systemcall to A5, but how do I find the code to that?
You would need a kernel mode debugger.

But a simpler solution, if you are looking for code to read the mobo counter, is to go to the hardware for the standard PC mobos. Linux source code will almost certainly have a driver to read the counter/timer. I would give the I/O ports directly but I have long since forgotten them now.
Post 11 Oct 2008, 17:10
View user's profile Send private message Visit poster's website Reply with quote
windwakr



Joined: 30 Jun 2004
Posts: 827
windwakr 11 Oct 2008, 23:15
EDIT: I updated the code, it sets itself to high priority when it starts, that gives about 20,000 more unique numbers a second for me. I didn't see any difference from high when I tried realtime.

Is this a somewhat good way of figuring out resolution of QueryPerformanceCounter? With this code I 'think' I have around 1.2 microsecond resolution with it, but I don't know how good the code is.

Please forgive me for badly written code, I hope someone can understand what I'm doing...
Code:
include 'win32ax.inc'

.data
n1 rd 2
n2 dd ?
n3 dd ?
n4 dd ?
n5 dd 0
title db '...',0
fmt db 'Last time measured: %u',13,10,'Frequency: %u',13,10,'Start number plus one second: %u',13,10,'"Unique" values encountered: %u',0
buf rb 512

.code
start:
invoke GetCurrentProcess
invoke SetPriorityClass,eax,HIGH_PRIORITY_CLASS

invoke QueryPerformanceCounter,n1
invoke QueryPerformanceFrequency,n2
mov eax,[n1]
add eax,[n2]
mov [n3],eax

looper:
mov eax,[n1]
mov [n4],eax
invoke QueryPerformanceCounter,n1
mov eax,[n1]
cmp eax,[n4]
je looper
inc [n5]
cmp eax,[n3]
jb looper

cinvoke wsprintf,buf,fmt,[n1],[n2],[n3],[n5]
invoke MessageBox,NULL,buf,title,MB_OK
invoke ExitProcess,0
.end start
    


Just divide one by the 'Unique' number for resolution

_________________
----> * <---- My star, won HERE


Last edited by windwakr on 12 Oct 2008, 03:27; edited 3 times in total
Post 11 Oct 2008, 23:15
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20300
Location: In your JS exploiting you and your system
revolution 12 Oct 2008, 02:44
You forgot to initialised n5.

But don't forget that the OS is multitasking. You may miss some values because the OS is doing other tasks. You might want to try raising the priority to realtime and see if you get more unique values.
Post 12 Oct 2008, 02:44
View user's profile Send private message Visit poster's website Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  
Goto page Previous  1, 2

< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.