flat assembler
Message board for the users of flat assembler.

Index > Windows > Edit:1:Fastest way to read ticks without function

Goto page 1, 2  Next
Author
Thread Post new topic Reply to topic
hopcode



Joined: 04 Mar 2008
Posts: 563
Location: Germany
hopcode 10 Oct 2008, 17:19
Hallo All,
This is one of the most interesting structures i have ever seen.
At the moment i have have no time to play deeply with it (i am some days far away from
my Fasm Lab).
But i can say it is connected with a LOT,LOT,LOT of features of WinNt+.
It resides permantly at 0x7ffe0000, in every version of Windows NT+. mapped into EVERY process.

AFAIK,it is a readonly block on Xp-Sp2.
Here the structure:


KUSER_SHARED_DATA

0x000 TickCountLow : Uint4B
0x004 TickCountMultiplier : Uint4B
0x008 InterruptTime : _KSYSTEM_TIME
0x014 SystemTime : _KSYSTEM_TIME
0x020 TimeZoneBias : _KSYSTEM_TIME
0x02c ImageNumberLow : Uint2B
0x02e ImageNumberHigh : Uint2B
0x030 NtSystemRoot : [260] Uint2B
0x238 MaxStackTraceDepth : Uint4B
0x23c CryptoExponent : Uint4B
0x240 TimeZoneId : Uint4B
0x244 Reserved2 : [8] Uint4B
0x264 NtProductType : _NT_PRODUCT_TYPE
0x268 ProductTypeIsValid : UChar
0x26c NtMajorVersion : Uint4B
0x270 NtMinorVersion : Uint4B
0x274 ProcessorFeatures : [64] UChar
0x2b4 Reserved1 : Uint4B
0x2b8 Reserved3 : Uint4B
0x2bc TimeSlip : Uint4B
0x2c0 AlternativeArchitecture : _ALTERNATIVE_ARCHITECTURE_TYPE
0x2c8 SystemExpirationDate : _LARGE_INTEGER
0x2d0 SuiteMask : Uint4B
0x2d4 KdDebuggerEnabled : UChar
0x2d5 NXSupportPolicy : UChar
0x2d8 ActiveConsoleId : Uint4B
0x2dc DismountCount : Uint4B
0x2e0 ComPlusPackage : Uint4B
0x2e4 LastSystemRITEventTickCount : Uint4B
0x2e8 NumberOfPhysicalPages : Uint4B
0x2ec SafeBootMode : UChar
0x2f0 TraceLogging : Uint4B
0x2f8 TestRetInstruction : Uint8B
0x300 SystemCall : Uint4B
0x304 SystemCallReturn : Uint4B
0x308 SystemCallPad : [3] Uint8B
; union----------------------------
0x320 TickCount : _KSYSTEM_TIME
0x320 TickCountQuad : Uint8B
; ---------------------------------
0x330 Cookie : Uint4B

The fastest way to GetTickCount consist in reading
for example the first two fields of this structure.
The structure wil be constantly updated.
So, with a macro:
Code:

macro readtick
 {
  mov edx,7FFE0000h
  mov eax,dword [edx]
  mul dword [edx+4]
  shrd eax,edx,24
  ;...and ticks will result in eax
 }

    

As i have a couple of minutes i will append important links
for this struct, abaut the use of all the features connected with it.

Enjoy,
hopcode[mrk]

;------------------------
/to windwakr/
change the line
shrd eax,edx,18
to
shrd eax,edx,24
;-------------------------


Last edited by hopcode on 10 Oct 2008, 17:51; edited 1 time in total
Post 10 Oct 2008, 17:19
View user's profile Send private message Visit poster's website Reply with quote
windwakr



Joined: 30 Jun 2004
Posts: 827
windwakr 10 Oct 2008, 17:30
EDIT:Thanks for the help, my example code is fixed now.

Code:
include 'win32ax.inc'

.data
title db '...',0
num1 dd ?
text db 'GetTickCount: %u',13,10,'Direct Access: %u',0
buf rb 256

.code
start:
invoke GetTickCount
mov [num1],eax

mov edx,7FFE0000h
mov eax,dword [edx]
mul dword [edx+4]
shrd eax,edx,24

cinvoke wsprintf,buf,text,[num1],eax
invoke MessageBox,NULL,buf,title,MB_OK
invoke ExitProcess,0
.end start
    


I find it strange that they return the same value. Wouldn't the time needed to execute the instructions make it off by some amount of time?

_________________
----> * <---- My star, won HERE
Post 10 Oct 2008, 17:30
View user's profile Send private message Reply with quote
windwakr



Joined: 30 Jun 2004
Posts: 827
windwakr 10 Oct 2008, 19:08
EDIT: Made the output a little better and now it displays which is faster and by how much

Heres a poorly coded, probably inaccurate speed test... Very Happy
Direct Access is about 450 ms faster...over 100 million iterations Very Happy Very Happy

I know, the code is HORRIBLE! no indents, bad label names, etc....
Code:
include 'win32ax.inc'

.data
DA    db 'Direct Access',0
GTT   db 'GetTickCount',0
time1 dd ? 
time2 dd ? 
title db '...',0  
num1  dd ?
text  db '100,000,000 times:',13,10,'GetTickCount: %u ms',13,10,'Direct Access: %u ms',13,10,\
         13,10,'%s is faster by about %u ms',0
buf rb 256

.code  
start: 
mov ecx,100000000 
invoke GetTickCount 
mov [time1],eax 
loop1: 
invoke GetTickCount 
loop loop1 
sub eax,[time1] 
mov [time1],eax 
mov ecx,100000000
invoke GetTickCount 
mov [time2],eax 
loop2: 
mov edx,7FFE0000h  
mov eax,dword [edx]  
mul dword [edx+4]  
shrd eax,edx,24  
loop loop2 
sub eax,[time2] 
cmp eax,[time1]
ja label1
mov ebx,DA
mov ecx,[time1]
sub ecx,eax
mov [num1],ecx
jmp end1

label1:
mov ebx,GTT
mov ecx,eax
sub ecx,[time1]
mov [num1],ecx
end1:
cinvoke wsprintf,buf,text,[time1],eax,ebx,[num1]
invoke MessageBox,NULL,buf,title,MB_OK  
invoke ExitProcess,0  
.end start    

_________________
----> * <---- My star, won HERE
Post 10 Oct 2008, 19:08
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20298
Location: In your JS exploiting you and your system
revolution 10 Oct 2008, 21:33
hopcode wrote:
It resides permantly at 0x7ffe0000, in every version of Windows NT+. mapped into EVERY process.
That is not true for all versions of Windows NT+, some versions are 3Gig aware.
Post 10 Oct 2008, 21:33
View user's profile Send private message Visit poster's website Reply with quote
windwakr



Joined: 30 Jun 2004
Posts: 827
windwakr 10 Oct 2008, 21:58
So, does this work on all windows versions?

_________________
----> * <---- My star, won HERE
Post 10 Oct 2008, 21:58
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20298
Location: In your JS exploiting you and your system
revolution 10 Oct 2008, 22:06
windwakr wrote:
So, does this work on all windows versions?
Of course not. None of these hacks, including NTDLL functions and fixed constants etc., work everywhere. Expect it to fail at the most inconvenient time.
Post 10 Oct 2008, 22:06
View user's profile Send private message Visit poster's website Reply with quote
windwakr



Joined: 30 Jun 2004
Posts: 827
windwakr 10 Oct 2008, 22:19
I guess it doesn't matter, it's not like I'm gonna need to do millions of GetTickCount's quickly.
Post 10 Oct 2008, 22:19
View user's profile Send private message Reply with quote
bitRAKE



Joined: 21 Jul 2003
Posts: 4016
Location: vpcmpistri
bitRAKE 11 Oct 2008, 01:42
My kernel says:
Code:
KERNEL32.GetTickCount:
  mov ecx,[7FFE0004] ; =$0FA00000
  mov rax,[7FFE0320]
  imul rax,rcx
  shr rax,24
  ret    
...clearly, the granularity is far from 1ms.

Edit: whoops, that 18 should be hex $18 = 24. Which means the multiplier is 15.625ms.

_________________
¯\(°_o)/¯ “languages are not safe - uses can be” Bjarne Stroustrup


Last edited by bitRAKE on 11 Oct 2008, 01:49; edited 1 time in total
Post 11 Oct 2008, 01:42
View user's profile Send private message Visit poster's website Reply with quote
Hrstka



Joined: 05 May 2008
Posts: 56
Location: Czech republic
Hrstka 11 Oct 2008, 01:42
Some time ago, I was looking at GetSystemTimeAsFileTime and it seemed weird to me - basically just reads 2 dword from memory. Didn't know about this structure. Btw, is there any further info about it, eg. what does DismountCount mean?

Quote:
I guess it doesn't matter, it's not like I'm gonna need to do millions of GetTickCount's quickly.

It wouldn't help you anyway, since the resolution is 10 ms (Win XP), so the value only changes 100 times per second.
Post 11 Oct 2008, 01:42
View user's profile Send private message Reply with quote
bitRAKE



Joined: 21 Jul 2003
Posts: 4016
Location: vpcmpistri
bitRAKE 11 Oct 2008, 01:58
Code:
KERNEL32.GetSystemTimeAsFileTime:
  mov edx,[7FFE0018] ; []=01C92B44
  mov r8d,[7FFE0014] ; []=00D8D2B0
  mov eax,[7FFE001C] ; []=01C92B44
  cmp edx,eax
  jnz KERNEL32.GetSystemTimeAsFileTime
  mov [rcx],r8d
  mov [rcx+04],edx
  ret    
...kind of interested in how long that delay could be?
Or if this is a realistic way to insure the value is correct?

_________________
¯\(°_o)/¯ “languages are not safe - uses can be” Bjarne Stroustrup
Post 11 Oct 2008, 01:58
View user's profile Send private message Visit poster's website Reply with quote
windwakr



Joined: 30 Jun 2004
Posts: 827
windwakr 11 Oct 2008, 02:01
I don't understand this....hopcodes original code was the exact of my dll's code but they returned different values...wtf?

here the code out of my kernel32.dll

Code:
=========
GetTickCount
=========
:7C80929C BA0000FE7F              mov edx, 7FFE0000
:7C8092A1 8B02                    mov eax, dword[edx]
:7C8092A3 F76204                  mul dword[edx+04]
:7C8092A6 0FACD018                shrd eax, edx, 18
:7C8092AA C3                      ret  
    


thats the code he originally posted, but it returned different values??

_________________
----> * <---- My star, won HERE
Post 11 Oct 2008, 02:01
View user's profile Send private message Reply with quote
bitRAKE



Joined: 21 Jul 2003
Posts: 4016
Location: vpcmpistri
bitRAKE 11 Oct 2008, 02:04
They are still the same - the 18 is hex for decimal 24.

_________________
¯\(°_o)/¯ “languages are not safe - uses can be” Bjarne Stroustrup
Post 11 Oct 2008, 02:04
View user's profile Send private message Visit poster's website Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20298
Location: In your JS exploiting you and your system
revolution 11 Oct 2008, 02:08
bitRAKE wrote:
Code:
KERNEL32.GetSystemTimeAsFileTime:
  mov edx,[7FFE0018] ; []=01C92B44
  mov r8d,[7FFE0014] ; []=00D8D2B0
  mov eax,[7FFE001C] ; []=01C92B44
  cmp edx,eax
  jnz KERNEL32.GetSystemTimeAsFileTime
  mov [rcx],r8d
  mov [rcx+04],edx
  ret    
...kind of interested in how long that delay could be?
Or if this is a realistic way to insure the value is correct?
The delay is worst case two times around the loop, and only then if it is preempted between reads. In most cases it runs through just once.
Post 11 Oct 2008, 02:08
View user's profile Send private message Visit poster's website Reply with quote
bitRAKE



Joined: 21 Jul 2003
Posts: 4016
Location: vpcmpistri
bitRAKE 11 Oct 2008, 02:34
Yeah, they are testing the high dword to ensure a carry isn't missed? Couldn't the qword be updated at once eliminating the need? Seems obvious they just recompiled the code for amd64 without changing anything for fear of breaking something. Laughing

_________________
¯\(°_o)/¯ “languages are not safe - uses can be” Bjarne Stroustrup
Post 11 Oct 2008, 02:34
View user's profile Send private message Visit poster's website Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20298
Location: In your JS exploiting you and your system
revolution 11 Oct 2008, 03:02
Yes, an assembly programmer would spot the silliness of the 32bit reads in an instant. Unfortunately, compilers are not so smart as we might like them to be ... yet.
Post 11 Oct 2008, 03:02
View user's profile Send private message Visit poster's website Reply with quote
LocoDelAssembly
Your code has a bug


Joined: 06 May 2005
Posts: 4624
Location: Argentina
LocoDelAssembly 11 Oct 2008, 03:11
Perhaps the variables located at 7FFEXXXX are marked as "volatile", so it can't transform two 32-bit copies into one 64-bit wide copy automatically, that would violate the program semantics.
Post 11 Oct 2008, 03:11
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20298
Location: In your JS exploiting you and your system
revolution 11 Oct 2008, 03:15
LocoDelAssembly wrote:
Perhaps the variables located at 7FFEXXXX are marked as "volatile", so it can't transform two 32-bit copies into one 64-bit wide copy automatically, that would violate the program semantics.
Yeah, exactly the point, they are not smart enough to know when it is okay to ignore such semantics and just do what is right.
Post 11 Oct 2008, 03:15
View user's profile Send private message Visit poster's website Reply with quote
bitRAKE



Joined: 21 Jul 2003
Posts: 4016
Location: vpcmpistri
bitRAKE 11 Oct 2008, 05:19
revolution wrote:
Yes, an assembly programmer would spot the silliness of the 32bit reads in an instant. Unfortunately, compilers are not so smart as we might like them to be ... yet.
"Yet?" I doubt I'll ever be satisfied with them nor content even! They will never be able to rub two bits together and get three. Smile

_________________
¯\(°_o)/¯ “languages are not safe - uses can be” Bjarne Stroustrup
Post 11 Oct 2008, 05:19
View user's profile Send private message Visit poster's website Reply with quote
hopcode



Joined: 04 Mar 2008
Posts: 563
Location: Germany
hopcode 11 Oct 2008, 05:53
Post 11 Oct 2008, 05:53
View user's profile Send private message Visit poster's website Reply with quote
hopcode



Joined: 04 Mar 2008
Posts: 563
Location: Germany
hopcode 11 Oct 2008, 07:46
/to revolution/
Please,read this,from the Table 13-1.
Jeffrey Ritcher's "Programming Applications for Microsoft Windows",4th Edition

The "32Bit Windows 2000 (x86 w/3GB User-mode)" column in Table 13-1 shows how the address space looks when the /3GB switch is used in the BOOT.INI file. And from the book:

"Microsoft had to create a solution that allowed this application to work in a 3-GB environment.
When the system is about to run an application, it checks to see if the application was linked
with the /LARGEADDRESSAWARE linker switch. If so, the application is claiming that it does not
do anything funny with memory addresses and is fully prepared to take advantage of a 3-GB user-mode
address space. On the other hand, if the application was not linked with the /LARGEADDRESSAWARE
switch, the operating system reserves the 1-GB area between 0x80000000 and 0xBFFFFFFF.

This prevents any memory allocations from being created at a memory address whose high bit is set.
Note that the kernel was squeezed tightly into a 2-GB partition. When using the /3GB switch, the kernel
is barely making it into a 1-GB partition. Using the /3GB switch reduces the number of threads, stacks,
and other resources that the system can create. In addition, the system can only use a maximum of 16 GB
of RAM vs. the normal maximum of 64 GB because there isn't enough virtual address space in kernel mode
to manage the additional RAM. "

So, if user links with the flags /LARGEADDRESSAWARE linker switch,
the application will work in a 3-GB environment following the "numbers"
of the second column and the /3GB flag in boot.ini. In normal linking,
the majority of the cases,the first column has the right numbers.

Also, it is basically an user choice.
Or a link-flag-testing in the PE header or in the boot.ini file.

And...
revolution wrote:
hopcode wrote:
It resides permantly at 0x7ffe0000, in every version of Windows NT+. mapped into EVERY process.
That is not true for all versions of Windows NT+, some versions are 3Gig aware.


is misleading what you have said (sure not for me that i have learned
this text a lot of years ago), but for the newers to this thema !!!

In conclusion,
dear revolution,
They reside permantly

Are you agree ?


Description: partioning with /3GB flag
Filesize: 57.95 KB
Viewed: 11687 Time(s)

process_space.jpg


Post 11 Oct 2008, 07:46
View user's profile Send private message Visit poster's website Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  
Goto page 1, 2  Next

< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.