flat assembler
Message board for the users of flat assembler.
![]() |
Author |
|
LocoDelAssembly 14 Sep 2006, 21:13
I tried this:
Code: include 'win32axp.inc' .code start: invoke SetPriorityClass, 0, REALTIME_PRIORITY_CLASS .loop: rdtsc push eax mov eax, $12345678 mov esi, buffer call toString rdtsc pop edx dec [counter] jnz .loop int3 toString:; IN=eax ; OUT=qword[esi] binary converted to ASCII HEX mov edi,0F0F0F0Fh mov ecx,eax mov ebx,eax shr eax,4 and ebx,edi and eax,edi mov ecx,eax mov edx,ebx add ebx,06060606h add eax,06060606h shr ebx,4 shr eax,4 and ebx,edi and eax,edi lea edx,[ebx*8+edx+30303030h] lea ecx,[eax*8+ecx+30303030h] sub edx,ebx sub ecx,eax xchg ch,dl rol ecx,16 rol edx,16 xchg ch,dl rol ecx,16 xchg cx,dx rol edx,16 mov [esi],ecx mov [esi+4],edx ret .data counter dd 3 buffer rb 16 .end start At int3 EDX-EAX=12596 and the string is "9681B46D" instead of "12345678". <- IGNORE THIS SENTENCE!! How do you measure the clocks? I hope that my Athlon64 isn't so bad to get that terrible time of 12596 clocks ![]() [edit] I added code to repeat the call and the second time it took only 92 cycles (measuring with the same method)[/edit] [edit2] With the new code I count only 22 cycles!![/edit2] Last edited by LocoDelAssembly on 14 Sep 2006, 21:45; edited 2 times in total |
|||
![]() |
|
Madis731 14 Sep 2006, 21:18
erm
1) mov eax,12345678h 2) RDTSC => eax=XXXXXXXXh 3) translate it to ASCII ![]() of course you get wrong results... and PS you should make the loop go like 100000 times with ebp or other counters that are not used...and I tested time with CALL&RET included! So the thirty-something clocks are "Call-2-Call" which means decrementing of ebp and testing it for zero. |
|||
![]() |
|
LocoDelAssembly 14 Sep 2006, 21:42
HAHAHAHAHAHA, you are right
![]() Something here must be interrupting my CPU very often because it typically spend more than 1000 cycles (repeating at least three times) but if I use a realtime priority I get just 21 cycles!! It's a very big difference, what thing here can be preempting the CPU slices so often? Regards PS: I'll edit my previous post with the new code. |
|||
![]() |
|
LocoDelAssembly 14 Sep 2006, 21:54
mmm, I tested without SetPriorityClass and took the same time. If I comment ".code", ".data" and ".end start" I get 1000~ clocks again, BUT, if I put "rb 256" between RET and "counter dd 3" I get 22 clocks. Seems that is very bad write too near of code that it's executing!!
|
|||
![]() |
|
f0dder 14 Sep 2006, 22:02
Quote:
Indeed, and this has been known for a long while ![]() |
|||
![]() |
|
Goplat 15 Sep 2006, 18:02
The first mov ecx,eax is redundant.
edit: Seems the code is actually slower without it. Average of 28 cycles with it, 28.357 without it. That's really strange... |
|||
![]() |
|
Vasilev Vjacheslav 16 Sep 2006, 04:39
Goplat, maybe it placed for align
|
|||
![]() |
|
LocoDelAssembly 16 Sep 2006, 14:10
With the redundant MOV I get 16 cycles and without it 15 cycles.
(I get 22 cycles if I put CPUID at .loop label what is just before the time counting starts and 21 cycles without the redundant MOV) |
|||
![]() |
|
r22 17 Sep 2006, 17:46
LUT might be a bit faster
Code: MOVZX EDX,AL MOVZX ECX,AH MOVZX EDX,WORD[LUT + EDX*2] SHR EAX,16 MOVZX ECX,WORD[LUT + ECX*2] MOV WORD[ESI+6],DX MOV WORD[ESI+4],CX MOVZX EDX,AL MOVZX ECX,AH MOVZX EDX,WORD[LUT + EDX*2] MOVZX ECX,WORD[LUT + ECX*2] MOV WORD[ESI+2],DX MOV WORD[ESI],CX LUT dw '00','01','02','03','04','05','06','07','08','09','0A','0B','0C','0D','0E','0F' ... dw 'F0','F1','F2','F3','F4','F5','F6','F7','F8','F9','FA','FB','FC','FD','FE','FF' |
|||
![]() |
|
f0dder 17 Sep 2006, 17:58
You might want to take a look at this thread from asmcommunity.
|
|||
![]() |
|
rugxulo 18 Sep 2006, 17:21
r22, just for the record, you can do this:
Code: LUT dw '000102030405060708090A0B0C0D0E0F' For lots of little strings, you may (or may not) prefer it. Just FYI. ![]() |
|||
![]() |
|
UCM 19 Sep 2006, 00:21
rugxulo: No, you can't, it will say "value out of range"
![]() |
|||
![]() |
|
rugxulo 19 Sep 2006, 22:02
Oops, I meant db, not dw:
Code:
LUT db '000102030405060708090A0B0C0D0E0F'
|
|||
![]() |
|
UCM 20 Sep 2006, 00:41
OR, you could do this:
Code: repeat 256 a = %-1 l = a and 0xF h = a shr 4 if h > 9 db 'A'+(h-10) else db '0'+h end if if l > 9 db 'A'+(l-10) else db '0'+l end if end repeat (tested) Okay, it's not great, but it's more compact ![]() |
|||
![]() |
|
< Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2023, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.