flat assembler
Message board for the users of flat assembler.
![]() |
Author |
|
Michael 24 Oct 2005, 16:21
Hello everyone,
I am relatively new to assembly. I have been readeing Art of Assembly by Randall Hyde. From the book I understand that each instruction takes 1 clock , if the operators are registers, and the instructions are in cache. Now I find this site: http://www.online.ee/%7Eandre/i80386/Opcodes/. The minimum time for any instruction is 2 and the max I have found so far is 41 ![]() Thanks in advance. Michael. |
|||
![]() |
|
Michael 24 Oct 2005, 19:10
I don't want to know exactly the times for each instruction, but an approximation. Also do similar instructions in same conditions have the same speed? Like mov eax,ebx; xor eax,eax; mul eax,ebx etc.
Is there any documentation about instruction times for newer processors? Michael. |
|||
![]() |
|
vid 24 Oct 2005, 19:34
similar are mov,add,sub,cmp,test,xor,or,and, "mul" is much more complicated and takes long time. "div" is even much more complicated than "mul". yes, similar take similar time. usually simpler and older instructions (which existed on 086, 286 and so on) are faster, and it's better to use combination of such instructions than newer alternative. so don't waste time learning 386 instructions like cmov, bt, etc. if you want speed.
|
|||
![]() |
|
Michael 24 Oct 2005, 20:38
Ok, got it now. Thanks for the quick responses.
|
|||
![]() |
|
MazeGen 25 Oct 2005, 12:49
vid wrote: it's better to use combination of such instructions than newer alternative. so don't waste time learning 386 instructions like cmov, bt, etc. if you want speed. Well, not always. For instance, look at this rule from Intel Optimization Manual: Quote:
|
|||
![]() |
|
vid 25 Oct 2005, 13:56
well, you are better informed here...
what is "predictable" control branch? |
|||
![]() |
|
Octavio 25 Oct 2005, 14:11
vid wrote: well, you are better informed here... The processor stores information about how many times the jmp is done or not and make predictions based on stadistics. if the jmp is done about 50% of times then is unpredictable. |
|||
![]() |
|
MazeGen 25 Oct 2005, 14:26
The processor uses a cache called Branch Target Buffer (BTB) to predict whether the jump will be taken or not, according to the branch history.
The jump is unpredictable if the jump was taken as often as not taken. |
|||
![]() |
|
Hayden 26 Oct 2005, 05:23
You can optimize intruction speed by pairing intructions together.
Most newer CPU's will execute both pipe-v and pipe-u in parallele. consider the following senario... ... mov eax, dword -- v-pipe add eax, edx -- u-pipe -- (mov eax, dword depenancy) this would cause the cpu to stall since pipe-u cannot be executed until pipe-v has completed the mov eax, dword intruction. cpu stalls usualy incure a number of clocks for recovery. solution... ... mov eax, dword -- v-pipe nop -- u-pipe add eax, edx -- v-pipe ... -- u-pipe now both the v-pipe and u-pipe are executed without any cpu stalls. it is much better to do a more meaningful intruction other than the nop. But I hope this illistrates the idea. code and data alignment will also gain some extra speed. mmx/smid code should be aligned to 8-bytes along with 64-bit code but 32-bit code should be aligned to 4-bytes. although alighn( ![]() 16-bit data aligned to 4-bytes -- optimal for 32/64 bit systems. 32-bit data aligned to 4-bytes 64-bit data aligned to 8-bytes ect... some coding ethics... make sure that all data is aligned to the apropiate boundry(s). when codeing align to the apropiate boundry and pair your intructions that can be executed at the same time. ie: align 4 ; pair 1 mov eax, [esp+4] -- v-pipe mov ebx, [esp+8] -- u-pipe ; pair 2 mov ecx, [esp+12] -- v-pipe add eax, ebx -- u-pipe note: some intrucions require more then 1 clock cycle ie - mul requires 3. After coding a 'mul ebx' eax and edx should not be accessed until at least 3 clock cycles to avoid a 3 clock cpu stall plus any recovery. I hope this generalizes some cpu charcteristics for you. I'm not that good at explaining theese things but more infomation can be found at the intel website under pairing. _________________ New User.. Hayden McKay. |
|||
![]() |
|
vid 26 Oct 2005, 05:54
there was a link to good tutorial on old FASM site.. anybody has it?
|
|||
![]() |
|
decard 26 Oct 2005, 07:25
|
|||
![]() |
|
vid 26 Oct 2005, 08:22
yes, but i had .hlp version of pentopt...
|
|||
![]() |
|
< Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2025, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.