flat assembler
Message board for the users of flat assembler.
![]() |
Author |
|
UCM 16 Aug 2006, 23:53
The first is the fastest, winning the second by 1 clock (the mov instruction), and winning the 3rd by 3 clocks for each time it the loop is executed. All "add" and "inc" instructions take one cycle.
|
|||
![]() |
|
DustWolf 16 Aug 2006, 23:56
UCM wrote: The first is the fastest, winning the second by 1 clock (the mov instruction), and winning the 3rd by 3 clocks for each time it the loop is executed. All "add" and "inc" instructions take one cycle. Yes but the Add eax,4 needs to wait for the system memory to pass on the 4, right? |
|||
![]() |
|
UCM 17 Aug 2006, 01:28
No, since it is loaded with the rest of the instruction.
|
|||
![]() |
|
r22 17 Aug 2006, 03:16
If you want FASTER unroll the loop and do an add esi,8 or 12 or 16
In my experience you can get a 10-20% speed increase by unrolling a loop 2-5x. |
|||
![]() |
|
MazeGen 17 Aug 2006, 11:06
UCM wrote: All "add" and "inc" instructions take one cycle. INC and DEC take 2 uops on P4. |
|||
![]() |
|
UCM 17 Aug 2006, 12:49
I was looking at the AMD cycle charts
![]() |
|||
![]() |
|
cod3b453 18 Aug 2006, 14:27
Where can/do you get cycle charts from?
|
|||
![]() |
|
UCM 18 Aug 2006, 14:42
It is an appendix of the AMD Software Optimization Manual, publication number 22007 (revision K.) It has many optimization tips as well. In addition, for more recent charts, see the "Software Optimization Guide for AMD64 Processors", publication number 25112.
|
|||
![]() |
|
MazeGen 18 Aug 2006, 15:20
cod3b453 wrote: Where can/do you get cycle charts from? http://www.agner.org/optimize/#manuals Get number 4. It contains details for P1, PMMX, PPro, PII, PIII, P4, PM, Core2, P4E, AMD64 ![]() |
|||
![]() |
|
cod3b453 18 Aug 2006, 15:39
Thanks UCM & MazeGen!
![]() |
|||
![]() |
|
< Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2025, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.