flat assembler
Message board for the users of flat assembler.

Index > Main > add or inc

Author
Thread Post new topic Reply to topic
DustWolf



Joined: 26 Jan 2006
Posts: 373
Location: Ljubljana, Slovenia
DustWolf 16 Aug 2006, 23:30
Hello,

What is faster (when done in a loop), this:
Code:
add esi,4    

or this:
Code:
mov edx,4 ;outside the loop
...
add esi,edx ;in the loop    

or this:
Code:
inc esi
inc esi
inc esi
inc esi    

?

Thanks for the answer. Smile
Post 16 Aug 2006, 23:30
View user's profile Send private message Reply with quote
UCM



Joined: 25 Feb 2005
Posts: 285
Location: Canada
UCM 16 Aug 2006, 23:53
The first is the fastest, winning the second by 1 clock (the mov instruction), and winning the 3rd by 3 clocks for each time it the loop is executed. All "add" and "inc" instructions take one cycle.
Post 16 Aug 2006, 23:53
View user's profile Send private message Reply with quote
DustWolf



Joined: 26 Jan 2006
Posts: 373
Location: Ljubljana, Slovenia
DustWolf 16 Aug 2006, 23:56
UCM wrote:
The first is the fastest, winning the second by 1 clock (the mov instruction), and winning the 3rd by 3 clocks for each time it the loop is executed. All "add" and "inc" instructions take one cycle.


Yes but the Add eax,4 needs to wait for the system memory to pass on the 4, right?
Post 16 Aug 2006, 23:56
View user's profile Send private message Reply with quote
UCM



Joined: 25 Feb 2005
Posts: 285
Location: Canada
UCM 17 Aug 2006, 01:28
No, since it is loaded with the rest of the instruction.
Post 17 Aug 2006, 01:28
View user's profile Send private message Reply with quote
r22



Joined: 27 Dec 2004
Posts: 805
r22 17 Aug 2006, 03:16
If you want FASTER unroll the loop and do an add esi,8 or 12 or 16

In my experience you can get a 10-20% speed increase by unrolling a loop 2-5x.
Post 17 Aug 2006, 03:16
View user's profile Send private message AIM Address Yahoo Messenger Reply with quote
MazeGen



Joined: 06 Oct 2003
Posts: 977
Location: Czechoslovakia
MazeGen 17 Aug 2006, 11:06
UCM wrote:
All "add" and "inc" instructions take one cycle.

INC and DEC take 2 uops on P4.
Post 17 Aug 2006, 11:06
View user's profile Send private message Visit poster's website Reply with quote
UCM



Joined: 25 Feb 2005
Posts: 285
Location: Canada
UCM 17 Aug 2006, 12:49
I was looking at the AMD cycle charts Wink
Post 17 Aug 2006, 12:49
View user's profile Send private message Reply with quote
cod3b453



Joined: 25 Aug 2004
Posts: 618
cod3b453 18 Aug 2006, 14:27
Where can/do you get cycle charts from?
Post 18 Aug 2006, 14:27
View user's profile Send private message Reply with quote
UCM



Joined: 25 Feb 2005
Posts: 285
Location: Canada
UCM 18 Aug 2006, 14:42
It is an appendix of the AMD Software Optimization Manual, publication number 22007 (revision K.) It has many optimization tips as well. In addition, for more recent charts, see the "Software Optimization Guide for AMD64 Processors", publication number 25112.
Post 18 Aug 2006, 14:42
View user's profile Send private message Reply with quote
MazeGen



Joined: 06 Oct 2003
Posts: 977
Location: Czechoslovakia
MazeGen 18 Aug 2006, 15:20
cod3b453 wrote:
Where can/do you get cycle charts from?

http://www.agner.org/optimize/#manuals

Get number 4. It contains details for P1, PMMX, PPro, PII, PIII, P4, PM, Core2, P4E, AMD64 Smile
Post 18 Aug 2006, 15:20
View user's profile Send private message Visit poster's website Reply with quote
cod3b453



Joined: 25 Aug 2004
Posts: 618
cod3b453 18 Aug 2006, 15:39
Thanks UCM & MazeGen! Cool
Post 18 Aug 2006, 15:39
View user's profile Send private message Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2025, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.