flat assembler
Message board for the users of flat assembler.

Index > Main > FAST COSINE DOUBLE PRECISION FP

Goto page Previous  1, 2, 3
Author
Thread Post new topic Reply to topic
Xorpd!



Joined: 21 Dec 2006
Posts: 161
Xorpd!
@Madis731: When I showed that moving through memory was faster than through a register on Core 2 Duo, the reason wasn't because of latency but because it forced the operation to go through ports than were otheriwse unused. The latency is I think 4 clocks more for the memory move than for the register move.

I prefer movddup for duplicating inputs into a register as it's a pretty old processor that doesn't have SSE3 by now. Are you aware that when Agner Fog leaves the latency field blank in his instruction tables that it means that his tests weren't able to determine their latency, not that the latency is zero? Operations from memory normally have the same latency for the destination (register) operand and something like 2 or 3 clocks more latency for the source (memory) operand. There are some operations such as movsd that are different semantically for a register source and a memory source where this may not be true. It's difficult to measure the latency of a load because it can't be disentangled from that of the store that meant you had an event you had to wait on in the first place. Core 2 Duo also has an additional clock latency between a load and a floating point operation on the modified register and a store operation after a floating point operation, IIRC. Latencies through memory are obviously much worse if store forwarding rules aren't followed. I timed some of this explicitly at some point but I'm too lazy to dig out the code just now.

The fastest way to compute SIN and COS is to store up a few first quadrant angles then compute their trig functions simultaneously. Argument reduction is the longest part of the algorithm if it can't be avoided and current processors need lots of parallel work even for single-threaded code to be efficient. Apologies if these facts have already been pointed out in this thread.
Post 19 Jan 2008, 03:32
View user's profile Send private message Visit poster's website Reply with quote
Madis731



Joined: 25 Sep 2003
Posts: 2140
Location: Estonia
Madis731
Really??? Then all my calculations for about... 5 months have been wrong. That is sad Sad
I must revisit Agner's and analyze all those "I thought to be zero". I just rechecked quickly and there were no info on these "blanks" :S better dig deeper.

Sorry!
Post 19 Jan 2008, 10:44
View user's profile Send private message Visit poster's website Yahoo Messenger MSN Messenger Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  
Goto page Previous  1, 2, 3

< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2020, Tomasz Grysztar. Also on GitHub, YouTube, Twitter.

Website powered by rwasa.