flat assembler
Message board for the users of flat assembler.

Index > Main > How big should I expect the data cache to be?

Author
Thread Post new topic Reply to topic
Hugh Aguilar



Joined: 15 Nov 2011
Posts: 62
Location: Arizona
Hugh Aguilar 26 Jan 2012, 10:47
How big should I expect the data cache to be? I suppose this is going to vary from one processor to another, but what can I assume as a reasonable size for a typical processor?

I have a record that is going to be accessed a lot. I want to align it to the size of the data cache, to make sure that it fits. I can use unions to make reduce its size if necessary, but this is a hassle, so I would rather not do this if it is not necessary.

Also, is it appropriate to ask questions about the x86 here, or is this forum only for questions about the FASM assembler?
Post 26 Jan 2012, 10:47
View user's profile Send private message Send e-mail Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20445
Location: In your JS exploiting you and your system
revolution 26 Jan 2012, 10:57
Hugh Aguilar wrote:
How big should I expect the data cache to be? I suppose this is going to vary from one processor to another, but what can I assume as a reasonable size for a typical processor?
There is no such thing as a typical processor. Cache can vary from 0B to more than 12MB. It also depends upon which level of cache you are thinking of: L1, L2 or L3.
Post 26 Jan 2012, 10:57
View user's profile Send private message Visit poster's website Reply with quote
AsmGuru62



Joined: 28 Jan 2004
Posts: 1670
Location: Toronto, Canada
AsmGuru62 26 Jan 2012, 11:50
@Hugh Aguilar:

This link has some information on cachces:
http://www.agner.org/optimize/

Early optimization case?
Are you certain, that "accessing that data a lot" is really what is slowing down your code?
Did you measure the speed of the code?
Post 26 Jan 2012, 11:50
View user's profile Send private message Send e-mail Reply with quote
edfed



Joined: 20 Feb 2006
Posts: 4353
Location: Now
edfed 26 Jan 2012, 18:26
revolution wrote:
Cache can vary from 0B to more than 12MB. It also depends upon which level of cache you are thinking of: L1, L2 or L3.


for the moment. maybe more cache layers, and more cache quantity (of course). what about a L0 cache? L-1, etc... Smile and 4giga bytes L0 cache .
Post 26 Jan 2012, 18:26
View user's profile Send private message Visit poster's website Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20445
Location: In your JS exploiting you and your system
revolution 26 Jan 2012, 18:49
L0 would be the CPU registers?
Post 26 Jan 2012, 18:49
View user's profile Send private message Visit poster's website Reply with quote
edfed



Joined: 20 Feb 2006
Posts: 4353
Location: Now
edfed 26 Jan 2012, 21:07
why not? if everything is register, it can be fast. in other words, you completelly delete all the layers, and concentrate to the more important thing, databus to alu connection. let say, you have a all in one cpu+ram. no registers. just ram. add ram to ram, mov ram to ram, mul ram by ram, etc.
Post 26 Jan 2012, 21:07
View user's profile Send private message Visit poster's website Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20445
Location: In your JS exploiting you and your system
revolution 27 Jan 2012, 01:58
The larger you make a memory structure the slower it will be. 4G of CPU registers is never going to happen because it will always be more efficient to have a smaller memory structure to improve overall performance.
Post 27 Jan 2012, 01:58
View user's profile Send private message Visit poster's website Reply with quote
Hugh Aguilar



Joined: 15 Nov 2011
Posts: 62
Location: Arizona
Hugh Aguilar 27 Jan 2012, 04:35
AsmGuru62 wrote:
@Hugh Aguilar:

This link has some information on cachces:
http://www.agner.org/optimize/

Early optimization case?
Are you certain, that "accessing that data a lot" is really what is slowing down your code?
Did you measure the speed of the code?


No, I didn't measure. It is not really meaningful to measure when interrupts may be occurring, and the OS is preemptive. It may be possible to shut all of that off during the execution of a section of code if I knew how the OS works --- the Menuet crowd may be able to comment on that.

In many cases, my whole program is oriented around processing data in a big data structure --- usually a linked list or a tree. All of the nodes are the same type of record. If that record fits in the L1 cache then I'm pretty sure the program is going to be significantly faster than it would be if the record straddles a cache boundary.

Although I am writing a program similar to what I described above, this is mostly a general question, because a lot of my programs are like this.

I haven't delved into Agner's writings yet, but I should --- a lot of people have said that that is the best place to start in learning about optimization, which is a subject I know very little about. I did read Abrash's two Zen books, but they are obsolete now I'm told.

Isn't it true though, that nobody really knows how to optimize code nowadays because the processor manufacturers don't tell anybody how their processors work internally? We have guidelines for optimization, but we don't know how much of an effect if any they have. We can measure, but that is just empirical evidence; it doesn't prove anything. This reminds me of the voodoo rituals that some folks do --- there is a lot of anecdotal evidence to indicate that they work --- but there is no proof and there never will be. Laughing
Post 27 Jan 2012, 04:35
View user's profile Send private message Send e-mail Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20445
Location: In your JS exploiting you and your system
revolution 27 Jan 2012, 05:35
Hugh Aguilar wrote:
Isn't it true though, that nobody really knows how to optimize code nowadays because the processor manufacturers don't tell anybody how their processors work internally? We have guidelines for optimization, but we don't know how much of an effect if any they have. We can measure, but that is just empirical evidence; it doesn't prove anything. This reminds me of the voodoo rituals that some folks do --- there is a lot of anecdotal evidence to indicate that they work --- but there is no proof and there never will be. Laughing
I think that is somewhat true. Optimisation for many significant apps is really just cache utilisation optimisation (with some pathological exceptions of course). But it is not voodoo, there are deterministic things you can do to make the best of the CPU system the code is running on. It just takes a lot of time, patience and understanding to get it perfect. However, beware that there are diminishing returns for most situations, the more time you spend trying to optimise then less you can save at each iteration and more you lose by not actually having the code running and producing worthwhile results.
Post 27 Jan 2012, 05:35
View user's profile Send private message Visit poster's website Reply with quote
Hugh Aguilar



Joined: 15 Nov 2011
Posts: 62
Location: Arizona
Hugh Aguilar 27 Jan 2012, 08:02
revolution wrote:
Hugh Aguilar wrote:
Isn't it true though, that nobody really knows how to optimize code nowadays because the processor manufacturers don't tell anybody how their processors work internally? We have guidelines for optimization, but we don't know how much of an effect if any they have. We can measure, but that is just empirical evidence; it doesn't prove anything. This reminds me of the voodoo rituals that some folks do --- there is a lot of anecdotal evidence to indicate that they work --- but there is no proof and there never will be. Laughing
I think that is somewhat true. Optimisation for many significant apps is really just cache utilisation optimisation (with some pathological exceptions of course). But it is not voodoo, there are deterministic things you can do to make the best of the CPU system the code is running on. It just takes a lot of time, patience and understanding to get it perfect. However, beware that there are diminishing returns for most situations, the more time you spend trying to optimise then less you can save at each iteration and more you lose by not actually having the code running and producing worthwhile results.


The key word there is "understanding" --- at least on my part, this is sorely lacking. There is a huge difference between familiarity and understanding --- and the difference between the smart, the dumb and pragmatic is that they know what this difference is, they don't know, and they don't care.

Your phrase "diminishing returns" pretty much hits the nail on the head though. For the most part, there are no returns. Desktop software is given away for free (as in free beer). Software increases the value of the hardware that it is running on, but software has no value in itself. Crying or Very sad
Post 27 Jan 2012, 08:02
View user's profile Send private message Send e-mail Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2025, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.