flat assembler
Message board for the users of flat assembler.
![]() Goto page Previous 1, 2, 3, 4 Next |
Author |
|
Azu 30 Mar 2009, 17:42
buzzkill wrote:
buzzkill wrote:
|
|||
![]() |
|
revolution 30 Mar 2009, 17:44
Well since you are discussing L1 cache then you should also realise the importance of L1 cache line size and the read-allocate policy. If your usual string length is <= the L1 line size and you align the strings on cache line boundaries the you get the whole string in the cache, whether you want it or not, just by reading one byte of it.
|
|||
![]() |
|
Azu 30 Mar 2009, 17:46
revolution wrote: Well since you are discussing L1 cache then you should also realise the importance of L1 cache line size and the read-allocate policy. If your usual string length is <= the L1 line size and you align the strings on cache line boundaries the you get the whole string in the cache, whether you want it or not, just by reading one byte of it. ![]() Last edited by Azu on 30 Mar 2009, 17:47; edited 2 times in total |
|||
![]() |
|
revolution 30 Mar 2009, 17:46
Azu wrote: ... use a register (which is part of the L1 cache right? |
|||
![]() |
|
buzzkill 30 Mar 2009, 17:47
Quote: I don't know about Pascal strings. By this I mean strings with a length byte prepended. These were the strings in the Pascal language, so I call them Pascal(-style) strings. Quote:
Well, there is for instance the scas(b) instruction if you want to search for the null byte to determine string length with a C-string. As for copying, I guess with a Pascal-string you could read the length byte and then movs(b) the string itself, but whether that would be a great performance increase I don't know. For comparison, the advantage of having a length byte is only if your two strings have different length and even then you need to consider what you call equal (ie "hello" vs "hello "). Also, you will want eg the position in the strings where they differ, so you'll have to traverse them anyway. |
|||
![]() |
|
Azu 30 Mar 2009, 17:50
buzzkill wrote:
buzzkill wrote:
buzzkill wrote: As for copying, I guess with a Pascal-string you could read the length byte and then movs(b) the string itself, but whether that would be a great performance increase I don't know. For comparison, the advantage of having a length byte is only if your two strings have different length and even then you need to consider what you call equal (ie "hello" vs "hello "). Also, you will want eg the position in the strings where they differ, so you'll have to traverse them anyway. revolution wrote:
![]() |
|||
![]() |
|
revolution 30 Mar 2009, 17:54
Azu wrote: Thanks. It takes more time to scan through the whole string then the first byte of it though, even in the L1 cache, doesn't it? If you are really determined to get peak performance out of string operations then you have to know about caches and memory bandwidths etc.. The CPU is only a small part of optimising code for best performance, you also get to get data into and out of the CPU to make it useful, and the path in/out is the memory interface. |
|||
![]() |
|
buzzkill 30 Mar 2009, 17:56
revolution wrote: Well since you are discussing L1 cache then you should also realise the importance of L1 cache line size and the read-allocate policy. If your usual string length is <= the L1 line size and you align the strings on cache line boundaries the you get the whole string in the cache, whether you want it or not, just by reading one byte of it. That's very interesting, revolution, I didn't know that. Lining up strings to cache-line sizes is not something I have seen done before (at least not that I've noticed, I wonder if any HLL compilers take this into account). For my CPU, cache line size is 64 bytes, so I personally wouldn't align to that I think (either large holes in your data, or you'd have to rearrange other data). And whether or not typical string usage in a program would be strlen() <= 64, I think would be hard so say beforehand. Nevertheless, you've given me something to think about ![]() |
|||
![]() |
|
Azu 30 Mar 2009, 17:57
revolution wrote:
And for strings that don't fit in L1 wouldn't the scanning still be a lot slower? |
|||
![]() |
|
buzzkill 30 Mar 2009, 18:00
Quote:
Not twice as much: C-string: append new data to end of string. Pascal-string: append new data to end of string and update string lenght. Quote:
That depends, if we take into account what revolution said about cache lines. Quote:
I didn't say that, if you really want to know for sure, first study some string-processing algorithms, then take some benchmarks... *Edit* Eg, with length byte strings, you may have to keep the string length in a register (as mentioned above) and that can affect the efficiency of your algorithm, x86 is not know as register-starved for nothing ![]() |
|||
![]() |
|
Azu 30 Mar 2009, 18:03
buzzkill wrote:
Not putting the length in the front AND putting a null on the end. buzzkill wrote:
buzzkill wrote:
|
|||
![]() |
|
revolution 30 Mar 2009, 18:09
Azu: Do you have the app ready, debugged and running now? If not then I would respectfully suggest that you are trying to optimise too early.
The basic rule is always: Get it working, then get it fast. If you have already passed the working stage then you can start to characterise the usage model and start to give real world figures and ratios etc. Only then can you start to answer the sample Q's that I proposed some posts back there somewhere. Also, if your app is already working then you can just code up a few variations and test to see which is fastest in your runtime situation. Remember that other people's computers may give different results. With differing cache sizes and differing CPUs etc. the results are almost certain to be different. One size does not fit all! I cannot tell you what will be faster for you, because your situation will be very specific and I cannot duplicate it here to tell what is best for performance. |
|||
![]() |
|
Azu 30 Mar 2009, 18:12
revolution wrote: Azu: Do you have the app ready, debugged and running now? If not then I would respectfully suggest that you are trying to optimise too early. |
|||
![]() |
|
buzzkill 30 Mar 2009, 18:17
Quote:
OT: you could consider this an argument against very low-level optimizations in assembly of course ![]() |
|||
![]() |
|
Azu 30 Mar 2009, 18:19
buzzkill wrote:
|
|||
![]() |
|
revolution 30 Mar 2009, 18:20
Azu wrote: I want to code it right to begin with instead of finishing it only to find out that I have to rewrite the entire thing because I chose a sub-optimal fundament. Last edited by revolution on 30 Mar 2009, 18:24; edited 1 time in total |
|||
![]() |
|
Azu 30 Mar 2009, 18:22
revolution wrote:
|
|||
![]() |
|
revolution 30 Mar 2009, 18:23
buzzkill wrote:
|
|||
![]() |
|
Azu 30 Mar 2009, 18:24
revolution wrote:
|
|||
![]() |
|
Goto page Previous 1, 2, 3, 4 Next < Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2025, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.