flat assembler
Message board for the users of flat assembler.
Index
> Windows > Disable SMT for my process? Goto page Previous 1, 2, 3, 4, 5, 6, 7 Next |
Author |
|
Azu 04 Sep 2009, 07:55
Let me make a simpler example
Code: ;func 1 loopy: lea ecx,[ecx] lea ecx,[ecx] lea ecx,[ecx] lea ecx,[ecx] lea ecx,[ecx] lea ecx,[ecx] lea ecx,[ecx] lea ecx,[ecx] loop loopy ;func 2 loopy: sub ecx,1 je loopy Code: ;func 1 loopy: nop loop loopy ;func 2 nop nop nop nop nop nop nop nop nop nop ;func 3 mov eax,ecx loopyA: loopyB: mov ecx,eax nop loop loopyB sub eax,1 je loopyA Last edited by Azu on 04 Sep 2009, 08:02; edited 1 time in total |
|||
04 Sep 2009, 07:55 |
|
revolution 04 Sep 2009, 08:01
Well you asked the same question again (3rd time now), and rather than try to answer in yet another different way I just want to know if this is even remotely an answer to what you want to know:
revolution wrote: AFAIK there is no notation for such a thing. Mathematicians are generally not interested in linear speed-ups so I doubt they have developed any special notation for it (if someone knows otherwise I would be happy to learn). Only computer programmers seem interested in linear speed-ups, so perhaps the measure of runtime, or clock counts, or instruction count, or byte count, or something, is the best you can get. |
|||
04 Sep 2009, 08:01 |
|
Azu 04 Sep 2009, 08:03
I'm not asking about linear speed-ups. I'm asking about how many times you need to run the whole algorithm as n (ecx) changes. Look at the examples I gave.. how can you say their differences are linear?
P.S. here is an article that uses big-O the same way I do! http://en.wikipedia.org/wiki/Bubble_sort You should fix it, then less people will be confused. Because that is how I learned the big-O notation. From that page, and a bunch of other pages they have about different sorting algorithms. |
|||
04 Sep 2009, 08:03 |
|
revolution 04 Sep 2009, 08:12
Sorry, Azu your question is very vague and seems to change each time. You first wanted to argue that SIMD changes O(n) into O(1) in certain cases. It was shown to you that this was wrong and you agreed. Now you seem to be asking what is the notation for the algorithm (not the implementation) in which case the O(something) is the proper notation. But you cannot confuse all this with SIMD or 64 vs 32 bit registers or something. Your previous example with a 16-way SIMD vs 1-way loop showed linear speed up, so I answered based upon that. But using SIMD is not an algorithmic change.
|
|||
04 Sep 2009, 08:12 |
|
Azu 04 Sep 2009, 08:16
???
Yes it was If you gave it 0 or 1 inputs, it would have been slower in relation to the non-SIMD way posted. 2 to 16 would have been faster in relation to it. All 0-16 would run the exact same amount of stuff, once. No loop or anything. How is that linear speedup? Why isn't it O(1)? And when did I agree to any kind of wrongness about what I said in this thread? I made a different example because you asked for another one, not because I disagreed with the SIMD vs non-SIMD ones I posted. (damn, you're going to quote those last 10 words out of context, aren't you? Well I don't feel like re-wording it sorry, so I'll just leave this note here to call you on it if you do) |
|||
04 Sep 2009, 08:16 |
|
revolution 04 Sep 2009, 08:21
The big-O notation is not a measure of speed. O(1) is not necessarily faster than O(N), the speed depends upon the implementation details. SIMD has nothing to do with O(anything), it only confuses the issue.
1. Big-O measures algorithmic performance. 2. SIMD give you linear speed-up. Those two points above are completely unrelated to each other and neither affects the other. |
|||
04 Sep 2009, 08:21 |
|
Azu 04 Sep 2009, 08:24
I am not wanting to measure speed. I have mentioned earlier, and just now, that the SIMD way is slower in certain situations. But algebraically.. it only is ran one time as long as n<17.. so in that situation I think it should be O(1), where as the non-SIMD way is always O(n).. since it has to run as many times as there are inputs no matter what.. regardless of which is faster, of how much RAM they take, of how many clock cycles they complete it in, or any of that other linear stuff you keep bringing up.
I just don't understand how changing the algorithm from a loop to a single constant block of code ran once, is a linear speedup and not an algebraic one. Can you please explain to me why this is instead of just saying it is? |
|||
04 Sep 2009, 08:24 |
|
revolution 04 Sep 2009, 08:32
Azu: I don't know how to word any differently than the previous messages. So I will try to ask some basic questions to see where your understanding is going wrong.
Do you understand that SIMD does not change the algorithm? Do you understand that big-O does not measure implementation speed? |
|||
04 Sep 2009, 08:32 |
|
Azu 04 Sep 2009, 08:35
No and yes, respectively.
I think that this algorithm; Azu wrote:
Azu wrote:
I think that algorithmic efficiency = how many times it has to run based on how many inputs you give it and linear efficiency = how many clock cycles (or whatever) it takes to run |
|||
04 Sep 2009, 08:35 |
|
revolution 04 Sep 2009, 08:40
In your example the algorithm is the same. The algorithm is to take a byte, add it to another byte, store the result. But the SIMD has not changed the algorithm, it merely parallelised it. It still takes a byte, add it to another byte, store the result, but doing 16 elements at a times. The algorithm describes what to do (adding bytes) - not how to do it, or what order to do it in.
|
|||
04 Sep 2009, 08:40 |
|
Azu 04 Sep 2009, 08:42
So these should all have the same big-O notation then? Since they "all the same thing just in different ways"???
Azu wrote:
And these should have different ones? Since even though they run the same amount of times, they do different things? Code: ;func 1 loopy: inc eax loop loopy ;func 2 loopy: xor eax,eax loop loopy I don't like this.. I want to have a notation that I can use for saying how many iterations something must make to handle how many inputs.. I thought big-O was the one but you have proved me wrong.. |
|||
04 Sep 2009, 08:42 |
|
revolution 04 Sep 2009, 08:45
Azu wrote: So these should all have the same big-O notation then? Since they "all the same thing just in different ways"??? But you have shown three different algorithms. 1) Loop O(N), 2) Lookup (i assume?) O(1), 3) loop within loop O(N^2). But can you see that using SIMD in any of those three function would not change the algorithms. |
|||
04 Sep 2009, 08:45 |
|
Azu 04 Sep 2009, 08:48
Oh and also, all sorting algorithms do the same thing in the end, just using different number of iterations, so.. they should all have the same big-O notation?
And jump tables should have the same big-O as looping through a list of numbers, comparing them one by one, and jumping when finding one that matches? Since it does the same thing in the end just with different number of loops? I really really hate big-O now.. it is completely useless if it is how you describe... |
|||
04 Sep 2009, 08:48 |
|
revolution 04 Sep 2009, 08:52
Azu wrote: Oh and also, all sorting algorithms do the same thing in the end, just using different number of iterations, so.. they should all have the same big-O notation? |
|||
04 Sep 2009, 08:52 |
|
Azu 04 Sep 2009, 08:54
But my example of SIMD vs non-SIMD uses different operations. And different algorithms (one loops n times, one runs once).. but you said they have the same big-O notation since in the end they both do the same thing..
|
|||
04 Sep 2009, 08:54 |
|
revolution 04 Sep 2009, 09:01
Azu wrote: But my example of SIMD vs non-SIMD uses different operations. And different algorithms (one loops n times, one runs once).. but you said they have the same big-O notation since in the end they both do the same thing.. |
|||
04 Sep 2009, 09:01 |
|
Azu 04 Sep 2009, 09:04
So shouldn't the SIMD way be O(1) until n>16 ( at which point it would be O(ceiling(n/16)) )?
|
|||
04 Sep 2009, 09:04 |
|
Tomasz Grysztar 04 Sep 2009, 09:06
revolution wrote:
You cannot talk about O() notation when clearly your data size is limited. O() notation is the asimptotic one, and therefore requires algorithms working on data of arbitrary size. And even if you use lookup tables of 128x128 bit size, you'd have to do this for every 128-bit part of the input and thus get O(N) in the end, anyway. The same mistake Azu made, when he said "only needs ran once (as long as n<17)". Well, the point is, when you take assumption like "n<17", you can not say anything about O() complexity. The O() notation applies only to the case, when your algorithm is able to handle input of ANY size. And clearly, to process the hundred megabytes of data, you'd still have to run your SIMD instruction in a loop. |
|||
04 Sep 2009, 09:06 |
|
revolution 04 Sep 2009, 09:06
Azu wrote: So shouldn't the SIMD way be O(1) until n>16 ( at which point it would be O(ceiling(n/how many it does at once)) )? |
|||
04 Sep 2009, 09:06 |
|
Goto page Previous 1, 2, 3, 4, 5, 6, 7 Next < Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2025, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.