flat assembler
Message board for the users of flat assembler.

 Index > Windows > Disable SMT for my process? Goto page Previous  1, 2, 3, 4, 5, 6, 7  Next
Author
Azu

Joined: 16 Dec 2008
Posts: 1159
Azu 03 Sep 2009, 19:45
But that is how I mean..
03 Sep 2009, 19:45
Borsuc

Joined: 29 Dec 2005
Posts: 2465
Location: Bucharest, Romania
Borsuc 03 Sep 2009, 19:46
Azu, in math, when studying asymptotic functions, you use a partial derivative, which is a measure of how a function changes if you change a parameter.

So let's say we have 3 functions, one has 1 add, second has 10000 adds, and the third has 1000000000000 adds. (doesn't matter, just a really big number). All of these, though, do not have a parameter: the number of additions is always constant, in this example.

Function 3 will always have 1000000000000 because it has no parameter to adjust the function. The derivative is 0. Hence the function is constant, hence its complexity is O(1). Yes, function 1 and function 3 are both O(1) as is function 2, not O(1), O(10000), and O(1000000000000) as you might think.

If you have a function that has a parameter, like let's say, the "number of items", then it's not constant anymore. You analyze the derivative to see how the function's timing changes if you vary the parameter.

If it varies linearly, that is, doubling the "number of items" for any number of items makes the execution 2 times slower, then it is O(N) because it has a linear relationship.

If it actually needs 4 times the amount of time for twice the "number of items" (again, for any number of items, not "exceptions"), then it is O(N^2) because it is a quadratic relationship: you see, N^2 is the time required for any N number of items, mathematically speaking.

It has nothing to do with execution time on a given CPU, it has everything to do with measuring the derivative and the number of steps it has to perform compared to the previous parameter value (that's what the derivative does). If it's the same, the difference is 0, the derivative is 0, and the function is O(1).

This is why ALL constant functions are O(1), no matter how many operations they have.

_________________
Previously known as The_Grey_Beast
03 Sep 2009, 19:46
Azu

Joined: 16 Dec 2008
Posts: 1159
Azu 03 Sep 2009, 19:48
In my example the parameter is how many additions you are doing..
03 Sep 2009, 19:48
Borsuc

Joined: 29 Dec 2005
Posts: 2465
Location: Bucharest, Romania
Borsuc 03 Sep 2009, 19:49
Then it is O(N)

O(1) means that it is O(1), no matter the parameter (number of additions in this case), which is false, because it isn't constant.

Bubble Sort is O(N^2), guess why.

_________________
Previously known as The_Grey_Beast
03 Sep 2009, 19:49
Azu

Joined: 16 Dec 2008
Posts: 1159
Azu 03 Sep 2009, 19:51
Because that algorithm must run N*N times!

If I have a SIMD instruction that can do up to 16 additions at once, then it is running 1 time, as long as 16 or less additions need done, where as non-SIMD way would need ran 16 times to do 16 additions, or 2 times to do 2 additions, or whatever.....
03 Sep 2009, 19:51
revolution
When all else fails, read the source

Joined: 24 Aug 2004
Posts: 19873
Location: In your JS exploiting you and your system
revolution 03 Sep 2009, 19:52
Azu wrote:
In my example the parameter is how many additions you are doing..
Okay, so SIMD still does the same number of additions as non-SIMD. Doing them in parallel is an implementation thing, it has nothing to do with mathematical complexity. In my simple example on the previous page the complexity is still 4 whether in parallel or series, it makes no difference to the complexity value.
03 Sep 2009, 19:52
Borsuc

Joined: 29 Dec 2005
Posts: 2465
Location: Bucharest, Romania
Borsuc 03 Sep 2009, 19:52
@Azu: More correct would be to say that, because as you increase N, the amount of "steps" (let's keep it at abstract mathematical level) required increases by N^2. Or in other words, doubling N will quadruple the number of steps required.

Azu, it is about the number of "steps" needed. Doing them in parallel doesn't change anything, there's still 2 steps whether you do both at the same time, or whether you do them twice as fast.

_________________
Previously known as The_Grey_Beast
03 Sep 2009, 19:52
Azu

Joined: 16 Dec 2008
Posts: 1159
Azu 03 Sep 2009, 19:56
No no no

The SIMD instruction will do the same amount of work whether you utilize the max amount of inputs it can work on at once or just give it 1

It's just that the excess will be wasted

If it can do up to 16 at once and you give it say 3, it will be the same as if you give it 16.. it will take same amount of time, use same amount of resources, same complexity and everything. Different input but the complexity is constant.
03 Sep 2009, 19:56
Borsuc

Joined: 29 Dec 2005
Posts: 2465
Location: Bucharest, Romania
Borsuc 03 Sep 2009, 23:13
O(N) is theoretical mathematical representation -- there is no such thing as "waste" in math!

But that's irrelevant anyway. O(N) is asymptotical analyzation. This means as N tends to infinity you take the limit (and derivative). It's the mathematical term for how it "tends".

You know, with limits (lim x->oo, stuff like that), which is x^2 for N^2, and 1 for O(1), not even using x! (no parameter change after all)

Asymptotes
03 Sep 2009, 23:13
Azu

Joined: 16 Dec 2008
Posts: 1159
Azu 03 Sep 2009, 23:17
What I mean is, SIMD instructions do the same amount of work whether you are making full use of how much they can do or not. It doesn't vary based on how much work you need done. It can either handle however many operations you need or not. Hence O(1) whenever one of them is enough.
03 Sep 2009, 23:17
Borsuc

Joined: 29 Dec 2005
Posts: 2465
Location: Bucharest, Romania
Borsuc 04 Sep 2009, 01:17
Yes you are correct, but that has nothing to do with Big-O notation. See my post above why

_________________
Previously known as The_Grey_Beast
04 Sep 2009, 01:17
Azu

Joined: 16 Dec 2008
Posts: 1159
Azu 04 Sep 2009, 01:21
Okay.. so what notation should I use instead?

I just want to express how many times an algorithm needs ran based on how much stuff it works on.

Like this;

loopy:
mov al,byte[edi]
sub ecx,1
mov byte[edi],al
lea edi,[edi+1]
lea esi,[esi+1]
je loopy

Needs ran ecx times where as this;

movdqa xmm0,dqword[esi]
movdqa xmm1,dqword[edi]
movdqa dqword[edi],xmm0

Needs ran 1 time..

What is the correct notation that I should use to compare these?
04 Sep 2009, 01:21
revolution
When all else fails, read the source

Joined: 24 Aug 2004
Posts: 19873
Location: In your JS exploiting you and your system
revolution 04 Sep 2009, 05:04
AFAIK there is no notation for such a thing. Mathematicians are generally not interested in linear speed-ups so I doubt they have developed any special notation for it (if someone knows otherwise I would be happy to learn). Only computer programmers seem interested in linear speed-ups, so perhaps the measure of runtime, or clock counts, or instruction count, or byte count, or something, is the best you can get.
04 Sep 2009, 05:04
Azu

Joined: 16 Dec 2008
Posts: 1159
Azu 04 Sep 2009, 07:24
Linear speadups? I'm talking about how many times you have to run the algorithm when you increase the number of inputs to it. That has absolutely nothing to do with runtime, clock counts, or instruction count. Look at my last example (before you ask, it's in the post right above yours).
04 Sep 2009, 07:24
revolution
When all else fails, read the source

Joined: 24 Aug 2004
Posts: 19873
Location: In your JS exploiting you and your system
revolution 04 Sep 2009, 07:33
Azu wrote:
Linear speadups? I'm talking about how many times you have to run the algorithm when you increase the number of inputs to it. That has absolutely nothing to do with runtime, clock counts, or instruction count. Look at my last example (before you ask, it's in the post right above yours).
Your example is purely a linear speed relation. Double the number of inputs and you double the number of required operations. For specific cases of a specific input size then just divide by the loop size and round up.
04 Sep 2009, 07:33
Azu

Joined: 16 Dec 2008
Posts: 1159
Azu 04 Sep 2009, 07:36
No.. the required operations is constant as long as less than 17 inputs. Actually, it won't even work for >16 inputs, since I didn't put in any logic to handle that... so it's going to be constant no matter how many, lol. Where as the number of times the non-SIMD one will run is exactly how many inputs.

What is the notation for this? I think that, like I said, it should be "O(n)" for the non-SIMD one, and O(1) for the SIMD one if less than 17 inputs. Or
x=16;how many it can do at once
O(ceil(n/x))
to handle cases all the way up to infinitely, like I said a few pages back. (okay actually I changed it a little just now to make it more compact, but same result)

You all keep saying that I am using the wrong notation so please tell me what is the right one to use for this.
04 Sep 2009, 07:36
revolution
When all else fails, read the source

Joined: 24 Aug 2004
Posts: 19873
Location: In your JS exploiting you and your system
revolution 04 Sep 2009, 07:46
Okay, so you have 16 operations per loop. That means:

0=1 loops
1-16=1 loop
17-32=2 loops
33-48=3 loops
etc.

That is how linear works.
04 Sep 2009, 07:46
Azu

Joined: 16 Dec 2008
Posts: 1159
Azu 04 Sep 2009, 07:47
For 0 through 16 it is 1 "loop" (not really a loop when it is just ran once though), like I said. Hence the "when N<17" qualifier I added to all my posts.

Last edited by Azu on 04 Sep 2009, 07:48; edited 1 time in total
04 Sep 2009, 07:47
revolution
When all else fails, read the source

Joined: 24 Aug 2004
Posts: 19873
Location: In your JS exploiting you and your system
revolution 04 Sep 2009, 07:48
Azu wrote:
You all keep saying that I am using the wrong notation so please tell me what is the right one to use for this.
Azu, please don't keep asking the same question over and over. I have already answered you above. If someone else has a better answer then I am sure they will see your question and will answer if they feel inclined.
04 Sep 2009, 07:48
Azu

Joined: 16 Dec 2008
Posts: 1159
Azu 04 Sep 2009, 07:49
revolution wrote:
Azu wrote:
You all keep saying that I am using the wrong notation so please tell me what is the right one to use for this.
Azu, please don't keep asking the same question over and over. I have already answered you above. If someone else has a better answer then I am sure they will see your question and will answer if they feel inclined.
You answered a question I never asked. I'm asking for the one for how many times the whole thing needs ran/looped/whatever based on the number of inputs. The "linear speed-up" you mentioned would be a comparison between these
Code:
```;func 1
loopy:
loop loopy
;func 2
loopy:
sub ecx,1
je loopy    ```
04 Sep 2009, 07:49
 Display posts from previous: All Posts1 Day7 Days2 Weeks1 Month3 Months6 Months1 Year Oldest FirstNewest First

 Jump to: Select a forum Official----------------AssemblyPeripheria General----------------MainTutorials and ExamplesDOSWindowsLinuxUnixMenuetOS Specific----------------MacroinstructionsOS ConstructionIDE DevelopmentProjects and IdeasNon-x86 architecturesHigh Level LanguagesProgramming Language DesignCompiler Internals Other----------------FeedbackHeapTest Area
Goto page Previous  1, 2, 3, 4, 5, 6, 7  Next

Forum Rules:
 You cannot post new topics in this forumYou cannot reply to topics in this forumYou cannot edit your posts in this forumYou cannot delete your posts in this forumYou cannot vote in polls in this forumYou cannot attach files in this forumYou can download files in this forum