flat assembler
Message board for the users of flat assembler.
Index
> High Level Languages > Weird code output by GCC, what's faster? |
Author |
|
Daedalus 12 Jun 2008, 20:25
Hi guys,
I had some problems today with "speeding up" my algorithm. I figured you guys of all people would know best what the fastest code is, so I'm hoping you can help me. I had this C code: Code: for(i = 0; i < 8; i++){ curPos[0] = knightPos[0] + horizontal[i]; if(curPos[0] > 7){ continue; } else{ curPos[1] = knightPos[1] + vertical[i]; if(curPos[1] > 7){ continue; } else{ if(board[curPos[0] + curPos[1]*8] == 0){ And I figured I could replace it with this: Code: for(i = 0; i < 8; i++){ curPos[0] = knightPos[0] + horizontal[i]; if(curPos[0] < 8){ curPos[1] = knightPos[1] + vertical[i]; if(curPos[1] < 8){ if(board[curPos[0] + curPos[1]*8] == 0){ The below one is a lot slower (about 30% on my system), and the code generated by GCC for it is really really really different from the code generated for the top piece of code. I've attached the outputted ASM (dissembled copy past from ollydbg). I was hoping you could take a look at it and explain to me why one piece of assembly is faster than another in this specific case and in general. I tried some googling, but found it quite hard to pinpoint information on speeding up algorithms. I still now a few things, but I'm not sure if they are obsolete by now. One thing I can think of is branch prediction. How can I predict when branch prediction will be bad? When generated C code will be slower than C code that looks a lot like it? I've also posted on a C forum asking this question, in case you're wondering. In a nutshell:
How can I be semi-sure that some code ASM or C is going to be fast? Thanks In Advance,
|
|||||||||||
12 Jun 2008, 20:25 |
|
Daedalus 12 Jun 2008, 21:13
Thank you for the reference to Agner Fog. I'll finish my C++ course and then take a look at his books for sure. Maybe the other ones as well. I'm afraid the best test is indeed compiling and checking but it was rather frustrating to code for an hour an notice the code I have with caching is actually slower than the code without all the fancy "optimizations" I had coded. Life sucks.
|
|||
12 Jun 2008, 21:13 |
|
kohlrak 13 Jun 2008, 06:20
No more giving back...
Last edited by kohlrak on 07 Aug 2008, 14:34; edited 1 time in total |
|||
13 Jun 2008, 06:20 |
|
bitRAKE 13 Jun 2008, 15:17
Use a bit lookup table to reduce the code to a single branch and minimize memory usage. Complex unpredictable branching is very slow on the processor, and should be worked out of the design. Google for "bit board chess".
_________________ ¯\(°_o)/¯ “languages are not safe - uses can be” Bjarne Stroustrup |
|||
13 Jun 2008, 15:17 |
|
AlexP 14 Jun 2008, 17:21
The Agner Fog manuals as well as the Intel Optimization Manual have been a life-saver for me also . It all depends on which processor you're using to get the most speed out of it.
|
|||
14 Jun 2008, 17:21 |
|
dap 14 Jun 2008, 19:35
|
|||
14 Jun 2008, 19:35 |
|
bitRAKE 15 Jun 2008, 01:23
In the video he says, "...family ten.." for the text "family 10h".
_________________ ¯\(°_o)/¯ “languages are not safe - uses can be” Bjarne Stroustrup |
|||
15 Jun 2008, 01:23 |
|
eskizo 02 Jul 2009, 06:33
at least 'curPos[1]*8' --> 'curPos[1] << 3'
|
|||
02 Jul 2009, 06:33 |
|
< Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2025, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.