flat assembler
Message board for the users of flat assembler.

Index > OS Construction > Code alignment optimization

Author
Thread Post new topic Reply to topic
Fulgurance



Joined: 27 Nov 2017
Posts: 215
Fulgurance
Hello, today i'm thinking about alignment into code. I have read some processors (i think it's true for intel) execute code more faster if code is aligned correctly.

What is the rule ? Is this the same rule with all intel computer ? How the most famous OS manage the alignment ?
Post 04 Feb 2020, 23:35
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 17715
Location: In your JS exploiting you and your system
revolution
There isn't any fixed rule. Sad

Sometimes, in some code, you can measure a speed improvement when you align procedure entries and loop entries to some binary power multiple. But it is not guaranteed to be true. Sometimes it harms the performance.

Blindly aligning every entry point is almost certainly not going to benefit you for any non-trivial program. You will have to selectively target certain parts and have functioning performance measurement code running to see the effects.

Also, it is an iterative process. Changing one part of the code can affect other parts.

And worse, every system/CPU/mobo variant will show different results. Unfortunately you can't use one optimisation on one system and have it have the same effect on all other systems. Each system needs its own testing procedure to see what works for it.
Post 05 Feb 2020, 05:37
View user's profile Send private message Visit poster's website Reply with quote
Hrstka



Joined: 05 May 2008
Posts: 19
Location: Czech republic
Hrstka
I would say that code alignment can bring some benefit only in situation when the code is executed many times. So it's not necessary to align the start of a function, but it's more meaningful to align the start of a loop instead. Anyway, the OS doesn't manage any alignment, it just loads the code stored in the executable file. The alignment must be performed by the compiler.
Post 05 Feb 2020, 12:53
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 17715
Location: In your JS exploiting you and your system
revolution
Also note that the kind of improvement you might see is probably of the order of a few percent at most.

If this kind of improvement is significant to you then I recommend setting up a robust measuring system to make sure you aren't mislead by transient effects and external events causing jitter.
Post 05 Feb 2020, 15:46
View user's profile Send private message Visit poster's website Reply with quote
donn



Joined: 05 Mar 2010
Posts: 199
donn
https://developer.amd.com/wp-content/resources/56305_SOG_3.00_PUB.pdf
Quote:
2.9 Instruction Fetch and Decode

"The pick window is 32 bytes, aligned on a 16-byte boundary. Having 16 byte aligned branch targets gets maximum picker throughput and avoids end-of-cacheline short op cache (OC) entries."


Is this 'pick window' just an internal cache in the processor, or is this related to the code instruction data that you can generate in an .asm/.inc file?


Obviously this 'recommendation' is related to AMD and one family of processors, but there are alignment 'tips' sprinkled throughout this doc.

Some are pretty crazy:

Quote:
"Only the first pick slot (of 4) can pick instructions greater than eight bytes in length. Avoid having more than one instruction in a sequence of four that is greater than eight bytes in length."
Post 05 Feb 2020, 16:58
View user's profile Send private message Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2020, Tomasz Grysztar. Also on GitHub, YouTube, Twitter.

Website powered by rwasa.