flat assembler
Message board for the users of flat assembler.
Index
> Main > cost of branch mis-prediction |
| Author |
|
|
anbyte 03 Mar 2026, 12:45
Someone told me about this book “Algorithms for Modern Hardware” by Sergey Slotin recently, and this chapter covers branching. It seems very comprehensive, there is information on a lot of other optimization concepts on modern architectures (and since we're talking about performance engineering, its worth mentioning the Computer, Enhance! course by Casey Muratori).
These seem like good quality information, thought I'd share it for anyone interested. |
|||
|
|
AsmGuru62 03 Mar 2026, 13:05
I think it is a good project to do: measure the performance impact on "predicted" path vs "unpredicted" one.
But that will not be easy, how to avoid the impact of other instructions? The book by Slotin is indeed a very nice one. |
|||
|
|
revolution 03 Mar 2026, 13:30
sylware wrote: What is the cost of a branch mis-prediciton on modern micro-architectures? AsmGuru62 wrote: ... how to avoid the impact of other instructions? |
|||
|
|
sylware 03 Mar 2026, 13:50
A bit offtopic:
oooof! CPU power management is now by default very coarsely driven by the OS. There is a risk that "efficient" code paths may not trigger CPU Hz increase as "inefficient" code paths may more likely trigger CPU Hz increase... making the "inefficent" code run _faster_ than the "efficient" code! Scary. Back on the topic: "measuring" would be useful to increase the chance to detect something is really going wrong, but with all those hardware bugs (for instance SLS), "measuring" may show really weird things... (I guess this is how the speculative execution exploits were discovered). Those 20 clock cycles penalty seems to be the "norm" on modern micro-architectures. The bottom of this: the CPU designers did make assumptions about the general design of machine code programs, and the most important is to respect the assumptions very common to all micro-achitectures, for instance THE static branch prediction rule. Ofc, one can push micro-architecture specialization of machine code at runtime or compile-time (very true for RISC-V CPU with very fine grained hardware control extensions). |
|||
|
< Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2026, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.