flat assembler
Message board for the users of flat assembler.

Index > Main > speculative execution and unconditional branches

Author
Thread Post new topic Reply to topic
sylware



Joined: 23 Oct 2020
Posts: 543
Location: Marseille/France
sylware 01 Mar 2026, 12:16
It is the follow up of my previous thread:

In the general case, does speculative execution have "barriers" (like a "too close" second conditional branch?)?

And as barriers I am talking about unconditional branches.

what does happen when a machine instruction which is speculatively executed is a unconditional branch?

Will the CPU speculatively execute it, that which could implied some serious work if the branch is far away with cache loading, new instruction fetch window, etc, aka there is more than pipeline flushing like cache loading?

I ask that because orchestrating code to please static branch prediction involves much more unconditional branches in some specific cases, for instance while checking the return value of a external call/syscall: basically, I would have to put the return value check code before the external call/syscall.

(AMD has a manual for software developers in order to be friendly with speculative execution, but I cannot download it https://www.amd.com/system/files/documents/software-techniques-for-managing-speculation.pdf)

(XXX: This is unrelated to speculative execution hardware vulnerabilities, aka not "SLS")

EDIT:
After a lot of reading, it seems on large CPU imlementations, the code which is the target of an unconditional branch gets speculatively executed with everything which is involved: cache loading, prefetching, etc
Post 01 Mar 2026, 12:16
View user's profile Send private message Reply with quote
Mike Gonta



Joined: 26 Dec 2010
Posts: 246
Mike Gonta 01 Mar 2026, 14:46
sylware wrote:
(AMD has a manual for software developers in order to be friendly with speculative execution

https://web.archive.org/web/20230127145939/https://www.amd.com/system/files/documents/software-techniques-for-managing-speculation.pdf

_________________
Mike Gonta
look and see - many look but few see

https://mikegonta.com
Post 01 Mar 2026, 14:46
View user's profile Send private message Visit poster's website Reply with quote
sylware



Joined: 23 Oct 2020
Posts: 543
Location: Marseille/France
sylware 01 Mar 2026, 16:15
Mike Gonta wrote:
sylware wrote:
(AMD has a manual for software developers in order to be friendly with speculative execution

https://web.archive.org/web/20230127145939/https://www.amd.com/system/files/documents/software-techniques-for-managing-speculation.pdf


Thx! Unfortunately this document is related only to the mitigations of the bazillions of speculative execution exploits.

I found more on the web (noscript/basic (x)html please): why the static branch prediction is this one, and I got even a recent verilog implementation of a new branch predictor dwarfing the efficiency of the current neural branch prediction that on "common" code (on _their_ corpus of common code, and for RISC-V 32bits processors).

Now, I am looking on how much address space the BHT&BTB do span (I guess the instruction cache size or a code prefetch window, but I am probably wrong).

This is what I got for Zen4:
"Each BTB entry can hold up to two branches if the branches reside in the same 64-byte aligned cache line and the first branch is a conditional branch."

"... branch targets tracked when branches are spaced by 8 bytes."
Post 01 Mar 2026, 16:15
View user's profile Send private message Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2026, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.