flat assembler
Message board for the users of flat assembler.

Index > Main > CPU load/store queues, pointer update and pipeline

Author
Thread Post new topic Reply to topic
sylware



Joined: 23 Oct 2020
Posts: 543
Location: Marseille/France
sylware 06 Mar 2026, 12:49
In modern CPU micro-architectures, we have memory load/store queues, but can I update the register which was used with the memory address of the load/store while this very load/store is "in flight":

mov rax,[rbx]
inc rbx

Will rbx be updated while the load is in flight that without stalling the pipeline? Or should I delay the rbx update as much as I can to give time for the load to complete? In other words, is the load operation tied the address which was in rbx or to rbx itself?
Post 06 Mar 2026, 12:49
View user's profile Send private message Reply with quote
bitRAKE



Joined: 21 Jul 2003
Posts: 4392
Location: vpcmpistri
bitRAKE 06 Mar 2026, 14:07
Volume 3A, Chapter 10, Section 10.2 — Memory Ordering

"Writes are not reordered with older reads."

There is no stall - all reads of RBX are planned to be performed prior to the final write of RBX - nothing unexpected is happening that requires a change.

I reread the above chapter very often - it was chapter 8 in older manuals, iirc. Many of your present concerns are directly related to this content. Yes, the register file can be imagined as a very fast memory - it follows the same rules in general.

_________________
¯\(°_o)/¯ AI may [not] have aided with the above reply.
Post 06 Mar 2026, 14:07
View user's profile Send private message Visit poster's website Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20896
Location: In your JS exploiting you and your system
revolution 06 Mar 2026, 17:08
Most CPus have register renaming. That means that rbx uses a new register from the pool each time it is written. The old value of rbx is still in the previous register in the pool.
Code:
mov rax,[rbx]   ; rbx = pool_reg[43], rax = pool_reg[42]
inc rbx         ; rbx = pool_reg[44]    
There are no instructions to select individual pool registers, they are used internally, there is no programmer control over them. CPUs can have hundreds of pool registers available, but only 16 (rax, rbx, ...) are visible to the programmer at any one time.
Post 06 Mar 2026, 17:08
View user's profile Send private message Visit poster's website Reply with quote
sylware



Joined: 23 Oct 2020
Posts: 543
Location: Marseille/France
sylware 06 Mar 2026, 18:46
This is what I suspected (due to register renaming).

Then there is no pipeline stall if I update the pointer while the load is in flight.

Good!
Post 06 Mar 2026, 18:46
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20896
Location: In your JS exploiting you and your system
revolution 07 Mar 2026, 18:21
BTW: inc can be a problematic instruction.

Because inc doesn't alter the carry flag, some CPUs create a false dependency on the carry flag, and some don't. This depends on whether the CPU implements separate carry flag renaming or not. Some CPUs only implement the full flags register renaming, and thus don't track the carry separately.

Something to be mindful of, and worthwhile testing for to see if it affects performance or not. inc can be replaced with the longer add rbx,1 to explicitly break the carry flag dependency.
Post 07 Mar 2026, 18:21
View user's profile Send private message Visit poster's website Reply with quote
sylware



Joined: 23 Oct 2020
Posts: 543
Location: Marseille/France
sylware 07 Mar 2026, 21:37
@revolution I heard about the flags register nightmare for out-of-order large CPU implementations... it seems RISC-V has no flags mostly for this very reason.
Post 07 Mar 2026, 21:37
View user's profile Send private message Reply with quote
bitRAKE



Joined: 21 Jul 2003
Posts: 4392
Location: vpcmpistri
bitRAKE 08 Mar 2026, 21:00
... and if the flags aren't needed we can lea rbx, [rbx+1] -- assuming this instruction was a performance problem because of flags dependency.

_________________
¯\(°_o)/¯ AI may [not] have aided with the above reply.
Post 08 Mar 2026, 21:00
View user's profile Send private message Visit poster's website Reply with quote
sylware



Joined: 23 Oct 2020
Posts: 543
Location: Marseille/France
sylware 09 Mar 2026, 13:41
Good catch. Feels convoluted ofc.

Now I am thinking of it, I wonder how many hardware bugs the tracking of the flags dependency has in modern hardware pipelines. It is said complex to a point some ISAs are excluding it completely out of the core design, maybe there are already CVEs about it.
Post 09 Mar 2026, 13:41
View user's profile Send private message Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2026, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.