flat assembler
Message board for the users of flat assembler.
Index
> Main > packed and partially used cache line register loads |
| Author |
|
|
revolution 13 Jan 2026, 15:03
In general for most modern CPUs made today, transfers to/from memory to a different layer (mem -> L3 or L2 -> L1, etc.) are always in whole cache line block sizes. Never smaller.
So using redundant instructions to load more data than needed into registers is most likely only wasting I-Cache space and wasting energy unnecessarily. However the real way to find out is to code up each method (with vs without) and see which is better for the app being measured. No amount of guessing based upon heuristics , or reading tea leaves, can give a definitive answer. Sometimes the results can by surprising. |
|||
|
|
sylware 14 Jan 2026, 12:42
I know that the only real way is to code some close to real-life benchmark... but you need the CPUs, and still, there is some 'generic and common' CPU implementation optimizations which are rather common, or at least a 'way' to code which will be friendly to those optimizations.
revolution wrote: So using redundant instructions to load more data than needed into registers is most likely only wasting I-Cache space and wasting energy unnecessarily. This is another topic, but I am really not convince that, that on modern and large CPU implementations, I-Cache impact could be significant, for instance I have doubts on the silicon cost of 'compressed' machine instructions on RISC-V, 'thumb' machine instructions on arm(erk!). There, we would need benchmarks (speed and energy consumption) arm(erk!) or RISC-V, on desktop/mobile/server implementations. Hopefully not in too weird niche use cases. I have not started to implement 'compressed instructions' in my rv64onx64 interpreter for that reason (and that defeat the purpose of the 'R' in RISC). I recall from some x64 optimization guide something about packing close to each other the loads/stores in a code fetch window up to a few (I don't recall the number range). |
|||
|
< Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2025, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.