flat assembler
Message board for the users of flat assembler.

Index > Main > How do you track register usage in your source code?

Author
Thread Post new topic Reply to topic
sylware



Joined: 23 Oct 2020
Posts: 517
Location: Marseille/France
sylware 15 Jan 2026, 12:02
MeMeMe: I use a C pre-processor to avoid my code to be too much tied to the pre-processor of a specific assembler (that's why I wonder if RFC-izing fasm CALM/macro would not be not be a good end game solution since it is a sweet spot which allows generic assembler writting AND file format support up to the complexity of PE+ and ELF64).

It allows me to name registers and to perform simple recursive text transforms (nested namespaces/labels).

For each significant (very mood dependent) code section, I have a pre-processor prolog where I define what I have "globally" (all "dominators" code sections), on "entry" (from each specific branching/fall-thru to this code section), and finally the various register names for the code section itself. After the code section I have an epilog where I do undefined everything. I build "entry" definitions based on the related "dominator" code section definitions.


(I use vim folds for that which naturally provides a one-liner visual boundary, in order to avoid too much scrolling)

I only take care of the register mapping once I have written a satifying version of a code section


And you? How are you handling it?
Post 15 Jan 2026, 12:02
View user's profile Send private message Reply with quote
AsmGuru62



Joined: 28 Jan 2004
Posts: 1762
Location: Toronto, Canada
AsmGuru62 15 Jan 2026, 12:47
So, the register renaming is done to improve readability of code?

I name my registers as EAX, EBX, etc. -- just as Intel named them.
I also design my functions to be short -- well, 95% of them.
So, it is not a big deal to see how register is used.

In addition the IDE has the ability to track usage of a token:
- I right click on a register name, say: "r8d" and select a menu item "Find token in procedure".
- IDE looks if the caret line is within "proc"/"endp" lines, and if so:
- IDE lists (in a search panel) all lines where "r8","r8w","r8d","r8b" shows up.
Post 15 Jan 2026, 12:47
View user's profile Send private message Send e-mail Reply with quote
sylware



Joined: 23 Oct 2020
Posts: 517
Location: Marseille/France
sylware 15 Jan 2026, 15:05
@AsmGuru62, you went the "very short" function way. In other words, "a lot of functions" way. Then you track what does your program mostly with function names and their parameter names, I see.

If I were to port my code from my side to yours, I guess most "code sections" would become "very short functions".

That said, do you have "your own internal ABI"? Because following an official ABI with a reasonably deep call graph will imply significantly more register saving on the stack. Or you do things another way?
Post 15 Jan 2026, 15:05
View user's profile Send private message Reply with quote
AsmGuru62



Joined: 28 Jan 2004
Posts: 1762
Location: Toronto, Canada
AsmGuru62 15 Jan 2026, 16:58
I do not have an ABI, wait... what is ABI?
Lot of functions are Okay, because everything is an object, and every object has functions to work with its internal data, below is an example.
Imagine you have a structure like this:
Code:
; ---------------------------------------------------------------------------
; FILE: String.Inc
; DATE: January 15, 2026
; ---------------------------------------------------------------------------
struct CStr
    Flags       db ?
    Length      db ?
    Buffer      rb 254
ends
    

And then, you have a set of functions which prefixed with the name, identifying the object:
Code:
; ---------------------------------------------------------------------------
; FILE: String.Asm
; DATE: January 15, 2026
; ---------------------------------------------------------------------------
align 16
proc String_Init pChar:DWORD
; ---------------------------------------------------------------------------
; Input:
;   ebx = pointer to CStr object
; ---------------------------------------------------------------------------
    ret
endp

align 16
proc String_InitWithLen pChar:DWORD, Length:DWORD
; ---------------------------------------------------------------------------
; Input:
;   ebx = pointer to CStr object
; ---------------------------------------------------------------------------
    ret
endp

align 16
proc String_Concatenate uses esi edi, pCStr:DWORD
; ---------------------------------------------------------------------------
; Input:
;   ebx = pointer to CStr object
; ---------------------------------------------------------------------------
    ret
endp
    

And, then, you have another object:
Code:
; ---------------------------------------------------------------------------
; FILE: Vector.Inc
; DATE: January 15, 2026
; ---------------------------------------------------------------------------
struct CVector
    Count       dd ?    ; # of items stored into 'Array'
    Room        dd ?    ; # of items reserved in 'Array'
    GrowBy      dd ?    ; # of items to grow in 'Array' if 'Count' is about to exceed 'Room'
    Array       dd ?    ; array of DWORD values (pointers, values, etc.)
ends
    

And, now, I will reuse the names used in CStr object:
Code:
; ---------------------------------------------------------------------------
; FILE: Vector.Asm
; DATE: January 15, 2026
; ---------------------------------------------------------------------------
align 16
proc Vector_Init nGrow:DWORD
; ---------------------------------------------------------------------------
; Input:
;   ebx = pointer to CVector object
; ---------------------------------------------------------------------------
    ret
endp

align 16
proc Vector_Add item:DWORD
; ---------------------------------------------------------------------------
; Input:
;   ebx = pointer to CVector object
; ---------------------------------------------------------------------------
    ret
endp

align 16
proc Vector_Concatenate pCVector:DWORD
; ---------------------------------------------------------------------------
; Input:
;   ebx = pointer to CVector object
; ---------------------------------------------------------------------------
    ret
endp
    

Having a lot of small functions is a good thing if they are properly named.
Post 15 Jan 2026, 16:58
View user's profile Send private message Send e-mail Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20835
Location: In your JS exploiting you and your system
revolution 15 Jan 2026, 17:03
Quote:
Code:
ebx = pointer to CVector object    
That is the ABI. Defining which registers are used for what purposes.
Post 15 Jan 2026, 17:03
View user's profile Send private message Visit poster's website Reply with quote
sylware



Joined: 23 Oct 2020
Posts: 517
Location: Marseille/France
sylware 16 Jan 2026, 09:13
Yep, since you would have many more functions, branching to those functions and the stack saving due to the ABI (call preserved registers) will put a serious strain on your programs.

What about this?
Post 16 Jan 2026, 09:13
View user's profile Send private message Reply with quote
bitRAKE



Joined: 21 Jul 2003
Posts: 4351
Location: vpcmpistri
bitRAKE 16 Jan 2026, 17:27
Most larger efforts create a scheme which the code must follow. The most basic is something like:
  • Leaf functions - no prologue/epilogue (regardless of ABI). Often can be converted to a macro for inlining. Can call other leaf functions.
  • ABI functions - prologue/epilogue which respect the ABI. Can call other ABI functions and leaf functions.
Sometimes performance dictates changes in the ABI layering. We can split ABI functions into two groups because using non-volatile registers is so useful for a family of functions, and preserving non-volatile registers can be expensive. This creates an expanded ABI and allows leaf functions to be more destructive in some cases.

Sometimes safety dictates changes in the ABI layering. How are we using exceptions? What rollback points need to exist in the stack?

Register usage aligns between instructions and the high-level scheme. It is often this complexity that people struggle with - refactoring to perturbation in the high-level scheme. We engineer at a particular complexity of understanding - which changes the understanding.

I need to understand a function in a couple minutes, and be able to use as a black-box in a larger hierarchy -- otherwise it's useless. Of course, that applies to my own code first! Domain understanding helps to accelerate that process.
Post 16 Jan 2026, 17:27
View user's profile Send private message Visit poster's website Reply with quote
AsmGuru62



Joined: 28 Jan 2004
Posts: 1762
Location: Toronto, Canada
AsmGuru62 17 Jan 2026, 00:31
People, if I may, we are talking about performance without actually measuring it or having issues, like some code is slow.
The talk about performance, in theory, is always a problem.
There are very few rules about theoretical performance:

- align your code and data.
- natural code flow is "fall down" instead of conditional jump forward (jumping back is good).
- short and simple instructions: like replace MOV EAX,0 with XOR EAX,EAX.
- use SETxx/CMOVxx opcodes to avoid branches.

Having ABI or not having it will not impact performance in any way.
Same goes for a lot of small functions: it will not do any damage, in any measurable way.

And finally, performance matter only for a large data sets.
If your data is not large, then concentrate on making code smaller.
Post 17 Jan 2026, 00:31
View user's profile Send private message Send e-mail Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20835
Location: In your JS exploiting you and your system
revolution 17 Jan 2026, 05:13
I thought this topic was about ABIs and register usage, but somehow has switched to performance?

Every instruction choice or ordering can affect performance, either negatively or positively. It is quite rash to claim that using, or not using, something "will not impact performance". Each situation is different.

It is all about what the code is doing, and when it does it. In some cases trying a different thing will make no difference, in other cases it can have a major impact.

Sometimes changing code in one place can affect performance of code in another place. There are so many internal interactions with our modern complex CPUs that statements like "X is always better than Y" or "don't do Z it is always bad" are problematic at best.

General guidelines are great. Using the caches as much as possible is a often a great strategy. Aligning data is often the right thing to do. Putting related data near each other is usually a good thing. Long flows of linear instructions (i.e. loop unrolling) is generally fine (until the cache starts thrashing). Reducing the call graph and call depth is normally a net positive. Compressing data structures can relieve DRAM pressure. etc.

Follow those guidelines, but don't be afraid to break them when it makes sense for a specific situation. Experiment, play, mix things up, and see how it works out.
Post 17 Jan 2026, 05:13
View user's profile Send private message Visit poster's website Reply with quote
bitRAKE



Joined: 21 Jul 2003
Posts: 4351
Location: vpcmpistri
bitRAKE 17 Jan 2026, 08:39
ABI is probably the wrong concept for what I intended; and I certainly didn't mention performance to derail the discussion of register selection. By ABI I mean something more general than strictly interfacing the system - all the policies and constraints imposed - from the rigid to the elective. My mention of performance was one of a number of reasoning forces that guide creation of policies.

_________________
¯\(°_o)/¯ AI may [not] have aided with the above reply.
Post 17 Jan 2026, 08:39
View user's profile Send private message Visit poster's website Reply with quote
sylware



Joined: 23 Oct 2020
Posts: 517
Location: Marseille/France
sylware 17 Jan 2026, 11:54
Indeed, the thread has been about how you track register usage in your assembly code.

That said the "ABI" is a critical part of significantly big modular projects. Often the "official" ABI for an ISA involves strong and generic "less worse" rules about register management involving saving some registers in memory (expensive), it means that having a rather large number of ABI functions will imply much more register dance to load the argument registers and saving call-preserved registers in memory.
Post 17 Jan 2026, 11:54
View user's profile Send private message Reply with quote
bitRAKE



Joined: 21 Jul 2003
Posts: 4351
Location: vpcmpistri
bitRAKE 18 Jan 2026, 13:44
Platform Contract or Execution Environment:
  • ISA: privledge(ring), valid opcodes
  • alignment: page, cacheline, type
  • memory: paging, segmentation, access
  • system: syscall/int
  • format: image, sections
  • security: DEP, ASLR, signing
  • ABI: frame, registers
  • error handling: exception, codes
  • cultural: elective / self-imposed

The Context-Pinned Register pattern, which is the secret weapon of language runtimes (like LuaJIT, Go, or Forth) to achieve performance that strictly compliant C++ code cannot easily match. This works because the Windows x64 ABI defines a large set of Non-Volatile (Callee-Saved) registers. If you "hijack" these registers for your own global state, the ABI works for you, not against you.
  • "VM" Context
  • Allocator
  • Exception
  • Thread Local Storage
... in assembly this can be conceptually more complex.

_________________
¯\(°_o)/¯ AI may [not] have aided with the above reply.
Post 18 Jan 2026, 13:44
View user's profile Send private message Visit poster's website Reply with quote
sylware



Joined: 23 Oct 2020
Posts: 517
Location: Marseille/France
sylware 19 Jan 2026, 11:27
Well, in all ABIs (x64,risc-v,arm,etc), callee-saved registers which do survive an ABI call, ends up used for some global state of a function (often the case in my code). That said, I still must track their usage in some way because using such a register implies you save its content in memory (stack/global/thread/etc) which is expensive.

In my code, in the pre-processor definitions for a code section, I do split the usage of callee-preserved registers from the "temporary' registers, but as I said above, I did notice that often I tend to brutally book a callee-preserved register for the whole function for a specific usage which is not actually global to the function then I am probably 'saving too much' in the memory.

In the end, I'll try to re-use more the callee-saved registers instead of kind of over allocating them... only if callee-saved register saving goes beyond the size of a cache-line. In the case where I do not even fill one cache line, 'not re-using the callee-preserved registers'/'overbook callee-preverved registers for function globals' is probably benign.
Post 19 Jan 2026, 11:27
View user's profile Send private message Reply with quote
bitRAKE



Joined: 21 Jul 2003
Posts: 4351
Location: vpcmpistri
bitRAKE 19 Jan 2026, 19:26
Back in the 8/16-bit days everything had a memory location. Some people carried that into their 32-bit and 64-bit code - only use registers if absolutely needed. This is not unlike the HLL perspective - it just doesn't get optimized away automatically. For me, that just moves the problem around - pun intended, lol.

You seem to have a good mindset about the topic - locality of access. Order operations to reduce the boundary crossing of variables and then everything left needs a memory location. Now-a-days, there are usually sufficient registers - they can always be given names to make reading the code easier.

_________________
¯\(°_o)/¯ AI may [not] have aided with the above reply.
Post 19 Jan 2026, 19:26
View user's profile Send private message Visit poster's website Reply with quote
sylware



Joined: 23 Oct 2020
Posts: 517
Location: Marseille/France
sylware 20 Jan 2026, 13:42
@bitRAKE this is exactly what it is happening: naming registers is usually more than enough, and you end up with simple code very easy to read, until you have a clear mental picture of the memory layout with it usage and code control flow.


How do you manage the register name definitons? I use vim folds to 'hide' their definitions in order to avoid to have to scroll too much. Do you have all definitions at the start of a code section and then do undefine them all at then end of of the code section?
Post 20 Jan 2026, 13:42
View user's profile Send private message Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2025, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.