How do you manage complexity in your assembly projects?

Index > Main > How do you manage complexity in your assembly projects?

Author

Thread

anbyte

Joined: 21 Jul 2024
Posts: 9

anbyte 05 Mar 2026, 14:13

I'm curious to hear what kind of methods people use to manage and debug larger projects. Of course, I'm sure a lot of your ideas will be the same as any other non-asm project, but I'm curious to hear about things specific to assembly; do you prefer heavy macro usage (e.g. custom macros to manage string constants, PROC/ENDP, etc)? do you have any peculiar conventions for your project? do you write assembly-based modules to be called from C code? etc.

Or, do you avoid excessive boilerplate work, and solve problems more on a strictly case-by-case basis? Maybe that's the best way to reap the benefits of using assembly, after all.

I'm also curious to hear about people's solutions for debugging or profiling. A bit of a tangent; flat assembler exports symbol data to a custom format (.fas), but on Linux, debug information is generally expected to be in DWARF format which is embedded into ELF. Because of this, a lot of debuggers (like GDB) don't have a convenient way to load symbols from an external format, which makes writing and maintaining a .fas converter impractical. My current solution is exporting .fas symbol information to a 'batch' script consumable by radare2's CLI. This worked for me because I already happen to be familiar with r2, and it's extensible by it's design (it integrates with other text-based tools at the command-line, as opposed to more monolithic interfaces like IDA/Ghidra's, so this is simple to do), but r2 is first and foremost a reverse-engineering framework, so it's maybe not an ideal solution. I think LLDB can accept a json-based format for external symbol information, so maybe I'll write a converter tool for that.

Anyway, if you have any noteworthy techniques, conventions or ideas about large assembly projects, I'd love to learn about them.

05 Mar 2026, 14:13

bitRAKE

Joined: 21 Jul 2003
Posts: 4442
Location: vpcmpistri

bitRAKE 05 Mar 2026, 20:39

There are as many types of assembly language programmers as there are processors.

Software can be like a pot-luck: you bring something to the party and mix it with a bunch of other code to make something very complex.

Software can be like fly-fishing: there is a very specific fish you want to catch. So, great effort is extended to create the environment which allows that specific performance advantage.

Personally, I like heavy macro use and building tools to create an ecosystem of support to solve many problems. There is no mystery - you can view much code of mine. I make it a habit to post code as a way to keep discussions grounded. One can look at a piece of code and instantly say, "that's not what I'm talking about."

Bottom-up isn't a panacea though - larger projects require top-down direction - a well defined vision on the horizon. The two concepts can meet in any number of ways in the middle. The most important aspect is to keep moving forward - checking sub-goals off the list, reassessing trajectory periodically, etc.

For debugging, I read x86 and prefer to intimately familiar with the code - combine that runtime data collection/display and there is little that can't be run down.

With assembly it's easy to be mucking around in the weeds. Collapse the complexity by moving to a more data centric model, or writing code generator.

_________________
¯\(°_o)/¯ AI may [not] have aided with the above reply.

05 Mar 2026, 20:39

AsmGuru62

Joined: 28 Jan 2004
Posts: 1795
Location: Toronto, Canada

AsmGuru62 05 Mar 2026, 20:51

I did some large projects in FASM (Windows, not Linux).
I do not use large macros, my macros serve only the good readability of code.
I use only FASM, no integration with other languages.

It is unclear what is "manage string constants".
I define a string in .data section and use its name in the code.

For debugging, I insert INT3 into place I want to look at and after investigation over -- just remove it.
I use nothing special, like debugging information, etc. -- just looking at registers and memory dumps.

I use modular principle.
For example, I need a module named HeapManager.
I will create two files:

HeapManager.Inc: it will contain structures, macros and other definitions for a memory storage I need for HeapManager.
HeapManager.Asm: it will contain procedures in form of PROC/ENDP.

All the procedure names in the file will have HeapManager as a prefix:

Code:

align 16
proc HeapManager_AllocateBlock
endp

align 16
proc HeapManager_FreeBlock
endp

...

Then I will include these files into main FASM file.

I use nothing special for profiling.
I consider profiling is needed for large data only.
Like if you have to parse the 20Gb text file, looking for something...
Or, you coded a custom DB Engine and want to test how fast you can update a field for a million records in the database.

Sometimes a function is sensitive to parameters, so I check the parameters in a code block, which is assembled only in DEBUG mode.
Basically, I invoke a OutputDebugString with a line why and in which method this happened.
And after the invoke --- there is an INT3 opcode, so when debugger stops, I can see the logged text.
When program is done, all the conditionally assembled blocks are removed with commenting out just one line.
Logged strings are also assembled on the same condition.

05 Mar 2026, 20:51

sylware

Joined: 23 Oct 2020
Posts: 565
Location: Marseille/France

sylware 06 Mar 2026, 12:35

Complexity mitigation: functions following the ABI (it makes code more flexible and comfy), trying hard to avoid to use macros and that as hard as possible, I prefer a good text editor cut and paste (vim) with comment based tagging/code section boundaries.

A directory tree, one function, one file, and a "master" source file. But I have the plan to actually split the 'abnormal'("cold")code from the 'normal' code and move the 'abnormal' code far away.

Debugging is manual with mini trace code. (I use my own file format embedded in legacy file formats, no debugging info anyway).

Code verbosity which is mechanically much higher than in legacy computer languages: currently I do cheat hard, I use vim folds (emacs has something similar).

On the memory management complexity: I try to be as static as possible, everything power of two. If data complexity is getting out of hand, I write mini memory allocators, usually based on cache lines.

Namespace: I use global very long identifiers usually embedding some hierarchy, which I can shorten locally if really needed. But I did discover that keeping the long identifiers is helping a lot my "futur self". I would need a really independent "module" or really deep project to shorten the identifiers.
I use a basic C pre-processor, to a point I have a mini x86_64 C pre-processor dialect I can assemble with fasm/gas/nasm with very little fixes (ofc, for simple code).

06 Mar 2026, 12:35

anbyte

Joined: 21 Jul 2024
Posts: 9

anbyte 06 Mar 2026, 12:59

Thanks for your thoughts bitRAKE, AsmGuru62, sylware.

bitRAKE wrote:

Software can be like a pot-luck: you bring something to the party and mix it with a bunch of other code to make something very complex.
Software can be like fly-fishing: there is a very specific fish you want to catch. So, great effort is extended to create the environment which allows that specific performance advantage.

That's a good way to put it. My background is in game development - perhaps an intersection between these two camps; large stateful systems interact with eachother for the purpose of producing emergent gameplay mechanics, but there's also tight performance constraints; generalizing a system might mean a concession on performance, so you have to find a balance. I take it for granted that good debugging/profiling tools are critical, but what AsmGuru mentions about debugging/profiling needs probably holds true for the vast majority of situations.

This is why I'm wondering how people like to manage complexity in asm. Granted, game dev in x86 asm isn't really practical nowadays (I do think there are some practical benefits, but I don't think they outweigh the benefits of using an HLL like C; first and foremost cross-architecture compatibility and static analysis capabilities). Still, I think it's a great way to learn the internals of a game down to a very granular level. I also find that when I'm working in assembly, most problems feel tangible, as opposed to dealing with so many "glue" problems (e.g. you're dealing with two incompatible vector classes - do you have to write a super-class that overloads static_cast? why can I not just have the data? this feels like a completely artificial problem). I have a lot of ideas on this, but I'll spare you the rambling :p
I'd be interested in doing a game-dev project in asm to explore some of these ideas, but I'd have to worry about tooling.

06 Mar 2026, 12:59

sylware

Joined: 23 Oct 2020
Posts: 565
Location: Marseille/France

sylware 06 Mar 2026, 19:41

In your case, I would work around the design of memory data structures properly "optimized" for their use cases, that to make them "language agnostic".

The benchmark is that assembly and a HLL like C should be able to work on those memory data structures.

For code, memory tables of functions for OS/platform abstraction and engine modules functionalities.

The opposite: you jail yourself in some specific HLL data structure/code, like c++ and similar. In reality, you endup jailed in one compiler. Those ultra complex syntax HLLs will be super hard on your mental biases: you will want very hard to create beyond sane constructs using that syntax (hence one of the main reasons why c++ and similar are accutely toxic). Brain damaged c++ templates...

Basically, memory data structures and memory tables of functions should be the core design and everything should build upon them. Keep in mind, this is much less comfy than any HLL...

While designing this core, think cache lines, memory pages, alignment, power of 2, etc. Perfect does not exist: it is always a compromise, and the goal is to find some acceptable "sweet spot". So don't spin on looking for perfection, force yourself to move forward, not to mention you'll know at the "end" only how everything you want/need fits together.

And the last but not the least: the development will be much slower and uncomfy anyway. You have to accept that for good before starting. This means, being in a hurry must never happen, but force yourself to really move forward.

06 Mar 2026, 19:41

anbyte

Joined: 21 Jul 2024
Posts: 9

anbyte 10 Mar 2026, 13:04

sylware wrote:

The opposite: you jail yourself in some specific HLL data structure/code, like c++ and similar. In reality, you endup jailed in one compiler. Those ultra complex syntax HLLs will be super hard on your mental biases: you will want very hard to create beyond sane constructs using that syntax (hence one of the main reasons why c++ and similar are accutely toxic). Brain damaged c++ templates....

Come to think of it, this was my impression of Haiku. It's really nice that it has native APIs for things like graphics (Linux doesn't have this), but it's all written in C++. Writing bindings for C or anything lower-level could get obtuse - not just because of the ABI, but also because so many things are tied to C++ language features. I think that's just the design of BeOS and Haiku though, but that makes it non-ideal for anything that's not C++. I prefer there to be a good syscall API, and allow libraries to be written on top of it for HLLs (like Menuet, I'd imagine?).

I wonder if there are platform considerations for designing data structures. It might seem irrelevant, but whether I can use a lot of statically allocated memory or not might change how I design a data structure. For example; I know that on Linux, untouched pages don't tend to be mapped to physical memory, which is great for large pools of uninitialized memory. But I have no idea how virtual memory is handled on Windows (I'd hope similarly, but the kernel does have a say in it).

sylware wrote:

Basically, memory data structures and memory tables of functions should be the core design and everything should build upon them. Keep in mind, this is much less comfy than any HLL...

When you say memory tables of functions, do you mean function pointers/vtables? Or do you mean a generally data and systems oriented design?

10 Mar 2026, 13:04

sylware

Joined: 23 Oct 2020
Posts: 565
Location: Marseille/France

sylware 11 Mar 2026, 09:42

Yep, c++ and similar are mothers of all evil in computer programming.

Yep, memory tables of function pointers (ABI or not). But not for everything (some would be static "at assembly time"). For instance, wayland->x11 runtime fallback would be abstracted away with such memory tables of function pointers.

The general way is 'tables of functions' some would be memory tables of function pointers, other 'assembly-time' tables of functions.

11 Mar 2026, 09:42

< Last Thread | Next Thread >

Forum Rules:

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum