flat assembler
Message board for the users of flat assembler.

Index > Macroinstructions > [fasmg, CALM] any tips for macros optimization?

Author
Thread Post new topic Reply to topic
zhak



Joined: 12 Apr 2005
Posts: 501
Location: Belarus
zhak 04 May 2021, 14:30
Tomasz, can you please give some tips for writing more optimal calminstructions and macros in general (if its optimization can have any overal impact)
For example,
- is it better to use long expressions for compute/check or better to split in shorter computations across multiple commands, or doesn't really matter?
- how heavy 'elementsof' and other poly operations are? In instruction operands parsing, does it make sense to use elementsof to check if the value is an immediate or belongs to registers domain? would it make sense, for example, to compute 'operand metadata 1' to a variable rather then repeat the operation twice, etc.
Any general rules of a thumb on how to write good calminstructions? Thanks in advance!
Post 04 May 2021, 14:30
View user's profile Send private message Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8351
Location: Kraków, Poland
Tomasz Grysztar 04 May 2021, 16:14
First of all, the expression evaluator in fasmg is really not well tuned at the moment - before I introduced CALM it was not a problem, as its poor performance was generally overshadowed by preprocessing-related overhead, but with CALM I have now exposed its problems and it's a bit of a shame. I may try to improve some of the most critical paths in the future, but I do not promise anything at the moment.

Also, ALM compiler is currently not doing any expression optimization, so even if your expression contains only numeric literals and no variable references, it is still not going to be reduced and COMPUTE is going to perform all the operations. Which means that if you use an expression to compute a constant, it is better to calculate it separately at definition time. For example:
Code:
calminstruction masked value
        compute result, value and (1 shl 24 - 1)
end calminstruction    
vs
Code:
; this is going to be slightly faster:

include 'xcalm.inc'

calminstruction masked value
        local mask
        init mask, 1 shl 24 - 1
        compute result, value and mask
end calminstruction    
You can compare by executing such macro a milion times or so. And I generally recommend to test things this way if you need to find out the best of the variants that you consider.

On the other hand, when your expression contains multiple operations on variables, it may be slightly better to leave it as a single complex expression instead of splitting it into multiple COMPUTE commands - there is a small overhead related to assigning results to variables. But this overhead is still likely smaller than the time needed for any actual computations, so if you can calculate a sub-expression that is common to multiple COMPUTE/CHECK commands and store it into variable, it is most likely better to do so.

All of the above could perhaps be summarized into a simple rule of thumb: first minimize the number of operations performed by your COMPUTE commands (including moving some of them to the definition-time if possible) and secondarily minimize the number of separate commands.
Post 04 May 2021, 16:14
View user's profile Send private message Visit poster's website Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8351
Location: Kraków, Poland
Tomasz Grysztar 04 May 2021, 17:05
One more thing: when a sub-expression is inside a symbolic variable, then its computation is going to be even slower, as it needs to be parsed before it can be evaluated. If you have an argument that you use multiple times in COMPUTE/CHECK commands, start by converting it into its evaluated numeric value:
Code:
        compute argument, argument    
You can also use this opportunity to enforce specific evaluation:
Code:
        compute argument, +argument ; if argument is a string, evaluate as number    
And this pattern also prevents some other problems, like evaluation with a wrong value of $, etc.
Post 04 May 2021, 17:05
View user's profile Send private message Visit poster's website Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8351
Location: Kraków, Poland
Tomasz Grysztar 05 May 2021, 11:55
A little off-topic, but I have just realized that there is a little trick that allows to easily create a mask spanning a given number of bytes. The same as:
Code:
1 shl 24 - 1    
can be done with:
Code:
(-1) bswap 3    
Post 05 May 2021, 11:55
View user's profile Send private message Visit poster's website Reply with quote
bitRAKE



Joined: 21 Jul 2003
Posts: 4020
Location: vpcmpistri
bitRAKE 06 May 2021, 01:30
Tomasz Grysztar wrote:
Code:
(-1) bswap 3    
In one sense it seems cryptic. (the hidden truncation)
Yet, in another it seems more expository. (three bytes)

_________________
¯\(°_o)/¯ “languages are not safe - uses can be” Bjarne Stroustrup
Post 06 May 2021, 01:30
View user's profile Send private message Visit poster's website Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20299
Location: In your JS exploiting you and your system
revolution 06 May 2021, 01:48
It feels weird to me that "((-1) bswap 3) bswap 3" doesn't return the original value of -1 Confused
Post 06 May 2021, 01:48
View user's profile Send private message Visit poster's website Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8351
Location: Kraków, Poland
Tomasz Grysztar 06 May 2021, 06:26
The original -1 is arbitrary length, while with "bswap N" you introduce a fixed size. You get a -1 valid for that size, but obviously you cannot get anything longer (endianness switching does work well for "infinite" length numbers).
Post 06 May 2021, 06:26
View user's profile Send private message Visit poster's website Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.