flat assembler
Message board for the users of flat assembler.

Index > Compiler Internals > Odd struc behavior

Author
Thread Post new topic Reply to topic
gmg1812



Joined: 15 Aug 2014
Posts: 13
Location: Northern New York
gmg1812
Consider the code:

struc QQQ a {.a db a }
a equ zz
a QQQ 3
mov al,[a.a]
jmp zz

Note you get a.a not zz.a and where does the label zz come from?

The generated tokens give the answer:

;struc QQQ a{.a db a}
;a equ zz
zz:; a QQQ 3
a.a db 3
mov al,[a.a]
jmp zz

Now I understand what is happening (the preprocessor code is quite clear) but WHY? What use is defining an equ replacement name as a label yet retaining the original equ name in the struc definitions?
Post 11 Sep 2015, 23:25
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 17663
Location: In your JS exploiting you and your system
revolution
gmg1812 wrote:
... but WHY? What use is defining an equ replacement name as a label yet retaining the original equ name in the struc definitions?
That is a good question. I wish I had a good answer. And is one of the reasons why I have tried to avoid using EQU's whenever possible because of the seemingly inconsistent application. On a technical level I can see in the source code how the behaviour comes about, but I've often wanted such things to be done differently.
Post 12 Sep 2015, 10:49
View user's profile Send private message Visit poster's website Reply with quote
l_inc



Joined: 23 Oct 2009
Posts: 881
l_inc
gmg1812
In short:
2.3.7 Order of processing wrote:
the symbolic constants are usually only replaced in the lines, where no preprocessor directives nor macroinstructions has been found

In other words macro/struc expansion has priority over symbolic constant substitution. The QQQ expansion results in a line "a:" (cause no single dot token is used in the struc definition body), which is then replaced by the symbolic constant expansion. See here for more details.

I find these rules very reasonable and well thought-out, cause exactly these rules prevent some very undesirable side-effects you'd have with fix-based definitions. This is one of the reasons I try to use equ and to avoid using fix whenever possible... Smile

_________________
Faith is a superposition of knowledge and fallacy
Post 12 Sep 2015, 13:37
View user's profile Send private message Reply with quote
gmg1812



Joined: 15 Aug 2014
Posts: 13
Location: Northern New York
gmg1812
"The QQQ expansion results in a line "a:" "

Ah, but there IS no "a:", only "zz:". Section 2.3.7's "usually only" is unfortunately ambiguous as in the case of a struc label it is not ignored. The preprocesor has a lot of code to make the EQU'd label in a struc act line a label, but to what advantage? Does anyone know of any current code that depends on that?

I can see enforcing the no-EQU replacement in a directive rule for struc, OR replacing the struc label with the EQU value (creating zz.a in the example) - either would be consistent, but not what it does now.
Post 12 Sep 2015, 19:12
View user's profile Send private message Reply with quote
gmg1812



Joined: 15 Aug 2014
Posts: 13
Location: Northern New York
gmg1812
I think I see the logic here: Given:

e1 equ aa
e2 equ bb
macro m a {db a}
struc s a {.a db a}

then

e1: m
e2 s 5

both generate the equ'd value of their label field: aa: and bb: That's consistent, but since any ref to the "s" struc would always define a.a instead of e2.a or bb.a (or whatever is in the label field) the feature appears useless and even highly bug-prone as the presence of the label cannot be seen unless you examine the tokens.
Post 12 Sep 2015, 19:48
View user's profile Send private message Reply with quote
gmg1812



Joined: 15 Aug 2014
Posts: 13
Location: Northern New York
gmg1812
Oops., sorry, bad example. Here's ws one problem with the way it is now:

struc s a {.a db a }

e1 equ aa
e1 s 1
e1 equ bb
e1 s 2 ; e1.a already defined (but bb: exists)
Post 12 Sep 2015, 20:00
View user's profile Send private message Reply with quote
JohnFound



Joined: 16 Jun 2003
Posts: 3502
Location: Bulgaria
JohnFound
gmg1812, there is a tag
Code:
[code][/code]    
in the forum engine.
Post 12 Sep 2015, 20:07
View user's profile Send private message Visit poster's website ICQ Number Reply with quote
l_inc



Joined: 23 Oct 2009
Posts: 881
l_inc
gmg1812
Quote:
Ah, but there IS no "a:", only "zz:"

Because a is then replaced by zz due to your symbolic constant definition.
Quote:
Section 2.3.7's "usually only" is unfortunately ambiguous

"Usually" is not ambiguous, because it's explained within the same sentence (read the section I linked).
Quote:
as in the case of a struc label it is not ignored

It's not ignored, because the line "a:" does not contain any preprocessor directives or macroinstructions.
Quote:
The preprocesor has a lot of code to make the EQU'd label in a struc act line a label, but to what advantage? Does anyone know of any current code that depends on that?

It's not about "EQU'd labels". It's just how struc macroinstructions are defined to work: they generate labels. And pretty much all the code that does not use the single dot token in the struc body depends on that behaviour.
Quote:
I can see enforcing the no-EQU replacement in a directive rule for struc

It doesn't make sense. A symbolic constant definition is there. It has to be applied. Otherwise don't define the constant.
Quote:
replacing the struc label with the EQU value (creating zz.a in the example)

It can't be that way, cause the macroinstruction expansion has precedence. This way a.a is generated first. It's a single token. The a is therefore not recognized and not replaced anymore.
Quote:
either would be consistent, but not what it does now

Current behaviour is well-documented to understand and flexible enough to allow achieving whatever result you want. You just need to understand and follow the rules.
Quote:
the feature appears useless and even highly bug-prone as the presence of the label cannot be seen unless you examine the tokens

I failed to understand, what you mean by that.
Quote:
Here's ws one problem with the way it is now

It's not a problem. It's defined behaviour. You know it and you act according to your knowledge. If you anticipate your struc being recklessly used with symbolic constants, you expand them first:
Code:
struc s a { .: match l,. \{ l\#\.a db a \} }    

Or you might be willing to prevent their expansion:
Code:
struc s a
{
    define . .
    .: .a db a
    restore .
}    

I would however suggest to look at this from the other side: symbolic constants should not be used for label definition. Could you provide an example why you'd need that? Some macroinstructions defined by struc are however not meant to define labels (reequ is a classical example). So it doesn't mean you should not be thinking at all about passing symbolic constants to them.

_________________
Faith is a superposition of knowledge and fallacy
Post 12 Sep 2015, 21:30
View user's profile Send private message Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 7796
Location: Kraków, Poland
Tomasz Grysztar
gmg1812 wrote:
Section 2.3.7's "usually only" is unfortunately ambiguous as in the case of a struc label it is not ignored. The preprocesor has a lot of code to make the EQU'd label in a struc act line a label, but to what advantage? Does anyone know of any current code that depends on that?

Code:
struc QQQ a {.a db a }    
is completely equivalent to
Code:
struc QQQ a {.: .a db a}    
as explained in section 2.3.4. And in turn these two constructions are equivalent:
Code:
struc QQQ a {.: .a db a}
a QQQ 3    
Code:
macro QQQ .,a {.: .#.a db a}
QQQ a,3    
STRUC is just a special variant of MACRO. That additional code in preprocessor which looks for symbolic variables in the implicit label is there to ensure that the preprocessing is consistent in all three cases.

The macroinstruction generates some new lines of source, which are in turn preprocessed. In every of the mentioned three variants there is "a: a.a db 3" text produced, which is then preprocessed. And since this text contains no preprocessor directives nor macroinstructions, the replacement of symbolic variables occurs.
Post 12 Sep 2015, 21:47
View user's profile Send private message Visit poster's website Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 17663
Location: In your JS exploiting you and your system
revolution
The technical reason is clear. But I think it doesn't really follow the principal of least surprise. Many people expect a different result, and without knowing the internal details, I would agree that the actual result is unexpected.
Post 13 Sep 2015, 06:18
View user's profile Send private message Visit poster's website Reply with quote
l_inc



Joined: 23 Oct 2009
Posts: 881
l_inc
revolution
Quote:
without knowing the internal details, I would agree that the actual result is unexpected

What do you mean by "internal details"? The behavior that is clearly declared in the documentation?

_________________
Faith is a superposition of knowledge and fallacy
Post 13 Sep 2015, 15:35
View user's profile Send private message Reply with quote
gmg1812



Joined: 15 Aug 2014
Posts: 13
Location: Northern New York
gmg1812
I disagree. Section 2.3.4 states:
Quote:
This label will be also attached at the beginning of every name starting with dot in the contents of macroinstruction.

Since struc is a form of macro, section 2.3.3 should apply:
Quote:
macroinstructions are replaced with corresponding code even before the symbolic constants are replaced with their values.

While the action of struc and macro relative to names in the "label field" are identical as Tomasz states, there is a big difference between the two directives visually. In the macro case, the label field looks like a label (name:) whereas with struc it doesn't - it looks like part of the directive. Like the label in an equ directive.

This belief is compounded as a struc ref's label is required (like equ's) but optional with a macro ref.

Put an equ name as a struc label and that label suddenly takes on two meanings, one before and one after equ replacement. This, as the preprocessor code shows, is a unique "special case" in fasm that if documented anywhere I've overlooked it.

Since strucs and macros are already different enough in what they do (though their processing shares a lot of code) why not remove the dichotomy by replacing the equ name in the label field of a struc with its value then handle the struc "nomally." This brings back "what you see is what you get" always a good feature in any programming language.
Post 13 Sep 2015, 16:52
View user's profile Send private message Reply with quote
l_inc



Joined: 23 Oct 2009
Posts: 881
l_inc
gmg1812
Quote:
Put an equ name as a struc label and that label suddenly takes on two meanings, one before and one after equ replacement.

A symbolic constant is a stack. It always has all the meanings you've ever pushed onto it and haven't popped out. You should view it as a stack and work with it accordingly.
Quote:
This, as the preprocessor code shows, is a unique "special case" in fasm

I prefer to view a label passed to a struc-defined macroinstruction as a prefix argument. In that respect there's nothing special about it (except it defaults to a label name) compared to any other argument of a macroinstruction. No matter what argument you take, it could have these "two meanings" the same way as the prefix argument. Handling the prefix argument this way is very consistent.
Quote:
This brings back "what you see is what you get" always a good feature in any programming language.

Smile "What you see is what you get" is for text and graphical editors. It's a bit unrelated, but in programming the priority is the opposite: hide what you write/get as much as possible. Show only clear interfaces.

_________________
Faith is a superposition of knowledge and fallacy
Post 14 Sep 2015, 00:16
View user's profile Send private message Reply with quote
gmg1812



Joined: 15 Aug 2014
Posts: 13
Location: Northern New York
gmg1812
That equ names are a definition stack has nothing to do with the situation I've noted with struc. The example I've used only uses the latest (or only) definition.

Language features must have a purpose - they must exist to solve or facilitate solving a programming problem. In the case of an equ name in the label field of a struc, the generated equ-name label serves what purpose? What problem does it help to solve?

At the very least (given that fasm has a good bit of code in the preprocessor to deal with this situation) this "feature" should be documented more precisely.

Allowing an equ name in a struc label to be both a label and the root of the defined variable names instead of just the former would have a purpose (though, admittedly not one you'd be likely to use very often).

Historically, one of the big advantaes to assembly language, dating from the mainframe days, is it allows you complete flexability when defining data structures and to make them clearer than is often possible using HLL's. After all, that's the whole reason for struc (in combination with macros), to make that programming problem easier and clearer. The current interaction between struc and equ names violates those objectives.

Since it appears no one has come up with a good use for the way it works now, why not change it? In examining the distributed code and several of the contributed applications, I have not found a single use of this "feature."

At any rate, I think this topic has been flogged to death. It is what it is.
Post 14 Sep 2015, 01:33
View user's profile Send private message Reply with quote
l_inc



Joined: 23 Oct 2009
Posts: 881
l_inc
gmg1812
Quote:
The example I've used only uses the latest (or only) definition.

My comment is not about your specific example only. It's about a general notion. There is nothing extraordinary about a symbolic constant having multiple meanings in general, and in your example in particular.
Quote:
In the case of an equ name in the label field of a struc, the generated equ-name label serves what purpose?

LOL. That sounds somewhat childish. Tights stretched over one's head serve what purpose? Well, you can find one, but don't complain to feel not comfortable enough then.

To make a more related example. There's a directive times, and there's a directive virtual. So here's an example:
Code:
times 5 virtual
    db 'I''m inside a strange construct',13,10
times 5 end virtual    

What purpose does times in front of virtual serve? None. It's just a valid language construct that is correctly parsed and compiled, but it's not meant to be in that combination. Although I can really find a valid situation to use such a thing, it still has no purpose in the language design and represents a side effect of combining two different language features.

It's the same way with the struc. It has multiple applications and was meant to serve different purposes: one is to serve the second compilation stage (assembly) by generating those plain old structures (hence the implicit label creation) and another one is to create general-purpose second-token macros. The latter can be pure preprocessor-based processing. Just to show a few examples:

Code:
;Allows to assign new value to a symbol without congesting it's stack
;usage: mysym equ 1
;       mysym reequ mysym,2
struc reequ [val]
{
    common
        tmp equ val
            restore .
            . equ tmp
        restore tmp
}

;Allows calculations with preprocessor
;usage: result equcalc 15/2 mod result
struc equcalc expr
{
    rept 1 res:expr
    \{
        restore .
        . equ res
    \}
}

;Finds the maximum value
;usage: max equmax 10,-15
struc equmax val1*, val2*
{
    rept 1 diff : ((val1)-(val2))
    \{
        restore .
        . equ val1
        match - any,diff
        \\{
            restore .
            . equ val2          
        \\}
    \}
}

;Allows to replicate the specified symbol a number of times
;with an optional delimiter
;usage: replication equdup 'hello',10,<#',',>
;       display replication
struc equdup val*, count*, delim
{
    tmp equ
    tmp equmax 1,count
    
    restore .
    . equ
    rept tmp-1 \{ . reequ . val delim \}
    rept 1 i:count \{ match =i,tmp \\{ . reequ . val \\} \}
    
    restore tmp
}    


So the former application does not need and does not expect to have a symbolic constant in the label argument. The latter in contrast might be designed exactly for that and should be able to safely use tokens of kind .someproperty without the symbolic constant name being replaced.

Quote:
At the very least (given that fasm has a good bit of code in the preprocessor to deal with this situation) this "feature" should be documented more precisely.

Again, to make it clear. It's not a feature. It's a side effect of trying to stretch the tights over your head. You can do that, but remember to consider the consequences. As for the documentation, it's very clear about the order of preprocessing here:
2.3.7 Order of processing wrote:
The standard preprocessing that comes after, on each line begins with recognition of the first symbol. It starts with checking for the preprocessor directives, and when none of them is detected, preprocessor checks whether the first symbol is macroinstruction. If no macroinstruction is found, it moves to the second symbol of line, and again begins with checking for directives, which in this case is only the equ directive, as this is the only one that occurs as the second symbol in line. If there is no directive, the second symbol is checked for the case of structure macroinstruction and when none of those checks gives the positive result, the symbolic constants are replaced with their values and such line is passed to the assembler.

2.3.7 Order of processing wrote:
When the macroinstruction generates the new lines from its definition block, in every line it first scans for macroinstruction directives, and interpretes them accordingly. All the other content in the definition block is used to brew the new lines, replacing the parameters with their values and then processing the symbol escaping and # and ` operators. The conversion operator has the higher priority than concatenation and if any of them operates on the escaped symbol, the escaping is cancelled before finishing the operation. After this is completed, the newly generated line goes through the standard preprocessing, as described above.

So first "brew" lines according to the rules of a the corresponding macroblock and only then standard preprocessing, of which replacing symbolic constants is the last step. It doesn't matter what kind of macroblock it is: macro, struc, rept, match etc.: each of these have their specific rules of "brewing" and pre-pending the prefix argument is just a one specific to strucs.

Quote:
Since it appears no one has come up with a good use for the way it works now, why not change it?

Are you claiming to have checked all the existing code in the world? Changing current behaviour is not about discarding the misuse of tights. It's about breaking existing uses that rely on tights and heads separately.

_________________
Faith is a superposition of knowledge and fallacy
Post 14 Sep 2015, 23:18
View user's profile Send private message Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2020, Tomasz Grysztar. Also on GitHub, YouTube, Twitter.

Website powered by rwasa.