flat assembler
Message board for the users of flat assembler.

Index > Main > [2006] Better explaination of the match directive

Author
Thread Post new topic Reply to topic
Meor



Joined: 02 Jan 2006
Posts: 3
Meor 15 Jan 2006, 22:41
Could someone explain the match directive for me a bit. I've read the documentation but :/ it just doesn't seem to answer my question well enough. What I am imagining the directive doing is matching the ascii characters before the comma with the ones after the comma although I don't know if this is correct or if it is, how it defines a match. Should I not be observing the behaviour with "display"

Code:
match hello,f
{
  display "yes"
}    

This displays yes

Code:
match hi,h
{
  display "yes"
}    

This displays yes

Code:
match h i,h
{
  display "yes"
}
    

This doesn't display yes despite my understanding that the preprocessor doesn't take in to account spaces, this should give the same result as the previous.
Is there any way I could help with adding a bit to the match documentation, if I understand the concept I do well at explaining it.
Post 15 Jan 2006, 22:41
View user's profile Send private message Reply with quote
Borsuc



Joined: 29 Dec 2005
Posts: 2465
Location: Bucharest, Romania
Borsuc 15 Jan 2006, 23:03
Here's how I imagine it:

if you have a symbolic constant named 'xxx', like:

Code:
define xxx a:2    


then, say you want to be able to extract the 'a' symbol (the one before the colon) into a parameter, and the symbol following the colon ('2' in our case) in another parameter, thus p1 and p2, then:

Code:
match p1:p2, xxx    


will match p1 with 'a' and p2 with '2', since the ':' symbol is between them (notice p1:p2 means any first symbol will be stored in p1, then it MUST find a colon ':', then ALL the remaining symbols will be stored in p2). If the match is false (i.e it cannot find a symbol, then a colon, then other symbols) it will NOT get processed.

Say you want a macro which uses this syntax:

Code:
Mymacro ah:5, bl:2    


and this macro just moves the value 5 into ah, and 2 into bl, you could use:

Code:
macro Mymacro [params]
{
 forward
  match reg:const, params
  \{
    mov reg, const
    ; reg at this point is the 'symbol' before a colon ':'
    ; const is the symbol 'after' the colon
  \}
}    


then invoking the macro like above will compile:

Code:
mov ah, 5
mov bl, 2    



IF you understood this, keep readin' Wink

Now, what happens if you want to test a symbol other than '+', '-', ':', etc? Say your macro's syntax would have been, for example:

Code:
Mymacro ah set 5, bl set 2    


and use 'set' symbol instead of ':'.. how can you do that? Simply using:

Code:
match reg set const, params    


will NOT work, since this just maps the 'set' parameter to ANY second symbol... like reg is assigned to the FIRST symbol.. now, if you want to map the symbol literally, use the '=' character before it, like:

Code:
match reg =set const, params    


this will assign the FIRST symbol into 'reg', then it MUST find the 'set' symbol literally, then it assigns all the following symbols to 'const', and that's it.

Note that to use the '=' symbol literally, or the ',' symbol, you must use == and =,

See the documentation for details, or ask Tomasz.. anyway, he may explain this better.

PS: @Tomasz, when I first read the documentation, I also realized the match directive was quite undocumented.. I mean, it included all the stuff, but maybe put some more examples? Maybe some examples from this post? just some hints, anyway Wink


hope it helps Smile

regards
Post 15 Jan 2006, 23:03
View user's profile Send private message Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8356
Location: Kraków, Poland
Tomasz Grysztar 15 Jan 2006, 23:15
OK, first let me explain few sentences from the manual:
fasm manual, section 2.3.6 wrote:
There are the few rules for building the expression for matching, first is that any of symbol characters and any quoted string should be matched exactly as is. (...) To match any other symbol literally, it has to be preceded by = character in the pattern.

To understand this better let's get back to the section 1.2.1:
fasm manual, section 1.2.1 wrote:
Each line in source is the sequence of items, which may be one of the three types. One type are the symbol characters, which are the special characters that are individual items even when are not spaced from the other ones. Any of the +-/*=<>()[]{}:,|&~#` is the symbol character. The sequence of other characters, separated from other items with either blank spaces or symbol characters, is a symbol. If the first character of symbol is either a single or double quote, it integrates the any sequence of characters following it, even the special ones, into a quoted string, which should end with the same character, with which it began (the single or double quote) - however if there are two such characters in a row (without any other character between them), they are integrated into quoted string as just one of them and the quoted string continues then. The symbols other than symbol characters and quoted strings can be used as names, so are also called the name symbols.

In different terminology (the one used in the manual is actually carried from the "ancient" times of fasm's design) it can be rephrased that each lines of source is tokenized, and there are the three types of tokens: the special characters (also called the symbol characters in the manual) are one type, the quoted strings are the other type, and any other sequence of characters makes the "name symbol" token, which may be also though of as a "word". For example this line:
Code:
some_label db "string",0    

is tokenized into sequence of five tokens:
  • name symbol: some_label
  • name symbol: db
  • quoted string: string
  • special character: ,
  • name symbol: 0


Now back to the "match". As the manual says, only tokens of two types can be matched as-is: it must be either be special character or quoted string token. To match literally the token of name symbol type, it must be preceded by = character. In fact, you can precede absolutely any token with = to match it literally, and this way you can also use the == and =, sequences to match the = and comma, which are characters that are treated specially in "match" otherwise.

Thus here are some samples where you've got the true match:
Code:
match +,+ ; special character is matched as-is
{}
match 'a','a' ; the same with quoted string
{}
match =a,a ; the name symbol must be preceded with =
{}
match =a=+='a' , a+'a' ; and = may be actually used with any token
{}    


Now what happens, when the name symbol is not preceded with = in the expression to be matched? Back to the manual:
fasm manual, section 2.3.6 wrote:
If some name symbol is placed in the pattern, it matches any sequence consisting of at least one symbol and then this name is replaced with the matched sequence everywhere inside the following block, analogously to the parameters of macroinstruction. For instance:
Code:
    match a-b, 0-7
     { dw a,b-a }    
will generate the dw 0,7-0 instruction.

The name symbol allows you to match any sequence of tokens, so it acts like a kind of wildcard. And the sequence of tokens that were matched with it are then assigned to this name, much like it is with macro parameters. Now when you just put a single name symbol into the match expression, it will simply match all the tokens you put on the right side, thus your sample:
Code:
match hello,f    
matches everything that comes after the comma with the "hello" name, and if you used the "hello" name somewhere inside the "match" block, it would get replaced with that sequence of tokens ("f" in this case). And now with your other sample:
Code:
match h i,h    

Here we've got the two name symbols to be matched, but on the right side there's only one token - and each name symbol has to be matched with at least one token. So the match cannot be done, as one token is simply not enough to fill the two names.

I hope this helps you understand the information from the manual. However this is only the beginning. The "match" has actually many different usages, I may try to explain here some of its advanced tricks in another post.

And note that manual contains some additional information on "match" in the section 2.3.7, it's not only 2.3.6 that covers it.
Post 15 Jan 2006, 23:15
View user's profile Send private message Visit poster's website Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8356
Location: Kraków, Poland
Tomasz Grysztar 16 Jan 2006, 21:17
Another important thing (mentioned at the end of 2.3.6 section) about "match" is that it evaluates the symbolic constants in the text it has to match. So any constant defined with "equ" (or its younger brother: "define") that occurs to the right from the comma (the one that separates the matching expression from the text to be matched) is replaced with the text of its value before performing the match. Thus this:
Code:
my_const equ 2+2
match =2+=2,my_const
{}    

matches "=2+=2" expression with the "2+2" text, and thus the match is successfull. You can match any exact value this way, including the empty value, like:
Code:
my_const equ
match ,my_const
{}    

which also is a successfull match, since the empty expression matches the empty text.

And if you want to check whether some symbolic constant has the value that is not empty, that is the value that contains at least one token, you can use the "named wildcard" in expression, like:
Code:
match any,my_const
{}    

The "any" can be actually any other name, as long as it is just the name token not preceded with =, so it acts like a wildcard in the matching expression. As it was said earlier, such name must be matched (and assigned with) at least one token from the text on right side, so if the value of "my_const" is empty, the match won't occur.

The example application of the above is making the lists of items with "match", originally first explained here: http://board.flatassembler.net/topic.php?t=3607

Now another place where "match" comes very useful is parsing the parameters of macro. Imagine you would like to have the "let" macro, that would allow you to write "let al=4" instead of "mov al,4". Normally the macroinstructions accept parameters separated with commas, so "let" defined as a macro would see "al=4" as one parameter. And here's where the "match" comes useful - to recognize the structure of such parameters and split it into parts we need.

I assume now, that you already know how nesting of the macro blocks has to be done in fasm (as "match" is actually a special kind of macro). Here's how such "let" can be implemented:
Code:
macro let param
{
  match dest==src,param
  \{
     mov dest,src
  \}
}    

since the = has special meaning inside the matching expression, we have to make it double (well, actually to precede = with another =) to match it literally. Now "let al=4" will pass "al=4" as a value of "param" to the contents of "let" macro. This way the "match dest==src,al=4" gets evaluated: everything that precedes the = character is matched with the "dest" named wildcard, and everything that follows is matched with "src". Thus finally we get the "mov al,4" instruction assembled.

If there was no = character inside the value of "param", or there was nothing preceding or following it, the match would be unsuccessfull and the above macro would do just nothing. But we can add more possible syntaxes to it by putting there more matches in a row, like:
Code:
macro let param
{
  match dest==src,param \{ mov dest,src \}
  match dest++,param \{ inc dest \}
}    

which additionally allows us to write "let al++" to generate the "inc al" instruction.

But now consider we wanted to allow the "let al+=4" to generate the "add al,4". We could simply add this line to the above macro:
Code:
  match dest+==src,param \{ add dest,src \}    

and it would correctly generate the "add" instruction if we wrote "let al+=4". However, unfortunately, also the "dest==src" would be matched in this case, with "al+" being assigned to "dest", and we would get the erroneous instruction "mov al+,4" aswell.

To prevent such problem, we need to somehow handle the information that the syntax has been already recognized and prevent any more matches from happening in such case. Since, due to the specific of this directive, we cannot do the "negative" match, it has to be done by employing some symbolic constant. The first solution that may come to mind is something like:
Code:
macro let param
{
  define status 0
  match dest+==src,param
  \{
     add dest,src
     define status 1
  \}
  match =0,status
  \{
     match dest==src,param
     \\{
        mov dest,src
        define status 1
     \\}
  \}
}    

(note that we still need to do the "match dest+==src" first, to make "al+=4" be recognized as "+=" syntax, not the "=" one).

But it can be done simpler. Those nested matches can be actually composed into single match:
Code:
  match =0 dest==src  ,  status param
  \{
     mov dest,src
     define status 1
  \}    

The comma that separates the matching expression from matched text is here spaced out, so you can see what is actually matched. Since the first token that has to be matched is the "0" literally, and the first token of text is either the "1" or "0", as those are only possible values of "status", the match can happen only when "status" is defined to be "0" and in such case the rest of text (which is just the value of "param" in this case) is matched to "dest==src".

So the complete macro may look like:
Code:
macro let param*
{
  define status 0
  match =0 dest+==src , status param
  \{
     add dest,src
     define status 1
  \}
  match =0 dest==src , status param
  \{
     mov dest,src
     define status 1
  \}
  match =0 dest++ , status param
  \{
     inc dest
     define status 1
  \}
  match =0 any , status param
  \{
     err "SYNTAX ERROR"
  \}
}    
Here each "match" contains the "=0" matched to "status", just to make the macro uniform. The last "match" check for the case, when the syntax was not recognized by any of given rules, and has to signal the error in such case. Checking for the case of empty "param" value is not needed, since we ensured it with the "*" following the name of parameter in the definition of macro. Otherwise we would have to add another (last one) "match" to check for the empty value.

PS. Want some more? Wink
Post 16 Jan 2006, 21:17
View user's profile Send private message Visit poster's website Reply with quote
shutdownall



Joined: 02 Apr 2010
Posts: 517
Location: Munich
shutdownall 04 Aug 2013, 19:28
Hmm not sure, hope somebody can help me.

I use match in conditional preprocessing to define a video memory layout for the Sinclair ZX computers.

I had till now two options checked with match (DFILETYPE) and added now an AUTO which chooses memory layout depending on available memory. This works as expected. So different macros are used depending on options.

But now I wonder if there is a chance to somehow handle a missing declaration of DFILETYPE to use an internal default structure ?

Code:
        DFILETYPE  EQU     AUTO            ;COLLAPSED, EXPANDED or AUTO

DFILE_ADDR:
match =COLLAPSED,DFILETYPE {
      collapsed_screen
}
match =EXPANDED,DFILETYPE {
      full_screen
}
match =AUTO,DFILETYPE {
      if MEMAVL>3000
                full_screen
      else
                collapsed_screen
      end if
}

VARS_ADDR:
    
Post 04 Aug 2013, 19:28
View user's profile Send private message Send e-mail Reply with quote
l_inc



Joined: 23 Oct 2009
Posts: 881
l_inc 04 Aug 2013, 19:54
shutdownall
If I understood you correctly, depending on your needs you can either introduce a check for a match or just match the symbol with its name:
1)
Code:
define matched +
define matched -
match =COLLAPSED,DFILETYPE {
    restore matched
    collapsed_screen
} 
match =EXPANDED,DFILETYPE {
    restore matched
    full_screen
}
match =AUTO,DFILETYPE {
    restore matched
    if MEMAVL>3000
        full_screen
    else
        collapsed_screen
    end if
}
match -,matched {
    restore matched
    ;default case
}
restore matched    


2)
Code:
match =COLLAPSED,DFILETYPE {
    collapsed_screen
} 
match =EXPANDED,DFILETYPE {
    full_screen
}
match =AUTO,DFILETYPE {
    if MEMAVL>3000
        full_screen
    else
        collapsed_screen
    end if
}
match =DFILETYPE,DFILETYPE {
    ;default case
}    

_________________
Faith is a superposition of knowledge and fallacy
Post 04 Aug 2013, 19:54
View user's profile Send private message Reply with quote
shutdownall



Joined: 02 Apr 2010
Posts: 517
Location: Munich
shutdownall 04 Aug 2013, 20:17
Great !
Thanks, I prefered your second example.
Works as expected. Very Happy


Code:
DFILE_ADDR:
match =COLLAPSED,DFILETYPE {
      collapsed_screen
}
match =EXPANDED,DFILETYPE {
      full_screen
}
match =AUTO,DFILETYPE {
      if MEMAVL>3000
                full_screen
      else
                collapsed_screen
      end if
}
match =DFILETYPE,DFILETYPE {
      collapsed_screen
}
    
Post 04 Aug 2013, 20:17
View user's profile Send private message Send e-mail Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.