flat assembler
Message board for the users of flat assembler.

Index > Macroinstructions > novice question regarding macros

Author
Thread Post new topic Reply to topic
Hugh Aguilar



Joined: 15 Nov 2011
Posts: 62
Location: Arizona
Hugh Aguilar 29 Dec 2011, 11:13
I'm still working on the Forth system that I mentioned earlier. I wrote quite a lot in HLA (about 1100 lines) but then I hit some error messages that I couldn't figure out, and there is no support available, so I am porting it over to FASM now. Here is a novice-level question to start with:

I want to write two macros that work together (the #terminator was used in HLA). The labels used in the first macro should be available in the second macro, but the second macro must then remove them from the symbol-table so the first macro can be executed again without causing a redefinition error. I can't use LOCAL in the first macro to define the labels because they won't be available when the second macro gets executed. I can't use the anonymous label @@ either, because there is only one in scope at any particular time (@b references it) and I need several labels.

One possibility is to use ordinary labels. In the second macro however, I need to remove those labels from the symbol table so they can be redefined later on (the next time that the two macros are executed). Is it possible to use RESTORE to remove them from the symbol table? This would be the most straight-forward solution if it works. The documentation seems to indicate that RESTORE only works with EQU symbols though, and not labels.

Another possibility is to define one unique label prior to each execution of the first macro and this label will be permanent. Then I could use the dot in front of the labels defined inside of the macro so they get concatenated with that permanent label so they don't conflict with themselves when they are defined the next time that the macro is executed. The problem is that, inside of the two macros, I don't know what the name of that permanent label is. Is there some generic surrogate for the last label defined?

How is this kind of thing usually done? I could experiment until I find something that works, but I would prefer that you tell me what the idiomatic solution is. I notice that you have .IF .ENDIF etc. available. Are these control-structures nestable? That is pretty much the same problem --- whatever technique was used to define those macro pairs should also work for my macro pairs.

Thanks in advance for your help. If none of this makes any sense, let me know and I will ask again showing you some code that doesn't work but looks like it should work. Confused
Post 29 Dec 2011, 11:13
View user's profile Send private message Send e-mail Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 19275
Location: In your JS exploiting you and your system
revolution 29 Dec 2011, 11:23
Yes, restore only works with equ. But you can use equ's to define unique labels and then later remove those defines so that you can use them again later for another definition.
Code:
macro one ... {
 local a,b,c
 label1 equ a
 label2 equ b
 label3 equ c
}
macro two ... {
 restore label1, label2, label3
}

one ...
label1: nop
label2: nop
label3: nop
two ...

one ...
label1: nop ;a new label name
label2: nop ;a new label name
label3: nop ;a new label name
two ...    
Post 29 Dec 2011, 11:23
View user's profile Send private message Visit poster's website Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 19275
Location: In your JS exploiting you and your system
revolution 29 Dec 2011, 11:27
Moved to macroinstructions
Post 29 Dec 2011, 11:27
View user's profile Send private message Visit poster's website Reply with quote
Hugh Aguilar



Joined: 15 Nov 2011
Posts: 62
Location: Arizona
Hugh Aguilar 30 Dec 2011, 01:55
revolution wrote:
Yes, restore only works with equ. But you can use equ's to define unique labels and then later remove those defines so that you can use them again later for another definition.
Code:
macro one ... {
 local a,b,c
 label1 equ a
 label2 equ b
 label3 equ c
}
macro two ... {
 restore label1, label2, label3
}

one ...
label1: nop
label2: nop
label3: nop
two ...

one ...
label1: nop ;a new label name
label2: nop ;a new label name
label3: nop ;a new label name
two ...    


Okay, now I'm confused! Doesn't EQU just do a text replacement, similar to #DEFINE in C? That doesn't make sense in the above context. You are using LABEL1 rather than A, but the effect is the same as if you had used A --- an error message because A is no longer defined (A was only defined while ONE was executing). I haven't actually tested your code yet --- I'll test it tonight --- maybe I'm misunderstanding what EQU does.

BTW, another thing that I'm confused about is what the difference is between DEFINE and EQU. The manual says: "The only difference between DEFINE and EQU is that DEFINE assigns the value as it is, it does not replace the symbolic constants with their values inside it." What???
Post 30 Dec 2011, 01:55
View user's profile Send private message Send e-mail Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 19275
Location: In your JS exploiting you and your system
revolution 30 Dec 2011, 02:27
Follow it through:

(one macro) local a ---> a = a?1b5f
(one macro) label1 equ a ---> label1 = a?1b5f
(code) label1: nop ---> a?1b5f: nop
(two macro) restore label1 ---> label1 = label1
(one macro) local a ---> a = a?1b7c
(one macro) label1 equ a ---> label1 = a?1b7c
(code) label1: nop ---> a?1b7c: nop
(two macro) restore label1 ---> label1 = label1


b equ a ;b=a
c equ b ;c=a
compare to:
define b a ;b=a
define c b ;c=b
Post 30 Dec 2011, 02:27
View user's profile Send private message Visit poster's website Reply with quote
Hugh Aguilar



Joined: 15 Nov 2011
Posts: 62
Location: Arizona
Hugh Aguilar 30 Dec 2011, 06:57
revolution wrote:
Follow it through:

(one macro) local a ---> a = a?1b5f
(one macro) label1 equ a ---> label1 = a?1b5f
(code) label1: nop ---> a?1b5f: nop
(two macro) restore label1 ---> label1 = label1
(one macro) local a ---> a = a?1b7c
(one macro) label1 equ a ---> label1 = a?1b7c
(code) label1: nop ---> a?1b7c: nop
(two macro) restore label1 ---> label1 = label1


b equ a ;b=a
c equ b ;c=a
compare to:
define b a ;b=a
define c b ;c=b


I think I understand. The LOCAL variable is effectively an EQU for a mangled variable that FASM has generated internally. The implication here is that every time that a macro gets executed it permanently allocates as much memory as needed for its local variables, and it assigns mangled names to those memory locations.

Are those locals always DWORD size?

Also, EQU does a recursive replacement, and DEFINE just does one level. In your example above, EQU was necessary and DEFINE couldn't be used.

BTW, I noticed in the example programs that there are .CODE and .DATA macros. What do these do exactly? I assume that they change the origin where we are assembling; that we have separate sections of memory for code and data. Why is this done? Why can't code and data be intermingled? The 32-bit x86 is a flat architecture; we don't have memory models like in the bad old days of MS-DOS (all of the segment registers are set to zero and left alone). In my compiler I am expecting to be able to copy code from pre-assembled functions into the code that I am generating. I am treating code as data. Is this going to work? I'm not going to get any memory-management fault am I?
Post 30 Dec 2011, 06:57
View user's profile Send private message Send e-mail Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 19275
Location: In your JS exploiting you and your system
revolution 30 Dec 2011, 07:02
Hugh Aguilar wrote:
Are those locals always DWORD size?
Locals names are not any size, they are just text replacements.
Hugh Aguilar wrote:
BTW, I noticed in the example programs that there are .CODE and .DATA macros. What do these do exactly? I assume that they change the origin where we are assembling; that we have separate sections of memory for code and data. Why is this done? Why can't code and data be intermingled? The 32-bit x86 is a flat architecture; we don't have memory models like in the bad old days of MS-DOS (all of the segment registers are set to zero and left alone). In my compiler I am expecting to be able to copy code from pre-assembled functions into the code that I am generating. I am treating code as data. Is this going to work? I'm not going to get any memory-management fault am I?
You can mix code and data together if you want to but it is generally not considered good practice. Usually code is separated and marked as non-writeable. But it is up to you how you want to arrange your segments, you don't have to do that.
Post 30 Dec 2011, 07:02
View user's profile Send private message Visit poster's website Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 19275
Location: In your JS exploiting you and your system
revolution 30 Dec 2011, 07:06
Also equ replacement is not recursive. It is one level only.

a equ c ;a=c
c equ b ;c=b
d equ a ;d=c
Post 30 Dec 2011, 07:06
View user's profile Send private message Visit poster's website Reply with quote
Hugh Aguilar



Joined: 15 Nov 2011
Posts: 62
Location: Arizona
Hugh Aguilar 31 Dec 2011, 03:04
revolution wrote:
Hugh Aguilar wrote:
Are those locals always DWORD size?
Locals names are not any size, they are just text replacements.


I was not thinking straight when I asked that question --- I understand that locals are essentially just text replacement of a mangled label that is guaranteed to be unique.

Quote:
Hugh Aguilar wrote:
BTW, I noticed in the example programs that there are .CODE and .DATA macros. What do these do exactly? I assume that they change the origin where we are assembling; that we have separate sections of memory for code and data. Why is this done? Why can't code and data be intermingled? The 32-bit x86 is a flat architecture; we don't have memory models like in the bad old days of MS-DOS (all of the segment registers are set to zero and left alone). In my compiler I am expecting to be able to copy code from pre-assembled functions into the code that I am generating. I am treating code as data. Is this going to work? I'm not going to get any memory-management fault am I?
You can mix code and data together if you want to but it is generally not considered good practice. Usually code is separated and marked as non-writeable. But it is up to you how you want to arrange your segments, you don't have to do that.


For an application program, I would separate code and data. I would even want the processor to throw a fault if my program wrote into code memory (certainly a bug, and likely a catastrophic bug if it were allowed to run).

I'm writing a compiler though, so treating code as data is pretty much the crux of what the program does. I need this to work under both Windows and Linux. If I just ignore .CODE and .DATA will I be okay?
Post 31 Dec 2011, 03:04
View user's profile Send private message Send e-mail Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 19275
Location: In your JS exploiting you and your system
revolution 31 Dec 2011, 03:33
Like I mentioned above you don't have to make independent sections if you don't want to. Just do what you feel is right for you. The CPU won't complain.

However when you say "compiler" do mean a JIT compiler? Any normal compiler would not have a need for code/data mixture. And for a JIT compiler you would most probably still want to write-protect your outputted code section before executing.

The only other use for a writeable code section would be for SMC.
Post 31 Dec 2011, 03:33
View user's profile Send private message Visit poster's website Reply with quote
Hugh Aguilar



Joined: 15 Nov 2011
Posts: 62
Location: Arizona
Hugh Aguilar 03 Jan 2012, 23:49
revolution wrote:
However when you say "compiler" do mean a JIT compiler? Any normal compiler would not have a need for code/data mixture. And for a JIT compiler you would most probably still want to write-protect your outputted code section before executing.


I asked previously about "incremental assembly" in this regard:
http://board.flatassembler.net/topic.php?t=13617

Forth is not a "normal compiler" (your term!) that generates an assembly-language source-file that is fed through an assembler and linker to create the executable, which is then executed in a debugger or ICE (single-stepping at the breakpoints, and all that jazz).

Forth is an interactive development system. The user can define a function, and that function is immediately available for execution. All of the data that the program has already generated is still there available for this newly minted function to access. The implication here is that the compiler (written in FASM in my case) has to assemble code at its run-time (which is compile-time from the user's perspective). Forth predates JIT by several decades, and it is not really the same thing, although there is similarity in that, in both cases, assembly is being done at run-time.

Unfortunately, there is no way (that I know of) to make FASM or HLA or any other assembler generate code that is tacked onto the end of the existing code. These assemblers just assemble obj files which are then given to a linker that generates an exe file.

A lot of Forth systems solve this problem by including an assembler written in Forth that does generate code that can be immediately executed.

What I am doing instead, is just pasting existing code together to make new code. I have functions already assembled into my compiler code. The compiler just does a string copy to paste this code into the code that it is generating. In some cases, the compiler must also poke data into the operands of some of the instructions. For example, the function that pushes a literal value onto the stack is defined like this:

push ebx
mov ebx, $deadbeaf

Note that ebx holds the top-of-stack (TOS) value, and the rest of the stack is held in memory with esp being the stack pointer. This code gets pasted into the code being generated. The compiler also has to store the actual literal value into the operand where the dummy value $deadbeaf is currently filling in. As another example, this is the function that pushes the address of a local variable onto the stack:

push ebx
lea ebx, [ebp+$78]

This code gets pasted into the code being generated. The compiler also has to store the actual offset value into the operand where the dummy value $78 is currently filling in.

This business of pasting existing code together to form new code, is a cheap and primitive way to generate code when there is no assembler available. There are severe limitations as to how much optimization can be done by such a crude compiler, but it is the best that I can muster right now --- I do get an interactive Forth system, and that is what matters to me. Later on I will write a cross-compiler (what you call a "normal compiler") that generates an assembly source-file, but that will be for the micro-controller (the ARM or TMS430 or whatever) --- what I'm writing at this time is for the x86, and it is interactive.

Quote:
The only other use for a writeable code section would be for SMC.


I don't know what the term SMC means.
Post 03 Jan 2012, 23:49
View user's profile Send private message Send e-mail Reply with quote
aq83326



Joined: 25 Jun 2011
Posts: 21
aq83326 04 Jan 2012, 03:07
SMC = self-modifying code
Can you use the fasm # operator? Is this what you need?
Code:
macro one name
{
label name#LabelA ; generates unique labels unless that particular combination occurs elsewhere
...
name#LabelB:
}
macro two name, OnesName
{
mov rax, OnesName#LabelA
...
mov [OnesName#LabelB], rax
}
one MyOne
two MyTwo MyOne
    
Post 04 Jan 2012, 03:07
View user's profile Send private message Visit poster's website Reply with quote
Hugh Aguilar



Joined: 15 Nov 2011
Posts: 62
Location: Arizona
Hugh Aguilar 04 Jan 2012, 04:12
aq83326 wrote:
SMC = self-modifying code
Can you use the fasm # operator? Is this what you need?
Code:
macro one name
{
label name#LabelA ; generates unique labels unless that particular combination occurs elsewhere
...
name#LabelB:
}
macro two name, OnesName
{
mov rax, OnesName#LabelA
...
mov [OnesName#LabelB], rax
}
one MyOne
two MyTwo MyOne
    


No. Evaluation of macros is happening at compile-time for the program --- the program being my Forth compiler. I need to assemble code at run-time for the program --- what the user of my Forth compiler considers to be compile-time.

This business of the One and Two macros is done when FASM is assembling the Forth compiler program. I needed the macro pair for compiling dictionary entries. I think that Revolution has already described how to do that using EQU and RESTORE. What you are describing could be used, but it is more complicated than Revolution's method. Anyway, none of this has anything to do with incremental-assembly which is what the Forth compiler is doing using the paste-code method that I described earlier as a crude solution.

Coaxing FASM into assembling code during the run-time of a FASM program is a much more difficult problem, which is certainly impossible without rewriting FASM. Incremental assembly is a pretty major upgrade to FASM, which is definitely beyond me at this time --- I'm still learning how to use FASM --- I haven't even looked at FASM's source-code yet, so I'm definitely not prepared to upgrade FASM.
Post 04 Jan 2012, 04:12
View user's profile Send private message Send e-mail Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2023, Tomasz Grysztar. Also on GitHub, YouTube, Twitter.

Website powered by rwasa.