flat assembler
Message board for the users of flat assembler.
![]() Goto page Previous 1, 2 |
Author |
|
revolution 22 Jan 2010, 15:51
f0dder wrote: Do you (or even better, Tomasz f0dder wrote: OK, let me see if I understand this correctly, then... without the inner nesting-slashes, would the preprocessor first expand "extrn '__imp__'\#`func\#'A@'\#\`p as func:dword" to "extrn '__imp__FooBarA@'`p as func:dword', before entering the match block? (Not sure how it'd expand `p when not being in the match block). f0dder wrote: I suppose this leads to a question on how the preprocessor works (I didn't grok the explanation in the fasm docs fully, but I guess reading it at 4am didn't help f0dder wrote: Btw, a "stringify" operator could be very helpful when debugging macros - this might sound similar to the `-operator, but not exactly. Consider the following example: Code: macro DI [instr] { common local string,last last equ ',' irps symbol,instr \{ string equ match =+,symbol \\{string equ '+'\\} match =-,symbol \\{string equ '-'\\} match =*,symbol \\{string equ '*'\\} match =/,symbol \\{string equ '/'\\} match ==,symbol \\{string equ '='\\} match =<,symbol \\{string equ '<'\\} match =>,symbol \\{string equ '>'\\} match =(,symbol \\{string equ '('\\} match =),symbol \\{string equ ')'\\} match =[,symbol \\{string equ '['\\} match =],symbol \\{string equ ']'\\} match =:,symbol \\{string equ ':'\\} match =,,symbol \\{string equ ','\\} match =|,symbol \\{string equ '|'\\} match =&,symbol \\{string equ '&'\\} match =~,symbol \\{string equ '~'\\} match =RVA,symbol \\{string equ 'RVA'\\} match =rva,symbol \\{string equ 'rva'\\} ;match ={,symbol \\{string equ '{'\\} ;match =},symbol \\{string equ '}'\\} ;match =#,symbol \\{string equ '#'\\} ;match =`,symbol \\{string equ '`'\\} ;match =',symbol \\{string equ "'"\\} ;match =",symbol \\{string equ '"'\\} match ,string last \\{display ' '\\} last equ string match any,string \\{display string\\} match ,string \\{ if symbol eqtype '' display "'"\#\`symbol\#"'" else display \`symbol end if \\} \} } Code: macro ShowEquContents var{match x,var\{DI x\}} |
|||
![]() |
|
f0dder 22 Jan 2010, 16:17
revolution wrote: No. The outer macro will do all ` and # things it sees (in this case just the `func) and strip one layer of backslashes and then pass on to the next layer in turn. So you get "extrn '__imp__'#'Function0'#'A@'#`p as Function0:dword" and then the match will see the #'s and the `p and do those. ![]() revolution wrote: It is single pass. Just expand and generate code into a buffer. Once expanded, process the buffer and generate more code as each layer of macrodom is stripped off. So, when processing a macro block, first the entire block is scanned for things the preprocessor handles, then inner macro invocations (including irp/match/...) are scanned for and expanded? I'd consider that multi-pass? This leads to yet another question ![]() revolution wrote: Here is my DI macro I used when learning about all this stuff. I think I have posted it before but here it is again. |
|||
![]() |
|
revolution 22 Jan 2010, 16:29
Multi pass would mean it re-processes the same data again. But it does not do that. A macro is expanded when instantiated once only. After expansion, the resulting code is then processed as per normal back into the main program loop. If it so happens that there is an embedded macro it will be discovered only after the outer macro has been fully expanded and expansion is finalised.
And yes, any improvement in string handling would be a nice addition. |
|||
![]() |
|
f0dder 22 Jan 2010, 17:17
revolution wrote: Multi pass would mean it re-processes the same data again. But it does not do that. A macro is expanded when instantiated once only. After expansion, the resulting code is then processed as per normal back into the main program loop. If it so happens that there is an embedded macro it will be discovered only after the outer macro has been fully expanded and expansion is finalised. 1. notice where macro instantiation is done. 2. expand macro but don't expand inner macro instantiations. 3. continue preprocessing from address saved in #1, not the line after the original invocation. While this obviously isn't going to re-preprocess the entire source, I still consider it a multi-pass approach, since you're processing much of the same macro code multiple times (of course it's not THE same code being re-processed, since expansions have been done ![]() And it's probably easier implementing this way than "descending into" nested macro calls while instantiating... and it does work fine in practice, you just have to be careful to remember those nesting-slashes (wasn't obvious to me they'd be needed in rept/match etc, apart from the close-brackets, but it does make sense if fasm basically follows the method outlined above). revolution wrote: And yes, any improvement in string handling would be a nice addition. ![]() ![]() _________________ carpe noctem |
|||
![]() |
|
revolution 22 Jan 2010, 17:37
f0dder wrote: If I understand you correctly, doesn't this mean that upon seeing a macro instantiation, basically the following happens: f0dder wrote: While this obviously isn't going to re-preprocess the entire source, I still consider it a multi-pass approach, since you're processing much of the same macro code multiple times (of course it's not THE same code being re-processed, since expansions have been done f0dder wrote: What's Tomasz' position, these days, on accepting code from others into the mainline FASM codebase? Better (as in, any |
|||
![]() |
|
f0dder 22 Jan 2010, 23:31
revolution wrote: No, it doesn't reprocess anything. Each thing is processed once only. The expansion will process the things that are not backslashed and not process anything else. Subsequent backslashed things are processed later. So everything is processed only once, but at different times. So it is never reprocessed, just delayed until later. Let me try a step-by-step example of how I perceive things at this stage (the "match" isn't really necessary in this case, just added it so you can comment on whether it's epxanded this way ![]() Code: ;<--- step1 starts at beginning of input foobar: macro macro_level2 arg2 { push arg2 call foobar } macro macro_level1 [arg] { reverse match any, arg \{ mov eax, any macro_level2 any mov ebx, eax \} } insertion_point: macro_level1 0x10, 0x20 ; <--- causes instantiation ;***> input below not processed by step 1 after_instantiation: ret When preprocessor instantiates macro_level1 at insertion_point, I expect the following output (excluding my comments, of course). From here on, I won't include the stuff before insertion_point to cut down on post size, but the rest is obviously still part of the (current state of) pre-processed input. Code: insertion_point: ;<--- step 2, preprocessor continues here. match any, 0x20 { ; <--- causes instantiation mov eax, any macro_level2 any mov ebx, eax } ;***> input below not processed by step 2 match any, 0x10 { mov eax, any macro_level2 any mov ebx, eax } ;---> step 1 generated code above this line, and didn't touch/see anything below after_instantiation: ret Above, we have the current pre-processed input generated by step 1, ready to be processed by step 2. The preprocessor has instantiated macro_level1: iterating backwards through the macro arguments, replacing 'arg' symbolically, stripping one level of nesting-backslashes. The previous step has generated the output between <--- --->, the current step starts processing at <---, and goes no further than ***>. As can be seen, each step continues right where the previous step caused instantiation - (obviously ![]() First thing that step 2 sees is the match block. Instantiate it, and we get the following: Code: insertion_point: ;<--- step 3, preprocessor continues here mov eax, 0x20 ; <--- (re-)processed but doesn't trigger anything macro_level2 0x20 ; <--- causes instantiation ;***> input below not processed by step 3 mov ebx, eax ;---> step 2 generated code above this line, and didn't touch/see anything below match any, 0x10 { mov eax, any macro_level2 any mov ebx, eax } after_instantiation: ret In step 3, the preprocessor does process the "mov eax, 0x30", but obviously doesn't result in any symbol substitution or macro instantiation. We hit the first macro_level2 instantiation: Code: insertion_point: mov eax, 0x20 ;<--- step 4, preprocessor continues here push 0x20 call foobar ;---> step 3 generated code above this line, and didn't touch/see anything below mov ebx, eax match any, 0x10 { ; <--- causes instantiation mov eax, any macro_level2 any mov ebx, eax } ;***> input below not processed by step 4 after_instantiation: ret Step 4 is the first time the preprocessor doesn't start right after the insertion_point label - the input for the first argument to the macro_level1 is finally pre-processed, but the preprocessor still starts at the point where it instantiated macro_level2. Continuing, expanding the second match block: Code: insertion_point: mov eax, 0x20 push 0x20 call foobar mov ebx, eax ;<--- step 5, preprocessor continues here mov eax, 0x10 macro_level2 0x10 ; <--- causes instantiation ;***> input below not processed by step 5 mov ebx, eax ;---> step 4 generated code above this line, and didn't touch/see anything below after_instantiation: ret Soldiering on, Code: insertion_point: mov eax, 0x20 push 0x20 call foobar mov ebx, eax mov eax, 0x10 ;<--- step 6, preprocessor continues here push 0x10 call foobar ;---> step 5 generated code above this line, and didn't touch/see anything below mov ebx, eax after_instantiation: ret Step 6 is the final step, since we (finally!) don't hit any macro instantiations. The above is assuming the preprocessor is going to do an "expand-and-reparse-from-insertion-point" for each of the match blocks. Perhaps it could handle match blocks directly without a reparse? revolution wrote: Maybe this is just a semantics thing, but I do feel it is an important distinction that does not qualify it for being called multi-pass. ![]() Btw, I've also tried to figure out what happens when the preprocessor sees a macro definition, as opposed to an instantiation. As far as I can tell (from playing with nested macros, overriding them, and purge), when a top-level macro is found it is "remembered" (I assume something alone the lines of {name, startptr, endptr} is stored) - and a stack is used so we can purge and get back old version. Inner macros aren't recognized in this step. One a macro with nested macros instantiated, the nested macro is part of the instantiation, and has one level of nesting-slashes removed. Because the preprocessor continues from the beginning of instantiated output, it now sees the inner macro, and treats it exactly like any other top-level macro. ...I think that's what I had to say/ask about the preprocessor for now ![]() revolution wrote: Tomasz likes to implement changes in his own way. Even my while submission was rewritten by Tomasz before he included it into "his" source. Nothing wrong with that but it does show that even if you write a complete solution that Tomasz would likely want to spend time to make it his own before including it. _________________ carpe noctem |
|||
![]() |
|
ouadji 22 Jan 2010, 23:31
all definitions "extrn __imp__x@y" to "ntoskrnl.exe" and "hal.dll" Last edited by ouadji on 21 Sep 2010, 20:21; edited 1 time in total |
|||
![]() |
|
baldr 23 Jan 2010, 01:18
ouadji,
Include files like those can be generated almost automatically from corresponding import libraries or DLL themselves. With symbol server configured, DUMPBIN lists both names, undecorated and decorated: Code: Microsoft (R) COFF/PE Dumper Version 9.00.21022.08 Copyright (C) Microsoft Corporation. All rights reserved. Dump of file HAL.DLL File Type: DLL Section contains the following exports for HAL.dll 00000000 characteristics 41107628 time date stamp Wed Aug 04 08:37:44 2004 0.00 version 1 ordinal base 92 number of functions 92 number of names ordinal hint RVA name 1 0 00002744 ExAcquireFastMutex = @ExAcquireFastMutex@4 2 1 00002778 ExReleaseFastMutex = @ExReleaseFastMutex@4 3 2 000027A0 ExTryToAcquireFastMutex = @ExTryToAcquireFastMutex@4 20 3 000074C6 HalAcquireDisplayOwnership = _HalAcquireDisplayOwnership@4 21 4 00006B7C HalAdjustResourceList = @HalSystemVectorDispatchEntry@12 22 5 0001D564 HalAllProcessorsStarted = _HalAllProcessorsStarted@0 Code: Dump of file hal.lib
File Type: LIBRARY
Exports
ordinal name
@ExAcquireFastMutex@4
@ExReleaseFastMutex@4
@ExTryToAcquireFastMutex@4
@HalClearSoftwareInterrupt@4
@HalRequestSoftwareInterrupt@4
@HalSystemVectorDispatchEntry@12
@KeAcquireInStackQueuedSpinLock@8 |
|||
![]() |
|
revolution 23 Jan 2010, 03:23
f0dder: Your description above is really good. And best of all I think it is correct also. You are now the forum's resident expert on macros.
|
|||
![]() |
|
f0dder 23 Jan 2010, 09:09
revolution wrote: f0dder: Your description above is really good. And best of all I think it is correct also. revolution wrote: You are now the forum's resident expert on macros. ![]() ![]() _________________ carpe noctem |
|||
![]() |
|
ouadji 23 Jan 2010, 11:01
Quote:
in the file ".obj", yes, all references are present ... but after editing the links (MS linker), in the file ".sys", only used symbols are referenced. (the others are canceled, I checked with IDA) ... in that case, why use "if used / end if" ? thank you |
|||
![]() |
|
baldr 23 Jan 2010, 16:32
ouadji,
if used is there to emulate MASM externdef functionality (only that part regarding unused undefined externdefs). Full NTDLL export (~1k3 functions) grows .Obj by ~62k. And there are DLLs with 9k+ exports. Include file is only read when source changes, linker reads object file regardless of that. |
|||
![]() |
|
f0dder 24 Jan 2010, 15:34
baldr: I agree that you shouldn't emit the externs unless used, but out of curiosity: can you feel a link-time speed hit from always emitting everything?
|
|||
![]() |
|
baldr 24 Jan 2010, 18:40
f0dder,
I seldom compile to COFF from FASM sources (thus don't have sufficient experience with object files containing unused external references) and didn't dig into LINK code deep enough to estimate the slowdown. It's just a rule of thumb: eliminate unnecessary things early, you don't know how they will affect later. Well, some rough tests could be done. Results follows: 1. LINK seems to check that even unused extrns are present in the libraries. 2. Here is some timings (inaccurate, I used %TIME%): Code: With unused extrns 20:32:58,98 20:32:59,34 36 20:32:59,34 20:32:59,70 36 20:32:59,71 20:33:00,06 35 20:33:00,07 20:33:00,42 35 20:33:00,43 20:33:00,79 36 20:33:00,81 20:33:01,15 34 20:33:01,17 20:33:01,51 34 20:33:01,53 20:33:01,93 40 20:33:01,95 20:33:02,29 34 20:33:02,31 20:33:02,65 34 Only used extrns 20:33:02,70 20:33:02,75 5 20:33:02,75 20:33:02,78 3 20:33:02,79 20:33:02,82 3 20:33:02,84 20:33:02,87 3 20:33:02,89 20:33:02,92 3 20:33:02,93 20:33:02,96 3 20:33:02,98 20:33:03,01 3 20:33:03,03 20:33:03,06 3 20:33:03,06 20:33:03,10 4 20:33:03,10 20:33:03,14 4 |
|||
![]() |
|
f0dder 25 Jan 2010, 00:25
baldr wrote: It's just a rule of thumb: eliminate unnecessary things early, you don't know how they will affect later. ![]() |
|||
![]() |
|
bitRAKE 26 Jan 2010, 06:17
I've used a variety of solutions, but this one works best, imho:
Code: ; only the first use causes EXTRN definition macro __imp_ n { match =n,n \{ display '__imp_' # `n,13,10 extrn __imp_\#n :QWORD n equ __imp_\#n \} call [n] } Another method involves the assembler passes. Using all the bells and whistles during build. |
|||
![]() |
|
Goto page Previous 1, 2 < Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2025, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.