flat assembler
Message board for the users of flat assembler.
Index
> Windows > extrn definitions in ms coff object file Goto page Previous 1, 2 |
Author |
|
revolution 22 Jan 2010, 15:51
f0dder wrote: Do you (or even better, Tomasz ) suppose such an evaluate-operator could be implemented in FASM-1.x? And can anybody think of any gotchas, possible inconsistencies, etc? f0dder wrote: OK, let me see if I understand this correctly, then... without the inner nesting-slashes, would the preprocessor first expand "extrn '__imp__'\#`func\#'A@'\#\`p as func:dword" to "extrn '__imp__FooBarA@'`p as func:dword', before entering the match block? (Not sure how it'd expand `p when not being in the match block). f0dder wrote: I suppose this leads to a question on how the preprocessor works (I didn't grok the explanation in the fasm docs fully, but I guess reading it at 4am didn't help ). Does it work in multiple passes, until it has done a pass that didn't lead to any expansions? In my mind it would be doing... recursive descent(?) every time a macro call is done. f0dder wrote: Btw, a "stringify" operator could be very helpful when debugging macros - this might sound similar to the `-operator, but not exactly. Consider the following example: Code: macro DI [instr] { common local string,last last equ ',' irps symbol,instr \{ string equ match =+,symbol \\{string equ '+'\\} match =-,symbol \\{string equ '-'\\} match =*,symbol \\{string equ '*'\\} match =/,symbol \\{string equ '/'\\} match ==,symbol \\{string equ '='\\} match =<,symbol \\{string equ '<'\\} match =>,symbol \\{string equ '>'\\} match =(,symbol \\{string equ '('\\} match =),symbol \\{string equ ')'\\} match =[,symbol \\{string equ '['\\} match =],symbol \\{string equ ']'\\} match =:,symbol \\{string equ ':'\\} match =,,symbol \\{string equ ','\\} match =|,symbol \\{string equ '|'\\} match =&,symbol \\{string equ '&'\\} match =~,symbol \\{string equ '~'\\} match =RVA,symbol \\{string equ 'RVA'\\} match =rva,symbol \\{string equ 'rva'\\} ;match ={,symbol \\{string equ '{'\\} ;match =},symbol \\{string equ '}'\\} ;match =#,symbol \\{string equ '#'\\} ;match =`,symbol \\{string equ '`'\\} ;match =',symbol \\{string equ "'"\\} ;match =",symbol \\{string equ '"'\\} match ,string last \\{display ' '\\} last equ string match any,string \\{display string\\} match ,string \\{ if symbol eqtype '' display "'"\#\`symbol\#"'" else display \`symbol end if \\} \} } Code: macro ShowEquContents var{match x,var\{DI x\}} |
|||
22 Jan 2010, 15:51 |
|
f0dder 22 Jan 2010, 16:17
revolution wrote: No. The outer macro will do all ` and # things it sees (in this case just the `func) and strip one layer of backslashes and then pass on to the next layer in turn. So you get "extrn '__imp__'#'Function0'#'A@'#`p as Function0:dword" and then the match will see the #'s and the `p and do those. revolution wrote: It is single pass. Just expand and generate code into a buffer. Once expanded, process the buffer and generate more code as each layer of macrodom is stripped off. So, when processing a macro block, first the entire block is scanned for things the preprocessor handles, then inner macro invocations (including irp/match/...) are scanned for and expanded? I'd consider that multi-pass? This leads to yet another question - when fasm sees a macro declaration, does it expand inner macros already then? Or does it defer that to macro instatiation time? revolution wrote: Here is my DI macro I used when learning about all this stuff. I think I have posted it before but here it is again. |
|||
22 Jan 2010, 16:17 |
|
revolution 22 Jan 2010, 16:29
Multi pass would mean it re-processes the same data again. But it does not do that. A macro is expanded when instantiated once only. After expansion, the resulting code is then processed as per normal back into the main program loop. If it so happens that there is an embedded macro it will be discovered only after the outer macro has been fully expanded and expansion is finalised.
And yes, any improvement in string handling would be a nice addition. |
|||
22 Jan 2010, 16:29 |
|
f0dder 22 Jan 2010, 17:17
revolution wrote: Multi pass would mean it re-processes the same data again. But it does not do that. A macro is expanded when instantiated once only. After expansion, the resulting code is then processed as per normal back into the main program loop. If it so happens that there is an embedded macro it will be discovered only after the outer macro has been fully expanded and expansion is finalised. 1. notice where macro instantiation is done. 2. expand macro but don't expand inner macro instantiations. 3. continue preprocessing from address saved in #1, not the line after the original invocation. While this obviously isn't going to re-preprocess the entire source, I still consider it a multi-pass approach, since you're processing much of the same macro code multiple times (of course it's not THE same code being re-processed, since expansions have been done ). And it's probably easier implementing this way than "descending into" nested macro calls while instantiating... and it does work fine in practice, you just have to be careful to remember those nesting-slashes (wasn't obvious to me they'd be needed in rept/match etc, apart from the close-brackets, but it does make sense if fasm basically follows the method outlined above). revolution wrote: And yes, any improvement in string handling would be a nice addition. _________________ - carpe noctem |
|||
22 Jan 2010, 17:17 |
|
revolution 22 Jan 2010, 17:37
f0dder wrote: If I understand you correctly, doesn't this mean that upon seeing a macro instantiation, basically the following happens: f0dder wrote: While this obviously isn't going to re-preprocess the entire source, I still consider it a multi-pass approach, since you're processing much of the same macro code multiple times (of course it's not THE same code being re-processed, since expansions have been done ). f0dder wrote: What's Tomasz' position, these days, on accepting code from others into the mainline FASM codebase? Better (as in, any ) string handling would be nice, but Tomasz probably have more important things to work on... I'm not volunteering though, the lack of documentation (or even comments) of the source is pretty daunting - and I didn't get a chance to copy Tomasz' black internals notebook back at the 2007 fasmcom |
|||
22 Jan 2010, 17:37 |
|
f0dder 22 Jan 2010, 23:31
revolution wrote: No, it doesn't reprocess anything. Each thing is processed once only. The expansion will process the things that are not backslashed and not process anything else. Subsequent backslashed things are processed later. So everything is processed only once, but at different times. So it is never reprocessed, just delayed until later. Let me try a step-by-step example of how I perceive things at this stage (the "match" isn't really necessary in this case, just added it so you can comment on whether it's epxanded this way ): Code: ;<--- step1 starts at beginning of input foobar: macro macro_level2 arg2 { push arg2 call foobar } macro macro_level1 [arg] { reverse match any, arg \{ mov eax, any macro_level2 any mov ebx, eax \} } insertion_point: macro_level1 0x10, 0x20 ; <--- causes instantiation ;***> input below not processed by step 1 after_instantiation: ret When preprocessor instantiates macro_level1 at insertion_point, I expect the following output (excluding my comments, of course). From here on, I won't include the stuff before insertion_point to cut down on post size, but the rest is obviously still part of the (current state of) pre-processed input. Code: insertion_point: ;<--- step 2, preprocessor continues here. match any, 0x20 { ; <--- causes instantiation mov eax, any macro_level2 any mov ebx, eax } ;***> input below not processed by step 2 match any, 0x10 { mov eax, any macro_level2 any mov ebx, eax } ;---> step 1 generated code above this line, and didn't touch/see anything below after_instantiation: ret Above, we have the current pre-processed input generated by step 1, ready to be processed by step 2. The preprocessor has instantiated macro_level1: iterating backwards through the macro arguments, replacing 'arg' symbolically, stripping one level of nesting-backslashes. The previous step has generated the output between <--- --->, the current step starts processing at <---, and goes no further than ***>. As can be seen, each step continues right where the previous step caused instantiation - (obviously ) not after the instantiated block. This is why I say it's doing "multiple passes" - it does one level of expansion, then processes the expanded output. First thing that step 2 sees is the match block. Instantiate it, and we get the following: Code: insertion_point: ;<--- step 3, preprocessor continues here mov eax, 0x20 ; <--- (re-)processed but doesn't trigger anything macro_level2 0x20 ; <--- causes instantiation ;***> input below not processed by step 3 mov ebx, eax ;---> step 2 generated code above this line, and didn't touch/see anything below match any, 0x10 { mov eax, any macro_level2 any mov ebx, eax } after_instantiation: ret In step 3, the preprocessor does process the "mov eax, 0x30", but obviously doesn't result in any symbol substitution or macro instantiation. We hit the first macro_level2 instantiation: Code: insertion_point: mov eax, 0x20 ;<--- step 4, preprocessor continues here push 0x20 call foobar ;---> step 3 generated code above this line, and didn't touch/see anything below mov ebx, eax match any, 0x10 { ; <--- causes instantiation mov eax, any macro_level2 any mov ebx, eax } ;***> input below not processed by step 4 after_instantiation: ret Step 4 is the first time the preprocessor doesn't start right after the insertion_point label - the input for the first argument to the macro_level1 is finally pre-processed, but the preprocessor still starts at the point where it instantiated macro_level2. Continuing, expanding the second match block: Code: insertion_point: mov eax, 0x20 push 0x20 call foobar mov ebx, eax ;<--- step 5, preprocessor continues here mov eax, 0x10 macro_level2 0x10 ; <--- causes instantiation ;***> input below not processed by step 5 mov ebx, eax ;---> step 4 generated code above this line, and didn't touch/see anything below after_instantiation: ret Soldiering on, Code: insertion_point: mov eax, 0x20 push 0x20 call foobar mov ebx, eax mov eax, 0x10 ;<--- step 6, preprocessor continues here push 0x10 call foobar ;---> step 5 generated code above this line, and didn't touch/see anything below mov ebx, eax after_instantiation: ret Step 6 is the final step, since we (finally!) don't hit any macro instantiations. The above is assuming the preprocessor is going to do an "expand-and-reparse-from-insertion-point" for each of the match blocks. Perhaps it could handle match blocks directly without a reparse? revolution wrote: Maybe this is just a semantics thing, but I do feel it is an important distinction that does not qualify it for being called multi-pass. Btw, I've also tried to figure out what happens when the preprocessor sees a macro definition, as opposed to an instantiation. As far as I can tell (from playing with nested macros, overriding them, and purge), when a top-level macro is found it is "remembered" (I assume something alone the lines of {name, startptr, endptr} is stored) - and a stack is used so we can purge and get back old version. Inner macros aren't recognized in this step. One a macro with nested macros instantiated, the nested macro is part of the instantiation, and has one level of nesting-slashes removed. Because the preprocessor continues from the beginning of instantiated output, it now sees the inner macro, and treats it exactly like any other top-level macro. ...I think that's what I had to say/ask about the preprocessor for now revolution wrote: Tomasz likes to implement changes in his own way. Even my while submission was rewritten by Tomasz before he included it into "his" source. Nothing wrong with that but it does show that even if you write a complete solution that Tomasz would likely want to spend time to make it his own before including it. _________________ - carpe noctem |
|||
22 Jan 2010, 23:31 |
|
ouadji 22 Jan 2010, 23:31
all definitions "extrn __imp__x@y" to "ntoskrnl.exe" and "hal.dll" Last edited by ouadji on 21 Sep 2010, 20:21; edited 1 time in total |
|||
22 Jan 2010, 23:31 |
|
baldr 23 Jan 2010, 01:18
ouadji,
Include files like those can be generated almost automatically from corresponding import libraries or DLL themselves. With symbol server configured, DUMPBIN lists both names, undecorated and decorated: Code: Microsoft (R) COFF/PE Dumper Version 9.00.21022.08 Copyright (C) Microsoft Corporation. All rights reserved. Dump of file HAL.DLL File Type: DLL Section contains the following exports for HAL.dll 00000000 characteristics 41107628 time date stamp Wed Aug 04 08:37:44 2004 0.00 version 1 ordinal base 92 number of functions 92 number of names ordinal hint RVA name 1 0 00002744 ExAcquireFastMutex = @ExAcquireFastMutex@4 2 1 00002778 ExReleaseFastMutex = @ExReleaseFastMutex@4 3 2 000027A0 ExTryToAcquireFastMutex = @ExTryToAcquireFastMutex@4 20 3 000074C6 HalAcquireDisplayOwnership = _HalAcquireDisplayOwnership@4 21 4 00006B7C HalAdjustResourceList = @HalSystemVectorDispatchEntry@12 22 5 0001D564 HalAllProcessorsStarted = _HalAllProcessorsStarted@0 Code: Dump of file hal.lib
File Type: LIBRARY
Exports
ordinal name
@ExAcquireFastMutex@4
@ExReleaseFastMutex@4
@ExTryToAcquireFastMutex@4
@HalClearSoftwareInterrupt@4
@HalRequestSoftwareInterrupt@4
@HalSystemVectorDispatchEntry@12
@KeAcquireInStackQueuedSpinLock@8 |
|||
23 Jan 2010, 01:18 |
|
revolution 23 Jan 2010, 03:23
f0dder: Your description above is really good. And best of all I think it is correct also. You are now the forum's resident expert on macros.
|
|||
23 Jan 2010, 03:23 |
|
f0dder 23 Jan 2010, 09:09
revolution wrote: f0dder: Your description above is really good. And best of all I think it is correct also. revolution wrote: You are now the forum's resident expert on macros. _________________ - carpe noctem |
|||
23 Jan 2010, 09:09 |
|
ouadji 23 Jan 2010, 11:01
Quote:
in the file ".obj", yes, all references are present ... but after editing the links (MS linker), in the file ".sys", only used symbols are referenced. (the others are canceled, I checked with IDA) ... in that case, why use "if used / end if" ? thank you |
|||
23 Jan 2010, 11:01 |
|
baldr 23 Jan 2010, 16:32
ouadji,
if used is there to emulate MASM externdef functionality (only that part regarding unused undefined externdefs). Full NTDLL export (~1k3 functions) grows .Obj by ~62k. And there are DLLs with 9k+ exports. Include file is only read when source changes, linker reads object file regardless of that. |
|||
23 Jan 2010, 16:32 |
|
f0dder 24 Jan 2010, 15:34
baldr: I agree that you shouldn't emit the externs unless used, but out of curiosity: can you feel a link-time speed hit from always emitting everything?
|
|||
24 Jan 2010, 15:34 |
|
baldr 24 Jan 2010, 18:40
f0dder,
I seldom compile to COFF from FASM sources (thus don't have sufficient experience with object files containing unused external references) and didn't dig into LINK code deep enough to estimate the slowdown. It's just a rule of thumb: eliminate unnecessary things early, you don't know how they will affect later. Well, some rough tests could be done. Results follows: 1. LINK seems to check that even unused extrns are present in the libraries. 2. Here is some timings (inaccurate, I used %TIME%): Code: With unused extrns 20:32:58,98 20:32:59,34 36 20:32:59,34 20:32:59,70 36 20:32:59,71 20:33:00,06 35 20:33:00,07 20:33:00,42 35 20:33:00,43 20:33:00,79 36 20:33:00,81 20:33:01,15 34 20:33:01,17 20:33:01,51 34 20:33:01,53 20:33:01,93 40 20:33:01,95 20:33:02,29 34 20:33:02,31 20:33:02,65 34 Only used extrns 20:33:02,70 20:33:02,75 5 20:33:02,75 20:33:02,78 3 20:33:02,79 20:33:02,82 3 20:33:02,84 20:33:02,87 3 20:33:02,89 20:33:02,92 3 20:33:02,93 20:33:02,96 3 20:33:02,98 20:33:03,01 3 20:33:03,03 20:33:03,06 3 20:33:03,06 20:33:03,10 4 20:33:03,10 20:33:03,14 4 |
|||
24 Jan 2010, 18:40 |
|
f0dder 25 Jan 2010, 00:25
baldr wrote: It's just a rule of thumb: eliminate unnecessary things early, you don't know how they will affect later. |
|||
25 Jan 2010, 00:25 |
|
bitRAKE 26 Jan 2010, 06:17
I've used a variety of solutions, but this one works best, imho:
Code: ; only the first use causes EXTRN definition macro __imp_ n { match =n,n \{ display '__imp_' # `n,13,10 extrn __imp_\#n :QWORD n equ __imp_\#n \} call [n] } Another method involves the assembler passes. Using all the bells and whistles during build. _________________ ¯\(°_o)/¯ “languages are not safe - uses can be” Bjarne Stroustrup |
|||
26 Jan 2010, 06:17 |
|
Goto page Previous 1, 2 < Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2025, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.