flat assembler
Message board for the users of flat assembler.
Index
> High Level Languages > C++ and ASM routine. Goto page Previous 1, 2, 3, 4, 5 Next |
Author |
|
revolution 23 Oct 2010, 06:49
Well we haven't seen you full code. But I assume that you did not use the return value so the compiler thought it would be smart to not even bother to include the instruction mov eax,4. When you put volatile then you tell the compiler that "someone else needs this value so don't forget to set it properly".
|
|||
23 Oct 2010, 06:49 |
|
vid 23 Oct 2010, 10:15
Quote: So, int a = test() is not enough you are saying? Let me see what happens if I do int a = test(); c = a * b; Heh... and did you use "c"? By "use" I mean something that compiler can not discover is efectless part of your app. Best way to test optimization of procedure is to compile the procedure into separate object file, and then disassemble that object file. But that can't be done with olly, unfortunatelly. |
|||
23 Oct 2010, 10:15 |
|
DarkAlchemist 23 Oct 2010, 16:01
revolution wrote: Well we haven't seen you full code. But I assume that you did not use the return value so the compiler thought it would be smart to not even bother to include the instruction mov eax,4. When you put volatile then you tell the compiler that "someone else needs this value so don't forget to set it properly". I just tried int c = a * 2; and the routine is still just RETN 10;" |
|||
23 Oct 2010, 16:01 |
|
DarkAlchemist 23 Oct 2010, 16:03
vid wrote:
As far as using it printf should have sufficed for that. I wish Olly could use the obj file. |
|||
23 Oct 2010, 16:03 |
|
vid 23 Oct 2010, 21:02
I think this is communication problem, "I tried this" and "I changed that" might not be entirely clear. Please post full C code (+ command line used to compile it) for case where compiler outputs something else than you would expect. That should be clear enough.
|
|||
23 Oct 2010, 21:02 |
|
DarkAlchemist 23 Oct 2010, 21:40
Of which one as there has been many different things tried in this thread with the outcome always being the same (except when I used volatile).
As far as command line used to compile them here ya go Code: /O1 /Oi /GL /D "WIN32" /D "NDEBUG" /D "_WINDOWS" /GF /FD /EHsc /MT /GS- /Fo"Release\\" /Fd"Release\vc90.pdb" /W3 /nologo /c /Zi /TP /errorReport:prompt |
|||
23 Oct 2010, 21:40 |
|
vid 23 Oct 2010, 22:26
Any that demonstrates compiler producing something else than you expect. But please, full source (so I can just paste it to file and compile with your command line). Also please describe how specifically does compiled executable differ from what you expected. With this info all in one place, I think we can quickly see what is source of this misunderstanding.
|
|||
23 Oct 2010, 22:26 |
|
Tyler 23 Oct 2010, 23:18
/O1 looks like an optimisation flag. They cause compilers to cut corners and make possibly false assumptions on your intentions whenever possible. The use of volatile makes your intentions clear and stops the compiler from making assumptions in cases where they could result in incorrect behaviour.
What was the result of using volatile? |
|||
23 Oct 2010, 23:18 |
|
vid 23 Oct 2010, 23:41
Compiler can make "false assumptions", but only in regard to choosing suboptimal representation of program entered by user. It can never change what the code does. And I'd be much more worried about /GL than /O1.
|
|||
23 Oct 2010, 23:41 |
|
DarkAlchemist 24 Oct 2010, 07:00
I think it was because I didn't really use it but the compiler was making a nasty assumption since I assigned the return value to a variable. If I used volatile it was forced to allow it otherwise it simply ignored it.
Code: #include <windows.h> #include <stdio.h> #include <stdlib.h> #include <time.h> int test() { srand ( time(NULL) ); return (rand() % 10 + 1); } int WINAPI WinMain(HINSTANCE hInstance,HINSTANCE hPrevInstance,LPSTR lpCmdLine,int nCmdShow) { int a = test(); printf("%d",a); } Code: CPU Disasm Address Hex dump Command Comments 00401000 /$ 6A 00 PUSH 0 ; test.00401000(guessed Arg1,Arg2,Arg3,Arg4) 00401002 |. E8 1C010000 CALL 00401123 00401007 |. 50 PUSH EAX 00401008 |. E8 E2000000 CALL 004010EF 0040100D |. E8 EF000000 CALL 00401101 ; [test.00401101 00401012 |. 6A 0A PUSH 0A 00401014 |. 59 POP ECX 00401015 |. 99 CDQ 00401016 |. F7F9 IDIV ECX 00401018 |. 42 INC EDX 00401019 |. 52 PUSH EDX 0040101A |. 68 6CB34000 PUSH OFFSET 0040B36C ; ASCII "%d" 0040101F |. E8 06000000 CALL 0040102A 00401024 |. 83C4 10 ADD ESP,10 00401027 \. C2 1000 RETN 10 Code: #include <windows.h> #include <stdio.h> #include <stdlib.h> #include <time.h> int test() { srand ( time(NULL) ); return (rand() % 10 + 1); } int WINAPI WinMain(HINSTANCE hInstance,HINSTANCE hPrevInstance,LPSTR lpCmdLine,int nCmdShow) { int a = test(); //printf("%d",a); } Code: CPU Disasm Address Hex dump Command Comments 00401000 /$ 6A 00 PUSH 0 ; test.00401000(guessed Arg1,Arg2,Arg3,Arg4) 00401002 |. E8 44000000 CALL 0040104B 00401007 |. 50 PUSH EAX ; /Arg1 00401008 |. E8 0A000000 CALL 00401017 ; \test.00401017 0040100D |. 59 POP ECX 0040100E |. 59 POP ECX 0040100F |. E8 15000000 CALL 00401029 00401014 \. C2 1000 RETN 10 All of the above is just the test section before the return with a value. |
|||
24 Oct 2010, 07:00 |
|
vid 24 Oct 2010, 11:10
Quote: I think it was because I didn't really use it but the compiler was making a nasty assumption since I assigned the return value to a variable. If I used volatile it was forced to allow it otherwise it simply ignored it. There is nothing nasty about ignoring unused variable. If variable is used by something else that compiler doesn't know about, it is your job to mark it with "volatile", so value is always stored in memory. If not marked with "volatile", the variable is only used by code that compiler sees, and so compiler can safely decide to ignore it when possible. Imagine the performance hit if every variable access in C would have to go to/from memory (not just in register). That's what you would get if every variable was assumed to be "volatile". Quote: What catches my eye is all of those calls because there is nothing to call since this is the section that was called and printf was commented out. Those calls are time(), srand() and rand(). time() is required for srand(), and srand()+rand() have other side effects besides returning value, so even though compiler properly ignores their value, they still have to be called in order for functionality remain same. By the way, inlining the test() function into WinMain() was asked for by /GL option, if that bugged you too, I am not sure now. Anything else you don't understand about output of compiler? |
|||
24 Oct 2010, 11:10 |
|
f0dder 24 Oct 2010, 15:48
DarkAlchemist: please DON'T mark your function as static, as that means the compiler knows you can only call it from within the current source code file, and can do a lot of optimizations it otherwise couldn't - like knowledge of which code paths will be taken through a function based on it's input parameters.
The same goes for LTCG, since it has a link-time full overview of your modules. Instead of compiling and linking to a full app, do a compile-only pass, and either use /FAs to generate (ugly) asm listing, or disassemble the output .obj file. |
|||
24 Oct 2010, 15:48 |
|
DarkAlchemist 24 Oct 2010, 18:39
Vid, yes I did not understand why it seemed to be inlining my function.
As far as the compile options they are the ones I always use (except forcing my exe to always load at the same address but this was just an experiment). For the fastest and tightest code what should the compile options be that will still work with any FASM code I write? Remember I do not do this via command line I do it inside Visualstudio but that shouldn't matter. |
|||
24 Oct 2010, 18:39 |
|
f0dder 24 Oct 2010, 19:17
I'd suggest tweaking the options as little as possible, really - go for "optimize for size" or "optimize for speed", and stick with that across your VS solution. Make a release build target that does LTCG, and one that doesn't; LTCG can take a lot of time for larger projects, and for some bugs you need to do edit/compile/test cycles both in debug and release mode.
For specific modules that are speed-critical, you can play around with tweaking the settings, and see what gains you can get - or you can write the stuff in assembly When looking at code generation, again, follow the advice I gave earlier; no LTCG, don't "static"... prefer a single function in a single source file with a compile-only build, and look at listing or disassembly. |
|||
24 Oct 2010, 19:17 |
|
DarkAlchemist 24 Oct 2010, 19:30
When you say "don't static" which parameter did I use that did that? If it is the one forcing the exe to a specific location that was only for ease of looking at the code (the code never changed using this option) since it was going all over the place each time I compiled and is not something I normally do.
|
|||
24 Oct 2010, 19:30 |
|
f0dder 24 Oct 2010, 19:41
My bad, I misread the following:
DarkAlchemist wrote: The subroutine resides at 401000 as I have it forced to be static for this testing I thought you meant you added the static keyword to the function definition, which means it can only be used inside the current source module (and thus lets the compiler optimize more aggressively) - I take it that what you meant is you build a FIXED executable, ie one without relocations? Btw, "going all over the place" should only happen at runtime, and is a decent security enhancement known as ASLR. It's another reason why you should look at disassemblies or generated assembly listings rather than inspecting runtime code As for the previous "1, 2, 3, 4 doesn't show up in the assembly output", can you show the minimal amount of C code plus compiler settings to make that happen? (following the advice I gave earlier on only putting that function in your .c file, and doing a compile-only build). |
|||
24 Oct 2010, 19:41 |
|
vid 24 Oct 2010, 19:50
Quote: For the fastest and tightest code what should the compile options be that will still work with any FASM code I write? Remember I do not do this via command line I do it inside Visualstudio but that shouldn't matter. ANY options should work with FASM-written code. C only optimizes code it has, but still accesses external .objs in standard way. |
|||
24 Oct 2010, 19:50 |
|
DarkAlchemist 24 Oct 2010, 20:21
f0dder wrote: My bad, I misread the following: Disable Image Randomization (/DYNAMICBASE:NO) To me that is forcing a static load address. Most programs I load in Olly load at 40000 not randomized (dynamic) each time I load them in Olly as mine was. f0dder wrote: Btw, "going all over the place" should only happen at runtime, and is a decent security enhancement known as ASLR. It's another reason why you should look at disassemblies or generated assembly listings rather than inspecting runtime code Quote: Address space randomization hinders some types of security attacks by making it more difficult for an attacker to predict target addresses. For example, attackers trying to execute return-to-libc attacks must locate the code to be executed, while other attackers trying to execute shellcode injected on the stack have to first find the stack. In both cases, the related memory addresses are obscured from the attackers. These values have to be guessed, and a mistaken guess is not usually recoverable due to the application crashing. f0dder wrote: As for the previous "1, 2, 3, 4 doesn't show up in the assembly output", can you show the minimal amount of C code plus compiler settings to make that happen? (following the advice I gave earlier on only putting that function in your .c file, and doing a compile-only build). |
|||
24 Oct 2010, 20:21 |
|
f0dder 24 Oct 2010, 20:57
DarkAlchemist wrote: I prefer to look at the loaded code to see exactly what is happening not what should, or could, be happening. As far as security goes DarkAlchemist wrote: in Windows I can find where a program is at virtually and is actually at in memory so this relocation randomization stuff isn't much of a security enhancement. DarkAlchemist wrote: I don't think it is needed, now, because I wasn't "really using it" because the compile saw that I was assigning int a to the return value but that was not enough to force it to do it I actually had to use the integer variable 'A' before it threw out a piece of code that wasn't very optimized. The fact that it pushed ecx then stored 4 on the stack and pop ecx instead of just mov ecx, 4. Not very optimized at the minimum. |
|||
24 Oct 2010, 20:57 |
|
Goto page Previous 1, 2, 3, 4, 5 Next < Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2025, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.