flat assembler
Message board for the users of flat assembler.

Index > High Level Languages > C++ and ASM routine.

Goto page Previous  1, 2, 3, 4, 5
Author
Thread Post new topic Reply to topic
DarkAlchemist



Joined: 08 Oct 2010
Posts: 108
DarkAlchemist 25 Oct 2010, 01:15
Are you talking about this code?
Code:
#include <windows.h>

int test()
{
       return 4;
}

int WINAPI WinMain(HINSTANCE hInstance,HINSTANCE hPrevInstance,LPSTR lpCmdLine,int nCmdShow)
{
int a = test();
}    


As I said int a was never used so the compiler calls test which just is one single line "RETN 10".
Code:
CPU Disasm
Address   Hex dump          Command                                  Comments
00401000   $  C2 1000       RETN 10                                  ; test.00401000(guessed Arg1,Arg2,Arg3,Arg4)
    
Do it like this
Code:
#include <windows.h>

int test()
{
        return 4;
}

int WINAPI WinMain(HINSTANCE hInstance,HINSTANCE hPrevInstance,LPSTR lpCmdLine,int nCmdShow)
{
volatile int a = test();
}    
in asm
Code:
CPU Disasm
Address   Hex dump          Command                                  Comments
00401000  /$  51            PUSH ECX                                 ; test.00401000(guessed Arg1,Arg2,Arg3,Arg4)
00401001  |.  C70424 040000 MOV DWORD PTR SS:[LOCAL.0],4
00401008  |.  59            POP ECX
00401009  \.  C2 1000       RETN 10    
This is weird code because it PUSHes ECX then writes 4 (the return) to the exact same spot as the PUSH is at then it POPs ECX which is now the 4 that it shoved on the stack overwriting the previous PUSH ECX. I asked it to compile the smallest code which should have been a simple "mov ecx, 4", right?
Post 25 Oct 2010, 01:15
View user's profile Send private message Send e-mail Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20451
Location: In your JS exploiting you and your system
revolution 25 Oct 2010, 01:31
Actually because of the inlining the compiler generates exactly what is expected.
Code:
push ecx ;this is not saving the value of ecx, but is allocating stack space. Could also be "sub esp,4", but this is an optimised version.
mov [esp],4 ;our volatile variable MUST be saved to memory, we told it to do this by using volatile
pop ecx ;this is merely to release our local stack storage, the ecx value is not a return value. Could also be "add esp,4", but this is an optimised version.
retn 0x10 ;remove callers parameters from stack, stdcall    
Post 25 Oct 2010, 01:31
View user's profile Send private message Visit poster's website Reply with quote
DarkAlchemist



Joined: 08 Oct 2010
Posts: 108
DarkAlchemist 25 Oct 2010, 02:53
That is surely not how any assembly language programmer would do the coding if they wrote the entire program in ASM. If they did then they would not use the stack, at least I would hope not, and just return eax, or w/e with the 4. Smaller and faster code than what MSVC is spitting out for sure.
Post 25 Oct 2010, 02:53
View user's profile Send private message Send e-mail Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20451
Location: In your JS exploiting you and your system
revolution 25 Oct 2010, 03:44
Well you forced it with the volatile keyword, the compiler HAD to do it because you said so. And the stack is the natural place to store local variables. In this instance, the compiler did it exactly as you asked for, and did it in the most efficient way possible.

And without the volatile keyword the compiler also did it exactly, and as efficiently as is possible, it made a simple 'retn 0x10'.
Post 25 Oct 2010, 03:44
View user's profile Send private message Visit poster's website Reply with quote
DarkAlchemist



Joined: 08 Oct 2010
Posts: 108
DarkAlchemist 25 Oct 2010, 04:12
I still do not call that efficient to store the 4 on the stack when I was asking for a return of 4. Storing it in EAX, for instance, and RETN 10 would have been better in an efficiency point of view.

As far as the simple retn 10 I do agree with you there.
Post 25 Oct 2010, 04:12
View user's profile Send private message Send e-mail Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20451
Location: In your JS exploiting you and your system
revolution 25 Oct 2010, 04:24
Volatile forces the compiler to store the value in memory, that is the purpose of volatile. The compiler had no choice. There is little point in returning anything in EAX since the function has been inlined into your winmain. The value 4 is only used within winmain and never returned to the OS upon exit.

[edit] Volatile is meant for situations where the value must be passed to memory. These situations usually occur when the address is in some external chip (an IO chip) and we can't simply throw away the value or else the IO chip does not get the right commands and your driver fails to work. However you told the compiler that your local stack variable absolutely must be stored to memory, so it did it, just as you commanded.


Last edited by revolution on 25 Oct 2010, 04:31; edited 1 time in total
Post 25 Oct 2010, 04:24
View user's profile Send private message Visit poster's website Reply with quote
Tyler



Joined: 19 Nov 2009
Posts: 1216
Location: NC, USA
Tyler 25 Oct 2010, 04:28
Why is it storing it in ecx? I thought stdcall uses eax.
Post 25 Oct 2010, 04:28
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20451
Location: In your JS exploiting you and your system
revolution 25 Oct 2010, 04:33
Tyler wrote:
Why is it storing it in ecx? I thought stdcall uses eax.
ECX is merely to adjust the stack pointer. The value 4 is not passed as a return value back to the caller (the OS). The original function was inlined so the 'return' value is only used inside winmain.
Post 25 Oct 2010, 04:33
View user's profile Send private message Visit poster's website Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20451
Location: In your JS exploiting you and your system
revolution 25 Oct 2010, 04:39
Here is something you can try:
Code:
#include <windows.h>

int test()
{
 return 4;
}

int WINAPI WinMain(HINSTANCE hInstance,HINSTANCE hPrevInstance,LPSTR lpCmdLine,int nCmdShow)
{
volatile int       a = test();
         a = 1;
              a = 2;
              a = 3;
              a = 4;
              a = 5;
              return 6;
}    
Now what does it assemble?
Post 25 Oct 2010, 04:39
View user's profile Send private message Visit poster's website Reply with quote
DarkAlchemist



Joined: 08 Oct 2010
Posts: 108
DarkAlchemist 25 Oct 2010, 04:58
revolution wrote:
Here is something you can try:
Code:
#include <windows.h>

int test()
{
  return 4;
}

int WINAPI WinMain(HINSTANCE hInstance,HINSTANCE hPrevInstance,LPSTR lpCmdLine,int nCmdShow)
{
volatile int       a = test();
         a = 1;
              a = 2;
              a = 3;
              a = 4;
              a = 5;
              return 6;
}    
Now what does it assemble?

Code:
CPU Disasm
Address   Hex dump          Command                                  Comments
00401000  /$  55            PUSH EBP                                 ; test.00401000(guessed Arg1,Arg2,Arg3,Arg4)
00401001  |.  8BEC          MOV EBP,ESP
00401003  |.  51            PUSH ECX
00401004  |.  6A 04         PUSH 4
00401006  |.  58            POP EAX
00401007  |.  8945 FC       MOV DWORD PTR SS:[LOCAL.1],EAX
0040100A  |.  C745 FC 01000 MOV DWORD PTR SS:[LOCAL.1],1
00401011  |.  C745 FC 02000 MOV DWORD PTR SS:[LOCAL.1],2
00401018  |.  C745 FC 03000 MOV DWORD PTR SS:[LOCAL.1],3
0040101F  |.  8945 FC       MOV DWORD PTR SS:[LOCAL.1],EAX
00401022  |.  6A 06         PUSH 6
00401024  |.  C745 FC 05000 MOV DWORD PTR SS:[LOCAL.1],5
0040102B  |.  58            POP EAX
0040102C  |.  C9            LEAVE
0040102D  \.  C2 1000       RETN 10    
Post 25 Oct 2010, 04:58
View user's profile Send private message Send e-mail Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20451
Location: In your JS exploiting you and your system
revolution 25 Oct 2010, 04:59
Looks good to me. Optimised for size also. And your return value (6) has been set.
Post 25 Oct 2010, 04:59
View user's profile Send private message Visit poster's website Reply with quote
DarkAlchemist



Joined: 08 Oct 2010
Posts: 108
DarkAlchemist 25 Oct 2010, 05:46
revolution wrote:
Looks good to me. Optimised for size also. And your return value (6) has been set.
Now, if you were to write that by hand, for an entirely ASM project, how would you have written it?
Post 25 Oct 2010, 05:46
View user's profile Send private message Send e-mail Reply with quote
baldr



Joined: 19 Mar 2008
Posts: 1651
baldr 25 Oct 2010, 06:00
DarkAlchemist wrote:
I still do not call that efficient to store the 4 on the stack when I was asking for a return of 4. Storing it in EAX, for instance, and RETN 10 would have been better in an efficiency point of view.
Didn't retn 10h ring the bell? This disassembly was for WinMain(), not test() (its call was inlined as mov dword[esp], 4).
Post 25 Oct 2010, 06:00
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20451
Location: In your JS exploiting you and your system
revolution 25 Oct 2010, 07:01
DarkAlchemist wrote:
Now, if you were to write that by hand, for an entirely ASM project, how would you have written it?
That is not really relevant. We already know HLL compilers are stupid. If you want to criticise the code it generates then go ahead, but we all already know that discussion. As far as HLL compilers go, that one did a pretty decent job. It certainly could have done much worse. But don't expect it to make perfect assembly, else you will be forever disappointed. Wink
Post 25 Oct 2010, 07:01
View user's profile Send private message Visit poster's website Reply with quote
f0dder



Joined: 19 Feb 2004
Posts: 3175
Location: Denmark
f0dder 25 Oct 2010, 07:56
Tyler wrote:
Why is it storing it in ecx? I thought stdcall uses eax.
Probably because the WinMain is supposed to return an int, but no return value was provided, and an implicit '0' will be returned... which means EAX can't be used for this purpose. If the code had done "return test()" instead of using the volatile variable, things would have been very different... like, "mov eax, 4" Smile

_________________
Image - carpe noctem
Post 25 Oct 2010, 07:56
View user's profile Send private message Visit poster's website Reply with quote
DarkAlchemist



Joined: 08 Oct 2010
Posts: 108
DarkAlchemist 25 Oct 2010, 15:21
revolution wrote:
DarkAlchemist wrote:
Now, if you were to write that by hand, for an entirely ASM project, how would you have written it?
That is not really relevant. We already know HLL compilers are stupid. If you want to criticise the code it generates then go ahead, but we all already know that discussion. As far as HLL compilers go, that one did a pretty decent job. It certainly could have done much worse. But don't expect it to make perfect assembly, else you will be forever disappointed. Wink
No, that is relevant from a perspective of wanting to learn and for those who can gleam that the compiler made this code while here is this code, that does the same thing, that is written slightly better.
Post 25 Oct 2010, 15:21
View user's profile Send private message Send e-mail Reply with quote
DarkAlchemist



Joined: 08 Oct 2010
Posts: 108
DarkAlchemist 25 Oct 2010, 15:22
f0dder wrote:
Tyler wrote:
Why is it storing it in ecx? I thought stdcall uses eax.
Probably because the WinMain is supposed to return an int, but no return value was provided, and an implicit '0' will be returned... which means EAX can't be used for this purpose. If the code had done "return test()" instead of using the volatile variable, things would have been very different... like, "mov eax, 4" Smile
Code:
#include <windows.h>

int test()
{
    return 4;
}

int WINAPI WinMain(HINSTANCE hInstance,HINSTANCE hPrevInstance,LPSTR lpCmdLine,int nCmdShow)
{
return test();
}    
became
Code:
CPU Disasm
Address   Hex dump          Command                                  Comments
00401000  /$  6A 04         PUSH 4                                   ; test.00401000(guessed Arg1,Arg2,Arg3,Arg4)
00401002  |.  58            POP EAX
00401003  \.  C2 1000       RETN 10    
Post 25 Oct 2010, 15:22
View user's profile Send private message Send e-mail Reply with quote
f0dder



Joined: 19 Feb 2004
Posts: 3175
Location: Denmark
f0dder 25 Oct 2010, 15:37
Compiler: Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 16.00.30319.01 for 80x86 (VS2010)

Code snippet below is the entire contents of the file.
Code:
int main()
{
   return 42;
}    


Optimized for speed:
cl /Ox /FAsc /Faret4.asm /c ret4.c wrote:
PUBLIC _main
; Function compile flags: /Ogtpy
; File e:\temp\ret4.c
_TEXT SEGMENT
_main PROC

; 3 : return 42;

00000 b8 2a 00 00 00 mov eax, 42

; 4 : }

00005 c3 ret 0
_main ENDP
_TEXT ENDS
END


Optimized for size:
cl /O1 /FAsc /Faret4.asm /c ret4.c wrote:
PUBLIC _main
; Function compile flags: /Ogspy
; File e:\temp\ret4.c
; COMDAT _main
_TEXT SEGMENT
_main PROC

; 3 : return 42;

00000 6a 2a push 42
00002 58 pop eax

; 4 : }

00003 c3 ret 0
_main ENDP
_TEXT ENDS
END


Points of interest: /Ox and /O1 are shorthands for other options. Speed optimization = 5 bytes, size optimization = 3 bytes.
Post 25 Oct 2010, 15:37
View user's profile Send private message Visit poster's website Reply with quote
DarkAlchemist



Joined: 08 Oct 2010
Posts: 108
DarkAlchemist 25 Oct 2010, 18:05
So, the sites that say to optimize for size normally produces faster code these days was full of bunk? Your examples says, at least in this instance, it was.

This has been a HUGE learning example for me because, as so many are, I only took what people said and not until I decided to launch back into ASM did I actually see the truth.

In this day and age does anyone compile for size?
Post 25 Oct 2010, 18:05
View user's profile Send private message Send e-mail Reply with quote
f0dder



Joined: 19 Feb 2004
Posts: 3175
Location: Denmark
f0dder 25 Oct 2010, 19:36
DarkAlchemist wrote:
So, the sites that say to optimize for size normally produces faster code these days was full of bunk? Your examples says, at least in this instance, it was.
It kinda depends.

The majority of normal code isn't running speed critical inner loops, and a lot of time will be spent block-waiting on external resources. Outside of those tight inner loops, you're probably better off optimizing for size. It might result in slower code, but
1) it doesn't matter if your code takes 1ms or 2ms if the next action is waiting hundreds of ms for socket or file data.
2) reducing size means (slightly) smaller code. Might not matter much wrt. filesize or memory consumption, but it means less L1 cache consumption.
Post 25 Oct 2010, 19:36
View user's profile Send private message Visit poster's website Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  
Goto page Previous  1, 2, 3, 4, 5

< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2025, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.