flat assembler
Message board for the users of flat assembler.

Index > Windows > Hello 64! - Any idea why this fails?

Goto page 1, 2  Next
Author
Thread Post new topic Reply to topic
drobole



Joined: 03 Nov 2010
Posts: 67
Location: Norway
drobole 20 Nov 2010, 18:25
Windows 7 64 bit seems to close this program unexpectedly

Code:
format PE64 GUI

entry main

section '.code' code readable executable

main: 
  mov r9d, 0       ; uType = MB_OK
  lea r8,  [mytit] ; LPCSTR lpCaption
  lea rdx, [mymsg] ; LPCSTR lpText
  mov rcx, 0       ; hWnd = HWND_DESKTOP
  call MessageBoxA
  mov ecx, eax     ; uExitCode = MessageBox(...)
  call ExitProcess

section '.data' data readable writeable 

  mytit db 'The 64-bit world of Windows & assembler...', 0
  mymsg db 'Hello World!', 0

section '.idata' import data readable writeable

  dd 0,0,0,RVA kernel_name,RVA kernel_table
  dd 0,0,0,RVA user_name,RVA user_table
  dd 0,0,0,0,0

  kernel_table:
    ExitProcess dq RVA _ExitProcess
    dq 0
  user_table:
    MessageBoxA dq RVA _MessageBoxA
    dq 0

  kernel_name db 'KERNEL32.DLL',0
  user_name db 'USER32.DLL',0

  _ExitProcess dw 0
    db 'ExitProcess',0
  _MessageBoxA dw 0
    db 'MessageBoxA',0
    


Any idea what could be the issue here?
Post 20 Nov 2010, 18:25
View user's profile Send private message Reply with quote
vid
Verbosity in development


Joined: 05 Sep 2003
Posts: 7105
Location: Slovakia
vid 20 Nov 2010, 18:53
My guess would be "call api" instead of "call [api]". If it still doesn't work, try including OriginalFirstThunk in imports.
Post 20 Nov 2010, 18:53
View user's profile Send private message Visit poster's website AIM Address MSN Messenger ICQ Number Reply with quote
asmhack



Joined: 01 Feb 2008
Posts: 431
asmhack 20 Nov 2010, 18:53
Code:
 sub     rsp,8*5         ; reserve stack for API use and make stack dqword aligned      
    

and
Code:
call [api]
    


Last edited by asmhack on 21 Nov 2010, 00:49; edited 1 time in total
Post 20 Nov 2010, 18:53
View user's profile Send private message Reply with quote
drobole



Joined: 03 Nov 2010
Posts: 67
Location: Norway
drobole 20 Nov 2010, 19:30
That actually worked, thanks.
Now I just have to figure out why...
Post 20 Nov 2010, 19:30
View user's profile Send private message Reply with quote
drobole



Joined: 03 Nov 2010
Posts: 67
Location: Norway
drobole 20 Nov 2010, 23:36
Let me get this right.

sub rsp,8*5

This is what they call a prologue. Basically allocating stack for the function to be called, and implying add rsp,8*5 to be part of the epilogue?
Post 20 Nov 2010, 23:36
View user's profile Send private message Reply with quote
asmhack



Joined: 01 Feb 2008
Posts: 431
asmhack 21 Nov 2010, 00:46
Post 21 Nov 2010, 00:46
View user's profile Send private message Reply with quote
drobole



Joined: 03 Nov 2010
Posts: 67
Location: Norway
drobole 21 Nov 2010, 14:47
I have changed my main to this

Code:
main: 
        sub rsp, 8
  
    sub rsp, 8*4
        mov r9d, 0       ; uType = MB_OK
    lea r8,  [mytit] ; LPCSTR lpCaption
 lea rdx, [mymsg] ; LPCSTR lpText
    mov rcx, 0       ; hWnd = HWND_DESKTOP
      call [MessageBoxA]
  add rsp, 8*4
        
    sub rsp, 8*4
        mov ecx, eax     ; uExitCode = MessageBox(...)  
    call [ExitProcess]
  add rsp, 8*4
        ret
    


Unless anyone has anything to say about it I will assume for now that the construct sub rsp, 8 has as its only purpose to align the stack pointer, and should be done once at the start of execution.

sub rsp, 8*4 is the Microsoft specific "Shadow space" and is required for all four registers, regardless of the number of arguments used in the forthcoming function call. The callee is responsible for cleaning up any stack usage below this.

edit:
One more question. If the above is correct, is it a viable option to "re-use" the shadow space by simply allocating it at the beginning of execution and leaving it there, arriving at
Code:
main:
  sub rsp, 8*5
  ...
    

?

thanks
Post 21 Nov 2010, 14:47
View user's profile Send private message Reply with quote
asmhack



Joined: 01 Feb 2008
Posts: 431
asmhack 21 Nov 2010, 16:12
drobole wrote:

edit:
One more question. If the above is correct, is it a viable option to "re-use" the shadow space by simply allocating it at the beginning of execution and leaving it there, arriving at
Code:
main:
  sub rsp, 8*5
  ...
    

?

thanks


Code:
sub rsp,  8
sub rsp,  32
=
sub rsp,  40
    


so why not ? nevertheless it runs.
but why you don't wanna use the macros ?
Post 21 Nov 2010, 16:12
View user's profile Send private message Reply with quote
drobole



Joined: 03 Nov 2010
Posts: 67
Location: Norway
drobole 21 Nov 2010, 16:25
Quote:

so why not ? nevertheless it runs.

Thats what I was thinking, but after reading yhis
http://msdn.microsoft.com/en-us/magazine/cc300794.aspx
In particular,
Quote:

Drilling into the calling convention a bit, even though an argument can be passed in a register, the compiler still reserves space on the stack for it by decrementing the RSP register. At a minimum, each function must reserve 32 bytes (four 64-bit values) on the stack. This space allows registers passed into the function to be easily copied to a well-known stack location. The callee function isn't required to spill the input register params to the stack, but the stack space reservation ensures that it can if needed. Of course, if more than four integer parameters are passed, the appropriate additional amount of stack space must be reserved.

Since several API functions take more than 4 arguments I guess I should be prepared to allocate more stack than the required "shadow space" of 32 bytes, and my suggestion of allocating 32 bytes and leaving it at that won't hold true.

Quote:

but why you don't wanna use the macros ?

I have been wondering that myself lately...
Macros seems like a good way to solve the above. I just wanned to get a hold of how things work

Anyway, thanks for helping out
Post 21 Nov 2010, 16:25
View user's profile Send private message Reply with quote
asmhack



Joined: 01 Feb 2008
Posts: 431
asmhack 21 Nov 2010, 17:00
drobole wrote:

Macros seems like a good way to solve the above. I just wanned to get a hold of how things work

Code:
invoke  CreateWindowEx,0,_class,_title,WS_VISIBLE+WS_DLGFRAME+WS_SYSMENU,128,128,256,192,NULL,NULL,[wc.hInstance],NULL
    

Code:
sub rsp,$60
mov rcx,0
mov rdx,_class
mov r8,_title
mov r9,WS_VISIBLE+WS_DLGFRAME+WS_SYSMENU
mov qword[rsp+$20],128
mov qword[rsp+$28],128
mov qword[rsp+$30],256
mov qword[rsp+$38],192
mov qword[rsp+$40],NULL
mov qword[rsp+$48],NULL
mov rax,[wc.hInstance]
mov qword[rsp+$50],rax
mov qword[rsp+$58],NULL
call [CreateWindowEx]
add rsp,$60
    


Due to the nature of parameter passing on x64, the “push” instruction is seldom used for setting up arguments. Instead, the compiler allocates all space up front (like for local variables on x86) and uses the “mov” instruction to write stack parameters onto the stack for function calls. This also means that you typically will not see an “add rsp” (or equivalent) after each function call, despite the fact that the caller cleans the stack space.

http://www.nynaeve.net/?p=10


Last edited by asmhack on 21 Nov 2010, 17:20; edited 1 time in total
Post 21 Nov 2010, 17:00
View user's profile Send private message Reply with quote
drobole



Joined: 03 Nov 2010
Posts: 67
Location: Norway
drobole 21 Nov 2010, 17:17
Code:
sub rsp,$60 
...
sub rsp,$60 ; <- Either this is a typo, or there is something I don't understand
    


Last edited by drobole on 15 Dec 2010, 04:17; edited 1 time in total
Post 21 Nov 2010, 17:17
View user's profile Send private message Reply with quote
asmhack



Joined: 01 Feb 2008
Posts: 431
asmhack 21 Nov 2010, 17:20
typo
Post 21 Nov 2010, 17:20
View user's profile Send private message Reply with quote
Alphonso



Joined: 16 Jan 2007
Posts: 295
Alphonso 26 Nov 2010, 06:05
asmhack wrote:

but why you don't wanna use the macros ?


Macros are great and can make code much easier to read but with 64 and default invoke for example
Code:
format PE64 GUI 4.0
entry start
include 'win64a.inc'

;----------------------------------------
section '.text' code readable executable
;----------------------------------------
  start:
                sub     rsp,5*8
                invoke  Sleep,20
                invoke  MessageBox,0,Mess,Title,0
                invoke  ExitProcess,0
;----------------------------------------
section '.data' data readable writeable
;----------------------------------------
  Title              db '64-bit',0
  Mess               db 'Hello World!',0
;----------------------------------------
section '.idata' import data readable writeable
;----------------------------------------

     library kernel32,'KERNEL32.DLL',\
             user32,'USER32.DLL'

             include 'api\kernel32.inc'
             include 'api\user32.inc'      


gives

Code:
        sub     rsp, 40                                 ;
        sub     rsp, 16                                 ; double stack op
        mov     rcx, 20
        call    near [rel imp_Sleep]
        add     rsp, 16                                 ;
        sub     rsp, 32                                 ; double stack op
        mov     rcx, 0
        mov     rdx, ?_003
        mov     r8, ?_002
        mov     r9, 0
        call    near [rel imp_MessageBoxA]
        add     rsp, 32                                 ; double stack op
        sub     rsp, 16                                 ;
        mov     rcx, 0
        call    near [rel imp_ExitProcess]
        add     rsp, 16                                 ; redundant
    


where as it seems to be enough to use

Code:
...
  start:
                sub     rsp,5*8           ; max reservation needed + alignment
                mov     rcx,20
                call    [Sleep]
                xor     rcx,rcx
                mov     rdx,Mess
                mov     r8,Title
                mov     r9,rcx
                call    [MessageBox]
                xor     rcx,rcx
                call    [ExitProcess]
...    


which gives

Code:
        sub     rsp, 40                   
        mov     rcx, 20                    
        call    near [rel imp_Sleep]       
        xor     rcx, rcx                   
        mov     rdx, ?_003                 
        mov     r8, ?_002                  
        mov     r9, rcx                    
        call    near [rel imp_MessageBoxA] 
        xor     rcx, rcx                   
        call    near [rel imp_ExitProcess]     


As long as enough stack space is initially allocated for the biggest function would this work okay or am I missing something? Just trying to think why it was done like that, simplicity maybe. Proc and endp would need there special start and finish stack adjustments too.

It's probably been answered before but why are we handed over a wonky stack at the beginning of our code instead of one that's already aligned?
Post 26 Nov 2010, 06:05
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20445
Location: In your JS exploiting you and your system
revolution 26 Nov 2010, 06:09
Alphonso wrote:
It's probably been answered before but why are we handed over a wonky stack at the beginning of our code instead of one that's already aligned?
Because call only pushes one qword.

Perhaps a more appropriate question is why we even need a 16-byte aligned stack anyway?


Last edited by revolution on 26 Nov 2010, 06:15; edited 1 time in total
Post 26 Nov 2010, 06:09
View user's profile Send private message Visit poster's website Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20445
Location: In your JS exploiting you and your system
revolution 26 Nov 2010, 06:14
Alphonso wrote:
As long as enough stack space is initially allocated for the biggest function would this work okay or am I missing something?
What about functions with more then four arguments? Using pre-allocation means that you can't use push arg, you would have to use mov [esp+offset],arg instead. A classic trade-off situation: sometimes you will gain and sometimes you will lose. No single solution will be optimal for everything.
Post 26 Nov 2010, 06:14
View user's profile Send private message Visit poster's website Reply with quote
bitRAKE



Joined: 21 Jul 2003
Posts: 4071
Location: vpcmpistri
bitRAKE 26 Nov 2010, 07:17
It's much better to allocate for biggest function (and align the stack). Then make the following assumption at all function entries: return address on top, and 32 bytes of temp storage. For small (leaf) functions the temp space is sufficient, larger function need the frame anyway to call API or leaf functions -- couple instructions handle that.
Post 26 Nov 2010, 07:17
View user's profile Send private message Visit poster's website Reply with quote
bitRAKE



Joined: 21 Jul 2003
Posts: 4071
Location: vpcmpistri
bitRAKE 26 Nov 2010, 07:20
revolution wrote:
Perhaps a more appropriate question is why we even need a 16-byte aligned stack anyway?
The API uses SSE2 which requires alignment.

_________________
¯\(°_o)/¯ “languages are not safe - uses can be” Bjarne Stroustrup
Post 26 Nov 2010, 07:20
View user's profile Send private message Visit poster's website Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8358
Location: Kraków, Poland
Tomasz Grysztar 26 Nov 2010, 07:44
Alphonso wrote:
As long as enough stack space is initially allocated for the biggest function would this work okay or am I missing something? Just trying to think why it was done like that, simplicity maybe. Proc and endp would need there special start and finish stack adjustments too.
You have to use "frame" macro to reduce stack allocations to a single one. And please also check the static RSP prologue/epilogue macros.
Post 26 Nov 2010, 07:44
View user's profile Send private message Visit poster's website Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20445
Location: In your JS exploiting you and your system
revolution 26 Nov 2010, 07:49
bitRAKE wrote:
The API uses SSE2 which requires alignment.
But which API? Do you have an example?
Post 26 Nov 2010, 07:49
View user's profile Send private message Visit poster's website Reply with quote
vid
Verbosity in development


Joined: 05 Sep 2003
Posts: 7105
Location: Slovakia
vid 26 Nov 2010, 10:27
Quote:
But which API? Do you have an example?

I am sure Feryno he has dealt with this in practice.
Post 26 Nov 2010, 10:27
View user's profile Send private message Visit poster's website AIM Address MSN Messenger ICQ Number Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  
Goto page 1, 2  Next

< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2025, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.