flat assembler
Message board for the users of flat assembler.
Index
> Windows > Customizing the "proc" |
Author |
|
Tomasz Grysztar 03 Aug 2009, 11:14
As a demonstration of what code the static-RSP macros generate, here's the modified procedure from OpenGL example, which was assembled with the static-RSP prologue/epilogue settings, and disassembled with BIEW (the disassembly comes below):
Code: proc WindowProc uses rbx rsi rdi, hwnd,wmsg,wparam,lparam locals rc RECT pfd PIXELFORMATDESCRIPTOR endl mov [hwnd],rcx cmp edx,WM_CREATE je .wmcreate cmp edx,WM_SIZE je .wmsize cmp edx,WM_PAINT je .wmpaint cmp edx,WM_KEYDOWN je .wmkeydown cmp edx,WM_DESTROY je .wmdestroy .defwndproc: invoke DefWindowProc,rcx,rdx,r8,r9 jmp .finish .wmcreate: invoke GetDC,rcx mov [hdc],rax lea rdi,[pfd] mov rcx,sizeof.PIXELFORMATDESCRIPTOR shr 3 xor eax,eax rep stosq mov [pfd.nSize],sizeof.PIXELFORMATDESCRIPTOR mov [pfd.nVersion],1 mov [pfd.dwFlags],PFD_SUPPORT_OPENGL+PFD_DOUBLEBUFFER+PFD_DRAW_TO_WINDOW mov [pfd.iLayerType],PFD_MAIN_PLANE mov [pfd.iPixelType],PFD_TYPE_RGBA mov [pfd.cColorBits],16 mov [pfd.cDepthBits],16 mov [pfd.cAccumBits],0 mov [pfd.cStencilBits],0 invoke ChoosePixelFormat,[hdc],addr pfd invoke SetPixelFormat,[hdc],eax,addr pfd invoke wglCreateContext,[hdc] mov [hrc],rax invoke wglMakeCurrent,[hdc],[hrc] invoke GetClientRect,[hwnd],addr rc invoke glViewport,0,0,[rc.right],[rc.bottom] invoke GetTickCount mov [clock],eax xor eax,eax jmp .finish .wmsize: invoke GetClientRect,[hwnd],addr rc invoke glViewport,0,0,[rc.right],[rc.bottom] xor eax,eax jmp .finish .wmpaint: invoke GetTickCount sub eax,[clock] cmp eax,10 jb .animation_ok add [clock],eax invoke glRotatef,float [theta],float dword 0.0,float dword 0.0,float dword 1.0 .animation_ok: invoke glClear,GL_COLOR_BUFFER_BIT invoke glBegin,GL_QUADS invoke glColor3f,float dword 1.0,float dword 0.1,float dword 0.1 invoke glVertex3d,float -0.6,float -0.6,float 0.0 invoke glColor3f,float dword 0.1,float dword 0.1,float dword 0.1 invoke glVertex3d,float 0.6,float -0.6,float 0.0 invoke glColor3f,float dword 0.1,float dword 0.1,float dword 1.0 invoke glVertex3d,float 0.6,float 0.6,float 0.0 invoke glColor3f,float dword 1.0,float dword 0.1,float dword 1.0 invoke glVertex3d,float -0.6,float 0.6,float 0.0 invoke glEnd invoke SwapBuffers,[hdc] xor eax,eax jmp .finish .wmkeydown: cmp r8d,VK_ESCAPE jne .defwndproc .wmdestroy: invoke wglMakeCurrent,0,0 invoke wglDeleteContext,[hrc] invoke ReleaseDC,[hwnd],[hdc] invoke PostQuitMessage,0 xor eax,eax .finish: ret endp Code: 00000362:a4883EC78 sub (q) rsp,+78 00000366:a48895C2458 mov [rsp+58],rbx 0000036B:a4889742460 mov [rsp+60],rsi 00000370:a48897C2468 mov [rsp+68],rdi 00000375:a48898C2480000000 mov [rsp+00000080],rcx 0000037D:a83FA01 cmp (d) edx,+01 00000380:a7432 je file:000003B4 00000382:a83FA05 cmp (d) edx,+05 00000385:a0F840D010000 je file:00000498 0000038B:a83FA0F cmp (d) edx,+0F 0000038E:a0F843C010000 je file:000004D0 00000394:a81FA00010000 cmp edx,00000100 0000039A:a0F84E5020000 je file:00000685 000003A0:a83FA02 cmp (d) edx,+02 000003A3:a0F84E6020000 je file:0000068F 000003A9:aFF15D11F0000 call (q) [rip+00001FD1] 000003AF:aE920030000 jmpn file:000006D4 000003B4:aFF15FE1F0000 call (q) [rip+00001FFE] 000003BA:a488905A70E0000 mov [rip+00000EA7],rax 000003C1:a488D7C2430 lea rdi,[rsp+30] 000003C6:a48C7C105000000 mov rcx,00000005 000003CD:a31C0 xor eax,eax 000003CF:aF348AB rep; stosq 000003D2:a66C74424302800 mov [rsp+30],0028 000003D9:a66C74424320100 mov [rsp+32],0001 000003E0:aC744243425000000 mov [rsp+34],00000025 000003E8:aC644244A00 mov [rsp+4A],00 000003ED:aC644243800 mov [rsp+38],00 000003F2:aC644243910 mov [rsp+39],10 000003F7:aC644244710 mov [rsp+47],10 000003FC:aC644244200 mov [rsp+42],00 00000401:aC644244800 mov [rsp+48],00 00000406:a488B0D5B0E0000 mov rcx,[rip+00000E5B] 0000040D:a488D542430 lea rdx,[rsp+30] 00000412:aFF159C200000 call (q) [rip+0000209C] 00000418:a488B0D490E0000 mov rcx,[rip+00000E49] 0000041F:a89C2 mov edx,eax 00000421:a4C8D442430 lea r8,[rsp+30] 00000426:aFF1590200000 call (q) [rip+00002090] 0000042C:a488B0D350E0000 mov rcx,[rip+00000E35] 00000433:aFF155F210000 call (q) [rip+0000215F] 00000439:a488905300E0000 mov [rip+00000E30],rax 00000440:a488B0D210E0000 mov rcx,[rip+00000E21] 00000447:a488B15220E0000 mov rdx,[rip+00000E22] 0000044E:aFF1554210000 call (q) [rip+00002154] 00000454:a488B8C2480000000 mov rcx,[rsp+00000080] 0000045C:a488D542420 lea rdx,[rsp+20] 00000461:aFF15491F0000 call (q) [rip+00001F49] 00000467:a48C7C100000000 mov rcx,00000000 0000046E:a48C7C200000000 mov rdx,00000000 00000475:a448B442428 mov r8d,[rsp+28] 0000047A:a448B4C242C mov r9d,[rsp+2C] 0000047F:aFF150B210000 call (q) [rip+0000210B] 00000485:aFF15331E0000 call (q) [rip+00001E33] 0000048B:a8905130E0000 mov [rip+00000E13],eax 00000491:a31C0 xor eax,eax 00000493:aE93C020000 jmpn file:000006D4 00000498:a488B8C2480000000 mov rcx,[rsp+00000080] 000004A0:a488D542420 lea rdx,[rsp+20] 000004A5:aFF15051F0000 call (q) [rip+00001F05] 000004AB:a48C7C100000000 mov rcx,00000000 000004B2:a48C7C200000000 mov rdx,00000000 000004B9:a448B442428 mov r8d,[rsp+28] 000004BE:a448B4C242C mov r9d,[rsp+2C] 000004C3:aFF15C7200000 call (q) [rip+000020C7] 000004C9:a31C0 xor eax,eax 000004CB:aE904020000 jmpn file:000006D4 000004D0:aFF15E81D0000 call (q) [rip+00001DE8] 000004D6:a2B05C80D0000 sub eax,[rip+00000DC8] 000004DC:a83F80A cmp (d) eax,+0A 000004DF:a722F jc file:00000510 000004E1:a0105BD0D0000 add [rip+00000DBD],eax 000004E7:a660F6E052D0D0000 movd xmm0,[rip+00000D2D] 000004EF:aB800000000 mov eax,00000000 000004F4:a660F6EC8 movd xmm1,eax 000004F8:aB800000000 mov eax,00000000 000004FD:a660F6ED0 movd xmm2,eax 00000501:aB80000803F mov eax,3F800000 00000506:a660F6ED8 movd xmm3,eax 0000050A:aFF1570200000 call (q) [rip+00002070] 00000510:a48C7C100400000 mov rcx,00004000 00000517:aFF154B200000 call (q) [rip+0000204B] 0000051D:a48C7C107000000 mov rcx,00000007 00000524:aFF1536200000 call (q) [rip+00002036] 0000052A:aB80000803F mov eax,3F800000 0000052F:a660F6EC0 movd xmm0,eax 00000533:aB8CDCCCC3D mov eax,3DCCCCCD 00000538:a660F6EC8 movd xmm1,eax 0000053C:aB8CDCCCC3D mov eax,3DCCCCCD 00000541:a660F6ED0 movd xmm2,eax 00000545:aFF1525200000 call (q) [rip+00002025] 0000054B:a48B8333333333333E3BF mov rax,BFE3333333333333 00000555:a66480F6EC0 movd xmm0,rax 0000055A:a48B8333333333333E3BF mov rax,BFE3333333333333 00000564:a66480F6EC8 movd xmm1,rax 00000569:a48C7C000000000 mov rax,00000000 00000570:a66480F6ED0 movd xmm2,rax 00000575:aFF150D200000 call (q) [rip+0000200D] 0000057B:aB8CDCCCC3D mov eax,3DCCCCCD 00000580:a660F6EC0 movd xmm0,eax 00000584:aB8CDCCCC3D mov eax,3DCCCCCD 00000589:a660F6EC8 movd xmm1,eax 0000058D:aB8CDCCCC3D mov eax,3DCCCCCD 00000592:a660F6ED0 movd xmm2,eax 00000596:aFF15D41F0000 call (q) [rip+00001FD4] 0000059C:a48B8333333333333E33F mov rax,3FE3333333333333 000005A6:a66480F6EC0 movd xmm0,rax 000005AB:a48B8333333333333E3BF mov rax,BFE3333333333333 000005B5:a66480F6EC8 movd xmm1,rax 000005BA:a48C7C000000000 mov rax,00000000 000005C1:a66480F6ED0 movd xmm2,rax 000005C6:aFF15BC1F0000 call (q) [rip+00001FBC] 000005CC:aB8CDCCCC3D mov eax,3DCCCCCD 000005D1:a660F6EC0 movd xmm0,eax 000005D5:aB8CDCCCC3D mov eax,3DCCCCCD 000005DA:a660F6EC8 movd xmm1,eax 000005DE:aB80000803F mov eax,3F800000 000005E3:a660F6ED0 movd xmm2,eax 000005E7:aFF15831F0000 call (q) [rip+00001F83] 000005ED:a48B8333333333333E33F mov rax,3FE3333333333333 000005F7:a66480F6EC0 movd xmm0,rax 000005FC:a48B8333333333333E33F mov rax,3FE3333333333333 00000606:a66480F6EC8 movd xmm1,rax 0000060B:a48C7C000000000 mov rax,00000000 00000612:a66480F6ED0 movd xmm2,rax 00000617:aFF156B1F0000 call (q) [rip+00001F6B] 0000061D:aB80000803F mov eax,3F800000 00000622:a660F6EC0 movd xmm0,eax 00000626:aB8CDCCCC3D mov eax,3DCCCCCD 0000062B:a660F6EC8 movd xmm1,eax 0000062F:aB80000803F mov eax,3F800000 00000634:a660F6ED0 movd xmm2,eax 00000638:aFF15321F0000 call (q) [rip+00001F32] 0000063E:a48B8333333333333E3BF mov rax,BFE3333333333333 00000648:a66480F6EC0 movd xmm0,rax 0000064D:a48B8333333333333E33F mov rax,3FE3333333333333 00000657:a66480F6EC8 movd xmm1,rax 0000065C:a48C7C000000000 mov rax,00000000 00000663:a66480F6ED0 movd xmm2,rax 00000668:aFF151A1F0000 call (q) [rip+00001F1A] 0000066E:aFF15041F0000 call (q) [rip+00001F04] 00000674:a488B0DED0B0000 mov rcx,[rip+00000BED] 0000067B:aFF15431E0000 call (q) [rip+00001E43] 00000681:a31C0 xor eax,eax 00000683:aEB4F jmps file:000006D4 00000685:a4183F81B cmp (d) r8d,+1B 00000689:a0F851AFDFFFF jne file:000003A9 0000068F:a48C7C100000000 mov rcx,00000000 00000696:a48C7C200000000 mov rdx,00000000 0000069D:aFF15051F0000 call (q) [rip+00001F05] 000006A3:a488B0DC60B0000 mov rcx,[rip+00000BC6] 000006AA:aFF15F01E0000 call (q) [rip+00001EF0] 000006B0:a488B8C2480000000 mov rcx,[rsp+00000080] 000006B8:a488B15A90B0000 mov rdx,[rip+00000BA9] 000006BF:aFF15FB1C0000 call (q) [rip+00001CFB] 000006C5:a48C7C100000000 mov rcx,00000000 000006CC:aFF15F61C0000 call (q) [rip+00001CF6] 000006D2:a31C0 xor eax,eax 000006D4:a488B5C2458 mov rbx,[rsp+58] 000006D9:a488B742460 mov rsi,[rsp+60] 000006DE:a488B7C2468 mov rdi,[rsp+68] 000006E3:a4883C478 add (q) rsp,+78 000006E7:aC3 retn I think that this "proc" variant should come as an easy to use setting in the standard Win64 headers, however I haven't yet come with an idea how choosing this as an option should look like. I'm open to any suggestions in this area. |
|||
03 Aug 2009, 11:14 |
|
madmatt 03 Aug 2009, 12:19
Not a macro question, but, looking at the dissassembly, are you passing parameters to opengl using xmm registers. I didn't know you could do that. Is this just for 64bit coding?
|
|||
03 Aug 2009, 12:19 |
|
Tomasz Grysztar 03 Aug 2009, 12:40
madmatt wrote: Not a macro question, but, looking at the dissassembly, are you passing parameters to opengl using xmm registers. I didn't know you could do that. Is this just for 64bit coding? Please check out this: http://msdn.microsoft.com/en-us/library/zthk2dkh.aspx |
|||
03 Aug 2009, 12:40 |
|
ramguru 03 Aug 2009, 13:28
nice addition,
now I'm curious would it be possible for static_rsp_* to check if declared proc has any registers to save \ local variables & sub-calls ... & if it has none, instead of: Code: proc X sub rsp, 8 ; code add rsp, 8 ret endp leave just Code: proc X ; code ret endp |
|||
03 Aug 2009, 13:28 |
|
Tomasz Grysztar 03 Aug 2009, 13:35
Do you need a "proc" macro in such case at all?
Anyway, this modification to "static_rsp_prologue" should be enough: Code: macro static_rsp_prologue procname,flag,parmbytes,localbytes,reglist { local counter,loc,regs,frame,current loc = (localbytes+7) and (not 7) counter = 0 irps reg, reglist \{ counter = counter+1 \} if loc | frame | counter regs = 8*( counter + (counter+loc shr 3+1) and 1 ) else regs = 0 end if totalbytes@proc equ frame+loc+regs if totalbytes@proc sub rsp,totalbytes@proc end if localbase@proc equ rsp+frame regsbase@proc equ rsp+frame+loc parmbase@proc equ rsp+frame+loc+regs+8 current = 0 current@frame equ current size@frame equ frame counter = 0 irps reg, reglist \{ mov [regsbase@proc+8*counter],reg counter = counter+1 \} } |
|||
03 Aug 2009, 13:35 |
|
ramguru 03 Aug 2009, 13:43
Tomasz Grysztar wrote: Do you need a "proc" macro in such case at all? Well it's a matter of formality :} for better visual orientation .. Thank you wow & app's size decreased by 0.5Kb |
|||
03 Aug 2009, 13:43 |
|
Tomasz Grysztar 03 Aug 2009, 13:55
ramguru wrote: wow & app's size decreased by 0.5Kb Well, if you wanted size optimization, you could change the macro to use PUSH/POP for registers instead of MOV. Something like this, I think: Code: macro static_rsp_prologue procname,flag,parmbytes,localbytes,reglist { local counter,loc,frame,current counter = 0 irps reg, reglist \{ push reg counter = counter+1 \} loc = (localbytes+7) and (not 7) if frame & (counter+loc shr 3+1) and 1 loc = loc + 8 end if framebytes@proc equ frame+loc if framebytes@proc sub rsp,framebytes@proc end if localbase@proc equ rsp+frame regsbase@proc equ rsp+frame+loc parmbase@proc equ rsp+frame+loc+counter*8+8 current = 0 current@frame equ current size@frame equ frame } macro static_rsp_epilogue procname,flag,parmbytes,localbytes,reglist { if framebytes@proc add rsp,framebytes@proc end if irps reg, reglist \{ reverse pop reg \} retn } macro static_rsp_close procname,flag,parmbytes,localbytes,reglist { size@frame = current@frame restore size@frame,current@frame } prologue@proc equ static_rsp_prologue epilogue@proc equ static_rsp_epilogue close@proc equ static_rsp_close I'm still not sure which variant I like more. |
|||
03 Aug 2009, 13:55 |
|
ramguru 03 Aug 2009, 14:06
yeah I prefer 'push/pop method' more, it will help to distinguish code parts in disassembly
|
|||
03 Aug 2009, 14:06 |
|
madmatt 03 Aug 2009, 21:07
Tomasz Grysztar wrote:
Ahhhh, I see, This isn't just for opengl. Thanks. _________________ Gimme a sledge hammer! I'LL FIX IT! |
|||
03 Aug 2009, 21:07 |
|
Tomasz Grysztar 05 Aug 2009, 12:27
I have included the "static RSP" macros in the official headers set. To enable them for you programs, you have to use such three lines:
Code: prologue@proc equ static_rsp_prologue epilogue@proc equ static_rsp_epilogue close@proc equ static_rsp_close If someone comes up with a nice name for a setting that would do it in shorter name, I may add some shortcut macro as well. |
|||
05 Aug 2009, 12:27 |
|
r22 05 Aug 2009, 21:00
I vote 3 proc macros
proc = normal behavior procrsp = RSP procx = custom using the prologue@proc etc maybe procopt = speed optimized experimental |
|||
05 Aug 2009, 21:00 |
|
Tomasz Grysztar 05 Aug 2009, 21:54
I prefer to stay with more standard way, that is one "proc", but with custom prologue settings. Just a simpler way to set up them might come in handy, but it's not a big deal anyway.
|
|||
05 Aug 2009, 21:54 |
|
r22 06 Aug 2009, 00:39
I was thinking procrsp could simply be an alias for ...
{ prologue@proc equ static_rsp_prologue epilogue@proc equ static_rsp_epilogue close@proc equ static_rsp_close proc } For the custom pro/epi-logues the semi-verbose setup probably isn't an issue, but I think for the statically defined customizations (the ones you distribute) you should have a quick alias for ease of use. As for naming convention for uniquely FASM stuff like static_rsp_prologue, as long as it's consistant whatever you choose would be fine. STC_RSP_EPI, FASM_RSP_EPI, _PROC_RSP_E |
|||
06 Aug 2009, 00:39 |
|
< Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.