flat assembler
Message board for the users of flat assembler.
Index
> Windows > Stack problem with proc64 Goto page Previous 1, 2, 3 Next |
Author |
|
Azu 23 Aug 2009, 00:10
Borsuc wrote:
E.G. rax could be dqword[rsp] eax could be dword[rsp] ax could be word[rsp] al could be byte[rsp] rcx could be dqword[rsp+8] etc etc etc.. rsp, rip, EFLAGS and RFLAGS might have to be exceptions though. P.S. it would always be in L1 cache since it would be being used in almost every opcode. |
|||
23 Aug 2009, 00:10 |
|
Borsuc 23 Aug 2009, 00:17
But the stack gets pushed/popped and registers remain the same, so it's still a different area. Yes maybe it is a special "erb" (register pointer) as you described (actually it isn't obviously, but we can pretend it is), and the encodings for it.
|
|||
23 Aug 2009, 00:17 |
|
Azu 23 Aug 2009, 00:23
Yes, existing code would obviously need changed before it could run on such an architecture.
|
|||
23 Aug 2009, 00:23 |
|
Borsuc 23 Aug 2009, 18:22
I meant that, if you want a stack-like usage of registers, probably look at the FPU registers
|
|||
23 Aug 2009, 18:22 |
|
Azu 24 Aug 2009, 03:18
The FPU way sucks.. you can only hold like 8 or 16 things in it and then it's stack is full.. worst of both worlds.
|
|||
24 Aug 2009, 03:18 |
|
madmatt 24 Aug 2009, 09:43
I converted the simple opengl example 'hello.c' to fasm. I also compiled the 'hello.c' example using Visual studio express (64bit). The file below show dis-assembled code from the fasm and c versions. As you can see, fasm is doing some strange things stack wise. An example,
Code: ??? add rsp, 32 ; 00402030 _ 48: 83. C4, 20 ??? sub rsp, 32 ; 00402034 _ 48: 83. EC, 20 lea rcx, [rbp-54H] ; 00402038 _ 48: 8D. 4D, AC mov rdx, 0 ; 0040203C _ 48: C7. C2, 00000000 mov r8, 72 ; 00402043 _ 49: C7. C0, 00000048 ; Note: Memory operand is misaligned. Performance penalty call qword ptr [imp_memset] ; 0040204A _ FF. 15, 0000233C(rel) ??? add rsp, 32 ; 00402050 _ 48: 83. C4, 20 ??? sub rsp, 32 ; 00402054 _ 48: 83. EC, 20 mov rcx, 0 ; 00402058 _ 48: C7. C1, 00000000 call qword ptr [imp_GetModuleHandleA] ; 0040205F _ FF. 15, 00002093(rel) Also, according to objconv, some code alignment problems are there too.
|
|||||||||||
24 Aug 2009, 09:43 |
|
revolution 24 Aug 2009, 09:47
The current fasm macros do not preallocate the stack at the procedure entry. Each function call will do it's own stack allocation and deallocation. This was previously discussed at the time that 64bit was first being introduced.
|
|||
24 Aug 2009, 09:47 |
|
Tomasz Grysztar 24 Aug 2009, 09:56
revolution wrote: The current fasm macros do not preallocate the stack at the procedure entry. They do if you choose the static RSP prologue/epilogue variant. See Customizing the "proc" thread. |
|||
24 Aug 2009, 09:56 |
|
revolution 24 Aug 2009, 10:09
Tomasz Grysztar wrote: They do if you choose the static RSP prologue/epilogue variant. See Customizing the "proc" thread. In that case it seems I left out two words from my previous post: By default the current fasm macros do not preallocate the stack at the procedure entry. |
|||
24 Aug 2009, 10:09 |
|
madmatt 24 Aug 2009, 11:46
Tomasz Grysztar wrote:
But isn't this still an error? That code looks like do-nothing code, in other cases though it seem's to work as intended. [EDIT] And another question, is the proc64 macro passing floating point values correctly (single precision?), if you look at the dis-assembly, Visual C/C++ is using xmm0, xmm1, xmm2, etc. registers first, while fasm is using rcx, rdx, etc. _________________ Gimme a sledge hammer! I'LL FIX IT! |
|||
24 Aug 2009, 11:46 |
|
Tomasz Grysztar 24 Aug 2009, 14:12
madmatt wrote: But isn't this still an error? That code looks like do-nothing code, in other cases though it seem's to work as intended. Do-nothing doesn't do any harm, does it? Why it was chosen for the default macro behavior to allocate stack frame each time separately, is explained in that oldest thread. madmatt wrote: And another question, is the proc64 macro passing floating point values correctly (single precision?), if you look at the dis-assembly, Visual C/C++ is using xmm0, xmm1, xmm2, etc. registers first, while fasm is using rcx, rdx, etc. If you read that other thread carefully, you will notice that "float" prefix should be used in such case. Check out the "WIN64/OPENGL" example that comes with fasmw package, too. |
|||
24 Aug 2009, 14:12 |
|
Tomasz Grysztar 24 Aug 2009, 14:18
Wait a second, madmatt, it was you who asked this question in the other thread, where I demonstrated a disassembly of a procedure from OpenGL example with static RSP frame enabled.
madmatt wrote: Not a macro question, but, looking at the dissassembly, are you passing parameters to opengl using xmm registers. I didn't know you could do that. Is this just for 64bit coding? Or is someone stealing your identity? |
|||
24 Aug 2009, 14:18 |
|
madmatt 24 Aug 2009, 15:14
Quote: Do-nothing doesn't do any harm, does it? Why it was chosen for the default macro behavior to allocate stack frame each time separately, is explained in that oldest thread. No it doesn't, but why allow it, if it's not needed? Quote: If you read that other thread carefully, you will notice that "float" prefix should be used in such case. Check out the "WIN64/OPENGL" example that comes with fasmw package, too. Well, there's the problem right there I didn't read it carefully, ok, I'll do that from now on. Quote: Wait a second, madmatt, it was you who asked this question in the other thread, where I demonstrated a disassembly of a procedure from OpenGL example with static RSP frame enabled. Nope! that was me then and now. When I asked that question my mind was still very much in the 32bit way of doing things. I re-installed win7 64bits and will do much more learning about 64bit asm programming. I compiled the opengl example and it works good, so it must be a problem with my include's or something. _________________ Gimme a sledge hammer! I'LL FIX IT! |
|||
24 Aug 2009, 15:14 |
|
madmatt 24 Aug 2009, 15:59
Ok, I got my opengl example working good now. You pass floats differently in the proc64 macro than with the proc32 macro, 'float dword' for single precision, just 'float' for double precision.
|
|||
24 Aug 2009, 15:59 |
|
Tomasz Grysztar 24 Aug 2009, 16:25
And if you want to get rid of redundant RSP operations, just use "frame" macro.
madmatt wrote: Ok, I got my opengl example working good now. You pass floats differently in the proc64 macro than with the proc32 macro, 'float dword' for single precision, just 'float' for double precision. This is all going to be documented in the new manual on Win32/Win64 headers, but I have't started it yet. |
|||
24 Aug 2009, 16:25 |
|
madmatt 24 Aug 2009, 16:38
Tomasz Grysztar wrote: And if you want to get rid of redundant RSP operations, just use "frame" macro. All right, sounds good. Thanks for your help. Have you fixed the proc64 macro problem that I first mentioned in this post? _________________ Gimme a sledge hammer! I'LL FIX IT! |
|||
24 Aug 2009, 16:38 |
|
madmatt 27 Aug 2009, 00:36
It seems to be fixed now. Good work, Tomasz!
|
|||
27 Aug 2009, 00:36 |
|
Borsuc 27 Aug 2009, 14:33
I have a question. Does the stack need to be aligned on 16 bytes for access, or is it just the stupid fastcall64 convention that needs that?
In other words, will a custom asm convention need that? (why would it???) just thought I'd ask (since I'm not programming x64 yet). |
|||
27 Aug 2009, 14:33 |
|
MazeGen 27 Aug 2009, 15:19
Jeremy explains it well I think:
The stack pointer (RSP) must be 16-byte aligned when making a call to an API. With some APIs this does not matter, but with other APIs wrong stack alignment will cause an exception. Some APIs will handle the exception themselves and align the stack as required (this will, however, cause performance to suffer). Other APIs (at least on early builds of x64) cannot handle the exception and unless you are running the application under debug control, it will exit. |
|||
27 Aug 2009, 15:19 |
|
Goto page Previous 1, 2, 3 Next < Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2025, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.