flat assembler
Message board for the users of flat assembler.
Index
> Macroinstructions > requesting help with a stack touching macro |
Author |
|
revolution 03 May 2016, 05:13
'repeat' should be just fine for the job.
Code: repeat (size + STACK_PAGE_SIZE - 1) / STACK_PAGE_SIZE cmp byte[esp - % * STACK_PAGE_SIZE],al end repeat if size < STACK_PAGE_SIZE cmp byte[esp - size],al end if |
|||
03 May 2016, 05:13 |
|
revolution 03 May 2016, 05:19
For the record I just want to add that stack touching at the procedure entry is a terrible idea. Much better to expand the stack once at startup and don't waste time touching a stack that already exists every time a function is subsequently called. Anyhow, not saying this applies to the code above, it might well be that this is in your startup code, in which case, good job
|
|||
03 May 2016, 05:19 |
|
tthsqe 07 May 2016, 20:12
Thanks! I didn't realize it was this simple - I wanted to emulate the x86 instructions inside the macro. Anyways, there are some portions of code that you just want to work and other portions that you want to work fast. This is sufficient for the former.
Code: ; use this macro if you are too lazy to touch beforehand the required amount of stack ; for functions that need more than 4K of stack space ; here we assume that the current stack pointer is in the committed range ; if size > 4096, [rsp-size] might be past the guard page ; so touch the pages up to it STACK_PAGE_SIZE = 4096 macro _chkstk_ms stackptr, size { repeat (size+8) / STACK_PAGE_SIZE cmp al, byte[stackptr - % * STACK_PAGE_SIZE] end repeat } [Edited] commited->RESERVED [Edited] RESERVED->committed Last edited by tthsqe on 09 May 2016, 04:12; edited 2 times in total |
|||
07 May 2016, 20:12 |
|
revolution 08 May 2016, 12:33
tthsqe wrote:
|
|||
08 May 2016, 12:33 |
|
l_inc 08 May 2016, 17:32
tthsqe, revolution
revolution wrote: For the record I just want to add that stack touching at the procedure entry is a terrible idea. Much better to expand the stack once at startup and don't waste time touching a stack that already exists every time a function is subsequently called tthsqe wrote: use this macro if you are too lazy to touch beforehand the required amount of stack I actually expected tthsqe to disagree, because revolution's suggestion makes absolutely no sense (to me). Stack probing must be done at the procedure entry only, because in most cases the stack frame address at function execution time is not known beforehand. Moreover in most cases it's not even known, whether that specific stack-greedy function is gonna be called at all, and therefore the suggestion contradicts revolution's own point here. And even for stack locations known for sure to be used at runtime (guaranteed minimal stack consumption) preliminary stack probing makes no sense as well, because in this case one just needs to specify SizeOfStackCommit correctly and omit stack probing completely. revolution wrote: I assume you actually mean the reserved range No, he actually correctly means commited range. Because only in this case following stack probing is guaranteed to not miss the guard page. tthsqe As for your original function, it has a large optimization potential. The sub instruction directly followed by the cmp instruction looks especially disturbing. There are many other ways in which the function is suboptimal, so I'm just gonna summarize them by rewriting the function (stack probing and stack allocation combined): Code: __allocstk: sub rsp,rax assert bsr PAGE_SIZE = bsf PAGE_SIZE and rax,-PAGE_SIZE jz .return .next_page: cmp [rsp+rax],edx ;3 bytes only, avoids any kinds of partial register access stall sub rax,PAGE_SIZE jae .next_page .return: ret Note that you don't need a single stack access in case the frame needed is smaller than a page. The function is OK in case you really don't know the stack frame size at compile time, which is quite uncommon. To differentiate between the known and unknown stack frame size cases you might be willing to use the relativeto operator: Code: macro m_allocstk stackptr, size { if size relativeto 0 ;statically sized stack frame sub rsp,size repeat size / STACK_PAGE_SIZE mov byte[rsp + size / STACK_PAGE_SIZE - % * STACK_PAGE_SIZE] end repeat else ;dynamically sized stack frame if ~ size eq rax mov rax,size end if call __allocstk else } _________________ Faith is a superposition of knowledge and fallacy |
|||
08 May 2016, 17:32 |
|
l_inc 08 May 2016, 22:27
tthsqe
l_inc wrote: There are many other ways in which the function is suboptimal That was so nice of me to replace the suboptimal implementation with a totally wrong one. ^_^ To my justification I was in a hurry. Here's a little fix: Code: __allocstk: pop r10 sub rsp,rax assert bsr PAGE_SIZE = bsf PAGE_SIZE and rax,-PAGE_SIZE jz .return .next_page: cmp [rsp+rax],edx ;3 bytes only, avoids any kinds of partial register access stall sub rax, PAGE_SIZE ja .next_page .return: push r10 ret The macro m_allocstk became a victim of my reduced attention even more: senseless mov and else, unneeded argument. Additionally I forgot to note that a loop might still be more preferrable, as the number of the 7-bytes-long probe instructions could become undesireably large. I wasn't going to take that into account, but as long as I screwed up and need to fix it anyway, here's the full version that tries to do a good job doing size optimization: Code: macro m_allocstk size* { local sz,off,..next_page if 0 relativeto size ;statically sized stack frame sz = ((size)+7) and (-8) assert bsr PAGE_SIZE = bsf PAGE_SIZE off = sz and (-PAGE_SIZE) if off < PAGE_SIZE*3 & ~sz = PAGE_SIZE*2 sub rsp,sz times off/PAGE_SIZE+1 : cmp [rsp + off - (%-1)*PAGE_SIZE],edx else mov rax,off if sz = off sub rsp,rax else sub rsp,sz end if ..next_page: cmp [rsp+rax],edx sub rax,PAGE_SIZE jae ..next_page end if else ;dynamically sized stack frame if ~ rax eq size mov rax,size end if call __allocstk end if } _________________ Faith is a superposition of knowledge and fallacy |
|||
08 May 2016, 22:27 |
|
revolution 08 May 2016, 23:10
l_inc wrote: I actually expected tthsqe to disagree, because revolution's suggestion makes absolutely no sense (to me). Stack probing must be done at the procedure entry only, because in most cases the stack frame address at function execution time is not known beforehand. Moreover in most cases it's not even known, whether that specific stack-greedy function is gonna be called at all, and therefore the suggestion contradicts revolution's own point here. And even for stack locations known for sure to be used at runtime (guaranteed minimal stack consumption) preliminary stack probing makes no sense as well, because in this case one just needs to specify SizeOfStackCommit correctly and omit stack probing completely. l_inc wrote:
|
|||
08 May 2016, 23:10 |
|
l_inc 08 May 2016, 23:33
revolution
Quote: If you can't predict how much stack it will use then you can't guarantee that the touching will not fall outside the reserved range. For most of real world programs it's not possible to predict how much stack exactly they need. Just because the exact amount of stack depends on runtime conditions. What you can predict is how much stack the program needs at least (that's SizeOfStackCommit) and how much it needs at most (that's SizeOfStackReserved). The latter gives the guarantee for not falling outside the reserved range. Quote: I don't think your link is not relevant I assume, you mean you don't think it's relevant. Well it is. For the exact same reason I mentioned above: you cannot know beforehand how much memory fasm will actually use during the compilation. For that reason you don't allocate all the requested amount of memory at once, same as most programs don't allocate all the amount of stack, but do that on-demand instead by touching the guard page. Quote: If it is already committed then there is no need to touch it If you look more attentively at his code, you'll notice that he doesn't touch it. He starts touching it one page before. Quote: And if it is reserved then you'll be fine committing new pages with touching as long as you don't fall outside the reserved range. No, you won't. And that's the exact reason, why stack probing exists and is inserted by compilers as a regular part of function prologues. You'll only be fine if you won't miss the guard page. Otherwise the program will crash. _________________ Faith is a superposition of knowledge and fallacy |
|||
08 May 2016, 23:33 |
|
tthsqe 09 May 2016, 00:27
revolution wrote:
Yes - the names are confusing for me. I mean that accessing [rsp] will not cause a page fault (and is not the guard page) |
|||
09 May 2016, 00:27 |
|
l_inc 09 May 2016, 00:32
tthsqe
Quote: Yes - the names are confusing for me. I mean that accessing [rsp] will not cause a page fault (and is not the guard page) Oh my god. That means "No"! Because reserved range will cause a page fault. The commited range won't. You were perfectly correct the first time. _________________ Faith is a superposition of knowledge and fallacy |
|||
09 May 2016, 00:32 |
|
tthsqe 09 May 2016, 04:13
:O
|
|||
09 May 2016, 04:13 |
|
< Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.