flat assembler
Message board for the users of flat assembler.
Index
> Main > Stack Realignment "Techniques" Goto page Previous 1, 2, 3, 4, 5, 6, 7, 8 Next |
Author |
|
revolution 31 Aug 2017, 05:16
system error wrote: You really are CLUELESS ... |
|||
31 Aug 2017, 05:16 |
|
system error 31 Aug 2017, 08:15
Furs wrote: The part where you use 2 functions when there should only be 1 working in both cases, MORON. ONE function. Furs, calling people MORON won't do you much good. Please behave and watch your language. This is not your home -_- |
|||
31 Aug 2017, 08:15 |
|
Furs 31 Aug 2017, 11:51
^ ok, officially, dealing with an 8 year old.
And yes, I call some of the GCC maintainers retarded, because they are (only 2 of them btw). You know, I have to deal with their bs often (nothing about programming, they're retarded for management reasons so to speak, and in fact that's why I get so annoyed, cause I spend less time programming having to deal with their bullshit). That's one reason I won't give you link to the patch when it'll be ready, because I want to keep the anonymity. And not like I have to prove anything to an 8 year old. This thread was a question; I didn't want to waste my time with a possible patch if it had a reason. Question was long answered. You're the only one who thinks it's something else. |
|||
31 Aug 2017, 11:51 |
|
system error 31 Aug 2017, 13:21
@Furs
Of course, you'll stay anonymous when sending the "patch". We all know what's coming xD I am getting the impressions that you're a very important people in GCC development and maintenance. You know, things like; 1 - "I OFTEN spend MOST of my time dealing with them rather than my own code". Sounds like enormous amount of work, time and sacrifices! 2 - "Go over the top management of GCC and pointing fingers at them and calling them all retards". Sounds like a true boss of GCC! They deserve it, right? Wow!! From those two PROOF of strong words, We believe you. We are impressed! You are a very important people in GCC. Millions of programmers will depend on you! Now get yourself a good book on assembly language and start practicing STACK PROGRAMMING like I asked you to. And don't forget to wash my car after that. Got it, Joe? |
|||
31 Aug 2017, 13:21 |
|
Furs 31 Aug 2017, 13:50
I'm not staying anonymous when sending the patch, since that's not even possible (man, you need to grow up for once). Clearly, you have never sent a single patch in your entire life, obvious as fuck. You even have to sign to the FSF that all contributions will be legally owned by FSF, otherwise they won't let you send any patches. (you know, legally sign it, a tough world out there for wannabes like you). Derp.
I'm staying anonymous on this board so the connection (between my real identity with GCC) and here is not linked. Because then I wouldn't be able to call them freely retarded, as it could backfire. Use your brain, it helps. I think people are perfectly able to read what I post, no need to edit it and twist it based on what you're personally thinking. I said I spend less time programming, not LESS THAN. |
|||
31 Aug 2017, 13:50 |
|
system error 31 Aug 2017, 15:18
Furs wrote: First, in the worst case it will use 64 bytes, however in the best case, it will use only 36. The first one uses at least 64 bytes, and that's its best case. No Furs. There's no worse case best case scenario. Are you playing with statistical theory here? If you know basic stack programming, you can allocate exactly 36 bytes for both data on the stack. This is where you fail big time in interpreting low-level codes. Your incompetency is exposed wide open from the way you described and interpreted them no matter how hard you try to hide it from people with your "impression". Caught you red-handed so many times, pal! And what's the chance that you would call Tomasz Gryzstar a retard like you do to those brilliant GCC maintainers? Judging from the way you were "teaching" him about those "asm thingies" in other threads, I guess it's just a matter of time before you show the same attitude against FASM/FASMG maintainer as well. |
|||
31 Aug 2017, 15:18 |
|
Furs 31 Aug 2017, 15:49
Huh? 36 is best case when it doesn't require realignment. 64 is when it requires realignment. You can't control that, because you can't control the incoming alignment. Otherwise, you wouldn't need to realign the stack, so what the hell is your point?
Anyway, back to serious topic (and no, not at you, feel free to ignore, I don't care). There's another problem I discovered with GCC. When it saves the XMM registers on the stack (the bullshit x64 MS ABI being the only one in existence requiring so, none other ABI (x64 or 32-bit) need it, I fucking hate it)... it emits shit code like (translated from AT&T): Code: push rbp mov rbp, rsp sub rsp, 64 ; space for XMM regs saved (4 of them in this case) and rsp, -32 ; realign stack to 32-bytes (yes, regs are saved unaligned!) sub rsp, 32 ; stack frame (random stuff, one AVX2 vector in this case) This seems stupid, right? Well, maybe it's a missed optimization, I thought, which is easily fixable with an extra pass in the gen_prologue insn hook. So I thought, why can't it be like this: Code: push rbp mov rbp, rsp sub rsp, 96 ; space for XMM regs saved and stack frame and rsp, -32 ; realign stack to 32-bytes Code: /* The computation of the size of the re-aligned stack frame means that we must allocate the size of the register save area before performing the actual alignment. Otherwise we cannot guarantee that there's enough storage above the realignment point. */ Last edited by Furs on 31 Aug 2017, 15:51; edited 1 time in total |
|||
31 Aug 2017, 15:49 |
|
system error 31 Aug 2017, 15:50
After reviewing Furs' "expert" comment in the first page, I am pretty sure that Furs is trying to apply some sort of statistical theorem (best case, worse case etc) against an innocent and naive stack. This is unprecedented and a true discovery indeed! I am crying in joy!
|
|||
31 Aug 2017, 15:50 |
|
revolution 31 Aug 2017, 20:12
Furs wrote:
|
|||
31 Aug 2017, 20:12 |
|
Furs 31 Aug 2017, 20:40
Well I forgot to mention, they use rbp to save the XMM regs (that's what I meant by "unaligned") -- i.e. they're relative to the saved frame pointer, not the "aligned" stack pointer (rbp's value is before any alignment, obviously).
Since rbp's value does not change, the comment doesn't make much sense to me to be honest. I mean, that comment is in the same "if block" and its sub (the XMM save reg size, 64 in my example) is exactly before the alignment operation emitted. I mean, why does it have to be enough storage *above* the realignment spot? It's as if they treat the "frame" (i.e. below the alignment spot) and the "stuff above the frame" (saved regs, etc) as totally different memory pages that can't coexist together and are in totally separate parts of memory. What the hell, lol. (this is even from the x86-specific hooks, so it's machine-specific, not generic code to cater to other machines also) |
|||
31 Aug 2017, 20:40 |
|
revolution 31 Aug 2017, 20:45
Is using RBP and unaligned stores necessary for the stack unwinding to be correct? I'd guess that the debugger expects certain things in certain places. But even so I still don't see any reason to adjust RSP twice.
|
|||
31 Aug 2017, 20:45 |
|
Furs 31 Aug 2017, 21:23
Hmm now that you mention it, it might be related to the stupid way x64 Windows does SEH (exception handling), does it even need to restore the registers when doing unwinding? However it still doesn't make sense since that space, well, exists even if the subtraction is done for the frame. (the XMM registers would be placed at the exact same offsets).
But since x64 SEH is so wacky it might be that it probably needs to keep a "record" of saved registers at exact instruction boundaries, and it has to do this before allocating the frame (or something like that). If the tables describe the frame allocation as "after" the saved registers (in the instruction stream), then the instructions have to be that way. I've never implemented it manually, only seen GCC use some ".cfi" directives to implement it, so I don't know exact specifics. The thing is, GCC doesn't even support SEH with MinGW (at least not with mine), only seems to support it when build via Cygmin. And I am 100% sure I disabled exceptions (because if I need them, I'll make it manually) since it doesn't emit any .cfi directives for this test-case. But they probably just use a generic version without optimizing for specific settings/cases, to "keep the compiler simpler", bleh. Thankfully GCC supports plugins and the gen_prologue is an insn hook so I can redirect it to my function in the plugin without patching GCC for personal use, cause I'm sure they wouldn't accept a patch that "complicates the compiler" unless it provides "obvious benefits" (benefit like idk, not using SEH when it's fucking disabled). [see, this is why I hate 2 of the maintainers, the only cool guy there in this respect seems to be Uros] |
|||
31 Aug 2017, 21:23 |
|
system error 01 Sep 2017, 03:39
Why don't u smarty pants leave the GCC alone. There must be some valid reason why the do it that way. Reasons that are beyond your shallow understanding of how GCC works. They could place some specific registers via RBP or things that we don't know. For example, GCC interrupt handlers are all callee-saver. And as I recall it, the volatiles must reside exactly at function prologue prior to any locals. They're not used to save the XMM or YMM registers. So WTF they need alignment for saving volatiles?
I am not interested in GCC's internals so I maybe wrong but GCC people are not stupid. They are just as good or even better at machine instructions than some "master" I know xD Why don't you people help Tomasz improve FASM's internals instead of GCC? Hmmm.?? hmmmmm???? |
|||
01 Sep 2017, 03:39 |
|
Furs 01 Sep 2017, 14:24
system error wrote: And as I recall it, the volatiles must reside exactly at function prologue prior to any locals. They're not used to save the XMM or YMM registers. So WTF they need alignment for saving volatiles? (MS ABI, being "old", only requires the XMM regs to be saved, as in the 128-bit lowerpart since that's what existed when it was designed, this is why any ABI reliant on piece of shit SSE is garbage and not future-proof; Linux x64 ABI has no problems at all here) I know you were totally clueless about it (when we argued about MS ABI long ago and you proved you don't know a single thing)... but man, after all this time, you still don't know the ABI. No hope for you I guess. I'm seriously starting to think you're a chatbot who learns to parrot from random posts online. Here's the "full" prologue of the example (GCC generated): Code: push rbp mov rbp, rsp sub rsp, 64 and rsp, -32 sub rsp, 32 vmovups [rbp-64], xmm6 vmovups [rbp-48], xmm7 vmovups [rbp-32], xmm8 vmovups [rbp-16], xmm9 system error wrote: but GCC people are not stupid. Also, stop telling me what to do. I don't care, especially coming from you. And why would I help improve FASM? FASM is a good enough tool for me. I'm not happy with the obvious crap code produced by GCC in some cases. FASM, on the other hand, is perfectly fine. I do it out of "necessity". |
|||
01 Sep 2017, 14:24 |
|
revolution 01 Sep 2017, 15:49
Furs wrote: ... (GCC generated): Code: ;... sub rsp, 64+32 and rsp, -32 ;... |
|||
01 Sep 2017, 15:49 |
|
system error 01 Sep 2017, 16:57
Quote: ??? XMM6+ are required to be saved due to MS x64 ABI. In fact, that's the entire reason for Intel adding the "vzeroupper" crap and complicating themselves with it -- instead of just doing it, by default, zero-extension when they added ymm registers. There is no requirement to save the XMM registers in 64-bit ABI. Your INCOMPETENCY is showing right from your first sentence! Just because they are volatiles / non-volaties, that doesn't mean it is a requirement to save their states. You really are a genuine INCOMPETENT aren't you? xD Code: vmovups [rbp-64], xmm6 vmovups [rbp-48], xmm7 vmovups [rbp-32], xmm8 vmovups [rbp-16], xmm9 Also btw, MS calls these regs NON volatile.... (garbage) MS 64-bit ABI does not involve AVX instructions set, you dumbfcuk! You are so caught this time, with your pants down! hahahaha xD |
|||
01 Sep 2017, 16:57 |
|
revolution 01 Sep 2017, 17:03
system error wrote: ... you dumbfcuk! |
|||
01 Sep 2017, 17:03 |
|
Furs 01 Sep 2017, 17:21
system error wrote: There is no requirement to save the XMM registers in 64-bit ABI. Microsoft wrote: Integer arguments are passed in registers RCX, RDX, R8, and R9. Floating point arguments are passed in XMM0L, XMM1L, XMM2L, and XMM3L. 16-byte arguments are passed by reference. Parameter passing is described in detail in Parameter Passing. In addition to these registers, RAX, R10, R11, XMM4, and XMM5 are considered volatile. All other registers are non-volatile. Microsoft wrote: XMM6:XMM15, YMM6:YMM15 Nonvolatile (XMM), Volatile (upper half of YMM) Furthermore, GCC saves them if they are used, so no matter what you say, it does it. I thought you said they aren't idiots...? So they must have a reason for saving them then? Either way, you shot yourself in the foot with this one. (in the example it only saved XMM6-XMM9 cause those were the only ones used in the function) @revolution: Yeah, I think it's 2 cases here: 1) GCC insists that the top of the stack frame be aligned to the same alignment as [esp]. I don't know why, maybe because of stupid machines where the stack grows upward (since the thing that does this is in reload, not in the x86 prologue stuff -- the sub esp, 64 in my first case, 64 comes from LRA/reload which is a "generic" register allocator for all machines, using hooks for specific regs) 2) It's something related to x64 SEH. I mean, they *on purpose* save all the XMM regs before the alignment using rbp, for some reason. It emits "unaligned" moves, because I told it to not assume anything about the stack. So it could have non-16 byte boundary on entry (for example, Wine needs this in Windows API functions, because some applications break the ABI and don't align the stack before an API call, can't blame them since the ABI sucks -- apparently Windows is also tolerant, or not using SSE, or realigns the stack maybe just to be compatible with those apps, since they work). So it emits unaligned moves because it insists to put them "up there" instead of from rsp (which is why the two subs must be there for some stupid reason). Of course this doesn't answer why the two subs can't be combined -- I think it's either SEH, or they just don't care. |
|||
01 Sep 2017, 17:21 |
|
system error 01 Sep 2017, 18:21
@Furs
1. I suspect that you been reading the manual upside down No wonder....hahahaha xD Non-Volatiles simply means the API makes no use of them! They are saved on as-needed basis only. It is not compulsory or a requirement of the ABI. These are USER's own registers for USER's own use and responsibility. The ABI's don't give a fcuk what,, when and how you wanna use them xD Volatiles = are clobbered / scratch registers. It us UP TO YOU whether you want to save them or not before calling an API. It is not a MS 64-bit requirement! the API simply don't care. APIs use them as they please. So in both cases, preserving them are never part of the MS-64bit requirements. It us up to the users, IF such requirements exist. You are confusing ABI requirements vs User's requirements! See, this is how you should read a technical documentation xD 3. And NO. MS-64 bit ABI does not involve any AVX instructions like VMOVUPS because Windows runs on non-AVX computers too! Please stop embarrassing yourself Furs! Is this to technical for you??? I bet it is! hahaha xD |
|||
01 Sep 2017, 18:21 |
|
Goto page Previous 1, 2, 3, 4, 5, 6, 7, 8 Next < Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2025, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.