flat assembler
Message board for the users of flat assembler.
Index
> Linux > Linux ABI stack alignment Goto page Previous 1, 2, 3 Next |
Author |
|
Furs 13 Apr 2017, 17:40
fasmnewbie wrote: Who needs the ABI/API convention compliance if I can always use "and rsp,-16" in my code? All alignment issues can be solved using and rsp,-16/-32/-64 regardless of calling conventions. You're missing the point of calling convention/ABI. It doesn't mandate what you have to use, it says what compliant functions use and expect (functions that use it, such as library functions out of your control). It deals with the state of stack/registers/etc after the function call. It doesn't deal with YOUR code, it deals with functions that abide by it. Your code has to "set the state properly" for those functions it calls (i.e. register parameters, stack alignment, and so on). If you don't call any shared object library, or you use a library that doesn't use the AMD ABI, you do not HAVE to align the stack at all. So what exactly is your issue here? You said it Linux doesn't align it to 16-bytes on shared objects. And that's true. Even libc aligns the stack for executables on entry point, because it clearly thinks the kernel won't align it to 16 bytes. The fact that your executables were 16-byte aligned on entry was just luck/coincidence and you shouldn't rely on it. What are you actually after, then? You simply need stack aligned to 16-bytes at the point you call an external library function that uses the AMD ABI. syscalls don't, though. It doesn't matter how you end up with the stack aligned to 16 bytes. You can propagate the alignment across the entire call chain so you won't have to use 'and rsp, -16' at all except at entry point. Or you could just use 'and rsp, -16' at the one specific external function you call that uses the AMD ABI. It's up to you. Or you could never align the stack if all you do is syscalls or whatever in your library/program. |
|||
13 Apr 2017, 17:40 |
|
fasmnewbie 13 Apr 2017, 17:48
Furs wrote: Does the kernel even have callbacks? If I were to use and rsp,-16 as the magic instruction, why should I bother about the ABI in the first place? There's no point. I should just set up a nice stack frame in every piece of routines and do the alignment from within. That will become even more obvious when you are creating binaries (DLL, SO, LIB ect), because if Linux is demonstrating such inconsistency in allocating stacks, then the routines are better off and much safer to completely disregard the ABI and do it manually from the inside without having to rely on the inconsistent stack policy from the caller's environment. That literally means going back to standard call / 32-bit fastcall, minus the stack alignment, plus the complete stack frame setup. Of course you can use and rsp,-16 at the caller's prior to calling a routine, but then again the caller needs to maintain it's own stack frame, because it is not a leaf function. You can do away with "main", but not so in other non-leaf functions. This thing is contagious. Reason why every ABI is avoiding "and rsp,-16" aka moving the stack, is due to this very reason. The OS needs to be consistent in maintaining an aligned stack ecosystem or else people will have no options but to completely disregard it. |
|||
13 Apr 2017, 17:48 |
|
fasmnewbie 13 Apr 2017, 18:23
Furs wrote:
I know syscall dont observe the alignment as demonstrated in BASELIB. But there will be problems when you're mixing syscall with other third party thingies that DO observe the ABI, most notoriously C (glibc, gcc, libm) because they share the same toolchain. The problems get even worse if you consider the use of ld linker because yet again, ld does align the stack. If you're mixing them up, what alignment policy should you employ? But there's a remedy if we can be sure that the kernel consistently allocates an aligned stack to everbody upon loading, because from that point on we can manually maintain an aligned stack ecosystem throughout, including for use with the binaries such as SO. This is what my question is all about. Not about syscall at all. It's about Linux Kernel stack / mem allocation policy, because what puzzles me is it does consistently allocate aligned heap memory for sys_brk. So why not the stack? Why two different policies over the same memory space? |
|||
13 Apr 2017, 18:23 |
|
Furs 13 Apr 2017, 19:19
fasmnewbie wrote: If I were to use and rsp,-16 as the magic instruction, why should I bother about the ABI in the first place? There's no point. Does your code call into any external library function that uses the ABI? That's the important question. If not, then don't bother. If yes, then that function requires 16-byte alignment before you issue the 'call' instruction. However you get that 16-byte alignment is up to you. (issue and rsp, -16, or propagate the alignment, etc) and rsp is not magic, it simply aligns the stack as requirement for the ABI. How is this any different than placing a value in a register that the ABI says is a parameter? Is the mov instruction to place the parameter also magic? Think about it. You issue those instructions because the respective function requires them to be that way (parameter in specific register, stack aligned to 16 bytes, etc). If you don't call such a function, would you place the non-existent parameter into the ABI register? Then why would you align the stack? Same thing. fasmnewbie wrote: But there's a remedy if we can be sure that the kernel consistently allocates an aligned stack to everbody upon loading, because from that point on we can manually maintain an aligned stack ecosystem throughout, including for use with the binaries such as SO. This is what my question is all about. And who said you can't do that if the kernel doesn't align it? Just do this: Code: entry_point: and rsp, -16 ; rsp is now 16-byte aligned, DOESN'T MATTER what kernel does, you only do this ONCE at start Why does the kernel matter so much? You only have to issue the and rsp once at the beginning. |
|||
13 Apr 2017, 19:19 |
|
fasmnewbie 14 Apr 2017, 08:00
Furs wrote: The answer is that the kernel does not guarantee 16-byte alignment upon "entry point" Btw, you don't need stack alignment for libc. You do need alignment for libm. |
|||
14 Apr 2017, 08:00 |
|
fasmnewbie 14 Apr 2017, 08:06
Maybe you're having difficulty for using gdb or finding the right tool to test this. Just download BASELIB here https://board.flatassembler.net/topic.php?p=184548
Use base64x.asm to test for executable (call dumpreg or stackview) Use base6.so.1 / base6.o binaries for linked object (call dumpreg or stackview). Tell me what the RSP value is during entry. Thanks for the confirmatin efforts |
|||
14 Apr 2017, 08:06 |
|
revolution 14 Apr 2017, 08:48
fasmnewbie wrote: Tell me what the RSP value is during entry. |
|||
14 Apr 2017, 08:48 |
|
fasmnewbie 14 Apr 2017, 08:58
The code, using base6.o + ld linker (Pay attention to RSP)
Code: ;ld prog.o base6.o -o prog format elf64 public _start extrn dumpreg extrn exitx _start: call dumpreg call exitx The output after 3 runs Code: RAX|0000000000000000 RBX|0000000000000000 RCX|0000000000000000 RDX|0000000000000000 RSI|0000000000000000 RDI|0000000000000000 R8 |0000000000000000 R9 |0000000000000000 R10|0000000000000000 R11|0000000000000000 R12|0000000000000000 R13|0000000000000000 R14|0000000000000000 R15|0000000000000000 RBP|0000000000000000 RSP|00007FFE474D7EF0 RIP|0000000000600078 RAX|0000000000000000 RBX|0000000000000000 RCX|0000000000000000 RDX|0000000000000000 RSI|0000000000000000 RDI|0000000000000000 R8 |0000000000000000 R9 |0000000000000000 R10|0000000000000000 R11|0000000000000000 R12|0000000000000000 R13|0000000000000000 R14|0000000000000000 R15|0000000000000000 RBP|0000000000000000 RSP|00007FFFAFC6A750 RIP|0000000000600078 RAX|0000000000000000 RBX|0000000000000000 RCX|0000000000000000 RDX|0000000000000000 RSI|0000000000000000 RDI|0000000000000000 R8 |0000000000000000 R9 |0000000000000000 R10|0000000000000000 R11|0000000000000000 R12|0000000000000000 R13|0000000000000000 R14|0000000000000000 R15|0000000000000000 RBP|0000000000000000 RSP|00007FFF57128050 RIP|0000000000600078 Linking with GCC Code: ;gcc -m64 prog.o base6.o -o prog format elf64 public main extrn dumpreg main: call dumpreg ret Output after 3 runs Code: RAX|0000000000601030 RBX|0000000000000000 RCX|0000000000000000 RDX|00007FFF0ECA3DC8 RSI|00007FFF0ECA3DB8 RDI|0000000000000001 R8 |0000000000400550 R9 |00007F83F2CA98E0 R10|0000000000000846 R11|00007F83F28E9740 R12|00000000004003E0 R13|00007FFF0ECA3DB0 R14|0000000000000000 R15|0000000000000000 RBP|00000000004004E0 RSP|00007FFF0ECA3CD8 RIP|0000000000601030 RAX|0000000000601030 RBX|0000000000000000 RCX|0000000000000000 RDX|00007FFC1CE2CBF8 RSI|00007FFC1CE2CBE8 RDI|0000000000000001 R8 |0000000000400550 R9 |00007FB3138418E0 R10|0000000000000846 R11|00007FB313481740 R12|00000000004003E0 R13|00007FFC1CE2CBE0 R14|0000000000000000 R15|0000000000000000 RBP|00000000004004E0 RSP|00007FFC1CE2CB08 RIP|0000000000601030 RAX|0000000000601030 RBX|0000000000000000 RCX|0000000000000000 RDX|00007FFFBE128F08 RSI|00007FFFBE128EF8 RDI|0000000000000001 R8 |0000000000400550 R9 |00007F9917F418E0 R10|0000000000000846 R11|00007F9917B81740 R12|00000000004003E0 R13|00007FFFBE128EF0 R14|0000000000000000 R15|0000000000000000 RBP|00000000004004E0 RSP|00007FFFBE128E18 RIP|0000000000601030 Now using EXECUTABLE (base64x.asm) Code: ;compile: fasm base64x.asm ;run: ./base64x format ELF64 executable 3 call dumpreg call exitx Output after 3 runs Code: RAX|0000000000000000 RBX|0000000000000000 RCX|0000000000000000 RDX|0000000000000000 RSI|0000000000000000 RDI|0000000000000000 R8 |0000000000000000 R9 |0000000000000000 R10|0000000000000000 R11|0000000000000000 R12|0000000000000000 R13|0000000000000000 R14|0000000000000000 R15|0000000000000000 RBP|0000000000000000 RSP|00007FFC996B1090 RIP|0000000000400078 RAX|0000000000000000 RBX|0000000000000000 RCX|0000000000000000 RDX|0000000000000000 RSI|0000000000000000 RDI|0000000000000000 R8 |0000000000000000 R9 |0000000000000000 R10|0000000000000000 R11|0000000000000000 R12|0000000000000000 R13|0000000000000000 R14|0000000000000000 R15|0000000000000000 RBP|0000000000000000 RSP|00007FFF05CD5F80 RIP|0000000000400078 RAX|0000000000000000 RBX|0000000000000000 RCX|0000000000000000 RDX|0000000000000000 RSI|0000000000000000 RDI|0000000000000000 R8 |0000000000000000 R9 |0000000000000000 R10|0000000000000000 R11|0000000000000000 R12|0000000000000000 R13|0000000000000000 R14|0000000000000000 R15|0000000000000000 RBP|0000000000000000 RSP|00007FFEA9DDB180 RIP|0000000000400078 Three different codes, 2 linkers. So based on the output of RSP, linux does consistently align the stack to 16 for EXECUTABLEs. But doesn't guarantee that with non-executable (probably because of the linkers, although "ld" does align it to 16). My question, how does this behavior be any different on other distros / PCs because I have only one linux machine right now. I need help with confirmation from others. Thanks |
|||
14 Apr 2017, 08:58 |
|
revolution 14 Apr 2017, 09:02
What do you consider "aligned"? In Windows using fastcall; entering with 8 mod 16 is the correct value.
fasmnewbie wrote: So based on the output of RSP, linux does consistently align the stack to 16 for EXECUTABLEs |
|||
14 Apr 2017, 09:02 |
|
fasmnewbie 14 Apr 2017, 09:06
Here I included both base6.o and base64x.asm for your testing convenience. They are from BASELIB
|
|||||||||||
14 Apr 2017, 09:06 |
|
Furs 14 Apr 2017, 11:20
fasmnewbie wrote: It does when it comes to ELF64 executable. It doesn't when it comes to format ELF64. Now we are going back to my original question in Page #1. Is this consistent behaviour across distro? My question is that easy. It doesn't mean it is random. The behavior is consistent on one kernel version (I doubt distro matters), but who says it will be the same in another kernel version? That is why the specification is important. What you do is akin to hardcoding decisions, like what plagued old windows games, which won't run on Windows NT/XP+ because they relied on unguaranteed behavior of Windows 98 "hacks" that were simply never GUARANTEED by the Windows API specifications. They worked in Windows 98 because of however it was implemented, but ANY implementation of Windows API that adheres to the spec is VALID, so Windows XP isn't broken, it was the game that was. Stop coding by what works on your machine, code by what is specified in the specification, because you know it will work in the future on other machines or kernels etc. In this case, the fact that libc realigns the stack on entry should tell you what the specification is. If libc thinks the kernel doesn't guarantee 16-byte alignment on entry, why would you? fasmnewbie wrote: Btw, you don't need stack alignment for libc. You do need alignment for libm. ANY C program in Linux (or C++ program) will link to libc at least for the entry point. So any such program on Linux will align the stack on entry because they use libc's entry point as entry, which in turn will call your "main" sometime later down the line. You understand that libc runs before your code, right? It hijacks the entry point in any HLL. Even if you don't call a single function from it yourself. But seriously, one instruction at the beginning of the entire program is going to bite you that hard or what? Too much fuss over avoiding one simple instruction ONCE at the start of your program. Here's a tip since you still don't get it: disassemble any Linux program written in C and check the entry point yourself. Tell me if you see Code: and rsp, -16 |
|||
14 Apr 2017, 11:20 |
|
fasmnewbie 14 Apr 2017, 12:18
revolution wrote: For one test on one OS. Sample size is too small to know if this behaviour is intended or just a coincidence. Furs wrote: The behavior is consistent on one kernel version (I doubt distro matters), but who says it will be the same in another kernel version? That is why the specification is important. Yes I know that. It probably works only my machine (repeat 5 times) So test it on your machine and give me the feedback. I'been asking the same question for quite a few times in this thread even from post #1. Which part of "help me test it / confirm it on your machine" that you do not understand? You two have reading comprehension difficulties? |
|||
14 Apr 2017, 12:18 |
|
revolution 14 Apr 2017, 13:23
Well I think the point here is that testing/measuring this is not the proper approach. Yes, everyone here could run the code and post results. And even if one million people all run it all gave the same results it still won't tell you what you need to know. Because if it isn't part of the spec/docs then whatever results you see now might change tomorrow in the next update.
|
|||
14 Apr 2017, 13:23 |
|
fasmnewbie 14 Apr 2017, 13:53
revolution wrote: Well I think the point here is that testing/measuring this is not the proper approach. Yes, everyone here could run the code and post results. And even if one million people all run it all gave the same results it still won't tell you what you need to know. Because if it isn't part of the spec/docs then whatever results you see now might change tomorrow in the next update. That means you're accusing Linux of not being reliably consistent though. That's a scary thought because it touches some of the memory allocation policy of Linux. Undocumented yes, but inconsistent? I don't know.... and rsp,-16 is like sending an offending line to Linus saying, "hey bro, we are not confident here! That's why the "and rsp" just to make sure we are safe from you inconsistency!" I don't know |
|||
14 Apr 2017, 13:53 |
|
Furs 14 Apr 2017, 13:54
fasmnewbie wrote: I'been asking the same question for quite a few times in this thread even from post #1. Which part of "help me test it / confirm it on your machine" that you do not understand? fasmnewbie wrote: I have no way of telling whether this is intended by design, by random or unattended behavior since I have only one Linux machine to run my tests. More importantly what is it exactly you wish to accomplish here? Even with testing? Even assuming there's no specification, why not just go the safe route and align the stack yourself with one instruction? Another thing to note: since the kernel is open-source, anyone can modify it, so what now? Of course, if the modification adheres to the specification then it is still a Linux Kernel, just not an official one. If it doesn't, then it's no longer a Linux Kernel but a derivative kernel. FYI my current kernel is 4.4.0-57-generic of Linux Mint (so it is patched, not official) and running your code gave 16-byte aligned always on this version. Happy? (I ran it like 50 times) I'm not going to boot into older kernels I have just to test this thing on them because you're too stubborn, sorry. |
|||
14 Apr 2017, 13:54 |
|
fasmnewbie 14 Apr 2017, 14:06
Furs wrote:
You talk garbage way too much and too far off. |
|||
14 Apr 2017, 14:06 |
|
revolution 14 Apr 2017, 14:17
fasmnewbie wrote:
And, yes, "we are not confident here". And until we find a canonical spec that tells us what is to be expected then we can never be confident. There is nothing wrong with defensive programming. It doesn't mean you are offending someone. No one is going to read it and feel slighted. |
|||
14 Apr 2017, 14:17 |
|
Furs 14 Apr 2017, 14:33
This guy is a lost cause.
Linus would probably find your entire tirade retarded unless you can actually show where he said he guarantees the stack is aligned to 16 bytes on entry. Testing something shows inconsistency in design. Proper design does not need tests because it is part of the design. If you do tests, it shows you are *UNSURE* of the design which is beyond dumb, so please don't talk in Linus' name, he's not a crappy programmer as you imply. If something is not specified, assume it isn't (i.e. no alignment specified, then don't assume any alignment of 16 etc) Just to show you why it is so dumb, here's the manual page for a Linux syscall called "uname". I picked this one because it is an easy example since the API actually changed (the struct changed). Of course, Linux keeps backwards compatibility and renamed the old one to "old_uname" and so on, so old apps don't break. Internally, the old one has the exact same syscall number ID, while the new one is an entirely new API, but it has the same interface to the programmer. (unless you code in asm ofc) First, here is the spec: http://man7.org/linux/man-pages/man2/uname.2.html Key parts: Quote: The length of the arrays in a struct utsname is unspecified (see Quote: The length of the fields in the struct varies. Some operating Now sizeof(...) is a compile-time constant in C, which changes depending on the header files when you compile the application (it uses a different "internal" uname call, even if it's called just "uname" in C sources). But what if it was another API function, imagine something like "get_uname_size"? Would you still hardcode it just because it was a specific size on YOUR machine or others who tested it? Maybe it was like that on all machines, how about in 10 years hmm? Better not use the function let's just hardcode everything based on "tests". Sounds like great programming practice. In fact, when the first uname was conceived, nobody knew there would be different in the future. Keep being shortsighted. Things are UNSPECIFIED on purpose to allow future changes and expansion as needed. i.e. allow them freedom. On the contrary, it means they are smart. Relying on testing is for people who want their code broken in the future. In fact, as you can see they DELIBERATELY UNSPECIFIED the size. What, are you going to say "but it IS specified, look, on my machine it is X bytes!!!", you're missing the entire point of a specification. Whatever, no point talking sense in someone who's so adamant on bad coding practice -- just don't hold your breath that anyone else will even answer your "Test" question. Hey, at least I did and ran your program. It's still something. EDIT: reminds me of people using "unspecified" or "undocumented" opcodes in asm, and then whining why it broke with newer CPUs. It's not the CPU design that's bad, it's you using "but on my tests it worked!" instructions. Same thing with ABIs/APIs. |
|||
14 Apr 2017, 14:33 |
|
fasmnewbie 14 Apr 2017, 15:44
revolution wrote: It doesn't mean I am accusing anyone of anything. All it means is that whatever you see now, might change tomorrow if it isn't part of the spec. If it is part of the spec then everything is fine, but so far the spec is in question. Consistency doesn't mean things like alignment will never change, it means that the spec is followed in a consistent manner. No spec means there is nothing to be consistent about. Many parts of Linux kernel is still undocumented too. But still people have such great confidence in its consistency. One example is sys_brk / sbrk. It is not explicitly stated in the man whether it will be aligned to a page boundary or not, but it is. Now should we re-align the returned pointer to some aligned address just because it's not documented or we are applying 'defensive programming'? I don't know |
|||
14 Apr 2017, 15:44 |
|
Goto page Previous 1, 2, 3 Next < Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2025, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.