flat assembler
Message board for the users of flat assembler.

Index > Main > X86 Stack Alignment Problem

Author
Thread Post new topic Reply to topic
fasmnewbie



Joined: 01 Mar 2011
Posts: 555
fasmnewbie
Hello people. Just a hobbyist to asm and FASM but love to learn more.

My confusion revolves around stack architecture of x86. I read the x86 manuals about stack alignment / misalignment but still not very clear about this thing.

A stack is either 16-bit or 32-bit wide. That also defines the boundary, hence the address. Lets assume that the stack width is 16-bit. If I PUSH a byte onto the stack, lets say PUSH myByte, this will be pushed onto the 16-bit wide segment's space, with one misalignment occurs. My questions:

1. if myByte occupies the LSB portion of that particular segment space, what happen to the MSB portion at that particular address? Would they be padded with zeros?

2. if they are not padded with zeros, can the next PUSH myWord claim the remaining (MSB) space of myByte while the remaining of myWord is kept in the next address?

To rephrase my question, would the different size pushed data be packed together or do they claim one 'unit' segment space (16-bit in this case) for their own (which is clearly a waste of space for smaller data)?

3. In PUSH instruction, is the value of SP be decremented by that item's size or by the stack width? For example, SUB SP, 13. What '13' does it referring to? Is it the total bytes allocated for locals or 13 spaces of stack memory, which of course no longer mean 'byte' if we take into account the width of the stack which is 16-bit (2 bytes) for each space?

4. Is there any FASM directives that can be used to optimize stack alignment?

Pardon my poor English. If all my questions are the wrong ones to ask, then I could use more elaborated explanations because I'm really confused.

Thanks.
Post 13 Mar 2011, 22:22
View user's profile Send private message Visit poster's website Reply with quote
Tyler



Joined: 19 Nov 2009
Posts: 1216
Location: NC, USA
Tyler
How do you push a byte?

[e]sp is nothing more than a pointer. You're over thinking it.

What manual chapter and section are you referring to?
Post 13 Mar 2011, 22:40
View user's profile Send private message Reply with quote
fasmnewbie



Joined: 01 Mar 2011
Posts: 555
fasmnewbie
Tyler wrote:
How do you push a byte?

[e]sp is nothing more than a pointer. You're over thinking it.

What manual chapter and section are you referring to?



Hi Tyler. Thanks for the reply.

1. Here is how I see the 16-bit wide stack segment

|--------|--------|
|--------|--------|
|--------|--------|
|--------|--------|
MSB<----->LSB, just for reference


If I pushed two items onto it;

PUSH myByte ;push a byte, becomes

|--------|myByte|
|--------|--------|
|--------|--------|
|--------|--------|
|--------|--------|

PUSH myWord ; push a word. what's this gonna be?

is it like this

|myWord|myByte|
|-------- |myWord|
|--------|--------|
|--------|--------|

or like this

|--------|myByte|
|myWord|myWord|
|--------|--------|
|--------|--------|


2. I refered to the Intel 64 and IA32 Architectures Software's Developer Manual, Volume 1, Chapter 6, Section 6.2.2 under "Stack Alignment".

Intel wrote:
6.2.2 Stack Alignment
The stack pointer for a stack segment should be aligned on 16-bit (word) or 32-bit
(double-word) boundaries, depending on the width of the stack segment. The D flag
in the segment descriptor for the current code segment sets the stack-segment width
(see “Segment Descriptors” in Chapter 3, “Protected-Mode Memory Management,” of
the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A).
The PUSH and POP instructions use the D flag to determine how much to decrement
or increment the stack pointer on a push or pop operation, respectively. When the
stack width is 16 bits, the stack pointer is incremented or decremented in 16-bit
increments; when the width is 32 bits, the stack pointer is incremented or decremented
in 32-bit increments.
Pushing a 16-bit value onto a 32-bit wide stack can
result in stack misaligned (that is, the stack pointer is not aligned on a doubleword
boundary). One exception to this rule is when the contents of a segment register (a
16-bit segment selector) are pushed onto a 32-bit wide stack. Here, the processor
automatically aligns the stack pointer to the next 32-bit boundary.
The processor does not check stack pointer alignment. It is the responsibility of the
programs, tasks, and system procedures running on the processor to maintain
proper alignment of stack pointers. Misaligning a stack pointer can cause serious
performance degradation and in some instances program failures.


The bold quote is referring to my question #3 as well, which I believe has something to do with the SP got incremented and decremented, like sub sp, 13

Thanks.
Post 14 Mar 2011, 01:03
View user's profile Send private message Visit poster's website Reply with quote
fasmnewbie



Joined: 01 Mar 2011
Posts: 555
fasmnewbie
5. If we have a double-precision data (8-byte) to be pushed onto a 32-bit wide stack, is it true that the CPU requires two read cycles of memory since it has to read from two memory addresses for such data? Is this related to "stack misalignment" and affecting performance? For example;

PUSH myDouble ;8-byte double precision on 32-bit wide stack

|myDouble|myDouble|myDouble|myDouble| address n
|myDouble|myDouble|myDouble|myDouble| address n-1
|----------|-----------|----------|----------|
|----------|-----------|----------|----------|
|----------|-----------|----------|----------|

Thanks in advance.
Post 14 Mar 2011, 01:37
View user's profile Send private message Visit poster's website Reply with quote
revo1ution



Joined: 04 Mar 2010
Posts: 34
Location: somewhere, twiddling something
revo1ution
fasmnewbie, there is no X86 instruction to push a byte.

You must push a word, doubleword etc. - always an even number of bytes.

Then if SP is word-aligned, (even address) there is no misalignment problem.
Post 15 Mar 2011, 04:27
View user's profile Send private message Reply with quote
MazeGen



Joined: 06 Oct 2003
Posts: 977
Location: Czechoslovakia
MazeGen
fasmnewbie, it is hard to learn about x86 from the manuals. Get a debugger and start stepping your code to see what's happening.

I assume you're using 32-bit version of Windows or *nix. That means your stack is 32 bits wide. At the program startup the stack is aligned to 32 bits. In the source code, PUSH 0x12 actually means PUSH 0x00000012, and PUSH 0x1234 means PUSH 0x00001234 so the stack keeps its alignment. You can explicitly declare that you want to push 16-bit constant, but it is rarely used. You can also PUSH AX that misalignes the stack. Again, there is no point in pushing 16-bit registers to 32-bit stack in everyday programming. (You can PUSH WORD [EAX], too.)

Note that you can't do PUSH AL or push 8-byte operand. These instructions do not exist on x86 (in "32-bit mode").
Post 15 Mar 2011, 07:50
View user's profile Send private message Visit poster's website Reply with quote
revo1ution



Joined: 04 Mar 2010
Posts: 34
Location: somewhere, twiddling something
revo1ution
MazeGen wrote:
Note that you can't do PUSH AL or push 8-byte operand. These instructions do not exist on x86 (in "32-bit mode").

They don't exist in "16-bit mode", real or protected, either. They just don't exist Wink
Post 15 Mar 2011, 07:55
View user's profile Send private message Reply with quote
sinsi



Joined: 10 Aug 2007
Posts: 709
Location: Adelaide
sinsi
The stack itself, as far as the processor is concerned, can be misaligned with no problems, apart from the performance hit.*
Operating systems, on the other hand, prefer (and in some cases demand (i.e. x64)) that the stack is aligned to the stack word size.

*except for 'pop reg16' if SP=FFFF (et al).
Post 15 Mar 2011, 08:35
View user's profile Send private message Reply with quote
MazeGen



Joined: 06 Oct 2003
Posts: 977
Location: Czechoslovakia
MazeGen
revo1ution wrote:
MazeGen wrote:
Note that you can't do PUSH AL or push 8-byte operand. These instructions do not exist on x86 (in "32-bit mode").

They don't exist in "16-bit mode", real or protected, either. They just don't exist Wink

Push 8-byte operand exists in 64-bit mode. Rolling Eyes
Post 15 Mar 2011, 08:45
View user's profile Send private message Visit poster's website Reply with quote
sinsi



Joined: 10 Aug 2007
Posts: 709
Location: Adelaide
sinsi
>Push 8-byte operand exists in 64-bit mode.
Yes, as an immediate value, but it is sign-extended to the stack size.
You still can't push an 8-bit register, but then again why would you?
Post 15 Mar 2011, 08:55
View user's profile Send private message Reply with quote
MazeGen



Joined: 06 Oct 2003
Posts: 977
Location: Czechoslovakia
MazeGen
The in "32-bit mode" part was there to prevent nitpicking like this. Is it really so unclear? I'm trying to simplify things for fasmnewbie. Let me correct myself:

Neither PUSH AL nor PUSH <8-byte operand> exists (for nitpickers, this does not apply to 64-bit mode).
Post 15 Mar 2011, 09:56
View user's profile Send private message Visit poster's website Reply with quote
sinsi



Joined: 10 Aug 2007
Posts: 709
Location: Adelaide
sinsi
8-bit <> 8-byte
Sorry, what an idiot Embarassed

reading <> comprehension
Post 15 Mar 2011, 10:18
View user's profile Send private message Reply with quote
fasmnewbie



Joined: 01 Mar 2011
Posts: 555
fasmnewbie
revo1ution wrote:
fasmnewbie, there is no X86 instruction to push a byte.

You must push a word, doubleword etc. - always an even number of bytes.

Then if SP is word-aligned, (even address) there is no misalignment problem.


LOL. sorry for the idiocy. I gave people the wrong example. Thanks for the explanation.
Post 15 Mar 2011, 13:47
View user's profile Send private message Visit poster's website Reply with quote
fasmnewbie



Joined: 01 Mar 2011
Posts: 555
fasmnewbie
MazeGen wrote:
fasmnewbie, it is hard to learn about x86 from the manuals. Get a debugger and start stepping your code to see what's happening.

I assume you're using 32-bit version of Windows or *nix. That means your stack is 32 bits wide. At the program startup the stack is aligned to 32 bits. In the source code, PUSH 0x12 actually means PUSH 0x00000012, and PUSH 0x1234 means PUSH 0x00001234 so the stack keeps its alignment. You can explicitly declare that you want to push 16-bit constant, but it is rarely used. You can also PUSH AX that misalignes the stack. Again, there is no point in pushing 16-bit registers to 32-bit stack in everyday programming. (You can PUSH WORD [EAX], too.)

Note that you can't do PUSH AL or push 8-byte operand. These instructions do not exist on x86 (in "32-bit mode").


See, your explanation is much clearer than the manuals and my own 'test' code. Yes I am on 32-bit/64-bit AMD Turion on Windows Very Happy

Only that not all data are double-word 'aligned'. If lets say I pushed 3 word-sized data on the 32-bit wide stack, that means memory spaces are wasted to the same amount. I see your point why there is no practical reason to push word on double-word stack. But the inefficiency is still there if smaller data are pushed.

For example, what if my program is non-string chars-intensive. I mean chars would have to be pushed onto the stack anyway. Word-padding the 8-bit chars would seem wasting.

Or did I get this wrong?

Thanks.
Post 15 Mar 2011, 14:13
View user's profile Send private message Visit poster's website Reply with quote
fasmnewbie



Joined: 01 Mar 2011
Posts: 555
fasmnewbie
Or maybe my questions are a bit 'architectural' rather than practical.
Post 15 Mar 2011, 14:23
View user's profile Send private message Visit poster's website Reply with quote
MazeGen



Joined: 06 Oct 2003
Posts: 977
Location: Czechoslovakia
MazeGen
fasmnewbie wrote:
Only that not all data are double-word 'aligned'. If lets say I pushed 3 word-sized data on the 32-bit wide stack, that means memory spaces are wasted to the same amount. I see your point why there is no practical reason to push word on double-word stack. But the inefficiency is still there if smaller data are pushed.

For example, what if my program is non-string chars-intensive. I mean chars would have to be pushed onto the stack anyway. Word-padding the 8-bit chars would seem wasting.

Or did I get this wrong?

You're right.

The thing is that PUSH is in most cases used to pass function arguments to stack. If you have 16-bit arguments, you can pass them as such but it is unusual because you would waste only few 16-bit chunks of memory. And it is unusual to misaling the stack at the function entry so in case of odd number of 16-bit arguments stack adjustment would be needed.

If you have 8-bit arguments, you would additionally have to pack them somehow since you can't PUSH them directly. Well, you could do
Code:
SUB ESP, 1
MOV BYTE [ESP], <8-bit register/constant>    

Again, this would be very unusual way to pass 8-bit argument.

On the other hand, it is usual to allocate stack using SUB ESP, x, for example for local (automatic) variables in function prolog. But that's another story.
Post 15 Mar 2011, 14:54
View user's profile Send private message Visit poster's website Reply with quote
Madis731



Joined: 25 Sep 2003
Posts: 2140
Location: Estonia
Madis731
@fasmnewbie: If you're thinking about pushing "Hello World!" on the stack then quickly forget that idea Smile
The most efficient way to use stack is to push the address of the string but not the string one-by-one (push "H" "e" "l" ... "d" "!").
Code:
push dword myHelloSample

;using later in the code:
mov eax,[esp] ; if you want the address to remain on stack
; or
pop  eax ; finally balance the stack

myHelloSample db "Hello World!",0
    
Post 16 Mar 2011, 20:05
View user's profile Send private message Visit poster's website Yahoo Messenger MSN Messenger Reply with quote
fasmnewbie



Joined: 01 Mar 2011
Posts: 555
fasmnewbie
MazeGen wrote:
fasmnewbie wrote:
Only that not all data are double-word 'aligned'. If lets say I pushed 3 word-sized data on the 32-bit wide stack, that means memory spaces are wasted to the same amount. I see your point why there is no practical reason to push word on double-word stack. But the inefficiency is still there if smaller data are pushed.

For example, what if my program is non-string chars-intensive. I mean chars would have to be pushed onto the stack anyway. Word-padding the 8-bit chars would seem wasting.

Or did I get this wrong?

You're right.

The thing is that PUSH is in most cases used to pass function arguments to stack. If you have 16-bit arguments, you can pass them as such but it is unusual because you would waste only few 16-bit chunks of memory. And it is unusual to misaling the stack at the function entry so in case of odd number of 16-bit arguments stack adjustment would be needed.

If you have 8-bit arguments, you would additionally have to pack them somehow since you can't PUSH them directly. Well, you could do
Code:
SUB ESP, 1
MOV BYTE [ESP], <8-bit register/constant>    

Again, this would be very unusual way to pass 8-bit argument.

On the other hand, it is usual to allocate stack using SUB ESP, x, for example for local (automatic) variables in function prolog. But that's another story.


Thank you for the explanation.
Post 26 Mar 2011, 08:50
View user's profile Send private message Visit poster's website Reply with quote
fasmnewbie



Joined: 01 Mar 2011
Posts: 555
fasmnewbie
Madis731 wrote:
@fasmnewbie: If you're thinking about pushing "Hello World!" on the stack then quickly forget that idea Smile
The most efficient way to use stack is to push the address of the string but not the string one-by-one (push "H" "e" "l" ... "d" "!").
Code:
push dword myHelloSample

;using later in the code:
mov eax,[esp] ; if you want the address to remain on stack
; or
pop  eax ; finally balance the stack

myHelloSample db "Hello World!",0
    


LOL. I still can't get rid of that 'C++ linked-list' hangover.
Thank you.
Post 26 Mar 2011, 08:56
View user's profile Send private message Visit poster's website Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2020, Tomasz Grysztar. Also on GitHub, YouTube, Twitter.

Website powered by rwasa.