flat assembler
Message board for the users of flat assembler.
Index
> Main > PUSH and POP Goto page 1, 2 Next |
Author |
|
Tomasz Grysztar 25 Jan 2007, 10:34
There's a short paragraph about the size settings for PUSH at the end of section 1.2.6 of manual.
|
|||
25 Jan 2007, 10:34 |
|
DOS386 25 Jan 2007, 10:46
Quote: There's a short paragraph about the size settings OK, there is one ... Code: Immediate value as an operand for push instruction without a size operator is by default treated as a word value if assembler is in 16-bit mode and as a double word value if assembler is in 32-bit mode, shorter 8-bit form of this instruction is used if possible, word or dword size operator forces the push instruction to be generated in longer form for specified size. pushw and pushd mnemonics force assembler to generate 16-bit or 32-bit code without forcing it to use the longer form of instruction. It DOES subject to my issue ... but I am still NOT fully smart from it Quote:
Is it useful at all to generate missmatching code (16-bit when other code is 32-bit and CPU is in 32-bit PM) or 32-bit when other code is 16-bit ? The text does NOT mention the amount of pushed data ... and also does not say anything on pushing 1 byte only _________________ Bug Nr.: 12345 Title: Hello World program compiles to 100 KB !!! Status: Closed: NOT a Bug |
|||
25 Jan 2007, 10:46 |
|
MCD 25 Jan 2007, 11:34
NTOSKRNL_VXE wrote:
Well, if you code stuff like DOS extenders (flat/unreal mode) where you switch from 16bit to 32bit mode for example NTOSKRNL_VXE wrote:
Code: use16 ;not allowed push/pop -2^31...-1-2^15 | 2^15..-1+2^31 pushw/popw -2^31...-1-2^15 | 2^15..-1+2^31 ;BUT allowed pushes/pops dword onto/from stack, 4 bytes in machine code pushd/popd -2^31...-1-2^15 | 2^15..-1+2^31 ;pushes/pops word onto/from stack, 2 bytes in machine code push/pop -2^15...-1-2^7 | 2^7..-1+2^15 pushw/popw -2^15...-1-2^7 | 2^7..-1+2^15 ;BUT this pushes/pops dword onto/from stack, 4 bytes in machine code pushd/popd -2^15...-1-2^7 | 2^7..-1+2^15 ;pushes/pops word onto/from stack, 1 bytes in machine code push/pop -2^7..-1+2^7 ;BUT this pushes/pops word onto/from stack, 2 bytes in machine code pushw/popw -2^7..-1+2^7 ;BUT this pushes/pops dword onto/from stack, 4 bytes in machine code pushd/popd -2^7..-1+2^7 use32 ;pushes/pops dword onto/from stack, 4 bytes in machine code push/pop -2^31...-1-2^15 | 2^15..-1+2^31 pushd/popd -2^31...-1-2^15 | 2^15..-1+2^31 ;BUT not allowed pushw/popw -2^31...-1-2^15 | 2^15..-1+2^31 ;pushes/pops dword onto/from stack, 4 bytes in machine code push/pop -2^15...-1-2^7 | 2^7..-1+2^15 pushd/popd -2^15...-1-2^7 | 2^7..-1+2^15 ;BUT this pushes/pops word onto/from stack, 2 bytes in machine code pushw/popw -2^15...-1-2^7 | 2^7..-1+2^15 ;pushes/pops word onto/from stack, 1 bytes in machine code push/pop -2^7..-1+2^7 ;BUT this pushes/pops dword onto/from stack, 4 bytes in machine code pushd/popd -2^7..-1+2^7 ;BUT this pushes/pops word onto/from stack, 2 bytes in machine code pushw/popw -2^7..-1+2^7 //sorry, no use64 yet So you see that there is no push/pop instruction that operates with only 1 byte. You would have to do this with other instructions, like directly writing to the stack, which is problematic to achieve. Also I highly disrecommend 1 byte pushes/pops because of stack misalignment issues, poor performance and very dangerous/or completely prohibited in most modern OSes. _________________ MCD - the inevitable return of the Mad Computer Doggy -||__/ .|+-~ .|| || |
|||
25 Jan 2007, 11:34 |
|
Tomasz Grysztar 25 Jan 2007, 11:36
NTOSKRNL_VXE wrote: The text does NOT mention the amount of pushed data ... and also does not say anything on pushing 1 byte only The size of the operand is the amount of pushed data aswell (by the very definition of the PUSH operation). So it is either word or double word, as stated. No byte push is possible. The 16-bit variant of instruction stores 16 bits on stack, the 32-bit variant stores 32 bits. These two are the only options (not counting the long mode here). |
|||
25 Jan 2007, 11:36 |
|
MCD 25 Jan 2007, 12:06
Or to make it even more clear: "the shorte 8bit form" is only usable for values from -128 to 127 and it means that the generated machine code contains only a byte, but this byte is nevertheless pushed as a word/dword by sign extending it.
for even more details refer to the Intel/AMD docs |
|||
25 Jan 2007, 12:06 |
|
Tomasz Grysztar 25 Jan 2007, 12:18
MCD wrote: Or to make it even more clear: "the shorte 8bit form" is only usable for values from -128 to 127 (...) This is a bit tricky. In 16-bit mode "push 65408" will also generate the short form. fasm follows the philosophy of assembly language being the abstract layer over machine code, which focuses on the functionality of the instruction. You write "push 65408" in 16-bit mode to push the 16-bit value on stack, and choosing the nicest encoding for this instruction is then a task for assembler. |
|||
25 Jan 2007, 12:18 |
|
DOS386 25 Jan 2007, 12:21
Thanks.
Quote: No byte push is possible. OK. 8080 was the last supporting this Quote: Also I highly disrecommend 1 byte pushes/pops Then MCD was wrong ... or meant "emulated" PUSH/POPing Quote: Well, if you code stuff like DOS extenders (flat/unreal mode) where you switch from 16bit to 32bit mode for example That's what I do , I do have 16-bit and 32-bit blocks in same executable, but still don't see a reason for having 16-bit code in a 32-bit block or vice-versa Quote: The size of the operand is the amount of pushed data aswell OK, but this operand is in source only and its size in NOT the size in output code It's confusing Code: Immediate value as an operand for push instruction without a size operator results by default into pushing 16 bits if assembler is in 16-bit mode and 32 bits if assembler is in 32-bit mode, shorter 8-bit form of this instruction is used if possible, word or dword size operator forces the push instruction to be generated in longer form for specified size and also push this size. pushw and pushd mnemonics force assembler to generate instructions pushing 16-bit or 32-bit without forcing it to use the longer form of instruction. Pushing 1 byte only is not possible. Maybe paragraph should be fixed to something like this ^^^ ? _________________ Bug Nr.: 12345 Title: Hello World program compiles to 100 KB !!! Status: Closed: NOT a Bug |
|||
25 Jan 2007, 12:21 |
|
Tomasz Grysztar 25 Jan 2007, 12:24
NTOSKRNL_VXE wrote: OK, but this operand is in source only and its size in NOT the size in output code As I said in the post above: when you write assembly, you usually focus on what the instruction does, not how it is encoded. |
|||
25 Jan 2007, 12:24 |
|
DOS386 25 Jan 2007, 12:37
Quote: This is a bit tricky. YES. Quote: In 16-bit mode "push 65408" will also generate the short form OK, 65408=$FF80, push $FF80 could be push -$80, this is the same as push $80, fits into 2 bytes, and then CPU extends $80 back to $FF80... and in "use32" even push $FFFFFF80 fits into 2 bytes Quote:
but not always, see XOR EAX,EAX issue _________________ Bug Nr.: 12345 Title: Hello World program compiles to 100 KB !!! Status: Closed: NOT a Bug |
|||
25 Jan 2007, 12:37 |
|
Tomasz Grysztar 25 Jan 2007, 12:54
NTOSKRNL_VXE wrote: but not always, see XOR EAX,EAX issue Yes, the optimization that changes the type of operation is left to be done by the programmer, especially when there's a real difference in operation (XOR changes the flags, MOV does not). However for the given EXACT operation I recommend leaving it to assembler to find the best form. That's what fasm is for. (Though also here there are some controversies, like whether assembler should optimize LEA to MOV, etc., that's another story...) |
|||
25 Jan 2007, 12:54 |
|
MCD 25 Jan 2007, 17:15
Tomasz Grysztar wrote:
I guess there is an analogous case with "push 4294967168" in 32bit mode |
|||
25 Jan 2007, 17:15 |
|
Tomasz Grysztar 25 Jan 2007, 17:28
MCD wrote: I guess there is an analogous case with "push 4294967168" in 32bit mode Oh well, even with the 64-bit "push 18446744073709551488" |
|||
25 Jan 2007, 17:28 |
|
FrozenKnight 25 Jan 2007, 23:38
to push a 0 byte
Code: mov BYTE [esp-1], 0 ;used esp-1 to avoid address interlock dec esp, 1 to zero a register and push a byte (this code may be slow because it has an address interlock after esp is adjusted) so it will take a minimum of 2 cycles Code: sub al, al dec esp, 1 mov BYTE [esp], al i haven't speed tested this method but is should be faster than the previous code (on some processors it may be as fast as only one cycle) Code: mov BYTE [esp-1], 0 dec esp sub al, al to pop a byte Code: mov al, BYTE [esp] inc esp while push/pop don't do the job you can emulate the behavior note it's to use these after you do any preceding push/pops. also note because of the stack adjustment if you do this just before a call or ret then you may loose two cycles do to address misalignment and an address interlock. also to force a 16 bit push/pop use pushw/popw and for 32 bit use pushd/popd |
|||
25 Jan 2007, 23:38 |
|
MCD 26 Jan 2007, 05:13
@FrozenKnight: that's exactly what I meant earlier with push/pop of 1 byte They may work on you own OS or in DOS, but very probably not in most modern OSes (Stack misaligned fault or something), even if you but 2 1-byte push immediately one after another. Performence conciderations left aside.
|
|||
26 Jan 2007, 05:13 |
|
DOS386 26 Jan 2007, 11:27
Thanks. PUSHing clarified.
Quote: loose two cycles do to address misalignment and an address interlock. But what is this "address interlock" ? You mentioned this serious (?) problem in your "Mersenne" thread _________________ Bug Nr.: 12345 Title: Hello World program compiles to 100 KB !!! Status: Closed: NOT a Bug |
|||
26 Jan 2007, 11:27 |
|
FrozenKnight 30 Jan 2007, 10:14
MCD i've tested code similar to that in Windows Xp (by pushing entire strings onto the stack. without aligning.) the only time you run into problems is if you dont keep track of how many bytes you pushed and make sure to pop them off correctly. however you can loose performance from using such methods. because of buffers on modern processors if you were to mis align the stack then the processor has to waste an extra cycle just to cache the rest of any addresses that are misaligned. (so it's not always a good idea to do this) also it misaligned data in ollydbg making debugging much harder on me.
as for the question about address interlocks. an address interlock is where the processor has to wait a cycle because you just modified data in a register that you are about to use. so basically Code: mov eax, [edx] add eax, 4 inc ecx Code: mov eax, [edx] inc ecx add eax, 4 also another optimization note it's usually faster to use math operations over bit operations. so Code: sub eax, eax Code: xor eax, eax |
|||
30 Jan 2007, 10:14 |
|
rugxulo 31 Jan 2007, 13:48
http://board.flatassembler.net/topic.php?t=4485&start=40
revolution wrote:
|
|||
31 Jan 2007, 13:48 |
|
FrozenKnight 31 Jan 2007, 18:34
yes, but xor is a bit manipulation instruction so it can only run in the main pipe. sub can (if placed correctly) be called for free. so i guess both have advantages and disadvantages.
|
|||
31 Jan 2007, 18:34 |
|
asmfan 31 Jan 2007, 19:27
[Intel® 64 and IA-32 Architectures Optimization Reference Manual]-248966.pdf
Quote: The Pentium 4 processor provides special support for XOR, SUB, and PXOR opera- _________________ Any offers? |
|||
31 Jan 2007, 19:27 |
|
Goto page 1, 2 Next < Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2025, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.