flat assembler
Message board for the users of flat assembler.

Index > Main > pop qword [rdi+rsp]

Author
Thread Post new topic Reply to topic
BAiC



Joined: 22 Mar 2011
Posts: 272
Location: California
BAiC 16 May 2011, 20:29
ok, so I'm writing an engine that generates data on the stack and then blind writes it out to memory in one pass. while I solved the code by replacing
Code:
sub rdi, rsp
@@:
pop qword [rdi+rsp]
loop @b    

with
Code:
@@:
pop rax
stosq
loop @b    

the issue of why the first code doesn't work is something I'd like to determine. The opcodes for the pop qword are the same regardless of the order rdi+rsp/rsp+rdi.

while the pointers obviously differ in Segments (one is the stack whilst the other is DS) both Segments start at 0 so the logical difference should cancel in the pointer arithmetic to target RDI as a base address while the extra bytes from the updates to RSP effectively increment RDI.
Post 16 May 2011, 20:29
View user's profile Send private message Visit poster's website Reply with quote
LocoDelAssembly
Your code has a bug


Joined: 06 May 2005
Posts: 4623
Location: Argentina
LocoDelAssembly 17 May 2011, 05:59
Does "pop qword [ds:rdi+rsp]" work? As for fasm compiling exactly the same for "rdi+rsp" and "rsp+rdi", that is because (besides fasm's algebra engine does not strictly follow your ordering), there is no encoding for RSP as index (or is it??)

When you tested your second code, did you make sure the direction flag was cleared? Can't see other sources of differences between the first and second codes.
Post 17 May 2011, 05:59
View user's profile Send private message Reply with quote
BAiC



Joined: 22 Mar 2011
Posts: 272
Location: California
BAiC 17 May 2011, 14:20
you obviously have my question wrong in your mind.

I didn't need the ds note, I had already expressed that issue in my original post.
Quote:
while the pointers obviously differ in Segments (one is the stack whilst the other is DS)

the second code snippet worked, so the Direction Flag issue is moot.
Post 17 May 2011, 14:20
View user's profile Send private message Visit poster's website Reply with quote
typedef



Joined: 25 Jul 2010
Posts: 2893
Location: 0x77760000
typedef 17 May 2011, 15:23
Don't you have to increment or decrement rdi?
Post 17 May 2011, 15:23
View user's profile Send private message Reply with quote
MazeGen



Joined: 06 Oct 2003
Posts: 977
Location: Czechoslovakia
MazeGen 17 May 2011, 18:16
BAiC, the problem might be that POP with RSP base computes the effective address of the operand after it increments the RSP register. (Weird but true, PUSH and POP with rSP base is tricky).
Post 17 May 2011, 18:16
View user's profile Send private message Visit poster's website Reply with quote
LocoDelAssembly
Your code has a bug


Joined: 06 May 2005
Posts: 4623
Location: Argentina
LocoDelAssembly 17 May 2011, 19:36
Quote:

I didn't need the ds note, I had already expressed that issue in my original post.
Since originally you posted this into Compiler Internals as if this were a fasm's bug, I had to make the comment just in case...

Quote:

the second code snippet worked, so the Direction Flag issue is moot.
Not if RDI is pointing to the end of the buffer where obviously it would have a big advantage having DF=1 over your first code.

Anyway, the problem probably it is what MazeGen says, I just assumed the processor always respects the RSP value prior to incrementing/decrementing it, but seems that things are different when using for addressing (even PUSH and POP handle the situation differently)

Intel's PUSH documentation wrote:
The PUSH ESP instruction pushes the value of the ESP register as it existed before the instruction was executed. Thus if a PUSH instruction uses a memory operand in which the ESP register is used for computing the operand address, the address of the operand is computed before the ESP register is decremented.
Intel's POP documentation wrote:
If the ESP register is used as a base register for addressing a destination operand in memory, the POP instruction computes the effective address of the operand after it increments the ESP register.
Post 17 May 2011, 19:36
View user's profile Send private message Reply with quote
BAiC



Joined: 22 Mar 2011
Posts: 272
Location: California
BAiC 18 May 2011, 08:54
typedef wrote:
Don't you have to increment or decrement rdi?

RSP is being incremented.. it's a bit complicated (this is a somewhat advanced topic in general): the negative RSP (sub rdi, rsp) that is stored in RDI (constant within the loop) causes only the very first RSP to be dropped from the sum leaving the multiple of 8 that gets added to RSP during each iteration of the loop.
MazeGen wrote:
BAiC, the problem might be that POP with RSP base computes the effective address of the operand after it increments the RSP register. (Weird but true, PUSH and POP with rSP base is tricky).

not "might".. this is it. Thanks MazeGen.
LocoDelAssembly wrote:
Since originally you posted this into Compiler Internals as if this were a fasm's bug, I had to make the comment just in case...

I posted it to Compiler Internals because at the time the possibility of Operand Override prefixes, Segment Selector prefixes, and general Instruction Encoding was running through my head as possible causes. There is no section that deals with actual encodings more specifically than Compiler Internals.
LocoDelAssembly wrote:
Not if RDI is pointing to the end of the buffer where obviously it would have a big advantage having DF=1 over your first code.

In the pop-block the Direction Flag is irrelevant. As for the stosq code, what you're referring to would require changing the code that generates the stack data: the data that is placed into rax courtesy of the "pop rax" must be sent to "low" memory which is increasing. using DF=1 would place the data into memory backwards.
Post 18 May 2011, 08:54
View user's profile Send private message Visit poster's website Reply with quote
LocoDelAssembly
Your code has a bug


Joined: 06 May 2005
Posts: 4623
Location: Argentina
LocoDelAssembly 18 May 2011, 17:42
BAiC, first of all, I apologize for the DS confusion, I had my mind a little bit biased about what you meant because of the [EBP+EBP] Vs. [EBP*2] problem in which fasm always picks the encoding of the former even though they use SS and DS respectively.

About the second code, yes, having DF=1 would reverse the data in your code, but if RDI would be at the end of the buffer it would still explain why STOSQ works while POP mem faults (e.g. in the case when the POP code fails after very few or zero iterations as opposed to your real problem which must be failing in the last iteration). Without knowing what MazeGen posted, and not being able to spot any error in both codes, I just tried to find other explanations of why your second code didn't fault (even if them break your memory copy needs)
Post 18 May 2011, 17:42
View user's profile Send private message Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2025, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.