flat assembler
Message board for the users of flat assembler.

Index > Main > 64 bit why not do call dword [ebx-4] ?

Author
Thread Post new topic Reply to topic
Roman



Joined: 21 Apr 2012
Posts: 989
Roman
I try rewrite my 32 bit example and get fasm error.
Call dword[ebx-4] illegal instruction !

How fix this ?

I plane do this:
Code:
sub rdx,rdx
mov edx,[ebx-4]
Call rdx
    

Or using RSP ?
push dword [ebx-4]
Post 20 Dec 2020, 17:39
View user's profile Send private message Reply with quote
bitRAKE



Joined: 21 Jul 2003
Posts: 3067
Location: vpcmipstrm
bitRAKE
Direct quote from AMD manual:
Quote:
No prefix is available to encode a 32-bit operand size in 64-bit mode.

_________________
¯\(°_o)/¯ unlicense.org
Post 20 Dec 2020, 18:32
View user's profile Send private message Visit poster's website Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 17876
Location: In your JS exploiting you and your system
revolution
Roman: You don't need the "sub rdx,rdx" instruction there. All writes to edx (and all 32-bit registers) will zero the upper 32-bits automatically.

Also we discussed push previously. In 64-bit mode you can't push 32-bit values. It simply can't be done.
Post 20 Dec 2020, 22:37
View user's profile Send private message Visit poster's website Reply with quote
Furs



Joined: 04 Mar 2016
Posts: 1564
Furs
Code:
mov edx,[rbx-4]    
Make sure it fits in 32 bits...
Post 21 Dec 2020, 14:06
View user's profile Send private message Reply with quote
Roman



Joined: 21 Apr 2012
Posts: 989
Roman
revolution wrote:
Roman: You don't need the "sub rdx,rdx" instruction there. All writes to edx (and all 32-bit registers) will zero the upper 32-bits automatically.


O ! I dont know this !

I test and look in IDA Pro
mov rax,-1
mov eax,0xff00ff00 ;this reset to null high parts rax

Why this need do ? About reset high part rax.
Very handfull store in high part some values.
And exist MOVSXD
Post 21 Dec 2020, 16:56
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 17876
Location: In your JS exploiting you and your system
revolution
Roman wrote:
Why this need do ? About reset high part rax.
Ask AMD. They decided it.
Post 22 Dec 2020, 09:23
View user's profile Send private message Visit poster's website Reply with quote
Roman



Joined: 21 Apr 2012
Posts: 989
Roman
Quote:
Ask AMD. They decided it.

Where was Intel ? Smile
Smoking in the bathroom ?
Post 22 Dec 2020, 09:38
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 17876
Location: In your JS exploiting you and your system
revolution
Roman wrote:
Where was Intel ? Smile
They were sleeping after creating the itanium.
Post 22 Dec 2020, 09:40
View user's profile Send private message Visit poster's website Reply with quote
Roman



Joined: 21 Apr 2012
Posts: 989
Roman
Intel was in Italy and created itanium.
Post 22 Dec 2020, 09:47
View user's profile Send private message Reply with quote
Furs



Joined: 04 Mar 2016
Posts: 1564
Furs
Roman wrote:
Why this need do ?
So they can rename the register and avoid false dependencies. I've talked to you about register renaming before, but you wouldn't listen. Wink

Code:
mov rax, -1
; do some stuff
mov eax, 1234
; eax is now a different physical register, all computations on it are done in parallel to above    
If the higher 4 bytes were not cleared, the CPU wouldn't be able to parallelize it: what if some earlier code modified the 4 higher bytes and you used them subsequently? It can't afford to create a wrong result.

But now it implicitly knows they are 0, so it can use a totally different register to do the calculations at the same time without waiting for it.
Post 22 Dec 2020, 14:31
View user's profile Send private message Reply with quote
Roman



Joined: 21 Apr 2012
Posts: 989
Roman
Quote:

mov rax, -1
mov eax, 1234

is now a different physical register, all computations on it are done in parallel to above

And how to use it on practically ?
Show simple example.
Post 22 Dec 2020, 14:56
View user's profile Send private message Reply with quote
Furs



Joined: 04 Mar 2016
Posts: 1564
Furs
Roman wrote:
And how to use it on practically ?
Show simple example.
You don't "use" it, it's automatic. I don't think you understand how long the OoO pipeline is in modern CPUs.

Here's an example with a loop:
Code:
@@:
  mov rax, [some_var + rcx*8]
  ; do something with rax
  mov eax, [some_other_var + rcx*4]
  xor eax, edx
  ; do some other stuff
  mov [some_output + rax*4], esi
  loop @b    
Note how the second "mov eax" doesn't depend on the first one at all (it will zero the upper part), so the CPU will do it in parallel on a different physical register. The write doesn't depend on the first "mov rax" either so it runs in parallel.

Keep in mind this isn't just two things running in parallel. Each iteration of the loop runs in parallel since it's independent of another one (except for ecx, which is dirt cheap to compute, just a decrement on each iteration).

Depends how much the CPU can execute in parallel at this point, which varies by CPU design. You don't concern yourself with that, what you simply do is allow it to run it in parallel, nothing more. It could execute 2 things in parallel at same time, or 100.

That's why some CPUs are wildly faster than others, even with similar clock speeds. They execute more stuff in parallel. But obviously to do this, they need to be allowed to execute them in parallel. Register renaming is a big deal for this.

Now it can do both memory reads per loop at the same time, instead of having to wait for first memory read and other computations done.


Keep in mind that the CPU OoO execution doesn't "stop" at a jump or loop. Branch prediction is a huge thing. For loops it will correctly predict them a lot of the time.

So as far as the CPU sees, it sees a lot more of these parallel instructions than just 2: it sees the next loop iteration at the same time, then the next as well and how many it can fit in its pipeline. It executes all of them at the same time.
Post 23 Dec 2020, 14:55
View user's profile Send private message Reply with quote
Roman



Joined: 21 Apr 2012
Posts: 989
Roman
[quote="Furs"]
Roman wrote:
mov rax, [some_var + rcx*8] ;i think here any regs do parallel , not only rax.
; do something with rax
mov eax, [some_other_var + rcx*4]
xor eax, edx

You said CPU do this parallel.
Rax on one CPU register and eax another CPU register
Physically in CPU this is two registers.
It right?

Code:
       mov ecx,300
@@:
  mov rax, [some_var + rcx*8]
  ; do something with rax
  mov eax, [some_other_var + rcx*4]
  xor eax, edx
  ; do some other stuff
  mov [some_output + rax*4], esi
dec ecx
test ecx,ecx
jnz @b  
     

You said in this case CPU do two parallel loops.
It right ?
Post 23 Dec 2020, 15:07
View user's profile Send private message Reply with quote
Furs



Joined: 04 Mar 2016
Posts: 1564
Furs
Roman wrote:
You said CPU do this parallel.
Rax on one CPU register and eax another CPU register
Physically in CPU this is two registers.
It right?
Exactly. It varies by CPU, but most current desktop CPUs have at least 128 physical GP registers per core. (while you can only address 8 in the ISA in non-long mode, and 16 in long mode)

The registers you access in code do not map directly to physical registers, the CPU abstracts that and uses a lot more internally due to register renaming.

Roman wrote:
You said in this case CPU do two parallel loops.
It right ?
Or more than two, depending how deep its capabilities are. Unroll the loop in your head, that's what the CPU "sees".

As long as it has idle execution units available, it will try to fill them up to do work at the same time, with things that don't depend on previous results.
Post 24 Dec 2020, 13:47
View user's profile Send private message Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2020, Tomasz Grysztar. Also on GitHub, YouTube, Twitter.

Website powered by rwasa.