flat assembler
Message board for the users of flat assembler.

Index > Main > how to load unsigned integer into fpu?

Goto page Previous  1, 2
Author
Thread Post new topic Reply to topic
bitRAKE



Joined: 21 Jul 2003
Posts: 3884
Location: vpcmipstrm
bitRAKE 22 Dec 2008, 01:46
Well, that sucks. Bit 63 must be set.

A branchless version is still possible, just not as pretty:
Code:
mov rax,197300000090000005
push $3FFF
bsr rcx,rax
add qword [rsp],rcx
sub ecx,63
neg ecx
shl rax,cl
push rax
fld tbyte [rsp]
add rsp,16    
(not 32-bit friendly either)
Post 22 Dec 2008, 01:46
View user's profile Send private message Visit poster's website Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 19869
Location: In your JS exploiting you and your system
revolution 22 Dec 2008, 01:51
You can also use FCMOVcc. Since you are using 64bit code then FCMOV will be available.
Post 22 Dec 2008, 01:51
View user's profile Send private message Visit poster's website Reply with quote
bitRAKE



Joined: 21 Jul 2003
Posts: 3884
Location: vpcmipstrm
bitRAKE 22 Dec 2008, 01:54
How does it look penalty-wise though?
Almost as bad as a branch, iirc.
(Off through the Fog I go...)

Edit: doc says 2 cycles! There is a winner.
Post 22 Dec 2008, 01:54
View user's profile Send private message Visit poster's website Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 19869
Location: In your JS exploiting you and your system
revolution 22 Dec 2008, 01:59
Penalties would depend upon the CPU being used. You need to define more the parameters of the test.
Post 22 Dec 2008, 01:59
View user's profile Send private message Visit poster's website Reply with quote
LocoDelAssembly
Your code has a bug


Joined: 06 May 2005
Posts: 4624
Location: Argentina
LocoDelAssembly 22 Dec 2008, 02:15
Quote:

You can also use FCMOVcc. Since you are using 64bit code then FCMOV will be available.

This assumption should be normally safe, but the documentation says:
AMD wrote:
Use the CPUID instruction to determine if this instruction is supported on a
particular x86-64 implementation. It is supported if both the CMOV and FPU bits are
set to 1.
Post 22 Dec 2008, 02:15
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 19869
Location: In your JS exploiting you and your system
revolution 22 Dec 2008, 02:21
Hehe, okay:

You can also use FCMOVcc. Since you are using 64bit code then FCMOV will most likely be available.
Post 22 Dec 2008, 02:21
View user's profile Send private message Visit poster's website Reply with quote
mattst88



Joined: 12 May 2006
Posts: 260
Location: South Carolina
mattst88 23 Dec 2008, 02:05
I don't believe any 64-bit x86 CPUs exist without a floating-point unit, much less FCMOVcc instuctions.
Post 23 Dec 2008, 02:05
View user's profile Send private message Visit poster's website Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 19869
Location: In your JS exploiting you and your system
revolution 23 Dec 2008, 02:38
mattst88 wrote:
I don't believe any 64-bit x86 CPUs exist without a floating-point unit, much less FCMOVcc instuctions.
Not today at least, but what about tomorrow when <insert new CPU company here> introduces the hyper-cheap no-frills no-FPU 64bit x86-compatible CPU (where the OS will emulate the FPU). Ah, you forgot about that one!

Wink
Post 23 Dec 2008, 02:38
View user's profile Send private message Visit poster's website Reply with quote
bitRAKE



Joined: 21 Jul 2003
Posts: 3884
Location: vpcmipstrm
bitRAKE 02 Feb 2009, 07:10
Windows defaults to 53-bit mode, so if anyone (myself, for example) attempts to use any of this code be sure to change to extended precision.
Code:
push 0
fnstcw [rsp]
and word [rsp],-1 - 1100000000b
or word [rsp],      1100000000b ; double extended precision
fldcw [rsp]
pop rax    
...lest ye be caught by an embarrassingly tricky bug.
Post 02 Feb 2009, 07:10
View user's profile Send private message Visit poster's website Reply with quote
LocoDelAssembly
Your code has a bug


Joined: 06 May 2005
Posts: 4624
Location: Argentina
LocoDelAssembly 02 Feb 2009, 18:22
bitRAKE, is the AND instruction really necessary for something?
Post 02 Feb 2009, 18:22
View user's profile Send private message Reply with quote
bitRAKE



Joined: 21 Jul 2003
Posts: 3884
Location: vpcmipstrm
bitRAKE 02 Feb 2009, 20:10
No, just the crust from some cut-n-paste. As you noticed, it doesn't do anything since all the bits effected are just being set - OR is sufficient. Could use PUSH R?? as well to reduce the size. Or maybe gamble with (or a 00 03 somewhere in the code):
Code:
push 11b
fldcw [rsp-1] 
pop rax    
Post 02 Feb 2009, 20:10
View user's profile Send private message Visit poster's website Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 19869
Location: In your JS exploiting you and your system
revolution 02 Feb 2009, 20:40
Changing the precision may bit be needed. It only seems to affect the div & sqrt that I can tell. Perhaps also the transcendental functions?

FADDs, FSUBs, FMULs and most others won't be affected.
Post 02 Feb 2009, 20:40
View user's profile Send private message Visit poster's website Reply with quote
bitRAKE



Joined: 21 Jul 2003
Posts: 3884
Location: vpcmipstrm
bitRAKE 02 Feb 2009, 21:52
Intel 8.1.5.2 wrote:
The precision-control bits only affect the results of the following floating-point instructions: FADD, FADDP, FIADD, FSUB, FSUBP, FISUB, FSUBR, FSUBRP, FISUBR, FMUL, FMULP, FIMUL, FDIV, FDIVP, FIDIV, FDIVR, FDIVRP, FIDIVR, and FSQRT.
All of them I tried do indeed seem to be affected.
Post 02 Feb 2009, 21:52
View user's profile Send private message Visit poster's website Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 19869
Location: In your JS exploiting you and your system
revolution 03 Feb 2009, 00:00
bitRAKE wrote:
Intel 8.1.5.2 wrote:
The precision-control bits only affect the results of the following floating-point instructions: FADD, FADDP, FIADD, FSUB, FSUBP, FISUB, FSUBR, FSUBRP, FISUBR, FMUL, FMULP, FIMUL, FDIV, FDIVP, FIDIV, FDIVR, FDIVRP, FIDIVR, and FSQRT.
All of them I tried do indeed seem to be affected.
I must be confused with something else. Perhaps I was thinking of the instruction timing that is not affected by the precision setting.
Post 03 Feb 2009, 00:00
View user's profile Send private message Visit poster's website Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  
Goto page Previous  1, 2

< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2023, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.