flat assembler
Message board for the users of flat assembler.
![]() |
Author |
|
Matrix 21 Jun 2005, 14:30
Hy MCD,
single, double are floating point numbers with different precisions... but if you can use instructions one byte smaller you can optimize 256 byte intros for example. |
|||
![]() |
|
Madis731 21 Jun 2005, 22:33
If they were integers - there would be no difference between 128bit at a time or 1bit at a time.
But like Matrix said - these work on floating point values so there is a precision difference and also if you have double for example then you don't need to first convert it to single because you already have the appropriate functions. |
|||
![]() |
|
MCD 22 Jun 2005, 11:22
Madis731 wrote: If they were integers - there would be no difference between 128bit at a time or 1bit at a time. I just verified this with OllyDbg v1.1. (On a Pentium 4 machine with "double precision" versions of the instructions worked around with a "db 66h" before those since Olly doesn't SSE2/3). _________________ MCD - the inevitable return of the Mad Computer Doggy -||__/ .|+-~ .|| || |
|||
![]() |
|
r22 22 Jun 2005, 17:30
if ANDPS, ORPS, XORPS and ANDNPS do the same as their respective suffix PD opcodes, then they are redundant.
I first thought maybe they affect SIMD FP execptions but after looking at documentation I found this was not the case. NaNs don't seem to be a factor with these instructions. MAYBE the suffix PD instructions are faster than the suffix PS ones. |
|||
![]() |
|
MCD 23 Jun 2005, 12:02
r22 wrote: I first thought maybe they affect SIMD FP execptions but after looking at documentation I found this was not the case. Well, actually, my TSCBENCW program with those instruction showed for both versions a 1 cycle clock; and actually its unlikely that different binary logical instructions have different timings, cause they are one of the most simplest to add in the ALU. _________________ MCD - the inevitable return of the Mad Computer Doggy -||__/ .|+-~ .|| || |
|||
![]() |
|
SDragon 14 Sep 2005, 14:04
From Intel manuals, vol. 1, chapter 11.6.9:
... In this example, XORPS or PXOR can be used instead of XORPD and yield the same correct result. However, because of type mismatch between the operand data type and the instruction data type, a latency penalty will be incurred due to implementations of the instructions at the microarchitecture level. So, two logical instructions XORPS/XORPD, ANDPS/ANDPD are functionally equivalent, but using single precision instruction on double values is slower than doing double calculations with double values. At least, Intel says so. |
|||
![]() |
|
< Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2025, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.