flat assembler
Message board for the users of flat assembler.

 Index > Main > SSE Random Numbers
Author
pal

Joined: 26 Aug 2008
Posts: 227
pal 11 Jul 2009, 23:18
Well basically I converted a code from C++ into assembly language and I decided to use SSE to do it as it contains some large integers. The code is in C++:

Code:
```Struct Ran {
Ullong u, v, w;
Ran(Ullong j) : v(4101842887655102017LL), w(1) {
// Constructor. Call with any integer seed (except value of v above).
u = j ^ v; int64();
v = u; int64();
w = v; int64();
}
inline Ullong int64() {
// Return 64-bit random integer. See text for explanation of method.
u = u * 2862933555777941757LL + 7046029254386353087LL;
v ^= v >> 17; v ^= v << 31; v ^= v >> 8;
w = 4294957665U*(w & 0xffffffff) + (w >> 32);
Ullong x = u ^ (u << 21); x ^= x >> 35; x ^= x << 4;
return (x + v) ^ w;
}
inline Doub doub() { return 5.42101086242752217E-20 * int64(); }
// Return random double-precision floating value in the range 0. to 1.
inline Uint int32() { return (Uint)int64(); }
// Return 32-bit random integer.
};
```

You have to slightly modify it to run it. This code is taken from Numerical Recipes 3rd Edition. The code I came up with was:

Code:
```RandomNumber:
push    ebp
mov             ebp,esp
mov             eax,[ebp+8]
movd    xmm0,eax
mov             eax,4101842887
movd    xmm7,eax
pslldq  xmm7,4
mov             eax,655102017
movd    xmm2,eax
pxor    xmm0,xmm7 ; u = j ^ v
call    produceRandom
movq    xmm7,xmm0 ; v = u
call    produceRandom
movq    xmm5,xmm7 ; w = v
call    produceRandom
mov             esp,ebp
pop             ebp
retn    4

produceRandom:
; u = u * 2862933555777941757LL + 7046029254386353087LL;
mov             eax,2862933555
movd    xmm2,eax
pslldq  xmm2,4
mov             eax,777941757
movd    xmm3,eax
pmuludq xmm0,xmm2
mov             eax,704602925
movd    xmm2,eax
pslldq  xmm2,4
mov             eax,4000000000
movd    xmm3,eax
mov             eax,386353087
movd    xmm3,eax

; v ^= v >> 17; v ^= v << 31; v ^= v >> 8
; v = xmm7
movq    xmm6,xmm7
psrldq  xmm6,17 ; v >> 17
pxor    xmm7,xmm6 ; v ^= v
movq    xmm6,xmm7
pslldq  xmm6,31 ; v << 31
pxor    xmm7,xmm6 ; v ^= v
movq    xmm6,xmm7
psrldq  xmm6,8 ; v >> 8
pxor    xmm7,xmm6 ; v ^= v
pxor    xmm6,xmm6

; w = 4294957665 * (w & 0xFFFFFFFF) + (w >> 32)
; w = xmm5
movq    xmm6,xmm5
mov             eax,4294957665
movd    xmm4,eax
mov             eax,0xFFFFFFFF
movd    xmm3,eax
pand    xmm6,xmm3
pmuludq xmm6,xmm4
psrldq  xmm5,32
pxor    xmm3,xmm3 ; Cleanup
pxor    xmm4,xmm4
pxor    xmm6,xmm6

; x = u ^ (u << 21); x ^= x >> 35; x ^= x << 4;
; x = xmm6
movq    xmm6,xmm0
movq    xmm4,xmm0
pslldq  xmm4,31 ; x << 31
pxor    xmm6,xmm4 ; x ^= x
movq    xmm4,xmm6
psrldq  xmm4,35 ; x >> 35
pxor    xmm6,xmm4 ; x ^= x
movq    xmm4,xmm6
pslldq  xmm4,4 ; x << 4
pxor    xmm6,xmm4 ; x ^= x

; return (x + v) ^ w;
movq    xmm4,xmm6
paddq   xmm4,xmm7 ; x + v
pxor    xmm4,xmm5
movd    eax,xmm4
ret
```

I may be quite a bit off with the accuracy, but it produces pseudo-random numbers. I do know that I do some pxor's when I then go and use movd to put information into the register. Apparently the C++ method is really good. I just have a few questions though as I could not find them in the manuals.

1. When loading a value like 2862933555777941757 into an XMM register, am I doing it the correct way? I.e. loading 2862933555 into GPR, then into XMM, then shifting it (pslldq) by a dword, then loading 777941757 into a different XMM register, then doing something (paddd, por, pxor etc.) to concat the string.

2. When wanting to subtract 1 from an XMM register (I don't do this here, but I did need it the other day), is there a quicker way to do it than:

Code:
```mov eax,1
movd xmm1,eax
psubb xmm0,xmm1
```

psubb doesn't allow an immediate value, and I couldn't find a pdec instruction mnemonic.

3. How do I tell what version of SSE the computer has. Like I was doing some coding the other day with SSE and I needed to check if xmm0 was 0 or not so I was going to use ptest xmm0,xmm0 but apparently I don't have SSE4.0+ so that didn't work, and in the end I didn't actually find a way to check what value was in xmm0

P.S. My bad for all the questions
11 Jul 2009, 23:18
windwakr

Joined: 30 Jun 2004
Posts: 827
windwakr 11 Jul 2009, 23:41
3. Use CPUID, its in the cpuid manual.

Call CPUID with EAX being 1

EDX can tell you support for SSE and SSE2
bit 25 is SSE bit 26 is SSE2

ECX can tell you support for SSE3 SSSE3 SSE4.1 and SSE4.2
bit 0 is SSE3 bit 9 is SSSE3 bit 19 is SSE4.1 bit 20 is SSE4.2

EDIT: *FACEPALM*, I stupidly forgot that the Windows API modifies the registers, boy do I feel dumb...Works good now...

Code:
```;SSE detection, by ---- aka windwakr
;Saturday, July 11, 2009  7/11/09
;Seriously, why does the rest of the world do dates the "logical" way? mm/dd/yy is so much better...
include "win32ax.inc"

.data
supported1 dd ?
supported2 dd ?
sse db ' SSE',0
sse2 db ', SSE2',0
sse3 db ', SSE3',0
sse33 db ', SSSE3',0
sse41 db ', SSE4.1',0
sse42 db ', SSE4.2',0

title db 'SSE support',0
support db 'Your machine has support for these SSE versions:',13,10,0
buffer rb 256

.code
start:
mov eax,1
cpuid
mov [supported1],edx
mov [supported2],ecx
test [supported1],00000010000000000000000000000000b
jz @f
invoke lstrcat,support,sse
@@:
test [supported1],00000100000000000000000000000000b
jz @f
invoke lstrcat,support,sse2
@@:
test [supported2],00000000000000000000000000000001b
jz @f
invoke lstrcat,support,sse3
@@:
test [supported2],00000000000000000000001000000000b
jz @f
invoke lstrcat,support,sse33
@@:
test [supported2],00000000000010000000000000000000b
jz @f
invoke lstrcat,support,sse41
@@:
test [supported2],00000000000100000000000000000000b
jz @f
invoke lstrcat,support,sse42
@@:

invoke MessageBox,0,support,title,MB_OK
invoke ExitProcess,0
.end start
```

_________________
----> * <---- My star, won HERE

Last edited by windwakr on 15 Jul 2012, 21:02; edited 7 times in total
11 Jul 2009, 23:41
r22

Joined: 27 Dec 2004
Posts: 805
r22 12 Jul 2009, 00:10
@pal

1 - Loading data into XMMX is best performed all at once using an aligned data source and the MOVDQA opcode
Code:
```align 16
bignum dq 1234567812345678h, 11111111ffffffffh
...
movdqa xmm0, dqword[bignum]
```

2. Subtracting one can be performed similar to the above, using an aligned data source.
Code:
```align 16
sub1fromallbytes dq 0101010101010101h, 0101010101010101h
sub1fromallqwords dq 0000000000000001h, 0000000000000001h
...
psubb XMM0,dqword[sub1fromallbytes]
psubq XMM0,dqword[sub1fromallqwords]
```

I find encryption and prng a very interesting topic. In fact I wrote my honors thesis for college on it, and documented my work on this board. http://board.flatassembler.net/topic.php?t=6518&postdays=0&postorder=asc&start=0
You may want to read it if you get a chance.
The PROE algorithm is used for encryption, but the PRNG portion can be easily stripped out and analyzed.

Using the NIST (national institute of standards and technology) test suite as well as ENT (a program used for testing randomness). I was able to verify that my PRNG is cryptographically secure (extremely random).
12 Jul 2009, 00:10
pal

Joined: 26 Aug 2008
Posts: 227
pal 12 Jul 2009, 12:02
windwakr: I thought it may have been something to do with cpuid, stupidly I didn't check the manual even though I was reading it the other day. Personally to test it I would have used bt but upon reading I have found that this takes 4 clocks for a reg,imm8.

r22: Again stupidly I completely forgot about just putting the data in a memory location, though I haven't come across movdqa yet so I'll have a play with that. Your PRNG looks very interesting, I am going to be sure to have a decent look at it, cheers for linking me to that man.

Cheers to both of you for your help by the way
12 Jul 2009, 12:02
 Display posts from previous: All Posts1 Day7 Days2 Weeks1 Month3 Months6 Months1 Year Oldest FirstNewest First

 Jump to: Select a forum Official----------------AssemblyPeripheria General----------------MainTutorials and ExamplesDOSWindowsLinuxUnixMenuetOS Specific----------------MacroinstructionsOS ConstructionIDE DevelopmentProjects and IdeasNon-x86 architecturesHigh Level LanguagesProgramming Language DesignCompiler Internals Other----------------FeedbackHeapTest Area

Forum Rules:
 You cannot post new topics in this forumYou cannot reply to topics in this forumYou cannot edit your posts in this forumYou cannot delete your posts in this forumYou cannot vote in polls in this forumYou cannot attach files in this forumYou can download files in this forum