flat assembler
Message board for the users of flat assembler.

 Index > Projects and Ideas > [IDEA] Abusing PCLMULQDQ to improve algorithms
Author

Joined: 25 Sep 2003
Posts: 2139
Location: Estonia
Madis731 25 Aug 2008, 09:05
Intro:
There's a new AVX coming out in erm...2010? AVX
Okay its all fine to move/shuffle/add 256 bits of data at once, but I was thinking about something completely new.
Now to the point:
Lets take this idea of magic numbers and develop it (Terje's algorithm)
There are other algorithm samples here on the board that I care not to search for now. What is important is to see
the carryless multiplication (PCLMULQDQ) as a "supershuffling agent under cover". When a regular multiplication
shifts the result far from the original location, the carryless variant might save us a few shifts or rotations (or might not?).

Some examples (updating from time to time):
Code:
```; 1) http://www.jjj.de/fxt/ the pdf chapters 1.3 and 1.4 and related.
; Detecting bitstream transitions 0>1, 1>0 when multiplying carrylessly with
; 3 (11b) the answer will tell you the edges where the stream of bits toggle.
movdqa xmm0, const_111100001101101000b
pclmulqdq xmm0,const_11b
; xmm0= 1000100010110111000b

; 2) Using carryless multiply to square the number (multiply with itself) the
; answer will be a bitwise unpack doubling the number of bits.
; (Compare with PUNPCKBW/WD/DQ (x)mm,(x)mm/m64/128)
movdqa xmm0, const_111100001101101000b
pclmulqdq xmm0,const_11b
;xmm0= 010101010000000001010001010001000000b

; 3) The parity check can be achieved by (carryless-) multiplying with all the
; interesting bits set to one and then shifting:
movdqa xmm0, const_111100001101101000b
pclmulqdq xmm0,const_11111111b
psrldq xmm0,8-1 ; interesting bits - 1
;if this instruction used GPRs then it would be possible to shift into carry!
;counting only lower 8 bits which are 01101000b
;xmm0[0] = 0 xor 1 xor 1 xor 0 xor 1 xor 0 xor 0 xor 0 = 1

```

_________________
My updated idol http://www.agner.org/optimize/

Last edited by Madis731 on 28 Aug 2008, 10:37; edited 2 times in total
25 Aug 2008, 09:05
revolution
When all else fails, read the source

Joined: 24 Aug 2004
Posts: 20132
Location: In your JS exploiting you and your system
revolution 25 Aug 2008, 11:18
Carryless means it is used for polynomial multiplications. That means each individual bit has no carry to the next bit. Good for CRC and Base-2 Elliptic curve computations. I doubt it can be used for a replacement of regular multiplication. But if I am wrong then I would be happy to learn how to use it for general multiply also.
25 Aug 2008, 11:18

Joined: 25 Sep 2003
Posts: 2139
Location: Estonia
Madis731 25 Aug 2008, 11:32
The problem with its use is that I can't find any description of how to use it or how its used other than "for doing some encryption & stuff".
I'd like to see some pseudocode or some VHDL to describe it. I haven't seen carryless multiply mentioned ever before Intel
25 Aug 2008, 11:32
revolution
When all else fails, read the source

Joined: 24 Aug 2004
Posts: 20132
Location: In your JS exploiting you and your system
revolution 25 Aug 2008, 11:43
Carryless means no carry, yes, really, it is simpler than normal multiply. All the bits are just XORed since the carry is not needed. You can write out the normal long hand multiply but just "forget" to carry over anything into the next column and then you have carryless multiply, or base-2 polynomial multiply for the fancy name.
25 Aug 2008, 11:43

Joined: 25 Sep 2003
Posts: 2139
Location: Estonia
Madis731 25 Aug 2008, 16:07
I will give it a try:
Code:
```   *101
1001
======
+   101
000
000
101
======
101101 ;normal and carryless are the same

*101
1101
======
+   101
000
101
101
======
1000001 ;normal
111101 ;carryless?
```

Wow, from APRIL
Sorry about "no info on the topic"

EDIT2:
This image was created by applying "display if CLMUL(A,B) != MUL(A,B)"

EDIT3:
I won't show you the binary output here, but it will reveal some magic already. For example:
1) When you CLMUL with 3 (11b) then the answer will tell you the edges where the stream of bits will toggle 0=>1 or 1=>0.
2) CLMUL with the number itself yields a bitwise unpack
Compare ~ PUNPCKbit-twobit xmm0,xmm0
Code:
```;Ex.1:
111100001101101000b CLMUL 11b=
1000100010110111000b
;Ex.2:
1111010111b CLMUL 1111010111b=
01010101000100010101b
;ASCII
;   1101 CLMUL 1101
;  / || \
;  | | | |
; 01010001
```

Last edited by Madis731 on 25 Aug 2008, 18:46; edited 1 time in total
25 Aug 2008, 16:07
LocoDelAssembly
Your code has a bug

Joined: 06 May 2005
Posts: 4624
Location: Argentina
LocoDelAssembly 25 Aug 2008, 18:18
Code:
```xor  eax, eax
mov  edx, multiplicand
mov  ecx, multiplier

.loop:
shr  ecx, 1
sbb  ebx, ebx
and  ebx, edx
xor  eax, ebx
shl  edx, 1

test ecx, ecx
jnz  .loop

; EAX = Result (mod 2^32)
```
25 Aug 2008, 18:18
revolution
When all else fails, read the source

Joined: 24 Aug 2004
Posts: 20132
Location: In your JS exploiting you and your system
revolution 25 Aug 2008, 20:19
Madis731 & LocoDelAssembly: Both of your suggestions look perfectly fine to me. You both seem to have grasped the concept quickly.

Also, you might like to read about Galios Field (GF), since that is what you two have implemented above.

Once the fancy names like GF are added many people seem to switch off the brain and think "ooh, this is getting too hard". But once the basic idea is grasped the fancy names are no longer scary.
25 Aug 2008, 20:19
edfed

Joined: 20 Feb 2006
Posts: 4325
Location: Now
edfed 26 Aug 2008, 19:48
one more communication security algorithm?
26 Aug 2008, 19:48
revolution
When all else fails, read the source

Joined: 24 Aug 2004
Posts: 20132
Location: In your JS exploiting you and your system
revolution 26 Aug 2008, 19:54
edfed wrote:
one more communication security algorithm?
What do you mean?
26 Aug 2008, 19:54
edfed

Joined: 20 Feb 2006
Posts: 4325
Location: Now
edfed 26 Aug 2008, 20:01
CRC, MD5, and CLMUL.

as it will signal the edges of a frame.

nowadays, communication is always made with serial signals.
theses signals can be securized using a machester code.
or a CLMUL3?

i don't know more about, but it limits the possible errors.

MANCHESTER CODE
Quote:

00=0101
01=0110
10=1001
11=1010
26 Aug 2008, 20:01
revolution
When all else fails, read the source

Joined: 24 Aug 2004
Posts: 20132
Location: In your JS exploiting you and your system
revolution 26 Aug 2008, 20:15
I thought Manchester Code was primarily used to increase the reliability of the signal, at the cost of bandwidth. AFAIK there is no error check defined as part of Manchester Code. But there are many types of error codes that can be added to a data stream. Depending upon the characteristics of the communication channel different schemes would be used to maximise the efficiency of signal recovery.
26 Aug 2008, 20:15

Joined: 25 Sep 2003
Posts: 2139
Location: Estonia
Madis731 27 Aug 2008, 07:23
edfed wrote:
CRC, MD5, and CLMUL.

as it will signal the edges of a frame.

nowadays, communication is always made with serial signals.
theses signals can be securized using a machester code.
or a CLMUL3?

i don't know more about, but it limits the possible errors.

MANCHESTER CODE
Quote:

00=0101
01=0110
10=1001
11=1010

I don't get where does the security come from and where does error-tolerancy come from?

You can't even definitely say upto one bit changed which one of these four it should be. 0111 - is it 0101 or is it 0110?
Usually redundancy is applied through streams of uneven number of the same bit (like 000 or 0 or 111 for 1). 010 and 001 can be corrected to 0.

EDIT: Will update first post from now on!

_________________
My updated idol http://www.agner.org/optimize/
27 Aug 2008, 07:23
 Display posts from previous: All Posts1 Day7 Days2 Weeks1 Month3 Months6 Months1 Year Oldest FirstNewest First

 Jump to: Select a forum Official----------------AssemblyPeripheria General----------------MainTutorials and ExamplesDOSWindowsLinuxUnixMenuetOS Specific----------------MacroinstructionsOS ConstructionIDE DevelopmentProjects and IdeasNon-x86 architecturesHigh Level LanguagesProgramming Language DesignCompiler Internals Other----------------FeedbackHeapTest Area

Forum Rules:
 You cannot post new topics in this forumYou cannot reply to topics in this forumYou cannot edit your posts in this forumYou cannot delete your posts in this forumYou cannot vote in polls in this forumYou cannot attach files in this forumYou can download files in this forum