flat assembler
Message board for the users of flat assembler.
Index
> Projects and Ideas > [IDEA] Abusing PCLMULQDQ to improve algorithms 
Author 

Madis731
Intro:
There's a new AVX coming out in erm...2010? AVX Okay its all fine to move/shuffle/add 256 bits of data at once, but I was thinking about something completely new. Now to the point: Lets take this idea of magic numbers and develop it (Terje's algorithm) There are other algorithm samples here on the board that I care not to search for now. What is important is to see the carryless multiplication (PCLMULQDQ) as a "supershuffling agent under cover". When a regular multiplication shifts the result far from the original location, the carryless variant might save us a few shifts or rotations (or might not?). Some examples (updating from time to time): Code: ; 1) http://www.jjj.de/fxt/ the pdf chapters 1.3 and 1.4 and related. ; Detecting bitstream transitions 0>1, 1>0 when multiplying carrylessly with ; 3 (11b) the answer will tell you the edges where the stream of bits toggle. movdqa xmm0, const_111100001101101000b pclmulqdq xmm0,const_11b ; xmm0= 1000100010110111000b ; 2) Using carryless multiply to square the number (multiply with itself) the ; answer will be a bitwise unpack doubling the number of bits. ; (Compare with PUNPCKBW/WD/DQ (x)mm,(x)mm/m64/128) movdqa xmm0, const_111100001101101000b pclmulqdq xmm0,const_11b ;xmm0= 010101010000000001010001010001000000b ; 3) The parity check can be achieved by (carryless) multiplying with all the ; interesting bits set to one and then shifting: movdqa xmm0, const_111100001101101000b pclmulqdq xmm0,const_11111111b psrldq xmm0,81 ; interesting bits  1 ;if this instruction used GPRs then it would be possible to shift into carry! ;counting only lower 8 bits which are 01101000b ;xmm0[0] = 0 xor 1 xor 1 xor 0 xor 1 xor 0 xor 0 xor 0 = 1 Last edited by Madis731 on 28 Aug 2008, 10:37; edited 2 times in total 

25 Aug 2008, 09:05 

Madis731
The problem with its use is that I can't find any description of how to use it or how its used other than "for doing some encryption & stuff".
I'd like to see some pseudocode or some VHDL to describe it. I haven't seen carryless multiply mentioned ever before Intel 

25 Aug 2008, 11:32 

revolution
Carryless means no carry, yes, really, it is simpler than normal multiply. All the bits are just XORed since the carry is not needed. You can write out the normal long hand multiply but just "forget" to carry over anything into the next column and then you have carryless multiply, or base2 polynomial multiply for the fancy name.


25 Aug 2008, 11:43 

Madis731
I will give it a try:
Code: *101 1001 ====== + 101 000 000 101 ====== 101101 ;normal and carryless are the same *101 1101 ====== + 101 000 101 101 ====== 1000001 ;normal 111101 ;carryless? Wow, from APRIL link Sorry about "no info on the topic" EDIT2: This image was created by applying "display if CLMUL(A,B) != MUL(A,B)" EDIT3: I won't show you the binary output here, but it will reveal some magic already. For example: 1) When you CLMUL with 3 (11b) then the answer will tell you the edges where the stream of bits will toggle 0=>1 or 1=>0. 2) CLMUL with the number itself yields a bitwise unpack Compare ~ PUNPCKbittwobit xmm0,xmm0 Code: ;Ex.1: 111100001101101000b CLMUL 11b= 1000100010110111000b ;Ex.2: 1111010111b CLMUL 1111010111b= 01010101000100010101b ;ASCII ; 1101 CLMUL 1101 ; /  \ ;     ; 01010001 Last edited by Madis731 on 25 Aug 2008, 18:46; edited 1 time in total 

25 Aug 2008, 16:07 

LocoDelAssembly
Code: xor eax, eax mov edx, multiplicand mov ecx, multiplier .loop: shr ecx, 1 sbb ebx, ebx and ebx, edx xor eax, ebx shl edx, 1 test ecx, ecx jnz .loop ; EAX = Result (mod 2^32) 

25 Aug 2008, 18:18 

revolution
Madis731 & LocoDelAssembly: Both of your suggestions look perfectly fine to me. You both seem to have grasped the concept quickly.
Also, you might like to read about Galios Field (GF), since that is what you two have implemented above. Once the fancy names like GF are added many people seem to switch off the brain and think "ooh, this is getting too hard". But once the basic idea is grasped the fancy names are no longer scary. 

25 Aug 2008, 20:19 

edfed
one more communication security algorithm?


26 Aug 2008, 19:48 

revolution
edfed wrote: one more communication security algorithm? 

26 Aug 2008, 19:54 

edfed
CRC, MD5, and CLMUL.
as it will signal the edges of a frame. nowadays, communication is always made with serial signals. theses signals can be securized using a machester code. or a CLMUL3? i don't know more about, but it limits the possible errors. MANCHESTER CODE Quote:


26 Aug 2008, 20:01 

revolution
I thought Manchester Code was primarily used to increase the reliability of the signal, at the cost of bandwidth. AFAIK there is no error check defined as part of Manchester Code. But there are many types of error codes that can be added to a data stream. Depending upon the characteristics of the communication channel different schemes would be used to maximise the efficiency of signal recovery.


26 Aug 2008, 20:15 

Madis731
edfed wrote: CRC, MD5, and CLMUL. I don't get where does the security come from and where does errortolerancy come from? You can't even definitely say upto one bit changed which one of these four it should be. 0111  is it 0101 or is it 0110? Usually redundancy is applied through streams of uneven number of the same bit (like 000 or 0 or 111 for 1). 010 and 001 can be corrected to 0. EDIT: Will update first post from now on! 

27 Aug 2008, 07:23 

< Last Thread  Next Thread > 
Forum Rules:

Copyright © 19992020, Tomasz Grysztar.
Powered by rwasa.