flat assembler
Message board for the users of flat assembler.
Index
> Main > Count bits 
Author 

macomics 15 Nov 2022, 12:34
Code: ; x = (x & 0x55555555) + ((x >> 1) & 0x55555555); Divide the number into 16 groups of 2 bits and sum them all with one command. Each of the groups can have either 0, 1, or 2 ones. Thus, there will be no transfers from one group to another. x = 0BC637EFF & 055555555 = 000010111100011000110111111011111111 & 01010101010101010101010101010101 = 00`00`00`01`01`00`01`00`00`01`01`01`01`00`01`01`01`01 = 014415455 x = 0BC637EFF >> 1 & 055555555 = 000001011110001100011011111101111111 & 01010101010101010101010101010101 = 00`00`01`01`01`00`00`01`00`01`00`01`01`01`01`01`01`01 = 054111555 x = 014415455 + 054111555 = 0685269AA ; x = (x & 0x33333333) + ((x >> 2) & 0x33333333); And now divide the number into 8 groups of 4 bits and repeat the addition. Each group will already calculate the sum of 4 ones max (i.e. there will be no transfers again) x = 0685269AA & 033333333 = 000001101000010100100110100110101010 & 000000110011001100110011001100110011 = 0000`0010`0000`0001`0010`0010`0001`0010`0010 = 020122122 x = 0685269AA >> 1 & 033333333 = 000000011010000101001001101001101010 & 000000110011001100110011001100110011 = 0000`0001`0010`0001`0000`0001`0010`0010`0010 = 012101222 x = 020122122 + 012101222 = 032223344 ; x = (x & 0x0F0F0F0F) + ((x >> 4) & 0x0F0F0F0F); Now 4 groups of 8 bits x = 032223344 & 00F0F0F0F = 000000110010001000100011001101000100 & 000000001111000011110000111100001111 = 00000010`00000010`00000011`00000100 = 02020304 x = 032223344 >> 4 & 00F0F0F0F = 000000000011001000100010001100110100 & 000000001111000011110000111100001111 = 00000011`00000010`00000011`00000100 = 03020304 x = 02020304 + 03020304 = 05040608 ; x = (x & 0x00FF00FF) + ((x >>Cool & 0x00FF00FF); ... x = 05040608 & 00FF00FF = 00000101000001000000011000001000 & 00000000111111110000000011111111 = 00000000`00000100`00000000`00001000 = 00040008 x = 05040608 >> 8 & 00FF00FF = 00000000000001010000010000000110 & 00000000111111110000000011111111 = 00000000`00000101`00000000`00000110 = 00050006 x = 00040008 + 00050006 = 0009000E ; x = (x & 0x0000FFFF) + ((x >>16) & 0x0000FFFF); ... x = 0009000E & 0000FFFF = 00000000000010010000000000001110 & 00000000000000001111111111111111 = 00000000000000000000000000001110 = 0000000E x = 0009000E >> 16 & 0000FFFF = 00000000000000000000000000001001 & 00000000000000001111111111111111 = 00000000000000000000000000001001 = 00000009 x = 0000000E + 00000009 = 00000017 = 23.0 

15 Nov 2022, 12:34 

Tomasz Grysztar 15 Nov 2022, 12:48
(x and 0x55555555) leaves only the bits at even positions (because 5=0101b), and ((x shr 1) & 0x55555555) puts the bits from odd positions into even ones and mask them similarly. If you add these two numbers, you end up adding at most 1+1 within every bit pair, so now you have a sequence of 2bit numbers, each containing a sum of two bits. If all bits of x were set, this is 2*0x55555555=0xAAAAAAAA, every 2bit subvalue within that result has value 2=10b.
Now it's time to similarly fold pairs of 2bit values into 4bit sums. (x and 0x33333333) selects every other 2bit value, ((x shr 2) and 0x33333333) puts the remaining ones at the same positions. After you add them, you get a sequence of 4bit numbers, each being the sum of previously adjacent 2bit numbers. If you had 0xAAAAAAAA, now you get 2*0x22222222=0x44444444. Next we, sum each pair of 4bit sums into 8bit sums. Masking 0x44444444 and its shifted version we get 2*0x04040404=0x08080808. Now we sum each pair of bytes to get 16bit sums. From 0x08080808 we get to 0x00100010. Finally we sum the two 16bit numbers, (x and 0xFFFF) + ((x shr 16) and 0xFFFF), if we were at 0x00100010, we get 0x00000020. 

15 Nov 2022, 12:48 

Andy 15 Nov 2022, 13:01
Thanks guys. This is a clever approach.


15 Nov 2022, 13:01 

revolution 15 Nov 2022, 13:22
Here is a thread from 2007.


15 Nov 2022, 13:22 

Furs 15 Nov 2022, 14:22
Another useful bitwise trick is multiplication. Think of it as duplicating the source number at each '1' bit in the multiplier (i.e. shifting it into that place), and adding them together. e.g. to duplicate an 8bit value (in eax) into each byte of the resulting 32bit value:
Code: imul eax, 0x01010101 

15 Nov 2022, 14:22 

< Last Thread  Next Thread > 
Forum Rules:

Copyright © 19992024, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.