flat assembler
Message board for the users of flat assembler.
Index
> Main > Count bits |
Author |
|
macomics 15 Nov 2022, 12:34
Code: ; x = (x & 0x55555555) + ((x >> 1) & 0x55555555); Divide the number into 16 groups of 2 bits and sum them all with one command. Each of the groups can have either 0, 1, or 2 ones. Thus, there will be no transfers from one group to another. x = 0BC637EFF & 055555555 = 000010111100011000110111111011111111 & 01010101010101010101010101010101 = 00`00`00`01`01`00`01`00`00`01`01`01`01`00`01`01`01`01 = 014415455 x = 0BC637EFF >> 1 & 055555555 = 000001011110001100011011111101111111 & 01010101010101010101010101010101 = 00`00`01`01`01`00`00`01`00`01`00`01`01`01`01`01`01`01 = 054111555 x = 014415455 + 054111555 = 0685269AA ; x = (x & 0x33333333) + ((x >> 2) & 0x33333333); And now divide the number into 8 groups of 4 bits and repeat the addition. Each group will already calculate the sum of 4 ones max (i.e. there will be no transfers again) x = 0685269AA & 033333333 = 000001101000010100100110100110101010 & 000000110011001100110011001100110011 = 0000`0010`0000`0001`0010`0010`0001`0010`0010 = 020122122 x = 0685269AA >> 1 & 033333333 = 000000011010000101001001101001101010 & 000000110011001100110011001100110011 = 0000`0001`0010`0001`0000`0001`0010`0010`0010 = 012101222 x = 020122122 + 012101222 = 032223344 ; x = (x & 0x0F0F0F0F) + ((x >> 4) & 0x0F0F0F0F); Now 4 groups of 8 bits x = 032223344 & 00F0F0F0F = 000000110010001000100011001101000100 & 000000001111000011110000111100001111 = 00000010`00000010`00000011`00000100 = 02020304 x = 032223344 >> 4 & 00F0F0F0F = 000000000011001000100010001100110100 & 000000001111000011110000111100001111 = 00000011`00000010`00000011`00000100 = 03020304 x = 02020304 + 03020304 = 05040608 ; x = (x & 0x00FF00FF) + ((x >>Cool & 0x00FF00FF); ... x = 05040608 & 00FF00FF = 00000101000001000000011000001000 & 00000000111111110000000011111111 = 00000000`00000100`00000000`00001000 = 00040008 x = 05040608 >> 8 & 00FF00FF = 00000000000001010000010000000110 & 00000000111111110000000011111111 = 00000000`00000101`00000000`00000110 = 00050006 x = 00040008 + 00050006 = 0009000E ; x = (x & 0x0000FFFF) + ((x >>16) & 0x0000FFFF); ... x = 0009000E & 0000FFFF = 00000000000010010000000000001110 & 00000000000000001111111111111111 = 00000000000000000000000000001110 = 0000000E x = 0009000E >> 16 & 0000FFFF = 00000000000000000000000000001001 & 00000000000000001111111111111111 = 00000000000000000000000000001001 = 00000009 x = 0000000E + 00000009 = 00000017 = 23.0 |
|||
15 Nov 2022, 12:34 |
|
Tomasz Grysztar 15 Nov 2022, 12:48
(x and 0x55555555) leaves only the bits at even positions (because 5=0101b), and ((x shr 1) & 0x55555555) puts the bits from odd positions into even ones and mask them similarly. If you add these two numbers, you end up adding at most 1+1 within every bit pair, so now you have a sequence of 2-bit numbers, each containing a sum of two bits. If all bits of x were set, this is 2*0x55555555=0xAAAAAAAA, every 2-bit sub-value within that result has value 2=10b.
Now it's time to similarly fold pairs of 2-bit values into 4-bit sums. (x and 0x33333333) selects every other 2-bit value, ((x shr 2) and 0x33333333) puts the remaining ones at the same positions. After you add them, you get a sequence of 4-bit numbers, each being the sum of previously adjacent 2-bit numbers. If you had 0xAAAAAAAA, now you get 2*0x22222222=0x44444444. Next we, sum each pair of 4-bit sums into 8-bit sums. Masking 0x44444444 and its shifted version we get 2*0x04040404=0x08080808. Now we sum each pair of bytes to get 16-bit sums. From 0x08080808 we get to 0x00100010. Finally we sum the two 16-bit numbers, (x and 0xFFFF) + ((x shr 16) and 0xFFFF), if we were at 0x00100010, we get 0x00000020. |
|||
15 Nov 2022, 12:48 |
|
Andy 15 Nov 2022, 13:01
Thanks guys. This is a clever approach.
|
|||
15 Nov 2022, 13:01 |
|
revolution 15 Nov 2022, 13:22
Here is a thread from 2007.
|
|||
15 Nov 2022, 13:22 |
|
Furs 15 Nov 2022, 14:22
Another useful bitwise trick is multiplication. Think of it as duplicating the source number at each '1' bit in the multiplier (i.e. shifting it into that place), and adding them together. e.g. to duplicate an 8-bit value (in eax) into each byte of the resulting 32-bit value:
Code: imul eax, 0x01010101 |
|||
15 Nov 2022, 14:22 |
|
< Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.