flat assembler
Message board for the users of flat assembler.

 Index > Main > Count bits
Author
Andy

Joined: 17 Oct 2011
Posts: 55
Andy 15 Nov 2022, 11:58
First of all I am aware of popcnt instruction and my question is pure educative. I have this basic implementation to count 1's bits for a number in eax with the result in ecx and works quite good (probably not the best implementation)

Code:
```mov eax, 0BC637EFFh
xor ecx, ecx
next:
jnz next
mov eax, ecx
ret    ```

but I saw some implementations without a loop and using basic arithmetics and well crafted constants. I have a vague idea how it works but can someone please explain this better?

Quote:
x = (x & 0x55555555) + ((x >> 1) & 0x55555555);
x = (x & 0x33333333) + ((x >> 2) & 0x33333333);
x = (x & 0x0F0F0F0F) + ((x >> 4) & 0x0F0F0F0F);
x = (x & 0x00FF00FF) + ((x >> & 0x00FF00FF);
x = (x & 0x0000FFFF) + ((x >>16) & 0x0000FFFF);
15 Nov 2022, 11:58
macomics

Joined: 26 Jan 2021
Posts: 915
Location: Russia
macomics 15 Nov 2022, 12:34
Code:
```; x = (x & 0x55555555) + ((x >> 1) & 0x55555555);

Divide the number into 16 groups of 2 bits and sum them all with one command. Each of the groups can have either 0, 1, or 2 ones. Thus, there will be no transfers from one group to another.

x = 0BC637EFF & 055555555 = 000010111100011000110111111011111111 & 01010101010101010101010101010101 = 00`00`00`01`01`00`01`00`00`01`01`01`01`00`01`01`01`01 = 014415455
x = 0BC637EFF >> 1 & 055555555 = 000001011110001100011011111101111111 & 01010101010101010101010101010101 = 00`00`01`01`01`00`00`01`00`01`00`01`01`01`01`01`01`01 = 054111555

x = 014415455 + 054111555 = 0685269AA

; x = (x & 0x33333333) + ((x >> 2) & 0x33333333);

And now divide the number into 8 groups of 4 bits and repeat the addition. Each group will already calculate the sum of 4 ones max (i.e. there will be no transfers again)

x = 0685269AA & 033333333 = 000001101000010100100110100110101010 & 000000110011001100110011001100110011 = 0000`0010`0000`0001`0010`0010`0001`0010`0010 = 020122122
x = 0685269AA >> 1 & 033333333 = 000000011010000101001001101001101010 & 000000110011001100110011001100110011 = 0000`0001`0010`0001`0000`0001`0010`0010`0010 = 012101222

x = 020122122 + 012101222 = 032223344

; x = (x & 0x0F0F0F0F) + ((x >> 4) & 0x0F0F0F0F);

Now 4 groups of 8 bits

x = 032223344 & 00F0F0F0F = 000000110010001000100011001101000100 & 000000001111000011110000111100001111 = 00000010`00000010`00000011`00000100 = 02020304
x = 032223344 >> 4 & 00F0F0F0F = 000000000011001000100010001100110100 & 000000001111000011110000111100001111 = 00000011`00000010`00000011`00000100 = 03020304

x = 02020304 + 03020304 = 05040608

; x = (x & 0x00FF00FF) + ((x >>Cool & 0x00FF00FF);

...

x = 05040608 & 00FF00FF = 00000101000001000000011000001000 & 00000000111111110000000011111111 = 00000000`00000100`00000000`00001000 = 00040008
x = 05040608 >> 8 & 00FF00FF = 00000000000001010000010000000110 & 00000000111111110000000011111111 = 00000000`00000101`00000000`00000110 = 00050006

x = 00040008 + 00050006 = 0009000E

; x = (x & 0x0000FFFF) + ((x >>16) & 0x0000FFFF);

...

x = 0009000E & 0000FFFF = 00000000000010010000000000001110 & 00000000000000001111111111111111 = 00000000000000000000000000001110 = 0000000E
x = 0009000E >> 16 & 0000FFFF = 00000000000000000000000000001001 & 00000000000000001111111111111111 = 00000000000000000000000000001001 = 00000009

x = 0000000E + 00000009 = 00000017 = 23.0    ```
15 Nov 2022, 12:34
Tomasz Grysztar

Joined: 16 Jun 2003
Posts: 8344
Location: Kraków, Poland
Tomasz Grysztar 15 Nov 2022, 12:48
(x and 0x55555555) leaves only the bits at even positions (because 5=0101b), and ((x shr 1) & 0x55555555) puts the bits from odd positions into even ones and mask them similarly. If you add these two numbers, you end up adding at most 1+1 within every bit pair, so now you have a sequence of 2-bit numbers, each containing a sum of two bits. If all bits of x were set, this is 2*0x55555555=0xAAAAAAAA, every 2-bit sub-value within that result has value 2=10b.

Now it's time to similarly fold pairs of 2-bit values into 4-bit sums. (x and 0x33333333) selects every other 2-bit value, ((x shr 2) and 0x33333333) puts the remaining ones at the same positions. After you add them, you get a sequence of 4-bit numbers, each being the sum of previously adjacent 2-bit numbers. If you had 0xAAAAAAAA, now you get 2*0x22222222=0x44444444.

Next we, sum each pair of 4-bit sums into 8-bit sums. Masking 0x44444444 and its shifted version we get 2*0x04040404=0x08080808.

Now we sum each pair of bytes to get 16-bit sums. From 0x08080808 we get to 0x00100010.

Finally we sum the two 16-bit numbers, (x and 0xFFFF) + ((x shr 16) and 0xFFFF), if we were at 0x00100010, we get 0x00000020.
15 Nov 2022, 12:48
Andy

Joined: 17 Oct 2011
Posts: 55
Andy 15 Nov 2022, 13:01
Thanks guys. This is a clever approach.
15 Nov 2022, 13:01
revolution
When all else fails, read the source

Joined: 24 Aug 2004
Posts: 20146
revolution 15 Nov 2022, 13:22
Here is a thread from 2007.
15 Nov 2022, 13:22
Furs

Joined: 04 Mar 2016
Posts: 2454
Furs 15 Nov 2022, 14:22
Another useful bitwise trick is multiplication. Think of it as duplicating the source number at each '1' bit in the multiplier (i.e. shifting it into that place), and adding them together. e.g. to duplicate an 8-bit value (in eax) into each byte of the resulting 32-bit value:
Code:
`imul eax, 0x01010101    `
It's obvious when you know how multiplication works, but in bitwise tricks it might seem weird at first glance, so easy to keep this trick in mind.
15 Nov 2022, 14:22
 Display posts from previous: All Posts1 Day7 Days2 Weeks1 Month3 Months6 Months1 Year Oldest FirstNewest First

 Jump to: Select a forum Official----------------AssemblyPeripheria General----------------MainTutorials and ExamplesDOSWindowsLinuxUnixMenuetOS Specific----------------MacroinstructionsOS ConstructionIDE DevelopmentProjects and IdeasNon-x86 architecturesHigh Level LanguagesProgramming Language DesignCompiler Internals Other----------------FeedbackHeapTest Area

Forum Rules:
 You cannot post new topics in this forumYou cannot reply to topics in this forumYou cannot edit your posts in this forumYou cannot delete your posts in this forumYou cannot vote in polls in this forumYou cannot attach files in this forumYou can download files in this forum