flat assembler
Message board for the users of flat assembler.

Index > Main > Rounding down to power of 2?

Author
Thread Post new topic Reply to topic
Borsuc



Joined: 29 Dec 2005
Posts: 2465
Location: Bucharest, Romania
Borsuc 03 Jul 2006, 18:14
I just can't get it to work. I don't understand why finding the right-most "set" bit is MUCH simpler than finding the left-most "set" bit (e.g rounding down to power of 2).

Finding the right-most "set" bit only takes:
Code:
C:    -x & x
FASM: (not x + 1) and x    

And for an example in assembly:
Code:
; eax is the "x"
mov ecx, eax
neg eax
and eax, ecx    
Ok, that was for the right-most "set" bit.

But I want to find the left-most "set" bit (aka rounding down to power of 2).
I don't understand why it's so much harder, the only method I know:

Code:
In C:
x |= x >> 1;
x |= x >> 2;
x |= x >> 4;
x |= x >> 8;
x |= x >> 16;
x = (x >> 1) + 1;    
Code:
In FASM:
x = x or (x shr 1)
x = x or (x shr 2)
x = x or (x shr 4)
x = x or (x shr 8)
x = x or (x shr 16)
x = (x shr 1) + 1    
What I am asking is a "simpler" method, which isn't really dependent on the size of the variable/register. And I also think this method is not so clean, though if it's the only one I know...

if anyone got any ideas for other tricks, please post them so I might find some insight in this mystery. thanks Smile

_________________
Previously known as The_Grey_Beast


Last edited by Borsuc on 03 Jul 2006, 18:55; edited 1 time in total
Post 03 Jul 2006, 18:14
View user's profile Send private message Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8356
Location: Kraków, Poland
Tomasz Grysztar 03 Jul 2006, 18:41
BSF/BSR instructions?

EDIT: Too fast and to loose thinking, I remove the proposed (and not working) algorithm.


Last edited by Tomasz Grysztar on 03 Jul 2006, 23:13; edited 5 times in total
Post 03 Jul 2006, 18:41
View user's profile Send private message Visit poster's website Reply with quote
Borsuc



Joined: 29 Dec 2005
Posts: 2465
Location: Bucharest, Romania
Borsuc 03 Jul 2006, 18:42
Tomasz Grysztar wrote:
BSF/BSR instructions?
I know, but I don't want the index, I just want the power of 2.. and doing a BSR and then shifting to the left is slower than what I want.. I know it may be impossible, but I wanted to know if someone knows how.

Tomasz Grysztar wrote:
Anyway, the algoritm "(not x + 1) and x" can be easily translated into the reversed bits case. The NOT and AND operator give exactly the same result no matter whether you interprete bits as reversed or not. It's only the "+1" operation that is dependent on the order of bits. However the equivalent of "+1" on 32-bit number for the reversed bits order is "-80000000h" (you can check it by doing simple bit arithmetics). Thus the dual algorithm for 32-bit numbers is:

(not x - 80000000h) and x

However, unfortunately, this is still dependent on the size of register.
It's not "dependent" in the number of instructions, it's perfect, but... it doesn't work Confused

Unfortunately, I was thinking of it too Wink. But when I tested, it didn't work out. Maybe because, when I did a lot of analysis, I discovered that the "borrow" works in the same direction as the "carry", though there has to be some trick out there.

Thanks for the reply Very Happy


Last edited by Borsuc on 03 Jul 2006, 18:53; edited 1 time in total
Post 03 Jul 2006, 18:42
View user's profile Send private message Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8356
Location: Kraków, Poland
Tomasz Grysztar 03 Jul 2006, 18:56
Sorry, that was stupid, I mixed the "left" with "right" somewhere in the middle, I did test the algorithm but didn't notice it still finds the rightmost bit in case when it works (and it fails for powers of two anyway)...
And you're right - it's the direction of borrows that makes the duality fail.
I'm afraid there is no such simple method then, as there are no instruction that would modify lower bits depending on the state of higher ones (as this would be discontinuous 2-adic operation, so for sure no arithmetic one).
Post 03 Jul 2006, 18:56
View user's profile Send private message Visit poster's website Reply with quote
Borsuc



Joined: 29 Dec 2005
Posts: 2465
Location: Bucharest, Romania
Borsuc 03 Jul 2006, 20:16
Tomasz Grysztar wrote:
as this would be discontinuous 2-adic operation, so for sure no arithmetic one.
Can you give some links to explain those "operations", 'cause I'm interested and usually like to play with math and numerical things Smile

and besides, is there a way to emulate or a trick for that discontinuous 2-adic operation?

thanks
Post 03 Jul 2006, 20:16
View user's profile Send private message Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8356
Location: Kraków, Poland
Tomasz Grysztar 03 Jul 2006, 21:54
You may try this: http://www.pucuch.cl/docume/100304120442.pdf
I haven't yet read it but it seems to be simpler than other books I've been using (I will check it in detail later to judge how good is it).

As for the "trick" - as I said, since the low bits of the result depend on the higher bits of the input, standard logical operations (where value of bit on n-th position of result depends only on the value on the n-th positions of input) and arithmetic ADD/SUB/MUL (where value of n-th bith depends only on the values of the lower bits) cannot do that in any combination. The DIV is an operation where some lower bit may depend on a higher one, but only by limited amount (from bit 2n to n at most), and it appears to be of little use here.

So the possible solutions may need to involve more some bit reordering (shifts/rotations) to make the higher bits work on lower. Since shift is able to break the order of bits only locally (if we had, for example, a bit-order-reversing instruction, it would solve our problem), there are needed many of them, perhaps one for each bit, just like in your code in first post above.

Some other such algorithm, with simple loop, may look like:
Code:
; input in EAX, must be <> 0
mov edx,1
@@:
ror edx,1
shl eax,1
jnc @b
; output in EDX    

However I guess it's better to just use bit scan instruction in such case.

Sorry if it appears to be an unreadable mess - it's just a recording of my free-style thinking. Now a bit more precisely...

In general, a 2-adic function (which operates on infinite sequences of bits) is uniformly continuous, if you can find some constant, let say M, such that N lowest bits of value of function depend only on MN lowest bits of argument . In other words, if two 2-adic numbers differ only at bits higher than MN - which in 2-adic metric means they are relatively near to each other - the values of function differ only at bits higher than N - so are also near to each other.

Addition, multiplication, logical operations and shifts - are all uniformly continuous as 2-adic functions. Any combination of uniformly continuous functions is also uniformly continuous.

As for the "round down to power of two" function, it is not uniformly continuous, not even continuous at all. 1000...(thousands of zeros)...001 and 1 differ only at very high bit, however the values of function differ at the very lowest bit.

Thus the conclusion is that you cannot generate this function by a simple (not branching) combination of arithmetical/logical operations.
Post 03 Jul 2006, 21:54
View user's profile Send private message Visit poster's website Reply with quote
Borsuc



Joined: 29 Dec 2005
Posts: 2465
Location: Bucharest, Romania
Borsuc 04 Jul 2006, 12:58
Thanks for the link Smile

Damn, I should've known that there has to be some logical operations involved.. maybe it can be done with lookup table, or another (quite bloated) method (see below).

Tomasz Grysztar wrote:
Thus the conclusion is that you cannot generate this function by a simple (not branching) combination of arithmetical/logical operations.
It seems that (of course, nasty) tricks with both logical and arithmetical operations can achieve the result, though in an ugly fashion. Of course, there must be a finite number of bits. I know that a combination of arithmetical/logical doesn't work for infinite number of bits, but tricks can be done on finite bits Wink My example which depends on the size of the registers (that means, it has more or less instructions depending on it) is a trick that works (of course, only on finite number of bits Wink )

Another ugly method I know is to reverse the bits of a byte:
Code:
C:    x = ((x * 0x0802LU & 0x22110LU) | (x * 0x8020LU & 0x88440LU)) * 0x10101LU >> 16

FASM: x = ((x * 0x0802 and 0x22110) | (x * 0x8020 and 0x88440)) * 0x10101 shr 16    
Of course, you'll have to do this for each byte (32-bit register meaning 4 times). And then, simple do:
Code:
C:    x = -x & x

FASM: x = (not x + 1) and x  ; dunno if x = -x and x works in FASM    

Then again, you'll have to reverse them back. This method is very brute-force and slow as hell, but the point is that it works with arithmetical and logical operations. Of course, it needs a finite number of bits, as previously noted Wink


Thanks again for the replies Smile
Post 04 Jul 2006, 12:58
View user's profile Send private message Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8356
Location: Kraków, Poland
Tomasz Grysztar 04 Jul 2006, 13:27
The_Grey_Beast wrote:
Of course, it needs a finite number of bits, as previously noted Wink

Not only finite, but also fixed (the algorithm depends on how exactly many bits you're working on). My reasoning was to show there's not such general formula like "-x & x" for it.
Post 04 Jul 2006, 13:27
View user's profile Send private message Visit poster's website Reply with quote
Madis731



Joined: 25 Sep 2003
Posts: 2139
Location: Estonia
Madis731 04 Jul 2006, 20:03
Maybe you could do a
BSWAP
and then find the lowest bit set upto 8 times. Then you have to look in which byte it occurred and decide the power of two from that.
Example:
Code:
mov   rax,002048C4ED72800Fh
bswap rax ; 0F8072EDC4482000
    

What happened here is the highest bits were moved closer about an average of 8 times.
Using the algo (-x)&x one can verify that 2 is the digit that holds our highest bit (which is also the lowet bit in the reversed one).
After that its obligatory to check if bits 4 and 8 of that byte are set.

Now we can say that the power of two is 2^64 / 2^8 / 2^3
2^64 is the maximum possible which also equals zero Very Happy
2^8 is one empty byte and 2^3 is the third bit set from that second byte.

Erm, might be too tedious and lengthy in clocks to assemble, but might even work.

==============================

There's even an option to make use of RCR/RCL, but I haven't figured it out yet so I won't post the idea just yet.
Post 04 Jul 2006, 20:03
View user's profile Send private message Visit poster's website Yahoo Messenger MSN Messenger Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.