flat assembler
Message board for the users of flat assembler.
Index
> Main > Rounding down to power of 2? 
Author 

Tomasz Grysztar
BSF/BSR instructions?
EDIT: Too fast and to loose thinking, I remove the proposed (and not working) algorithm. Last edited by Tomasz Grysztar on 03 Jul 2006, 23:13; edited 5 times in total 

03 Jul 2006, 18:41 

Borsuc
Tomasz Grysztar wrote: BSF/BSR instructions? Tomasz Grysztar wrote: Anyway, the algoritm "(not x + 1) and x" can be easily translated into the reversed bits case. The NOT and AND operator give exactly the same result no matter whether you interprete bits as reversed or not. It's only the "+1" operation that is dependent on the order of bits. However the equivalent of "+1" on 32bit number for the reversed bits order is "80000000h" (you can check it by doing simple bit arithmetics). Thus the dual algorithm for 32bit numbers is: Unfortunately, I was thinking of it too . But when I tested, it didn't work out. Maybe because, when I did a lot of analysis, I discovered that the "borrow" works in the same direction as the "carry", though there has to be some trick out there. Thanks for the reply Last edited by Borsuc on 03 Jul 2006, 18:53; edited 1 time in total 

03 Jul 2006, 18:42 

Tomasz Grysztar
Sorry, that was stupid, I mixed the "left" with "right" somewhere in the middle, I did test the algorithm but didn't notice it still finds the rightmost bit in case when it works (and it fails for powers of two anyway)...
And you're right  it's the direction of borrows that makes the duality fail. I'm afraid there is no such simple method then, as there are no instruction that would modify lower bits depending on the state of higher ones (as this would be discontinuous 2adic operation, so for sure no arithmetic one). 

03 Jul 2006, 18:56 

Borsuc
Tomasz Grysztar wrote: as this would be discontinuous 2adic operation, so for sure no arithmetic one. and besides, is there a way to emulate or a trick for that discontinuous 2adic operation? thanks 

03 Jul 2006, 20:16 

Tomasz Grysztar
You may try this: http://www.pucuch.cl/docume/100304120442.pdf
I haven't yet read it but it seems to be simpler than other books I've been using (I will check it in detail later to judge how good is it). As for the "trick"  as I said, since the low bits of the result depend on the higher bits of the input, standard logical operations (where value of bit on nth position of result depends only on the value on the nth positions of input) and arithmetic ADD/SUB/MUL (where value of nth bith depends only on the values of the lower bits) cannot do that in any combination. The DIV is an operation where some lower bit may depend on a higher one, but only by limited amount (from bit 2n to n at most), and it appears to be of little use here. So the possible solutions may need to involve more some bit reordering (shifts/rotations) to make the higher bits work on lower. Since shift is able to break the order of bits only locally (if we had, for example, a bitorderreversing instruction, it would solve our problem), there are needed many of them, perhaps one for each bit, just like in your code in first post above. Some other such algorithm, with simple loop, may look like: Code: ; input in EAX, must be <> 0 mov edx,1 @@: ror edx,1 shl eax,1 jnc @b ; output in EDX However I guess it's better to just use bit scan instruction in such case. Sorry if it appears to be an unreadable mess  it's just a recording of my freestyle thinking. Now a bit more precisely... In general, a 2adic function (which operates on infinite sequences of bits) is uniformly continuous, if you can find some constant, let say M, such that N lowest bits of value of function depend only on MN lowest bits of argument . In other words, if two 2adic numbers differ only at bits higher than MN  which in 2adic metric means they are relatively near to each other  the values of function differ only at bits higher than N  so are also near to each other. Addition, multiplication, logical operations and shifts  are all uniformly continuous as 2adic functions. Any combination of uniformly continuous functions is also uniformly continuous. As for the "round down to power of two" function, it is not uniformly continuous, not even continuous at all. 1000...(thousands of zeros)...001 and 1 differ only at very high bit, however the values of function differ at the very lowest bit. Thus the conclusion is that you cannot generate this function by a simple (not branching) combination of arithmetical/logical operations. 

03 Jul 2006, 21:54 

Borsuc
Thanks for the link
Damn, I should've known that there has to be some logical operations involved.. maybe it can be done with lookup table, or another (quite bloated) method (see below). Tomasz Grysztar wrote: Thus the conclusion is that you cannot generate this function by a simple (not branching) combination of arithmetical/logical operations. Another ugly method I know is to reverse the bits of a byte: Code: C: x = ((x * 0x0802LU & 0x22110LU)  (x * 0x8020LU & 0x88440LU)) * 0x10101LU >> 16 FASM: x = ((x * 0x0802 and 0x22110)  (x * 0x8020 and 0x88440)) * 0x10101 shr 16 Code: C: x = x & x FASM: x = (not x + 1) and x ; dunno if x = x and x works in FASM Then again, you'll have to reverse them back. This method is very bruteforce and slow as hell, but the point is that it works with arithmetical and logical operations. Of course, it needs a finite number of bits, as previously noted Thanks again for the replies 

04 Jul 2006, 12:58 

Tomasz Grysztar
The_Grey_Beast wrote: Of course, it needs a finite number of bits, as previously noted Not only finite, but also fixed (the algorithm depends on how exactly many bits you're working on). My reasoning was to show there's not such general formula like "x & x" for it. 

04 Jul 2006, 13:27 

Madis731
Maybe you could do a
BSWAP and then find the lowest bit set upto 8 times. Then you have to look in which byte it occurred and decide the power of two from that. Example: Code: mov rax,002048C4ED72800Fh bswap rax ; 0F8072EDC4482000 What happened here is the highest bits were moved closer about an average of 8 times. Using the algo (x)&x one can verify that 2 is the digit that holds our highest bit (which is also the lowet bit in the reversed one). After that its obligatory to check if bits 4 and 8 of that byte are set. Now we can say that the power of two is 2^64 / 2^8 / 2^3 2^64 is the maximum possible which also equals zero 2^8 is one empty byte and 2^3 is the third bit set from that second byte. Erm, might be too tedious and lengthy in clocks to assemble, but might even work. ============================== There's even an option to make use of RCR/RCL, but I haven't figured it out yet so I won't post the idea just yet. 

04 Jul 2006, 20:03 

< Last Thread  Next Thread > 
Forum Rules:

Copyright © 19992020, Tomasz Grysztar.
Powered by rwasa.