flat assembler
Message board for the users of flat assembler.

Index > Windows > how do x^6 on sse or avx ?

Author
Thread Post new topic Reply to topic
Roman



Joined: 21 Apr 2012
Posts: 2004
Roman 16 Oct 2025, 22:10
I search simple way do this x*x*x*x*x*x.
I not like do 5 times mulss xmm1,xmm1.
I searching couple sse or avx asm commands for this task.

I found this.
Code:
pow(base,power) = exp( power * log(base) )    

In this case do 5 times mulss xmm1,xmm1 not bad solution.

AVX have vhaddps horizontal apply.
Maybe exist horizontal multiplication?
ymm0 could multiply 8 floats numbers.
Post 16 Oct 2025, 22:10
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20742
Location: In your JS exploiting you and your system
revolution 16 Oct 2025, 23:42
A power of 6 can be done with three multiplies.

a = x*x ; x^2
b = a*x ; x^3
c = b*b ; x^6
Post 16 Oct 2025, 23:42
View user's profile Send private message Visit poster's website Reply with quote
Roman



Joined: 21 Apr 2012
Posts: 2004
Roman 17 Oct 2025, 00:08
Sorry my mistake.
Code:
;x^8
align 32
x dd 1f,2f,3f,4f,5f,6f,7f,8f

vmovaps ymm0,yword [x]
    vmulps ymm0, ymm0, ymm0    ; x^2
    vmulps ymm0, ymm0, ymm0    ; x^4
    vmulps ymm0, ymm0, ymm0    ; x^8  

;after  in ymm0 1f^8 ,2f^8 ,3f^8 ,4f^8 ,5f^8 ,6f^8 ,7f^8 ,8f^8 
    

Code:
x dd 3.0

movss xmm1,[x]
movss xmm0,xmm1
mulss xmm0,xmm0 ;x^2
mulss xmm0,xmm0 ;x^4
mulss xmm1,xmm0 ;x^5 xmm1=3^5=243
    
Post 17 Oct 2025, 00:08
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20742
Location: In your JS exploiting you and your system
revolution 17 Oct 2025, 06:30
In general x-to-power-of-n can be computed in no more than [ bsr(n) + popcnt(n) - 1 ] multiplies.
Post 17 Oct 2025, 06:30
View user's profile Send private message Visit poster's website Reply with quote
Roman



Joined: 21 Apr 2012
Posts: 2004
Roman 17 Oct 2025, 14:37
Quote:

[ bsr(n) + popcnt(n) - 1 ]

Very interesting see example how do this.
Post 17 Oct 2025, 14:37
View user's profile Send private message Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8463
Location: Kraków, Poland
Tomasz Grysztar 17 Oct 2025, 17:14
Roman wrote:
Quote:

[ bsr(n) + popcnt(n) - 1 ]

Very interesting see example how do this.
Take the binary representation of the number, for example 13 = 1101b:
13 = 1101b = 1\cdot{2^3} + 1\cdot{2^2} + 0\cdot{2^1} + 1\cdot{2^0}

x^{13} = x^{2^3+2^2+2^0} = x^{2^3}\cdot{x^{2^2}}\cdot{x}

So if we have the square powers of x ready, we only need to perform 2 more multiplications - this is the "popcnt(n) - 1" portion of revolution's formula.

Now, taking into consideration that x^{2^i}=(x^{2^{i-1}})^2 it follows that we can compute x^{2^i} by squaring i times, and we get all the intermediate squares on the way. This is the "bsr(n)" part of the formula.

For n = 13 we get this sequence of multiplications:
a = x*x
b = a*a
c = b*b
x^13 = c*b*x
For final calculation you take the product of all the consecutive squares and remove the ones that correspond to zeros in the binary representation of n. Here, because 13 = 1101b, it meant removing "a" from "c*b*a*x" (as "a" corresponds to the only zero in 1101b).
Post 17 Oct 2025, 17:14
View user's profile Send private message Visit poster's website Reply with quote
Roman



Joined: 21 Apr 2012
Posts: 2004
Roman 18 Oct 2025, 08:13
Quote:

[ bsr(n) + popcnt(n) - 1 ]

I thinked this code help do two multiplication.
But if not, this code only confusing and complicating. But not help.
Post 18 Oct 2025, 08:13
View user's profile Send private message Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2025, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.