flat assembler
Message board for the users of flat assembler.

Index > Main > How rewrite this code to 32 bits ?

Author
Thread Post new topic Reply to topic
Roman



Joined: 21 Apr 2012
Posts: 1821
Roman 29 Oct 2024, 20:29
And get hash in EAX:EDX
Code:
pText             db "tasking", 0 ;out hash is 0x6431452F21E4F77A
FNV1Hash:
dqSizeText =7
        mov rax,14695981039346656037 ;rax = offset_basis - set to 14695981039346656037 for FNV-1
        mov rcx, dqSizeText
        mov r8, pText

        mov r9, 0100000001B3h     ;r9 = FNV_64_PRIME = 1099511628211
        xor rbx, rbx              ;rbx = 0
nextbyte:
        mul r9                    ;rax = rax * FNV_64_PRIME
        mov bl, [r8]              ;bl = byte from r8
        xor rax, rbx              ;al = al xor bl
        inc r8                    ;inc buffer pos
        dec rcx                   ;rcx = rcx - 1 (counter)
        jnz nextbyte              ;if rcx != 0, jmp to nextbyte
        ret                       ;rax = fnv1 hash
    


Last edited by Roman on 31 Oct 2024, 18:11; edited 3 times in total
Post 29 Oct 2024, 20:29
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20408
Location: In your JS exploiting you and your system
revolution 29 Oct 2024, 20:32
Post 29 Oct 2024, 20:32
View user's profile Send private message Visit poster's website Reply with quote
macomics



Joined: 26 Jan 2021
Posts: 1011
Location: Russia
macomics 29 Oct 2024, 22:52
Check if I got it wrong with multiplying 64-bits by 64-bits. I typed in browser and did not check functionality.
Code:
Hash:
  label .result.low  dword at esp + 0
  label .result.high dword at esp + 4
  label .lpszText    dword at esp + 24
  label .cchText     dword at esp + 28
    push ebx
    push esi
    push edi
    push [.hash.high]
    push [.hash.low]
    mov  ebx, [.mult.low]
    mov  edi, [.mult.high]
    mov  esi, [.lpszText]
    mov  ecx, [.cchText]

  .loop:
    mov  eax, [.result.high]
    mul  edi
    xchg eax, [.result.high]
    mul  ebx
    add  edx, [.result.high]
    xchg eax, [.result.low]
    push eax
    mul  edi
    mov  edx, eax
    pop  eax
    add  [.result.low], edx
    mul  ebx
    add  [.result.low], eax
    adc  [.result.high], edx
    xor  al, byte [esi]
    inc  esi
    loop .loop
    pop  eax ; mov  eax, [.result.low]
    pop  edx ; mov  edx, [.result.high]
             ; add  esp, 8
    pop  edi
    pop  esi
    pop  ebx
    retn 8
label .hash.low  dword
label .hash.high dword at .hash.low + 4
.hash         dq 14695981039346656037
label .mult.low  dword
label .mult.high dword at .mult.low + 4
.mult         dq 0100000001B3h    
Post 29 Oct 2024, 22:52
View user's profile Send private message Reply with quote
Roman



Joined: 21 Apr 2012
Posts: 1821
Roman 31 Oct 2024, 11:26
How about PMULLQ ?
Multiply the packed qword signed integers in xmm2 and xmm3/m128/m64bcst and store the low 64 bits of each product in xmm1 under writemask k1.
Post 31 Oct 2024, 11:26
View user's profile Send private message Reply with quote
macomics



Joined: 26 Jan 2021
Posts: 1011
Location: Russia
macomics 31 Oct 2024, 11:48
Quote:
How about VPMULLQ ?
Multiply the packed qword signed integers in xmm2 and xmm3/m128/m64bcst and store the low 64 bits of each product in xmm1 under writemask k1.
In this case, a 32-bit program will need to be checked if processor supported SSE4.1/AVX512DQ

For a 64-bit program, this check may not be performed because most processors capable of operating in 64-bit mode can work with SSE4.1/AVX512DQ


Last edited by macomics on 31 Oct 2024, 11:57; edited 2 times in total
Post 31 Oct 2024, 11:48
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20408
Location: In your JS exploiting you and your system
revolution 31 Oct 2024, 11:51
macomics wrote:
For a 64-bit program, this check may not be performed because most processors capable of operating in 64-bit mode can work with SSE4.1
My 64-bit CPU doesn't have SSE4.
Post 31 Oct 2024, 11:51
View user's profile Send private message Visit poster's website Reply with quote
macomics



Joined: 26 Jan 2021
Posts: 1011
Location: Russia
macomics 31 Oct 2024, 11:56
revolution wrote:
My 64-bit CPU doesn't have SSE4.
Even more so. If this requires duplication/complication of the code, then why is it necessary. This only makes it more difficult to test and identify the actual performance of the program.
Post 31 Oct 2024, 11:56
View user's profile Send private message Reply with quote
Roman



Joined: 21 Apr 2012
Posts: 1821
Roman 31 Oct 2024, 12:21
Quote:

My 64-bit CPU doesn't have SSE4.

Buy AMD or Intel 285K(for study of avx 10.2) Smile
Post 31 Oct 2024, 12:21
View user's profile Send private message Reply with quote
Roman



Joined: 21 Apr 2012
Posts: 1821
Roman 31 Oct 2024, 12:57
How about this ?
Code:
MUL64_MEMORY:
     mov eax, [val1high]
     mov esi, [val1low]
     mov ecx, [val2high]
     mov ebx, [val2low]

     mul ebx
     xchg eax, ebx  ; partial product top 32 bits
     mul esi
     xchg esi, eax ; partial product lower 32 bits
     add ebx, edx
     mul ecx
     add ebx, eax  ; final upper 32 bits
; answer here in EBX:ESI    
Post 31 Oct 2024, 12:57
View user's profile Send private message Reply with quote
macomics



Joined: 26 Jan 2021
Posts: 1011
Location: Russia
macomics 31 Oct 2024, 16:34
If we try to reduce the number of memory accesses, then so. In your version, you forgot to write a record of result to memory so that you can work with it in the next loop. And this, as in my version, is still 6 (7) accesses per loop.
Code:
Hash:
  label .lpszText    dword at esp + 20
  label .cchText     dword at esp + 24
    push ebp
    push ebx
    push esi
    push edi
    mov  ebx, [.hash.low]
    mov  ebp, [.hash.high]
    mov  esi, [.lpszText]
    mov  ecx, [.cchText]

  .loop:
    mov  eax, ebp
    mul  [.mult.high]
    xchg eax, ebp
    mul  [.mult.low]
    add  ebp, edx
    xchg eax, ebx
    mov  edi, eax
    mul  [.mult.high]
    add  ebx, eax
    mov  eax, ebp
    mul  [.mult.low]
    add  ebx, eax
    adc  ebp, edx
    lods byte [esi]
    xor  bl, al
    loop .loop

    mov  eax, ebx
    mov  edx, ebp
    pop  edi
    pop  esi
    pop  ebx
    pop  ebp
    retn 8

align 16
label .hash.low  dword
label .hash.high dword at .hash.low + 4
.hash         dq 14695981039346656037
label .mult.low  dword
label .mult.high dword at .mult.low + 4
.mult         dq 0100000001B3h    
In the new version, there are only 4 (5) memory accesses.
Post 31 Oct 2024, 16:34
View user's profile Send private message Reply with quote
Roman



Joined: 21 Apr 2012
Posts: 1821
Roman 31 Oct 2024, 18:08
macomics
This new 32 bit version get me hash EDX=0x31C61D51 and EAX = 0x9A3F4B1C for db "tasking"

In 64 bits program in first post i get hash
db "tasking" ;out hash is 0x6431452F21E4F77A
Post 31 Oct 2024, 18:08
View user's profile Send private message Reply with quote
macomics



Joined: 26 Jan 2021
Posts: 1011
Location: Russia
macomics 31 Oct 2024, 20:57
Code:
hash64:
    push rbx
    mov  rax, [.base_value]
    mov  r8,  rdx
    mov  r9,  [.mult_value]
    xor  rbx, rbx
  .loop:
    mul  r9
    mov  bl, [r8]
    xor  rax, rbx
    inc  r8
    loop .loop
    pop  rbx
    retn

align 8
  .base_value dq 14695981039346656037
  .mult_value dq 0100000001B3h    

Code:
hash32: ; ecx = cchLength, edx = lpszText
    push ebp
    push ebx
    push esi
    push edi
    mov  ebx, [.base_value.low]
    mov  ebp, [.base_value.high]
    mov  esi, edx

  .loop:
    mov  eax, [.mult_value.high]
    mov  edi, [.mult_value.low]
    mul  ebx
    xchg eax, ebx
    mul  edi
    xchg edi, eax
    add  ebx, edx
    mul  ebp
    lea  ebp, [ebx+eax]
    mov  ebx, edi
    lods byte [esi]
    xor  bl, al
    loop .loop
    mov  eax, ebx
    mov  edx, ebp
    pop  edi
    pop  esi
    pop  ebx
    pop  ebp
    retn

align 8
  label .base_value.low  dword
  label .base_value.high dword at $ + 4
  .base_value dq 14695981039346656037
  label .mult_value.low  dword
  label .mult_value.high dword at $ + 4
  .mult_value dq 0100000001B3h    
Code:
; string db 'tasking'
; hash32 = 0xC408FE07C248E99E
; hash64 = 0xC408FE07C248E99E    
Post 31 Oct 2024, 20:57
View user's profile Send private message Reply with quote
Roman



Joined: 21 Apr 2012
Posts: 1821
Roman 31 Oct 2024, 21:22
Thanks.
Very good.
Post 31 Oct 2024, 21:22
View user's profile Send private message Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.