flat assembler
Message board for the users of flat assembler.

Index > Main > how to properly rotate buffer in a loop

Author
Thread Post new topic Reply to topic
zhak



Joined: 12 Apr 2005
Posts: 490
Location: Belarus
zhak
Having a write to a buffer loop, what would be a better way (in terms of speed) to rotate the buffer? Buffer rotation is an exceptional situation and will occur very rarely.

I have thought about three possible solutions:

Given:
rdx - pointer to buffer
rcx - buffer size

First: jump out and return. But I guess this is the worst one
Code:
mov rdi, rdx
lea rbx, [rdx + rcx]

continue:
  cmp rdi, rbx
  jae rotate

do_stuff:
  . . .
  mov [rdi], ax
  add rdi, 2
  jmp continue

rotate:
  mov rdi, rdx
  jmp do_stuff
    


Second, use cmovcc.
Code:
mov rdi, rdx
lea rbx, [rdx + rcx]

continue:
  push rdx
  cmp rdi, rbx
  cmovae [rsp]
  pop rdx

do_stuff:
  . . .
  mov [rdi], ax
  add rdi, 2
  jmp continue
    


Third, use cmovcc preinitialized
Code:
mov rdi, rdx
mov [rbp - 8], rdx
lea rbx, [rdx + rcx]

continue:
  cmp rdi, rbx
  cmovae [rbp - 8]

do_stuff:
  . . .
  mov [rdi], ax
  add rdi, 2
  jmp continue
    


I remember from optimization guides, that it's better to avoid conditions in loops. But I'm not sure if cmovcc is better than conditional flow in case branch prediction will predict "not taking the jump" most of the time. Number of iterations may vary. But most of the time I assume couple of hundreds. Thanks in advance!
Post 16 Jun 2016, 21:52
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 17271
Location: In your JS exploiting you and your system
revolution
zhak wrote:
I remember from optimization guides, that it's better to avoid conditions in loops. But I'm not sure if cmovcc is better than conditional flow in case branch prediction will predict "not taking the jump" most of the time. Number of iterations may vary. But most of the time I assume couple of hundreds. Thanks in advance!
You shouldn't be guessing this. The only way to know is to test it. Each system will give different results anyway so in most cases there is no single universal solution. And any advice or suggestion can only be taken with appropriate testing to see what happens in your use case.
Post 17 Jun 2016, 01:27
View user's profile Send private message Visit poster's website Reply with quote
bitRAKE



Joined: 21 Jul 2003
Posts: 2913
Location: [RSP+8*5]
bitRAKE
A very lazy way is to use a buffer twice the needed size. Only really better for common rotations, or block operations. Buffer appears always contiguous and there is a large penalty (full copy to start) for rotation at end.
Post 17 Jun 2016, 15:05
View user's profile Send private message Visit poster's website Reply with quote
zhak



Joined: 12 Apr 2005
Posts: 490
Location: Belarus
zhak
Yes, but I was implementing a function in UEFI style to return BUFFER_TOO_SMALL error and required buffer size on overflow. Decided to leave cmov for now. Will do some testing later, not a big change to make
Post 20 Jun 2016, 11:19
View user's profile Send private message Reply with quote
l4m2



Joined: 15 Jan 2015
Posts: 648
l4m2
For I didn't catch what you mean
Post 23 Aug 2016, 04:20
View user's profile Send private message Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2020, Tomasz Grysztar.

Powered by rwasa.