how to properly rotate buffer in a loop

Index > Main > how to properly rotate buffer in a loop

Author

Thread

zhak

Joined: 12 Apr 2005
Posts: 501
Location: Belarus

zhak 16 Jun 2016, 21:52

Having a write to a buffer loop, what would be a better way (in terms of speed) to rotate the buffer? Buffer rotation is an exceptional situation and will occur very rarely.

I have thought about three possible solutions:

Given:
rdx - pointer to buffer
rcx - buffer size

First: jump out and return. But I guess this is the worst one

Code:

mov rdi, rdx
lea rbx, [rdx + rcx]

continue:
  cmp rdi, rbx
  jae rotate

do_stuff:
  . . .
  mov [rdi], ax
  add rdi, 2
  jmp continue

rotate:
  mov rdi, rdx
  jmp do_stuff

Second, use cmovcc.

Code:

mov rdi, rdx
lea rbx, [rdx + rcx]

continue:
  push rdx
  cmp rdi, rbx
  cmovae [rsp]
  pop rdx

do_stuff:
  . . .
  mov [rdi], ax
  add rdi, 2
  jmp continue

Third, use cmovcc preinitialized

Code:

mov rdi, rdx
mov [rbp - 8], rdx
lea rbx, [rdx + rcx]

continue:
  cmp rdi, rbx
  cmovae [rbp - 8]

do_stuff:
  . . .
  mov [rdi], ax
  add rdi, 2
  jmp continue

I remember from optimization guides, that it's better to avoid conditions in loops. But I'm not sure if cmovcc is better than conditional flow in case branch prediction will predict "not taking the jump" most of the time. Number of iterations may vary. But most of the time I assume couple of hundreds. Thanks in advance!

16 Jun 2016, 21:52

revolution
When all else fails, read the source

Joined: 24 Aug 2004
Posts: 20895
Location: In your JS exploiting you and your system

revolution 17 Jun 2016, 01:27

zhak wrote:

I remember from optimization guides, that it's better to avoid conditions in loops. But I'm not sure if cmovcc is better than conditional flow in case branch prediction will predict "not taking the jump" most of the time. Number of iterations may vary. But most of the time I assume couple of hundreds. Thanks in advance!

You shouldn't be guessing this. The only way to know is to test it. Each system will give different results anyway so in most cases there is no single universal solution. And any advice or suggestion can only be taken with appropriate testing to see what happens in your use case.

17 Jun 2016, 01:27

bitRAKE

Joined: 21 Jul 2003
Posts: 4390
Location: vpcmpistri

bitRAKE 17 Jun 2016, 15:05

A very lazy way is to use a buffer twice the needed size. Only really better for common rotations, or block operations. Buffer appears always contiguous and there is a large penalty (full copy to start) for rotation at end.

17 Jun 2016, 15:05

zhak

Joined: 12 Apr 2005
Posts: 501
Location: Belarus

zhak 20 Jun 2016, 11:19

Yes, but I was implementing a function in UEFI style to return BUFFER_TOO_SMALL error and required buffer size on overflow. Decided to leave cmov for now. Will do some testing later, not a big change to make

20 Jun 2016, 11:19

l4m2

Joined: 15 Jan 2015
Posts: 670

l4m2 23 Aug 2016, 04:20

For I didn't catch what you mean

23 Aug 2016, 04:20

< Last Thread | Next Thread >

Forum Rules:

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum