flat assembler
Message board for the users of flat assembler.

Index > Main > 1-byte instruction that copies the byte at edi+eax into al?

Author
Thread Post new topic Reply to topic
Azu



Joined: 16 Dec 2008
Posts: 1159
Azu 07 Jul 2009, 18:07
I read about it a long time ago but had no use for it.. and now I can't remember how I found it or what it was called.. I thought it was latb but there is no instruction called that.. please someone tell me what it is Confused
Post 07 Jul 2009, 18:07
View user's profile Send private message Send e-mail AIM Address Yahoo Messenger MSN Messenger ICQ Number Reply with quote
Goplat



Joined: 15 Sep 2006
Posts: 181
Goplat 07 Jul 2009, 18:17
You're probably thinking of xlatb (long form: "xlat byte [bx]" or "xlat byte [ebx]"). It reads a byte from memory at the address (E)BX+AL and stores it into AL.
Post 07 Jul 2009, 18:17
View user's profile Send private message Reply with quote
Azu



Joined: 16 Dec 2008
Posts: 1159
Azu 07 Jul 2009, 18:29
Thank you Smile that's the one.
Post 07 Jul 2009, 18:29
View user's profile Send private message Send e-mail AIM Address Yahoo Messenger MSN Messenger ICQ Number Reply with quote
asmcoder



Joined: 02 Jun 2008
Posts: 784
asmcoder 07 Jul 2009, 18:55
[content deleted]


Last edited by asmcoder on 14 Aug 2009, 14:49; edited 1 time in total
Post 07 Jul 2009, 18:55
View user's profile Send private message Reply with quote
windwakr



Joined: 30 Jun 2004
Posts: 827
windwakr 07 Jul 2009, 20:45
asmcoder wrote:
xlatb takes 4 clocks !! thats much, i think..


Seriously, 4 clocks "is much"? On modern computer 4 cycles is what, less than 0.0000000012121212121212121212121212121212 of a second(on my 3.3ghz cpu)

...Like I said, I'm really sick of seeing Asmcoders posts. He is polluting our nice forum with this shit. 99% of the fucking time on his threads he asks questions that could be found in 2 seconds of searching. And I do not like his motives, malware writers should not be allowed here.

_________________
----> * <---- My star, won HERE
Post 07 Jul 2009, 20:45
View user's profile Send private message Reply with quote
bitRAKE



Joined: 21 Jul 2003
Posts: 3892
Location: vpcmipstrm
bitRAKE 07 Jul 2009, 20:54
I actually use XLATB in a situation requiring billions of iterations - 4 cycles is a problem.
Code:
     mov rdi,rsi
.0:  lodsb
       xadd al,ah
  xlatb
       stosb
       loop .0    
...any optimizations are appreciated. Very Happy
Post 07 Jul 2009, 20:54
View user's profile Send private message Visit poster's website Reply with quote
pal



Joined: 26 Aug 2008
Posts: 227
pal 07 Jul 2009, 23:08
bitRAKE: Surely the loop is slowing the code down. You can replace it with a dec ecx; jnz .0; and it would save like a minimum of 3 loops (1 + 1 = 2 loops, loop = 5/6 loops (someone also said 5-10 loops)), or sub ecx,1; if there is something wrong with dec in this situation (which I have heard there is?).
Post 07 Jul 2009, 23:08
View user's profile Send private message Reply with quote
baldr



Joined: 19 Mar 2008
Posts: 1651
baldr 16 Jul 2009, 10:29
pal,

That's probably partial RFLAGS update stall issue (CF is not affected by dec).


bitRAKE,

Unroll and use SIMD.
PEXTRB al, xmm1, byte_index / PINSRB xmm1, [xlat_tab+eax], byte_index combo for xlatb.
PADDB for summing (overflow is ignored, right?). PMADDUBSW with proper byte masks will help to combine stripes' sums afterward.

Use word granularity in translation if memory footprint is of no concern (128kB table can be cumbersome though).
Post 16 Jul 2009, 10:29
View user's profile Send private message Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2023, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.