flat assembler
Message board for the users of flat assembler.
Index
> Main > String length Goto page Previous 1, 2, 3, 4, 5, 6 Next |
Author |
|
r22 19 Jun 2014, 17:47
@Sasha
Code: and ecx,$80808080 and edx,$80808080 test ecx,ecx ;!! jnz .sub8 test edx,edx jz .scan Can't you combine the AND and TEST to just be a TEST reg32, imm32? Code: test ecx,$80808080 jnz .sub8 test edx,$80808080 jz .scan |
|||
19 Jun 2014, 17:47 |
|
JohnFound 19 Jun 2014, 18:50
r22 wrote: Can't you combine the AND and TEST to just be a TEST reg32, imm32? No, because it needs the result later in the code, where the last bytes are scanned byte by byte. _________________ Tox ID: 48C0321ADDB2FE5F644BB5E3D58B0D58C35E5BCBC81D7CD333633FEDF1047914A534256478D9 |
|||
19 Jun 2014, 18:50 |
|
r22 19 Jun 2014, 18:59
JohnFound wrote:
Couldn't you simply use the TEST reg8, imm8 (TEST cl, $80) encodings later in the code. Speed wise REG,REG and REG,IMM should be similar, the encoding size will be a byte larger (some fiddling of registers could be done so that you use EAX instead of ECX which would make the TEST al, $80 2 bytes instead of 3). |
|||
19 Jun 2014, 18:59 |
|
JohnFound 19 Jun 2014, 20:05
@r22 - well, it probably can be done this way...
|
|||
19 Jun 2014, 20:05 |
|
Sasha 20 Jun 2014, 13:46
r22 wrote: @Sasha We can remove the test at the and of the loop. It makes the loop smaller. Code: align 32 proc strlen_freshlib_opt_2 uses ebx esi edi,str mov eax,[str] mov ebx,-01010101h .aligning: test eax,3 jz .scan mov dl,[eax] test dl,dl jz .found inc eax jmp .aligning align 32 .scan: mov esi,[eax] mov edi,[eax+4] lea eax,[eax+8] lea ecx,[esi+ebx] ;! lea edx,[edi+ebx] not esi not edi and ecx,esi and edx,edi and ecx,$80808080 ;!!!! jnz .sub8 and edx,$80808080 jz .scan ; byte 0 was found: so search by bytes. lea eax,[eax-4] mov ecx,edx jmp .bytesearch .sub8: lea eax,[eax-8] .bytesearch: ;!!! test cl,cl jnz .found inc eax test ch,ch jnz .found shr ecx,16 inc eax test cl,cl jnz .found inc eax .found: sub eax,[str] ret endp But gives poorer results.. I don't know why. |
|||
20 Jun 2014, 13:46 |
|
Sasha 20 Jun 2014, 13:49
r22 wrote: Couldn't you simply use the TEST reg8, imm8 (TEST cl, $80) encodings later in the code. Speed wise REG,REG and REG,IMM should be similar, the encoding size will be a byte larger (some fiddling of registers could be done so that you use EAX instead of ECX which would make the TEST al, $80 2 bytes instead of 3). I didn't understand what about reg8,imm8 ? Did you meant this? Code: .bytesearch: ;!!! test cl,80h jnz .found inc eax test ch,80h jnz .found shr ecx,16 inc eax test cl,80h jnz .found inc eax |
|||
20 Jun 2014, 13:49 |
|
HaHaAnonymous 20 Jun 2014, 14:25
[ Post removed by author. ]
Last edited by HaHaAnonymous on 28 Feb 2015, 18:10; edited 1 time in total |
|||
20 Jun 2014, 14:25 |
|
Sasha 21 Jun 2014, 00:17
JohnFound wrote:
Of course you may use it. I don't know yet how to submit it there. And even more, you need to integrate it to your function back, as I've removed some preceding code. |
|||
21 Jun 2014, 00:17 |
|
Sasha 21 Jun 2014, 01:31
HaHaAnonymous wrote: The bigger disadvantage I see is that this function uses much more bytes (which of course will not be a problem if speed is more important). Thanks. There are many in-between variations, if you want less code and more speed. Like: Code: proc strlen str mov eax,[str] .loop: mov dx,[eax] inc eax test dl,dl jz .found inc eax test dh,dh jnz .loop .found: sub eax,[str] dec eax ret endp |
|||
21 Jun 2014, 01:31 |
|
Sasha 21 Jun 2014, 01:45
Now I want to think, how to add the maximum length check and make the function to search for any desired byte. Like this:
Code: proc strchar str,char,len mov ecx,[len] mov edi,[str] mov eax,[char] repne scasb sub edi,[str] lea eax,[edi-1] ret endp |
|||
21 Jun 2014, 01:45 |
|
HaHaAnonymous 21 Jun 2014, 05:20
[ Post removed by author. ]
Last edited by HaHaAnonymous on 28 Feb 2015, 18:10; edited 1 time in total |
|||
21 Jun 2014, 05:20 |
|
JohnFound 21 Jun 2014, 15:32
Sasha wrote: Of course you may use it. I don't know yet how to submit it there. And even more, you need to integrate it to your function back, as I've removed some preceding code. OK, I will integrate it back in the library and will put a comment about your contribution. I will use the nickname Sasha. If you prefer another nickname, of your real name - let me know with PM. Submitting to the repository is not so hard, but requires use of fossil version control system. Then, you have to register yourself in the main repository in order to get needed permissions. _________________ Tox ID: 48C0321ADDB2FE5F644BB5E3D58B0D58C35E5BCBC81D7CD333633FEDF1047914A534256478D9 |
|||
21 Jun 2014, 15:32 |
|
JohnFound 21 Jun 2014, 15:37
HaHaAnonymous wrote: This method is interesting. I decided to test it and it was a little less than 2 times faster than my ordinary "byte-by-byte" method (608 / 315ms). The bigger disadvantage I see is that this function uses much more bytes (which of course will not be a problem if speed is more important). HaHaAnonymous, what kind of strings you use in the benchmarks? The speed of this procedure is very dependent on the string length. As a rule, longer strings higher speed gain. _________________ Tox ID: 48C0321ADDB2FE5F644BB5E3D58B0D58C35E5BCBC81D7CD333633FEDF1047914A534256478D9 |
|||
21 Jun 2014, 15:37 |
|
HaHaAnonymous 21 Jun 2014, 15:47
[ Post removed by author. ]
Last edited by HaHaAnonymous on 28 Feb 2015, 18:10; edited 1 time in total |
|||
21 Jun 2014, 15:47 |
|
Sasha 22 Jun 2014, 20:14
HaHaAnonymous, it happens when buffer is not aligned. The word by word routine is sensitive to disaligment.
|
|||
22 Jun 2014, 20:14 |
|
Sasha 22 Jun 2014, 22:03
There are some strange behavior even on aligned(unaligned strings behaves really unpredictable.) strings. Look at the chart below.
Upd: The question is why does yellow faster than blue as it is the same(the string IS aligned you jump after test eax,1), and the blue even smaller.
Last edited by Sasha on 22 Jun 2014, 23:36; edited 2 times in total |
||||||||||
22 Jun 2014, 22:03 |
|
AsmGuru62 22 Jun 2014, 22:57
Maybe try straightforward version:
Code: strlen: mov edx, [esp + 4] xor eax, eax xor ecx, ecx align 16 @@: cmp [edx + eax], cl je .done add eax, 1 jmp @r .done: ret |
|||
22 Jun 2014, 22:57 |
|
Sasha 23 Jun 2014, 00:16
AsmGuru62, do you want to execute all that nops? I thing, you must first align the entire procedure, and then decide if you need to fix the aligments inside.
|
|||
23 Jun 2014, 00:16 |
|
revolution 23 Jun 2014, 00:22
How does it compare to rep cmpsb?
|
|||
23 Jun 2014, 00:22 |
|
Goto page Previous 1, 2, 3, 4, 5, 6 Next < Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.