flat assembler
Message board for the users of flat assembler.

Index > Tutorials and Examples > strlen & strcmp

Author
Thread Post new topic Reply to topic
Ali.Z



Joined: 08 Jan 2018
Posts: 732
Ali.Z 05 Aug 2018, 16:17
Code:
    example:
                        ...
                        mov edi,example_test
                        mov esi,edi
                        mov edx,esi ; ESI == EDI == EDX == example_test
                        call strcmp
                        cmp dword ecx,00 ; if ECX is not '0' it means non-of the strings are equal
                        jnz str_err ; then throw an error if ECX => '1'
                        ...
    strlen:
                        mov eax,edx ; copy example_test to EAX
                        @@:
                        add eax,01 ; INC EAX
                        cmp byte [eax],00 ; then compare if NULL terminator found
                        jnz @B ; if not found then loop
                        sub eax,edx ; otherwise subtract to get the length
                        ret
    strcmp:
                        call strlen
                        sub eax,01 ; DEC EAX to remove NULL terminator from count, we dont want to compare '0's
                        mov ecx,eax ; mov length to ECX , used for loop
                        repz cmpsb ; repeat while they are equal and while ECX is not '0'
                        ret
    str_err:
                        ; some code to throw error

example_test db 'fasm',0 ; since both ESI and EDI have the same address then both strings are equal and ecx will be '0'    

_________________
Asm For Wise Humans
Post 05 Aug 2018, 16:17
View user's profile Send private message Reply with quote
DimonSoft



Joined: 03 Mar 2010
Posts: 1228
Location: Belarus
DimonSoft 05 Aug 2018, 19:16
The implementation is asking for trouble.

1) So, you say your strcmp has three parameters (passed in registers)? Which should be passed in EDX? Will the convention change if you decide to change the implementation by removing the useless call to strlen from the procedure?

2) Why would you even need strlen as part of strcmp? Just xor ecx, ecx / dec ecx.

3) Comparing ECX with zero is not the way to check for equality. ZF is. ECX might be zero after REP-loop no matter which was the result of the last comparison.

P.S. If you decide to use string instructions for your task, feel free to use scasb instead of branching in strlen impementation.
Post 05 Aug 2018, 19:16
View user's profile Send private message Visit poster's website Reply with quote
Ali.Z



Joined: 08 Jan 2018
Posts: 732
Ali.Z 05 Aug 2018, 20:09
please keep in-mind this is just a snippet to provide, you can have them separate or .. yeah


DimonSoft wrote:
So, you say your strcmp has three parameters (passed in registers)? Which should be passed in EDX?

it doesnt have to be like that, this is how i wrote it in my program.
because i keep my string addresses in ESI and EDI, if i want to get the length i would pass edx the address and call strlen. (or pass edx the address i want to compare to, and call strcmp)
DimonSoft wrote:
Will the convention change if you decide to change the implementation by removing the useless call to strlen from the procedure?

i can write different strcmp algorithm, which takes at least couple lines more.
DimonSoft wrote:
Why would you even need strlen as part of strcmp? Just xor ecx, ecx / dec ecx.

as said above, i can change it.
"xor ecx,ecx"? seriously? how would REPZ repeat then?
"dec ecx"? then i have to get the length. (unless i want to compare 3-5 byte or more/less not the whole string)
DimonSoft wrote:
Comparing ECX with zero is not the way to check for equality. ZF is. ECX might be zero after REP-loop no matter which was the result of the last comparison.

ecx will never be zero, UNTIL the last char is compared. (in case it was not '0' then string X isnt = to string Y, and ECX = NUM OF CHARS NOT COMPARED)
** please note ECX = NUM OF CHARS NOT COMPARED it depends on the length of what you are comparing with, a constant-string or from input.
* yes i know i can use jz jnz but what remains in ecx do matter.
DimonSoft wrote:
P.S. If you decide to use string instructions for your task, feel free to use scasb instead of branching in strlen impementation.

i would use scasb for other purposes. but thank you anyway.

_________________
Asm For Wise Humans
Post 05 Aug 2018, 20:09
View user's profile Send private message Reply with quote
DimonSoft



Joined: 03 Mar 2010
Posts: 1228
Location: Belarus
DimonSoft 06 Aug 2018, 08:32
Ali.A wrote:
DimonSoft wrote:
So, you say your strcmp has three parameters (passed in registers)? Which should be passed in EDX?

it doesnt have to be like that, this is how i wrote it in my program.
because i keep my string addresses in ESI and EDI, if i want to get the length i would pass edx the address and call strlen. (or pass edx the address i want to compare to, and call strcmp)

Which breaks the incapsulation of your procedure: you have to know the internals of the function to call it correctly. Which, in turn, tends to become a terrible problem as soon as your program becomes larger than, say, 1000 LoC: you either end up doing things inefficiently or have difficulties to figure out what went wrong with your code.

Ali.A wrote:
DimonSoft wrote:
Will the convention change if you decide to change the implementation by removing the useless call to strlen from the procedure?

i can write different strcmp algorithm, which takes at least couple lines more.

It’s not about the number of lines, it’s about maintainability of the outer code.

Ali.A wrote:
DimonSoft wrote:
Why would you even need strlen as part of strcmp? Just xor ecx, ecx / dec ecx.

as said above, i can change it.
"xor ecx,ecx"? seriously? how would REPZ repeat then?
"dec ecx"? then i have to get the length. (unless i want to compare 3-5 byte or more/less not the whole string)
DimonSoft wrote:
Comparing ECX with zero is not the way to check for equality. ZF is. ECX might be zero after REP-loop no matter which was the result of the last comparison.

ecx will never be zero, UNTIL the last char is compared. (in case it was not '0' then string X isnt = to string Y, and ECX = NUM OF CHARS NOT COMPARED)
** please note ECX = NUM OF CHARS NOT COMPARED it depends on the length of what you are comparing with, a constant-string or from input.
* yes i know i can use jz jnz but what remains in ecx do matter.

My mistake: the xor ecx, ecx / dec ecx stuff should have been put near the description of strlen, not strcmp

But you’d better think a little bit before arguing, to get the idea of what is suggested.

Having an strlen call as part of strcmp is not that bad, but poses a question: which string (of the 2 strings given) you should pass. Actually any will suit but the amount of time taken will differ. But in any case your algorithm goes like “walk through one of the strings, then walk it again comparing it to another string”.

In fact what you need is an endless loop which stops as soon as either there’s a null character in one of the strings or two characters being compared are not equal. Such implementation requires only a single pass through any string but you’ll have to give up using REP prefix, definitely.

As for comparing ECX, you’re wrong. Consider these two strings:
Code:
abcd, 0
abc, 0    

In your code you ignore the result of the last comparison which IS important. But wait, it even worse since you decrease the number of comparisons by 1. Let’s check two cases.

1) Longer string is passed. strlen returns 4. You decrease it by 1 thus having 3 comparisons. The first 3 characters of both strings are equal. ECX becomes zero. You think they’re equal. But they don’t.
2) Shorter string is passed. strlen returns 3. You decrease it by 1 thus having 2 comparisons. The first 2 characters of both strings are equal. ECX becomes zero. You think they’re equal. But they don’t and 2 last characters of the longer string and 1 last character of the shorter string haven’t even taken part in comparing.

What remains in ECX doesn’t really matter since ECX is just an auxiliary register for your loop. The result of comparison is what matters.
Ali.A wrote:
DimonSoft wrote:
P.S. If you decide to use string instructions for your task, feel free to use scasb instead of branching in strlen impementation.

i would use scasb for other purposes. but thank you anyway.

Although that is exactly the case where it has some pros compared to branching. Funny.
Post 06 Aug 2018, 08:32
View user's profile Send private message Visit poster's website Reply with quote
Ali.Z



Joined: 08 Jan 2018
Posts: 732
Ali.Z 06 Aug 2018, 11:50
well, the reason is:
see my attachment to understand better although its slightly different, anyhow because in my program i wanna take input from users i have only 3-5 commands and each one have different name.

one of them is the CLS command, typing C, CL, or CLS will clear the screen.
any other letter will generate error.

YES this is useless if i have multiple commands starting with 'C'
but in my case NO.

so the idea behind this, to make it slightly similar to DISKPART.EXE where you can type the whole command or part of it.

_________________
Asm For Wise Humans
Post 06 Aug 2018, 11:50
View user's profile Send private message Reply with quote
DimonSoft



Joined: 03 Mar 2010
Posts: 1228
Location: Belarus
DimonSoft 06 Aug 2018, 17:47
Then you definitely shouldn’t give such procedures the same names as well known procedures which do well-known things.

As for your task, have you considered using an automaton? I guess, the DFA for your task might be quite small while working at least an order of magnitude faster (O(N) instead of O(N * L / 2), where N is the number of commands, L is an average length of the command).
Post 06 Aug 2018, 17:47
View user's profile Send private message Visit poster's website Reply with quote
Ali.Z



Joined: 08 Jan 2018
Posts: 732
Ali.Z 06 Aug 2018, 19:37
DimonSoft wrote:
(O(N) instead of O(N * L / 2), where N is the number of commands, L is an average length of the command).

this is good, thanks.

okay, site-admin or moderator can completely remove this thread if its 100% useless.

_________________
Asm For Wise Humans
Post 06 Aug 2018, 19:37
View user's profile Send private message Reply with quote
jochenvnltn



Joined: 15 Jul 2011
Posts: 96
jochenvnltn 18 Jun 2019, 18:18
This is what i use for string compare

Code:
proc CmpStr uses ecx esi edi,len,str1,str2

  cld
  xor eax, eax
  mov ecx,[len]
  mov esi,[str1]
  mov edi,[str2]
  repe cmpsb
  sete al
  ret

endp
    
Post 18 Jun 2019, 18:18
View user's profile Send private message MSN Messenger Reply with quote
bitRAKE



Joined: 21 Jul 2003
Posts: 4070
Location: vpcmpistri
bitRAKE 18 Jun 2019, 19:41
contribution to your string library...
Code:
Multi_ConcatenationA:
        pop rdx
        mov r8,rsi
        mov r9,rdi
        pop rdi
        pop rsi

@@:     lodsb           ;       lodsw
        stosb           ;       stosw
        test al,al      ;       test ax,ax
        jnz @B

        pop rsi
        dec rdi
        test rsi,rsi
        jnz @B

        mov rsi,r8
        mov rdi,r9
        jmp rdx    
Just push zero, then a bunch of strings to join, and finally where you want them. No error checking obviously.

_________________
¯\(°_o)/¯ “languages are not safe - uses can be” Bjarne Stroustrup
Post 18 Jun 2019, 19:41
View user's profile Send private message Visit poster's website Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2025, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.