flat assembler
Message board for the users of flat assembler.

Index > Windows > String functions

Goto page Previous  1, 2, 3, 4
Author
Thread Post new topic Reply to topic
Asm++



Joined: 04 Feb 2013
Posts: 24
Location: On a Chip!
Asm++ 06 Feb 2013, 18:39
Kazyaka wrote:
Asm++,

Yeah, you're kind of right about this. But if StringCopy is MemoryCopy (saved as RtlMoveMemory in ntdll.dll), you can also say, that StringLen is like a lstrlen or strlen from WinAPI. And it'll the true. So why should you use one of my functions? Just run Cheat Engine or similar tool, and research Microsoft's procedures. They're much larger and slower. It's everything what I can say about this.


What I mean is that in strcpy you have not to set the bytes count to be copied, since it's defined by the 0(NULL) character, but in memcpy you have to set the bytes count to be copied, because there is NOTHING else to indicate how many bytes to copy, so they are Basically work the same, the difference is how to set the bytes count to be copied. Smile

For Windows APIs, usually I don't use its API for this kind of tasks.

Have you ever tested the speed of CopyMemory API?
As I remember, I have once tested its speed versus the C standard function memcpy, the results were so close(the same sometimes), they are actually very good. Cool

_________________
Binary is nice, but Assembly is better!
Post 06 Feb 2013, 18:39
View user's profile Send private message Reply with quote
AsmGuru62



Joined: 28 Jan 2004
Posts: 1671
Location: Toronto, Canada
AsmGuru62 06 Feb 2013, 18:55
This topic grew fast!
Looks like string functions is a pain for coders.

Lets talk more about replacement.
Of course, when I said about allocations -- I meant that the programmer
must ensure that there is room in a string buffer to do a replacement.

Something like this:
Code:
Buffer rb 256
sHello db 'HELLO',0
...
invoke lstrcpyA, Buffer, sHello
;
; Then here you call a replacement function, so there will
; be enough room in 'Buffer'.
;
stdcall StringReplace, ...
    

Other possibility is to create an object which is a string and it makes sure
that the room is always there and if not -- expands the buffer by reallocation.
But doing that will involve a lot of effort. Writing the allocator for it and
doing all the functions, like concatenation, insert, remove, searching of character and string and other stuff -- a full library.
Post 06 Feb 2013, 18:55
View user's profile Send private message Send e-mail Reply with quote
Asm++



Joined: 04 Feb 2013
Posts: 24
Location: On a Chip!
Asm++ 06 Feb 2013, 19:01
f0dder wrote:
Well, it depends on your goal - size vs. speed.

But, assuming we follow the windows/x86 register preservation rules and C calling convention, can we do a strcpy-that-relies-on-calling-strlen shorter than this?
Code:
strcpy:; 24 bytes
        push    esi
        push    edi

        mov     edi, [esp + 12] ; dst
        mov     esi, [esp + 16] ; src
        
        push    esi
        call    strlen
        mov     ecx, eax
        rep     movsb
        
        pop     eax ; to get rid of arg for strlen
        pop     edi
        pop     esi
        ret    


Here's a couple of straightforward alternatives that don't rely on first calling strlen:
Code:
strcpy_1:       ; 19 bytes
        push    esi
        push    edi
        mov     edi, [esp + 12] ; dst
        mov     esi, [esp + 18] ; src
.copy:
        lodsb
        stosb
        test    al, al
        jnz     .copy

        pop     edi
        pop     esi
        ret

strcpy_2:       ; 19 bytes
        mov     edx, [esp + 4]  ; dst
        mov     ecx, [esp + 8]  ; src
.copy:
        mov     al, [ecx]
        mov     [edx], al
        inc     ecx
        inc     edx
        test    al, al
        jnz     .copy
        ret    


Not claiming any of those are optimal (or even good, and they haven't been tested - they're just food for thought.


Did not tested your code yet but, I am curious, how did you calculated the size of it?
If I'm not wrong, the first one is 30 bytes, the second and the third are 27 bytes! Confused

_________________
Binary is nice, but Assembly is better!
Post 06 Feb 2013, 19:01
View user's profile Send private message Reply with quote
Asm++



Joined: 04 Feb 2013
Posts: 24
Location: On a Chip!
Asm++ 06 Feb 2013, 19:11
AsmGuru62 wrote:
This topic grew fast!
Looks like string functions is a pain for coders.

Lets talk more about replacement.
Of course, when I said about allocations -- I meant that the programmer
must ensure that there is room in a string buffer to do a replacement.

Something like this:
Code:
Buffer rb 256
sHello db 'HELLO',0
...
invoke lstrcpyA, Buffer, sHello
;
; Then here you call a replacement function, so there will
; be enough room in 'Buffer'.
;
stdcall StringReplace, ...
    

Other possibility is to create an object which is a string and it makes sure
that the room is always there and if not -- expands the buffer by reallocation.
But doing that will involve a lot of effort. Writing the allocator for it and
doing all the functions, like concatenation, insert, remove, searching of character and string and other stuff -- a full library.


Replacement supposed to be done on the Original string NOT on a copy of it, right? please correct me if I misunderstood.

_________________
Binary is nice, but Assembly is better!
Post 06 Feb 2013, 19:11
View user's profile Send private message Reply with quote
f0dder



Joined: 19 Feb 2004
Posts: 3175
Location: Denmark
f0dder 06 Feb 2013, 20:03
AsmGuru62 wrote:
Looks like string functions is a pain for coders.

Indeed - and this is still in the realm of trivial single-byte encoding... once you need to deal with different encodings and unicode, and need to take multithreading into consideration, things become way... funnier Smile

Asm++ wrote:
Did not tested your code yet but, I am curious, how did you calculated the size of it?
If I'm not wrong, the first one is 30 bytes, the second and the third are 27 bytes! Confused

That's pretty simple - fasm Smile


Description:
Download
Filename: stringfun.zip
Filesize: 3 KB
Downloaded: 319 Time(s)


_________________
Image - carpe noctem
Post 06 Feb 2013, 20:03
View user's profile Send private message Visit poster's website Reply with quote
HaHaAnonymous



Joined: 02 Dec 2012
Posts: 1178
Location: Unknown
HaHaAnonymous 06 Feb 2013, 20:13
[ Post removed by author. ]


Last edited by HaHaAnonymous on 28 Feb 2015, 21:41; edited 1 time in total
Post 06 Feb 2013, 20:13
View user's profile Send private message Reply with quote
f0dder



Joined: 19 Feb 2004
Posts: 3175
Location: Denmark
f0dder 06 Feb 2013, 20:24
HaHaAnonymous wrote:
Quote:
That's pretty simple - fasm

In other words: You can count yourself. If you aren't lazy, of course.

1) I've never had a reason to memorize instruction encodings (a few have stuck, though).
2) I'm lazy indeed - why do something that a computer can do for me, several times faster, with guaranteed results and no downsides? Smile

_________________
Image - carpe noctem
Post 06 Feb 2013, 20:24
View user's profile Send private message Visit poster's website Reply with quote
baldr



Joined: 19 Mar 2008
Posts: 1651
baldr 06 Feb 2013, 20:39
f0dder,

"Several times"? Methink, "several orders of magnitude" will fit better. Wink

I'm lazy too (that's why I keep references handy). As a side note, proper repne movs implementation can reduce strncpy() to something much simpler.
Post 06 Feb 2013, 20:39
View user's profile Send private message Reply with quote
Kazyaka



Joined: 10 Oct 2011
Posts: 62
Location: Earth
Kazyaka 06 Feb 2013, 21:06
AsmGuru62 wrote:
This topic grew fast!
Yeah, it looks like a birth of a monster.


What do you think about this simple StringLen?
Code:
mov edi,pString
mov al,0
mov ecx,-1
repne scasb    
However, the result is inverted and placed in ECX register. If it's not a problem, you can use it.
Post 06 Feb 2013, 21:06
View user's profile Send private message Reply with quote
AsmGuru62



Joined: 28 Jan 2004
Posts: 1671
Location: Toronto, Canada
AsmGuru62 06 Feb 2013, 22:19
I used that same -1 trick and then I inverted ECX.
I have that function already.
If you take this code into debugger - you will see that inversion is not enough - you'll need a DEC to get to the correct length:
Code:
align 32
TString_Length:
; ---------------------------------------------------------------------------
; INPUT:
;   EDI = ANSI text (zero-byte terminated)
; OUTPUT:
;   ECX = text length (not including zero terminator)
; ---------------------------------------------------------------------------
    push      eax edi
    or        ecx, -1
    xor       eax, eax
    repne     scasb
    not       ecx
    dec       ecx
    pop       edi eax
    ret
    
Post 06 Feb 2013, 22:19
View user's profile Send private message Send e-mail Reply with quote
Asm++



Joined: 04 Feb 2013
Posts: 24
Location: On a Chip!
Asm++ 07 Feb 2013, 02:58
f0dder,
Yes, the sizes you have specified are true, thanks to use32, FASM uses 16-bit as a default option, and that makes a difference in size of the generated code, since the first time, I just copied your code from the forum and did not add use32!Shocked

Why FASM did not use 32-bit as a default option rather than 16-bit?

_________________
Binary is nice, but Assembly is better!
Post 07 Feb 2013, 02:58
View user's profile Send private message Reply with quote
Asm++



Joined: 04 Feb 2013
Posts: 24
Location: On a Chip!
Asm++ 07 Feb 2013, 03:08
HaHaAnonymous wrote:
Quote:
That's pretty simple - fasm

In other words: You can count yourself. If you aren't lazy, of course.


I think there are other important things to invest man's time rather than the sitting down and counting the bytes! Wink

_________________
Binary is nice, but Assembly is better!
Post 07 Feb 2013, 03:08
View user's profile Send private message Reply with quote
f0dder



Joined: 19 Feb 2004
Posts: 3175
Location: Denmark
f0dder 07 Feb 2013, 11:17
baldr wrote:
"Several times"? Methink, "several orders of magnitude" will fit better. Wink

Are you calling me slow? HOW INSULTING! ... :p - you're of course right. Now imagine if we could directly program our brains...

Asm++ wrote:
Why FASM did not use 32-bit as a default option rather than 16-bit?

Probably because it's the starting mode for x86 processors? And because one of the most widespread uses for binary output was .com files?

I added a couple of strlen routines in the zip above, might as well post them inline as well:
Code:
strlen_1:       ; 17 bytes
        mov             eax, [esp + 4]
.scan:
        inc             eax
        cmp             byte [eax-1], 0
        jne             .scan
        
        sub             eax, [esp + 4]
        dec             eax
        ret

strlen_2:       ; 15 bytes
        mov             ecx, [esp + 4]
        xor             eax, eax
        dec             eax
.scan:
        inc             eax
        cmp             byte [ecx + eax], 0
        jne             .scan
        ret

strlen_3:       ; 20 bytes
        push    edi
        mov             edi, [esp + 8]
        xor             ecx, ecx
        not             ecx
        xor             al, al
        repne   scasb
        not             ecx
        lea             eax, [ecx - 1]
        pop             edi
        ret    


Again, untested Smile
Post 07 Feb 2013, 11:17
View user's profile Send private message Visit poster's website Reply with quote
HaHaAnonymous



Joined: 02 Dec 2012
Posts: 1178
Location: Unknown
HaHaAnonymous 07 Feb 2013, 12:34
[ Post removed by author. ]


Last edited by HaHaAnonymous on 28 Feb 2015, 21:41; edited 1 time in total
Post 07 Feb 2013, 12:34
View user's profile Send private message Reply with quote
Asm++



Joined: 04 Feb 2013
Posts: 24
Location: On a Chip!
Asm++ 07 Feb 2013, 18:04
f0dder wrote:
Probably because it's the starting mode for x86 processors? And because one of the most widespread uses for binary output was .com files?

Yes, may be, but I think it should be changed, as we no longer in Dos era. Cool

f0dder wrote:
I added a couple of strlen routines in the zip above, might as well post them inline as well

Thanks for your time, I will check them out.

HaHaAnonymous wrote:
Asm++ wrote:

I think there are other important things to invest man's time rather than the sitting down and counting the bytes! Wink

That is important, a wrong calculation can lead to bugs, crashes, exceptions and others (if any).

The time is mine, let me spend it the way I want to. You people just keeps caring about useless details (such as license change) and now you throw stones at me for this (detail).

I can't understand.

Hi HaHaAnonymous,
The calculations are done by the Assembler, so no need to do them yourself(at least, in normal cases).

I do not think I have said something wrong Confused , it is just MY OPINION about investing the time I have, Your time is Yours and You are free to spend it as You want, NO ONE SAID THAT YOU HAVE TO DO SOMETHING THE WAY OTHERS WANT, AND NO ONE "throw stones" at you, they're just a Discussions. Smile

_________________
Binary is nice, but Assembly is better!
Post 07 Feb 2013, 18:04
View user's profile Send private message Reply with quote
f0dder



Joined: 19 Feb 2004
Posts: 3175
Location: Denmark
f0dder 07 Feb 2013, 18:18
Asm++ wrote:
f0dder wrote:
Probably because it's the starting mode for x86 processors? And because one of the most widespread uses for binary output was .com files?

Yes, may be, but I think it should be changed, as we no longer in Dos era. Cool

Dunno about that - a 16bit binary file is probably more generally useful, since it can be executed as a .com file on a bunch of platforms. Don't know anything commonly used that'll eat a flat 32bit binary? Smile

_________________
Image - carpe noctem
Post 07 Feb 2013, 18:18
View user's profile Send private message Visit poster's website Reply with quote
HaHaAnonymous



Joined: 02 Dec 2012
Posts: 1178
Location: Unknown
HaHaAnonymous 07 Feb 2013, 19:04
[ Post removed by author. ]


Last edited by HaHaAnonymous on 28 Feb 2015, 21:39; edited 1 time in total
Post 07 Feb 2013, 19:04
View user's profile Send private message Reply with quote
ly47



Joined: 24 Sep 2010
Posts: 28
ly47 08 Feb 2013, 09:39
Hi

StringFind(StringSearch) function.
Returns: * Found from eax to ebx Pos.
* Not found eax = -1.
Based on RosAsm.

ly

Code:
; format PE GUI 4.0

include "win32ax.inc"

macro showhex caption,value {
        local .over,.str
        jmp     .over
        .str    db caption," = %08Xh",0
 .over: pushad
        mov     ebx,value
        stdcall [GlobalAlloc],GMEM_MOVEABLE+GMEM_ZEROINIT,1000h
        push    eax
        push    eax
        stdcall [GlobalLock],eax
        push    eax
        ccall   [wsprintf],eax,.str,ebx
        pop     eax
        stdcall [MessageBox],0,eax,0,MB_OK+MB_ICONASTERISK+MB_APPLMODAL
        call    [GlobalUnlock]
        call    [GlobalFree]
        popad
}

macro sra {
        showhex 'eax',eax
}

macro srb {
        showhex 'ebx',ebx
}

macro src {
        showhex 'ecx',ecx
}

macro srd {
        showhex 'edx',edx
}

macro sredi {
        showhex 'edi',edi
}

macro sresi {
        showhex 'esi',esi
}

.data
        source db "This is a kind of text. This kind is a big fat kind",0
        search db "kind",0
        search2 db 'is a', 0
        replace db "line",0
        dest dd ?

.code
main:
        invoke MessageBox, 0, source, "", 0

        stdcall StringSearch, source, search, 1
        sra
        srb

        stdcall StringSearch, source, search2, 1
        sra
        srb

ret

proc StringSearch uses ecx edx esi edi, Buffer, Find, First

    local NextBytePos dd ?

        cmp byte[First],1
        jne @F
        mov [NextBytePos], -1
        @@:
        mov esi, [Buffer]
        mov edi, [Find]
        mov eax, 0
        mov edx, -1
        mov ecx, [NextBytePos]

    .L0: inc ecx
        mov al,byte[esi+ecx]
        cmp al,0
        je .L9
        mov bl, byte[edi]
        cmp bl, 0
        je .L9

            .if al = bl
                cmp edx, -1
        jne @F
        mov edx,ecx
          @@:
                inc edi
        jmp .L0
            .elseif edx <> -1
                mov ecx,edx
        mov edx,-1
            .endif
            mov edi,[Find]
            jmp .L0

    .L9: .if bl = 0
            .if al <> 0
                mov eax, edx
        mov ebx, ecx
            .else
                mov eax, -1
            .endif
        .else
            mov eax, -1
        .endif

        mov [NextBytePos],ecx
       ret
endp

.end main

    
Post 08 Feb 2013, 09:39
View user's profile Send private message Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  
Goto page Previous  1, 2, 3, 4

< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2025, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.