flat assembler
Message board for the users of flat assembler.

Index > Windows > new version of dynamic string library

Goto page 1, 2, 3  Next
Author
Thread Post new topic Reply to topic
decard



Joined: 11 Sep 2003
Posts: 1092
Location: Poland
decard 19 Sep 2003, 05:20
Here's a new version of John Found's dynamic string library. It is a part of the Fresh project, but that routines should be useful for many win32 app. It is incompatybile with previous version, but IMO the library itself benefited from that changes...

What exackly was modified:
- now all functions are stdcall (except for the NumToStr, but in next release it will be changed too)
- they return their values in eax
- three new functions were added
more routines comming soon Smile

What do you think about it? Please post here your opinions.

I hope you will find it useful..

(current version is posted below)


Last edited by decard on 25 Sep 2003, 16:42; edited 4 times in total
Post 19 Sep 2003, 05:20
View user's profile Send private message Visit poster's website Reply with quote
roticv



Joined: 19 Jun 2003
Posts: 374
Location: Singapore
roticv 19 Sep 2003, 08:26
Replacing string opcodes with branches? I think string opcodes are slow..
Post 19 Sep 2003, 08:26
View user's profile Send private message Visit poster's website MSN Messenger Reply with quote
decard



Joined: 11 Sep 2003
Posts: 1092
Location: Poland
decard 19 Sep 2003, 13:05
Hi roticv,
I have looked into some documents about optimizing assembly code, and realized that you're right. Your version of StrLen should run faster on Pentium and above Smile (don't know about older mashines). I don't have enough experience with more complex optimization (all those V and U pipes, branch prediction... too difficult to care by now), and as I think optimizing string operations is very important, so meaby you would be a better person to take the StrLib? What do you think?

regards,
decard
Post 19 Sep 2003, 13:05
View user's profile Send private message Visit poster's website Reply with quote
scientica
Retired moderator


Joined: 16 Jun 2003
Posts: 689
Location: Linköping, Sweden
scientica 19 Sep 2003, 13:15
I have an (old) tool which shows how the code will pair, I'll see if I can find it on the net (if not I'll uppload it unless I find some text that prohibits it)

_________________
... a professor saying: "use this proprietary software to learn computer science" is the same as English professor handing you a copy of Shakespeare and saying: "use this book to learn Shakespeare without opening the book itself.
- Bradley Kuhn
Post 19 Sep 2003, 13:15
View user's profile Send private message Visit poster's website Reply with quote
roticv



Joined: 19 Jun 2003
Posts: 374
Location: Singapore
roticv 19 Sep 2003, 13:15
Try this
Code:
strlen:
 mov     ecx, [esp+4] ; first paramter
code_base:
     mov     eax, 1
      cpuid
       test    edx, 800000h
        db 2Eh ;prediction.hintnot taken
    jz      no_mmx_code
mmx_code:
        @@:
             mov             al, byte ptr [ecx]
          inc             ecx
         test    al, al
              je              done
        test    ecx, 7
      jne     @B
  pxor    mm0, mm0
    @@:
             movq    mm1, qword [ecx]
            movq    mm2, qword [ecx + 8]
                movq    mm3, qword [ecx + 16]
               movq    mm4, qword [ecx + 24]
               movq    mm5, qword [ecx + 32]
               movq    mm6, qword [ecx + 40]
               pcmpeqb mm1, mm0
            pcmpeqb mm2, mm0
            pcmpeqb mm3, mm0
            pcmpeqb mm4, mm0
            pcmpeqb mm5, mm0
            pcmpeqb mm6, mm0
            por     mm1, mm2
            por     mm3, mm4
            por     mm5, mm6
            por     mm1, mm3
            por     mm1, mm5
            add     ecx, 48
             packsswb mm1, mm1
   movd    eax, mm1
    test    eax, eax
    jz      @B
  sub     ecx, 48
     emms
no_mmx_code:
    cmp     byte [ecx],0
        lea     ecx, [ecx+1]
        jnz     no_mmx_code
 sub     ecx, [esp][4]
       xchg    eax, ecx
    dec     eax ;return value in eax
    

Don't worry about optimisation, we will optimise it while we go along...

Anyway don't mind if I remove the stack frame. I do not see the need for stack frame for string functions.

Don't mind if any mistakes pop up. I was coding in a notepad and did not attempt to compile the code. Embarassed
Post 19 Sep 2003, 13:15
View user's profile Send private message Visit poster's website MSN Messenger Reply with quote
scientica
Retired moderator


Joined: 16 Jun 2003
Posts: 689
Location: Linköping, Sweden
scientica 19 Sep 2003, 13:19

_________________
... a professor saying: "use this proprietary software to learn computer science" is the same as English professor handing you a copy of Shakespeare and saying: "use this book to learn Shakespeare without opening the book itself.
- Bradley Kuhn
Post 19 Sep 2003, 13:19
View user's profile Send private message Visit poster's website Reply with quote
roticv



Joined: 19 Jun 2003
Posts: 374
Location: Singapore
roticv 19 Sep 2003, 13:37
okay just realised that I made a tiny mistake since ebx and ecx is modified by cpuid (Dammable)
Code:
strlen: 
code_base: 
   mov   eax, 1 
   push ebx
   cpuid 
   mov   ecx, [esp+4] ; first paramter 
   test   edx, 800000h 
   db 2Eh ;prediction.hintnot taken 
   jz   no_mmx_code 
mmx_code: 
   @@: 
      mov      al, byte ptr [ecx] 
      inc      ecx 
      test   al, al 
      je      done 
   test   ecx, 7 
   jne   @B 
   pxor   mm0, mm0 
   @@: 
      movq   mm1, qword [ecx] 
      movq   mm2, qword [ecx + 8] 
      movq   mm3, qword [ecx + 16] 
      movq   mm4, qword [ecx + 24] 
      movq   mm5, qword [ecx + 32] 
      movq   mm6, qword [ecx + 40] 
      pcmpeqb mm1, mm0 
      pcmpeqb mm2, mm0 
      pcmpeqb mm3, mm0 
      pcmpeqb mm4, mm0 
      pcmpeqb mm5, mm0 
      pcmpeqb mm6, mm0 
      por   mm1, mm2 
      por   mm3, mm4 
      por   mm5, mm6 
      por   mm1, mm3 
      por   mm1, mm5 
      add   ecx, 48 
      packsswb mm1, mm1 
   movd   eax, mm1 
   test   eax, eax 
   jz   @B 
   sub   ecx, 48 
   emms 
no_mmx_code: 
   cmp   byte [ecx],0 
   lea   ecx, [ecx+1] 
   jnz   no_mmx_code 
pop ebx
   sub   ecx, [esp][4] 
   xchg   eax, ecx 
   dec   eax ;return value in eax 

    


Last edited by roticv on 19 Sep 2003, 13:58; edited 2 times in total
Post 19 Sep 2003, 13:37
View user's profile Send private message Visit poster's website MSN Messenger Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8349
Location: Kraków, Poland
Tomasz Grysztar 19 Sep 2003, 13:49
An offtopic bit: you are using the following construction in your code (the feature of latest Intel processors):
Code:
   db 2Eh ;prediction.hintnot taken 
   jz   no_mmx_code     

It's enough to write it this way:
Code:
cs jz no_mmx_code    

and if you want it to be more logical, you can define some aliases for this purpose, for example:
Code:
lt equ ds ; likely taken
ut equ cs ; unlikely taken

ut jz no_mmx_code
    

or even define them as macros, to allow them as a prefixes only...
Post 19 Sep 2003, 13:49
View user's profile Send private message Visit poster's website Reply with quote
JohnFound



Joined: 16 Jun 2003
Posts: 3499
Location: Bulgaria
JohnFound 19 Sep 2003, 14:03
Hi, guys.

IMO: We need no speed optimization, especialy in exchange of size. Maybe later we will make some ultra fast libraries. Making the strlib without string functions (mov al, [esi]/inc esi instead of lodsb) is good because it don't make so big code overbloat and it's easy for reading by beginers, but doubling routines with and without MMX is not a good idea I think.

Regards.
Post 19 Sep 2003, 14:03
View user's profile Send private message Visit poster's website ICQ Number Reply with quote
decard



Joined: 11 Sep 2003
Posts: 1092
Location: Poland
decard 19 Sep 2003, 14:47
Well... I agree that by now we shouldn't optimize string functions with MMX. (BTW: doesn't the cpuid make StrLen too slow?... or maybe not?), so in next release there will be StrLen starting from "no_mmx_code: "...

roctiv, IMO you are right about no need for stack frame... I was just simply converting those routines to stdcall with macros Smile In next release it will be fixed.

regards
Post 19 Sep 2003, 14:47
View user's profile Send private message Visit poster's website Reply with quote
JohnFound



Joined: 16 Jun 2003
Posts: 3499
Location: Bulgaria
JohnFound 19 Sep 2003, 15:25
decard wrote:
roctiv, IMO you are right about no need for stack frame... I was just simply converting those routines to stdcall with macros Smile In next release it will be fixed.
regards


Of course roticv is right, but only, please, please, please, Smile keep the readability of the source. It's very important. Describe parameters very clearly and what [sp+???] corresponds with what parameter. Note that if you use the stack the offset will be different for the same parameter in diferent places of the routine - possible bugs.

regards.
Post 19 Sep 2003, 15:25
View user's profile Send private message Visit poster's website ICQ Number Reply with quote
decard



Joined: 11 Sep 2003
Posts: 1092
Location: Poland
decard 20 Sep 2003, 10:59
When I was testing stdcall version of StrLib, I forgot about StrDel.... and of course it was having one stupid bug... It's fixed now, and now the release includes roticv's version of StrLen.

BTW: what do you thing about the NumToStr routine: is it better to create two functions (one for unsigned numbers, and one for signed), or maybe to code one function with additional parameter that will specify whether to threat the number as signed or unsigned...??
Post 20 Sep 2003, 10:59
View user's profile Send private message Visit poster's website Reply with quote
scientica
Retired moderator


Joined: 16 Jun 2003
Posts: 689
Location: Linköping, Sweden
scientica 20 Sep 2003, 11:44
One word from me just, will you write a "wrapper" for the str functions, so that registers are preserved (but still leaving the register param passing version avalible -- for compabllity and "hand in hand code").

_________________
... a professor saying: "use this proprietary software to learn computer science" is the same as English professor handing you a copy of Shakespeare and saying: "use this book to learn Shakespeare without opening the book itself.
- Bradley Kuhn
Post 20 Sep 2003, 11:44
View user's profile Send private message Visit poster's website Reply with quote
decard



Joined: 11 Sep 2003
Posts: 1092
Location: Poland
decard 20 Sep 2003, 11:56
OK, no problem... by compatibility you mean to preserve the function name? I wanted NumToStr to be name of a wrapper function, but you are the StrLib user Very HappyVery Happy

But what about my question?
Post 20 Sep 2003, 11:56
View user's profile Send private message Visit poster's website Reply with quote
scientica
Retired moderator


Joined: 16 Jun 2003
Posts: 689
Location: Linköping, Sweden
scientica 20 Sep 2003, 12:05
Suggestion for wrapper names:
StrNum and StrNumU
IMO it's better to have two functions rather than one with a argument specifying wether it's a signed or unsigned number.

_________________
... a professor saying: "use this proprietary software to learn computer science" is the same as English professor handing you a copy of Shakespeare and saying: "use this book to learn Shakespeare without opening the book itself.
- Bradley Kuhn
Post 20 Sep 2003, 12:05
View user's profile Send private message Visit poster's website Reply with quote
JohnFound



Joined: 16 Jun 2003
Posts: 3499
Location: Bulgaria
JohnFound 20 Sep 2003, 12:16
decard wrote:
BTW: what do you thing about the NumToStr routine: is it better to create two functions (one for unsigned numbers, and one for signed), or maybe to code one function with additional parameter that will specify whether to threat the number as signed or unsigned...??


The NumToStr functions are two: NumToStr (signed) and NumToStrU (unsigned). I think that we must rename these functions to _NumToStr and _NumToStrU and write some wraper function:

Code:
ntsSigned = $00000
ntsUnsigned = $10000
ntsZeroTerminated = $20000
ntsFixedWidth     = $40000

ntsBin  = $02
ntsQuad = $04
ntsOct  = $08
ntsDec  = $0a
ntsHex  = $10

;***********************************************************
; NumToStr - converts number to any radix.
; num - number to convert
; str - handle of the string. If NULL - creates new string.
; index - Offset in string where to put converted number.
; flags:
;   byte 0 - contains radix for the convertion.
;   byte 1 - number of digits if ntsFixedWidth is set.
;   byte 2,3 - flags.
; Returns:
;   eax - handle of the string (new one or passed in [str])
;   edx - pointer to the string.
;  
;***********************************************************
proc NumToStr, num, str, index, flags

; Exmple of using:

        stdcall  NumToStr, $12345, NULL, 1, ntsUnsigned or ntsHex
        mov      byte [edx], '$'
    
Post 20 Sep 2003, 12:16
View user's profile Send private message Visit poster's website ICQ Number Reply with quote
decard



Joined: 11 Sep 2003
Posts: 1092
Location: Poland
decard 20 Sep 2003, 13:03
Well, John, I like your idea. That would be a very powerful routine... To get more 'specified' routines we could use some macros...
But what about 'ntsZeroTerminated' flag? what would be its purpouse?

regards
Post 20 Sep 2003, 13:03
View user's profile Send private message Visit poster's website Reply with quote
JohnFound



Joined: 16 Jun 2003
Posts: 3499
Location: Bulgaria
JohnFound 20 Sep 2003, 13:11
decard wrote:
But what about 'ntsZeroTerminated' flag? what would be its purpouse?


When you convert num to str, it's rare case when you need plain string with only one number in it. In the most cases you need to insert the string with number in some other string with some other text. Because of that I remove zero terminator from original NumToStr functions. [Index] argument is for same reason. Of course you can make plain number string and then use other string functions to concatenate it with any other string, but in most cases it is not optimal.

regards.
Post 20 Sep 2003, 13:11
View user's profile Send private message Visit poster's website ICQ Number Reply with quote
decard



Joined: 11 Sep 2003
Posts: 1092
Location: Poland
decard 20 Sep 2003, 13:21
sounds good Very Happy, so I'm starting to code it....thanks!
Post 20 Sep 2003, 13:21
View user's profile Send private message Visit poster's website Reply with quote
roticv



Joined: 19 Jun 2003
Posts: 374
Location: Singapore
roticv 20 Sep 2003, 16:40
Attached is one StrLCase, one StrUCase, one StrCopyMMX. One thing is that I preserved all the registers that was used, uncomment that if the register preservation is not needed. :/ Grr.. irritating... txt file not accepted.


Description: Addition String functions
Download
Filename: MoreStringFunctions.zip
Filesize: 649 Bytes
Downloaded: 304 Time(s)

Post 20 Sep 2003, 16:40
View user's profile Send private message Visit poster's website MSN Messenger Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  
Goto page 1, 2, 3  Next

< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.