flat assembler
Message board for the users of flat assembler.

Index > Windows > Splitting a String

Author
Thread Post new topic Reply to topic
Nameless



Joined: 30 Apr 2010
Posts: 95
Nameless 04 May 2010, 14:35
i wrote this code to get the second part of the string, but not the first Sad
any idea?
Code:
format PE GUI 4.0

include 'win32ax.inc'


.data
 lpBuf         db "a test|b test",0
 szBuffer      rb 1024

.code

start:
   mov ebx, lpBuf
   jmp split

  next_byte:
   add ebx, 1

  split:
   mov dl, byte[ebx]
   cmp dl, 124 ;ASCII of |
   jne next_byte
   add ebx, 1

   invoke MessageBox, 0, ebx, " ", 0
.end start
    


*ps : i hope u dont mind having a super n00by between u, cause i tried to read most topics made by other members and none matched my level, all talking about OS development and im talking about strings :S
just PM me if im embarrassing this board
Post 04 May 2010, 14:35
View user's profile Send private message Reply with quote
Nameless



Joined: 30 Apr 2010
Posts: 95
Nameless 04 May 2010, 15:44
ok, a little update
took a while to do this function, im trying as hard as i can

Code:
format PE GUI 4.0

include 'win32ax.inc'


.data
 lpBuf         db "a test|b test",0
 szBuffer      rb 1024
 szBuffer2     rb 1024
 sChar               db      128 dup(?)

.code

proc Split, lpBuffer, Str1, Str2
  pusha

  mov ebx, [lpBuffer]
  jmp split

  next_byte:
   mov dl, byte[ebx]
   mov [sChar], dl
   invoke lstrcat, [Str1], sChar
   add ebx, 1

  split:
   mov dl, byte[ebx]
   cmp dl, 124 ;ASCII of |
   jne next_byte
   add ebx, 1
   invoke lstrcat, [Str2], ebx

  popa
  ret
endp

start:

   stdcall Split, lpBuf, szBuffer, szBuffer2
   invoke MessageBox, 0, szBuffer, szBuffer, 0
.end start
    


any hints or tips about it? size? calls? anything is really appreciated
Post 04 May 2010, 15:44
View user's profile Send private message Reply with quote
baldr



Joined: 19 Mar 2008
Posts: 1651
baldr 04 May 2010, 16:04
Nameless,

Forget about strings, think bytes. Your function must copy part of buffer at [lpBuffer] (up to '|') into buffer at [Str1], and the rest into buffer at [Str2]? Then copy bytes, don't append single-byte zstrings (what if [Str1] points to non-empty string?).

You must check the case when '|' don't appear in the zstring at [lpBuffer] too.
Post 04 May 2010, 16:04
View user's profile Send private message Reply with quote
Nameless



Joined: 30 Apr 2010
Posts: 95
Nameless 04 May 2010, 17:01
how should i do that?
im sorry im too n00by here

i also tried StrStr and StrChr, StrChr didnt work, and StrStr returned the Second Part of the String only

Edit: is That Better
Code:
proc Split, lpBuffer, Str1, Str2
  pusha
  mov ebx, [lpBuffer]

  invoke lstrlen, [Str1]
  invoke RtlZeroMemory, [Str1], eax
  invoke lstrlen, [Str2]
  invoke RtlZeroMemory, [Str2], eax

 .split:
   mov dl, byte[ebx]
   mov [sChar], dl
   invoke lstrcat, [Str1], sChar
   inc ebx
   cmp byte[ebx], 124 ; 124 = ASCII of |
   jne .split
   inc ebx
   invoke lstrcat, [Str2], ebx

  popa
  ret
endp      
Post 04 May 2010, 17:01
View user's profile Send private message Reply with quote
edemko



Joined: 18 Jul 2009
Posts: 549
edemko 04 May 2010, 19:27
Code:
format pe gui 4.0
include 'win32ax.inc'








section '' code import readable writable executable
library advapi32, 'advapi32.dll',\
        kernel32, 'kernel32.dll',\
        user32,   'user32.dll'
include 'api\advapi32.inc'
include 'api\kernel32.inc'
include 'api\user32.inc'




BUF_START:
buf db 'a-b +c* d/e\f. g,h<i>j (k)[l]{m}&n:       o;^%$#p     ''"q!r`  s~t^$ uvwxyz',0,'hello',13,13,13,10,10,9,9,9,0,'http://www.board.flatassembler.net',0
BUF_END:

entry $

        cld
        mov     esi,buf
        mov     edi,esi
        mov     ecx,BUF_END-BUF_START
  .loop:
        lodsb
        test    al,al
        jnz     .more_word_breaks
  .word_break:
        lea     eax,[edi+1]
        neg     eax
        add     eax,esi
        cmovz   edi,esi
        jz      .continue
        ;;do smth. yours
        mov     byte[esi-1],0
        push    ecx
        invoke  MessageBoxA,0,edi,0,0
        pop     ecx
        ;;
        mov     edi,esi
        jmp     .continue
  .more_word_breaks:
        cmp     al,9
        je      .word_break
        cmp     al,10
        je      .word_break
        cmp     al,13
        je      .word_break
        cmp     al,32
        je      .word_break
  .continue:
        loop    .loop

        invoke  ExitProcess,0

    
Post 04 May 2010, 19:27
View user's profile Send private message Reply with quote
Picnic



Joined: 05 May 2007
Posts: 1403
Location: Piraeus, Greece
Picnic 04 May 2010, 20:19
Hi Nameless,

Here is a minimal implementation of a routine which slice and cut a null-terminated string into portions.
Code:
Tokenizer:
            push edi
            mov ah, al
@@:         mov al, [esi]
            test al, al
            je .fin
            cmp al, ah
            je @F
            mov [edi], al
            inc edi
            inc esi
            jmp @B
@@:         inc esi
.fin:       mov BYTE [edi], 0
            pop edi
            ret


           source db 'Alpha|Bravo|Charlie|Delta',0
           dest rb 256
    


Extract the 1st portion of string in which all words are separated by |

Code:
            mov al, '|'                 ; The delimiter
            mov esi,  source             ; Source string's offset
            mov edi,  dest               ; Destination string's offset
            call Tokenizer              ; Tokenizer the string (1st portion)

            ; dest = 'Alpha'
    


Extract the 2nd portion of string in which all words are separated by |
Code:
            mov al, '|'
            mov esi,  source
            mov edi,  dest
            call Tokenizer           ; Tokenizer the string (1st portion)
            call Tokenizer           ; Tokenizer the string again (2nd portion)

            ; dest = 'Bravo'
    
Post 04 May 2010, 20:19
View user's profile Send private message Visit poster's website Reply with quote
Nameless



Joined: 30 Apr 2010
Posts: 95
Nameless 04 May 2010, 22:25
thanks alot Very Happy

now im gonna study this and see how it works
kinda complicated for me
Post 04 May 2010, 22:25
View user's profile Send private message Reply with quote
Tyler



Joined: 19 Nov 2009
Posts: 1216
Location: NC, USA
Tyler 05 May 2010, 01:53
Code:
format PE GUI 4.0

include 'win32ax.inc'


.data
 lpBuf         db "a test|b test",0

.code

proc Split, lpBuf
  pushad

  mov esi, lpBuffer
  nextbyte:
     lodsb
     ; You may want to add check for 0, incase the str isn't null terminated
     cmp al,'|'
     jne nextbyte
  ;esi contains pointer to byte after '|'
  invoke MessageBox, 0, esi, esi, 0
  popad
  ret
endp

start:

   stdcall Split, lpBuf
.end start
    

I don't use proc's, but I don't think you have to put the ret or push/pops, unless I'm mistaken, proc does that for you.

Nameless wrote:

i hope u dont mind having a super n00by between u, cause i tried to read most topics made by other members and none matched my level, all talking about OS development and im talking about strings :S
just PM me if im embarrassing this board

Nah dude, if they've been willing to put up with ~300 posts by me(and believe me, I've asked WAY worse), you'll be fine.
Post 05 May 2010, 01:53
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20445
Location: In your JS exploiting you and your system
revolution 05 May 2010, 02:35
Tyler wrote:
I don't use proc's, but I don't think you have to put the ret or push/pops, unless I'm mistaken, proc does that for you.
The proc macro will only push/pop things if you tell it to. It is not automatic.
Code:
proc Split uses esi, lpBuf    
Post 05 May 2010, 02:35
View user's profile Send private message Visit poster's website Reply with quote
Nameless



Joined: 30 Apr 2010
Posts: 95
Nameless 05 May 2010, 16:00
Nameless wrote:

i hope u dont mind having a super n00by between u, cause i tried to read most topics made by other members and none matched my level, all talking about OS development and im talking about strings :S
just PM me if im embarrassing this board

Nah dude, if they've been willing to put up with ~300 posts by me(and believe me, I've asked WAY worse), you'll be fine.[/quote]
thats comforting Very Happy

also now im gonna try with arrays (dunno yet if they exists in FASM or no), so if the string can be splited into more than 2 pieces (a test|b test|c test|d test) it will do it and fill the array

just an idea, y there isnt some sort of snippets base here?

thanks alot for helping me Very Happy, u dunno how much knowledge i made from ur posts Smile
Post 05 May 2010, 16:00
View user's profile Send private message Reply with quote
Tyler



Joined: 19 Nov 2009
Posts: 1216
Location: NC, USA
Tyler 05 May 2010, 21:20
You've already used an array, the string you used is an array of chars. Wink
Post 05 May 2010, 21:20
View user's profile Send private message Reply with quote
Nameless



Joined: 30 Apr 2010
Posts: 95
Nameless 06 May 2010, 03:47
so i need an array of an array then
lol
Post 06 May 2010, 03:47
View user's profile Send private message Reply with quote
baldr



Joined: 19 Mar 2008
Posts: 1651
baldr 06 May 2010, 05:16
Nameless,

Assembler is not fundamentally different from HLLs in data structures, and it probably can't be: the difficulty arises from the fact it doesn't hide implementation details under some abstraction veil as HLLs do. From this comes great power (and great confusion): while HLL can operate similarly with values of distinct types (e.g. ShortString, AnsiString and WideString in Delphi behave likewise, yet they have different physical representations), assembly language requires exact instructions to handle them.

Back to the topic: the choice of data structure depends on its usage. For sequential access, array of substring lengths can be sufficient; random access may require more elaborate approach. Specify exactly what your function expects as arguments and what result it should produce. Essentially the task could be done with successive calls to function that returns pointer to current token and pointer to the rest of the string (you may look for strtok() C library function specification).
Post 06 May 2010, 05:16
View user's profile Send private message Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2025, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.