flat assembler
Message board for the users of flat assembler.

Index > Windows > Variable Length String

Author
Thread Post new topic Reply to topic
Masood.Sandking



Joined: 12 Jan 2012
Posts: 65
Location: Iran
Masood.Sandking
Hi assemblers...!

In this program:

Code:

include 'win32ax.inc'

macro strcpy stra , strb
{
   local label1
   local label2
   mov ecx,0
   jmp label2
   label1:
   inc ecx
   label2:
   mov al,[strb+ecx]
   mov [stra+ecx],al
   cmp [strb+ecx],0
   jne label1
}

.code

  start:
        strcpy str1 , str3
        strcpy str1 , str2
        invoke  MessageBox,HWND_DESKTOP,str1,invoke GetCommandLine,MB_OK
        invoke  ExitProcess,0

.end start

.data

str1 db 'hello',0
str2 db 'harry',0
str3 db 'jupiter',0    


Output is r, because str1, str2 and str3 are sticking together in memory.
how can i solve this problem. what is the solution in high-level languages?
i want to implement variable length strings later...
Post 13 Sep 2012, 18:39
View user's profile Send private message Yahoo Messenger Reply with quote
marcinzabrze12



Joined: 07 Aug 2011
Posts: 60
marcinzabrze12
run this code. It explain you wath is happend:

Code:
include 'win32ax.inc'

macro strcpy stra , strb
{
   local label1
   local label2
   mov ecx,0
   jmp label2
   label1:
   inc ecx
   label2:
   mov al,[strb+ecx]
   mov [stra+ecx],al
   cmp [strb+ecx],0
   jne label1
}

.code

  start:
        strcpy str1 , str3
        invoke  MessageBox,HWND_DESKTOP,str2,invoke GetCommandLine,MB_OK
        strcpy str1 , str2
        invoke  MessageBox,HWND_DESKTOP,str1,invoke GetCommandLine,MB_OK
        invoke  ExitProcess,0

.end start

.data

str1 db 'hello',0
str2 db 'harry',0
str3 db 'jupiter',0  
    


beside i yor macro you use cmp [memory], value and mov ecx,0
change to cmp al,0 and xor ecx,ecx - it will be faster
ps. why you jus not use lstrcpy function (WinApi) ?
beside in assembly to text operation are specjal command:
movs*, stos*, lods* .... they are faster and it is important because text operation are execute many times.


Last edited by marcinzabrze12 on 13 Sep 2012, 19:09; edited 2 times in total
Post 13 Sep 2012, 19:01
View user's profile Send private message Reply with quote
typedef



Joined: 25 Jul 2010
Posts: 2913
Location: 0x77760000
typedef
You can make your macro strict, meaning it will fail if the destination address's length is less than the source's.

Or, you can just copy enough bytes that fit in the destination address. If the function is for versatile strings you can leave it as is otherwise you'd have to compensate for the null byte too.

PS: Didn't mama teach you how to indent code ? Wink
Post 13 Sep 2012, 19:03
View user's profile Send private message Reply with quote
Masood.Sandking



Joined: 12 Jan 2012
Posts: 65
Location: Iran
Masood.Sandking
thanks for answer...

'hello',0,'harry',0,'jupiter'
=> 'jupiter',0,'rry',0,'jupiter'
=> 'r',0,'rryer',0,'rry',0,'jupiter'

it happens because last character of str1 is sticking to first character of str2. but what happens in high level programming languages like Basic or Pascal? how can i fix this problem? i heard something about dynamic memory allocation... do i need to do that?
Post 13 Sep 2012, 19:18
View user's profile Send private message Yahoo Messenger Reply with quote
typedef



Joined: 25 Jul 2010
Posts: 2913
Location: 0x77760000
typedef
Another approach, but rather slow.

Code:
macro strcpy [stra , strb]
{ 
        local     label1
        local     label2

        push      edi  esi

        cld
        mov       esi, strb
        mov       edi, stra
label1:
        lodsb
        cmp       al,   0
        je        label2
        stosb            ; Or mov byte[edi], al / then inc edi

        jmp       label1
label2:
        pop       esi  edi
}
    
Post 13 Sep 2012, 19:23
View user's profile Send private message Reply with quote
Masood.Sandking



Joined: 12 Jan 2012
Posts: 65
Location: Iran
Masood.Sandking
i'm not sure what is the lengths of those three strings at run time. i mean i don't want fixed-lenght strings... help me guys...
Post 13 Sep 2012, 19:27
View user's profile Send private message Yahoo Messenger Reply with quote
r22



Joined: 27 Dec 2004
Posts: 805
r22
HLLs use pointers to string objects.
Since you only have Copy you can do something a little simpler.

Code:

include 'win32ax.inc'

macro strcpy stra , strb
{
    MOV eax, [strb]
    MOV [stra], eax
}

.code

  start:
        strcpy pstr1 , pstr3
        strcpy pstr1 , pstr2
        invoke  MessageBox,HWND_DESKTOP,[pstr1],invoke GetCommandLine,MB_OK
        invoke  ExitProcess,0

.end start

.data

str1 db 'hello',0
str2 db 'harry',0
str3 db 'jupiter',0
pstr1 dd str1
pstr2 dd str2
pstr3 dd str3
    
Post 13 Sep 2012, 19:49
View user's profile Send private message AIM Address Yahoo Messenger Reply with quote
typedef



Joined: 25 Jul 2010
Posts: 2913
Location: 0x77760000
typedef
This is more of a proc than a macro. I'm not a macro guru like revolution.

And this is not tested by the way. It's just to give you ideas. There might be bugs in it.

Code:
macro valloc [size]{
           invoke VirtualAlloc, 0, size, MEM_RESERVER or MEM_COMMIT, PAGE_READWRITE
}
macro  free [pointer]{
           invoke VirtualFree, pointer, 0, MEM_RELEASE
}

;
; Returns string length in eax
;
macro  strlen [string]{

       local   lbl_0
       local   lbl_1

       push    esi

       xor     eax,  eax
       cld
       mov     esi,   string
lbl_0:
       lodsb
       cmp     al,    $00
       je      lbl_1
       lea     eax,   [eax+1]

       jmp     lbl_0
lbl_1:
       pop     esi

}

macro strcpy [stra , strb]
{
        local     label1
        local     label2
        local     copy

        local     strA_len
        local     strB_len
common
        strA_len = 0
        strB_len = 0

        push      edi   esi

        mov       edi,  stra
        push      edi

        strlen    stra
        mov       dword[strA_len],     eax

        strlen    strb
        mov       dword[strB_len],     eax

        pop       edi

        cmp       dword[strA_len],     eax      ; destination
        jg        copy

        mov       eax,  dword[strB_len]
        sub       eax,  dword[strA_len]
        add       eax,  dword[strB_len]

        valloc    eax
        mov       edi,  eax

        xor       eax,  eax

        mov       ecx,  dword[strA_len]
        ; mov       esi,  stra ;   for concatenation 
        repnz     stosb                  ; quick zero memory

copy:
        mov       esi, strb
label1:
        lodsb
        cmp       al,   0
        je        label2
        stosb                   ; edi still points to the new address.
        jmp        label2

label2:
        pop       esi  edi
}
    


Also shows you various ways of moving strings
Post 13 Sep 2012, 20:00
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 17341
Location: In your JS exploiting you and your system
revolution
For most string related purposes using Virtual* functions is not a good choice. Consider using Heap* or Local* functions instead. They are a better fit for multiple small buffer allocations.

And remember that the Windows API has lstrcpy and related functions. You are reinventing the wheel if you write your own.
Post 13 Sep 2012, 20:27
View user's profile Send private message Visit poster's website Reply with quote
AsmGuru62



Joined: 28 Jan 2004
Posts: 1412
Location: Toronto, Canada
AsmGuru62
I once took OlyDbg and traced lstrcpy - did not like it - too many strange calls into other stuff, probably for safety. I did my own CopyString.
Smile

Strings are fun to work with.
When I was writing a parser - I noticed a pattern in string allocation.
A lot of small strings must have been allocated in a loop and then freed.
So, to improve it I allocated a big block (8 Mb) and put a pointer to the start of the block.
Then when I needed some room - I simply moved pointer forward. At the end of loop, again moved it back to block start.
I got almost twice speed-up against HeapAlloc.
There is a danger of overwriting these 8 Mb on a huge file, but too
huge files are impractical. Danger is there, however.
Post 14 Sep 2012, 02:05
View user's profile Send private message Send e-mail Reply with quote
typedef



Joined: 25 Jul 2010
Posts: 2913
Location: 0x77760000
typedef
AsmGuru62 wrote:
Strings are fun to work with.
When I was writing a parser - I noticed a pattern in string allocation.
A lot of small strings must have been allocated in a loop and then freed.
So, to improve it I allocated a big block (8 Mb) and put a pointer to the start of the block


This sounds like a good idea. I have to try it out in my app.

So basically it's like a "workplace" for manipulating strings. Thanks for this idea.

I think it will fit with my app since the strings I'll be handling are not above 5KB.

Maybe 2 MB will work. But I'll see.
Post 14 Sep 2012, 03:35
View user's profile Send private message Reply with quote
Jimmus



Joined: 13 Sep 2012
Posts: 3
Jimmus
You could assign an arbitrary max, and use that
Code:
str1 db 'hello',0
     rb 20 
str2 db 'harry',0 
str3 db 'jupiter',0 
    
Post 14 Sep 2012, 15:38
View user's profile Send private message Reply with quote
Masood.Sandking



Joined: 12 Jan 2012
Posts: 65
Location: Iran
Masood.Sandking
I prefer dynamic allocation... But i have never done it before... Is there any simple example specially for working with strings? Or any useful example to understand HeapAlloc...?
Post 19 Sep 2012, 20:15
View user's profile Send private message Yahoo Messenger Reply with quote
rohagymeg



Joined: 19 Aug 2011
Posts: 77
rohagymeg
Memory management in windows works like this:

Program memory is divided into heaps.
There is a default heap for every process, so if your program is simple, you don't need to create any heap at all:

Code:
invoke GetProcessHeap;Returns the handle of the main heap.
invoke HeapAlloc, eax, HEAP_ZERO_MEMORY, 100
    


There you have it. HeapAlloc gets process heap handle in eax, HEAP_ZERO_MEMORY guarantees that all the bytes are 0. And the last parameter is what you wanted to know. Just as for any other function, you can pass a constant, register, or memory offset. The size is in bytes.

If someone writes a large program that needs memory allocation and deallocation, it's essential to have multiple heaps, because you can free them if you don't need them. This is when HeapCreate comes into play:

Code:
invoke HeapCreate, HEAP_GENERATE_EXCEPTIONS, 0, 0

mov [MyMem], eax

invoke HeapAlloc, eax, HEAP_ZERO_MEMORY, 100

mov [MyAlloc], eax

;After you called HeapAlloc, you can no longer use it for that heap. HeapReAlloc should be used:

invoke HeapReAlloc, [MyMem], HEAP_GENERATE_EXCEPTIONS+HEAP_ZERO_MEMORY, eax, 200;Allocation properties are changed

;To access your memory, the best way is to load the address into a register:

mov ebx, [MyAlloc];The pointer of the first byte is loaded into ebx

mov dword [ebx], "Test"

add ebx, 4;Now ebx points to the zero byte which is after "Test"

mov word [ebx], "in"

add ebx, 2

mov byte [ebx], "g"

;If you want to write more:

inc ebx;So on...

;If you no longer need the heap, just call HeapFree

invoke HeapFree, [MyMem], 0, [MyAlloc]
    


It's not my job to explain what msdn explains. And even if I did, I couldn't explain it clearer than msdn. Please look up these functions to understand the parameters. Hope I helped Smile
EDIT: I didn't try, but if you are able to call HeapAlloc more than once on a heap without getting appcrash, then it's completely legal to use, but your data on the heap will be lost. HeapReAlloc can change the heap properties, like increasing/decreasing size, without losing the data. Unless you give less size than your data is.
Post 23 Sep 2012, 17:07
View user's profile Send private message Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2020, Tomasz Grysztar. Also on YouTube, Twitter.

Website powered by rwasa.