flat assembler
Message board for the users of flat assembler.

Index > Windows > Strings and Local Variables

Goto page 1, 2, 3  Next
Author
Thread Post new topic Reply to topic
Teehee



Joined: 05 Aug 2009
Posts: 570
Location: Brazil
Teehee 10 Dec 2009, 12:30
Hi.

Can someone explain me how can I work with strings and local variables?

I have too much difficult to understand strings (and ESP, EBP, ESI, ESI, SS, DS, ...) operations.


Well, I need to understant that bc I want to make a syntax highlight (using a RichEdit (is the best choise?)). Any help I thanks.
Post 10 Dec 2009, 12:30
View user's profile Send private message Reply with quote
hopcode



Joined: 04 Mar 2008
Posts: 563
Location: Germany
hopcode 10 Dec 2009, 13:23
Teehee wrote:
Hi.how can I work with strings and local variables?...and ESP, EBP, ESI, ESI, SS, DS, ...) operations.
Please, first browse the FAQ here
http://board.flatassembler.net/topic.php?t=2530
Quote:
... to make a syntax highlight (using a RichEdit (is the best choise?)).
Absolutely not !!! you are warned Wink
Anyway "one of the wictims of richedit",iczelion, and his good/best tutorial tutorials on richedit, and almost all win32 assembly, simple to adapt to fasm
http://www.website.masmforum.com/tutorials/iczelion/iczelion.zip

On the main FAQ, i think there is a translation to fasm of these tutorials

Regards,
hopcode
Post 10 Dec 2009, 13:23
View user's profile Send private message Visit poster's website Reply with quote
Teehee



Joined: 05 Aug 2009
Posts: 570
Location: Brazil
Teehee 10 Dec 2009, 15:24
Thanks for your anwser!

hopcode wrote:
Please, first browse the FAQ here
http://board.flatassembler.net/topic.php?t=2530
I took a look at http://flatassembler.net/docs.php?article=manual#2.1.8 but I think I need some simple examples to understand each one of that structions.

hopcode wrote:
Absolutely not !!! you are warned Wink
So what should I use for? Smile

_________________
Sorry if bad english.
Post 10 Dec 2009, 15:24
View user's profile Send private message Reply with quote
hopcode



Joined: 04 Mar 2008
Posts: 563
Location: Germany
hopcode 10 Dec 2009, 16:07
Quote:
So what should I use for? Smile

Coding it yourself is the best solution... but requires a lot loT lOT LOT of time
Scintilla for example, it is the best edit control imho, but it is not
in asm written. For it, You should consider:
320kb for the main module and more kb (hundreds)
for each lexer you implement.

my new fasmlab version (available on sourceforge in few days) will use scintilla. Till now, only Richedit... You can browse the code yourself.

In the new version of fasmlab i have used only the available asm
lexer and it compiles to 350kb. In a near future i will embed php/html/c lexers.

Other edit components... Rolling Eyes there is not so much

For example using fasm, follow this thread
http://board.flatassembler.net/topic.php?p=96265

Regards
Post 10 Dec 2009, 16:07
View user's profile Send private message Visit poster's website Reply with quote
Teehee



Joined: 05 Aug 2009
Posts: 570
Location: Brazil
Teehee 11 Dec 2009, 13:14
hopcode wrote:
Coding it yourself is the best solution...

wow.. I got no idea by where to start.. Smile nvm..

Well, while no one help me with strings and locals, let'me get some help about this:

I made that struct:
Code:
struct CHARFORMAT
    cbSize             dd ?
    dwMask             dd ?
    dwEffects          dd ?
    yHeight            dd ?
    yOffset            dd ?
    crTextColor        dd ?
    bCharSet           db ?
    bPitchAndFamily    db ?
    szFaceName         db 32 dup (?)
ends       

and this code:

Code:
        mov     [cf.cbSize],sizeof.CHARFORMAT
        mov     [cf.dwMask],CFM_COLOR
        mov     [cf.dwEffects],0
        mov     [cf.crTextColor],0x00FF0000
        invoke  SendMessage,[editHwnd],EM_SETCHARFORMAT,SCF_SELECTION,cf
        cmp     eax, 0
        je      exit
    


But it doens't work. It always returns 0. What am I doing wrong?

_________________
Sorry if bad english.
Post 11 Dec 2009, 13:14
View user's profile Send private message Reply with quote
Teehee



Joined: 05 Aug 2009
Posts: 570
Location: Brazil
Teehee 11 Dec 2009, 15:25
Code:
struct CHARFORMAT 
    cbSize             dd ? 
    dwMask             dd ? 
    dwEffects          dd ? 
    yHeight            dd ? 
    yOffset            dd ? 
    crTextColor        dd ? 
    bCharSet           db ? 
    bPitchAndFamily    db ? 
    szFaceName         db 32 dup (?) 
    _wPad2             dw ?          ; <------------------------+
ends    

dam line Smile why do I need that?

_________________
Sorry if bad english.
Post 11 Dec 2009, 15:25
View user's profile Send private message Reply with quote
LocoDelAssembly
Your code has a bug


Joined: 06 May 2005
Posts: 4624
Location: Argentina
LocoDelAssembly 11 Dec 2009, 15:31
I believe it is to make the structure's size multiple of DWORD size. What I don't understand is why Microsoft doesn't documents it http://msdn.microsoft.com/en-us/library/bb787881%28VS.85%29.aspx Evil or Very Mad
Post 11 Dec 2009, 15:31
View user's profile Send private message Reply with quote
Teehee



Joined: 05 Aug 2009
Posts: 570
Location: Brazil
Teehee 12 Dec 2009, 10:43
My first string operation, woohoo:
Code:
        stA db 'test',0 
        stB db ?      

        mov ecx, 4 
        mov esi, stA 
        mov edi, stB 
        rep movsb 

        invoke  MessageBox,NULL, stB ,NULL,MB_OK 
    

1. Is everything alright in the code?
2. There is not another simplest way to do that? like copy the address of stA to stB? :B

_________________
Sorry if bad english.
Post 12 Dec 2009, 10:43
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20516
Location: In your JS exploiting you and your system
revolution 12 Dec 2009, 10:55
Teehee wrote:
My first string operation, woohoo:
Code:
        stA db 'test',0 
        stB db ?      

        mov ecx, 4 
        mov esi, stA 
        mov edi, stB 
        rep movsb 

        invoke  MessageBox,NULL, stB ,NULL,MB_OK 
    

1. Is everything alright in the code?
Sorry, no.

Your allocated space for stB is only one byte, but you move four bytes there and overwrite the instruction 'mov ecx,4'.

Also you didn't copy the terminating zero byte and simply got lucky that the 'mov ecx,4' instruction had a zero in there to terminate.

In short: increase the size of stB buffer to at least the same length as stA, and copy all bytes including the zero terminator byte.
Post 12 Dec 2009, 10:55
View user's profile Send private message Visit poster's website Reply with quote
Teehee



Joined: 05 Aug 2009
Posts: 570
Location: Brazil
Teehee 12 Dec 2009, 11:08
revolution wrote:
Sorry, no.

oh, dam Smile
Quote:
...and overwrite the instruction 'mov ecx,4'.

what do you mean?

so:
Code:
        stA db 'test',0
        stB db 5 dup ?

        mov ecx, 5
        mov esi, stA
        mov edi, stB
        rep movsb    

?

_________________
Sorry if bad english.
Post 12 Dec 2009, 11:08
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20516
Location: In your JS exploiting you and your system
revolution 12 Dec 2009, 12:03
Teehee wrote:
Quote:
...and overwrite the instruction 'mov ecx,4'.

what do you mean?
Directly after the stB buffer you have put the instruction 'mov ecx,4' so the bytes following the stB buffer are the instructions the CPU is executing. If you were to loop back to the instructions again then the CPU would start executing some different code because it would see the new bytes of 'est' and try to interpret them as instructions and execute them.
Teehee wrote:


so:
Code:
        stA db 'test',0
        stB db 5 dup ?

        mov ecx, 5
        mov esi, stA
        mov edi, stB
        rep movsb    

?
Yes.
Post 12 Dec 2009, 12:03
View user's profile Send private message Visit poster's website Reply with quote
Teehee



Joined: 05 Aug 2009
Posts: 570
Location: Brazil
Teehee 12 Dec 2009, 12:20
revolution wrote:
Directly after the stB buffer you have put the instruction 'mov ecx,4'....

Oh, I just put here what was interesting to exemplify, but in my real code I divide the sections Smile

Quote:
Yes.

Yay!

Question:
After I study MOVS and CMPS instructions, I'm not sure if I understand the utility of SCAS. It's just like CMPS, but using Register?

_________________
Sorry if bad english.
Post 12 Dec 2009, 12:20
View user's profile Send private message Reply with quote
Teehee



Joined: 05 Aug 2009
Posts: 570
Location: Brazil
Teehee 12 Dec 2009, 18:13
Hello!

I'm studing a little more, and doing some functions to help to learn. So I'll put the functions here to you all tell me if it's correct.

Code:
; section data
    stA db "4324 234 ç ã abcd'nwxyz123 231",0
; section code
_start:
        ccall strToUpper,stA
        invoke MessageBox,NULL,stA,NULL,MB_OK
        jmp exit
    


strToUpper function:
Code:
proc strToUpper str:dword

        mov   esi, [str]        ; define src
 .next: mov   edi, esi          ; define dest [and later, new dest]
        lodsb                   ; AX = a char
        cmp   al, 0             ; if end_of_string exit
        je   .exit
        mov   bl, al
        and   bl, 'a'
        and   bl, 'z'
        cmp   bl, 'a'-1
        jnz  .next
        sub   al, 'a'-'A'       ; offset
        stosb                   ; loads AX to str
        jmp  .next
 .exit: ret
endp    


Something wrong? Can be optimized?


strToLower function:
Code:
; Idem above, just change:
    cmp   bl, 'a'-1    ->   cmp   bl, 'A'-1
    sub   al, 'a'-'A'  ->   add   al, 'a'-'A'
    


strcmp function:
Code:
proc strcmp str1:dword, str2:dword, str1Size:dword

        mov ecx, [str1Size]
        mov esi, [str1]
        mov edi, [str2]
        rep cmpsb
        jz  @f
        mov eax,1      ; return true
        ret
    @@: xor eax,eax    ; return false
        ret

endp    


Something wrong? Can be optimized?

That's it, for while.

_________________
Sorry if bad english.
Post 12 Dec 2009, 18:13
View user's profile Send private message Reply with quote
Borsuc



Joined: 29 Dec 2005
Posts: 2465
Location: Bucharest, Romania
Borsuc 12 Dec 2009, 19:56
Teehee wrote:
Question:
After I study MOVS and CMPS instructions, I'm not sure if I understand the utility of SCAS. It's just like CMPS, but using Register?
Yes, it scans with the value in eax, ax or al (depending on string function you use, CMPSD,CMPSW or CMPSB). It is similar to the difference between STOS and MOVS, where the former uses the value in eax/ax/al to put in edi, instead of taking the value from whatever esi points to at that moment.


As for optimizations, why not use your own calling convention? use the real instruction call instead of the macro ccall, and pass arguments directly in the registers, not put them on the stack, then get them to the registers. This especially applies for the strcmp function, i.e:

Code:
_start:
  mov esi, str1     ; parameter
  mov edi, str2     ; parameter
  mov ecx, str1size ; parameter
  call strcmp
  jz .equal
  [...]
  .equal:
  [...]

strcmp:
  rep cmpsb
  ret
    
or alternatively just embed the function directly in your code? it's pretty small.

Also you can see I used the flag as 'return value' directly, why not use that instead of uselessly passing it into eax? Remember, this is assembly, you don't need to follow HLL conventions unless you interact with bloated functions (as with the Windows API).

Code:
_start:
  mov esi, str1
  mov edi, str2
  mov ecx, str1size
  rep cmpsb
  jz .equal    
this is obviously better and doesn't even need a function anymore. Small functions are pointless to not inline/put them directly in code.

Some tips for StrToUpper:

  • in this case it's the same, but in general, whenn comparing a register with 0, use 'test reg, reg' -- it's smaller in most cases.

    i.e: test al, al in this case

  • logical AND operation is commutative -- having two consecutive 'and' instructions in a row is pointless. You can combine the constants.

    i.e: and bl, 'a' and 'z'

  • why do you mov bl, al? why not just use al directly?

  • you use 'stosb' only once so why not just use mov [esi], al and avoid 'edi' completely?


here's my take:
Code:
        mov   esi, [str]        ; define src (you should pass the parameter directly in registers IMO)
 .next: lodsb                   ; AL = a char
        test  al, al            ; if end_of_string exit
        je   .exit
        and   al, 'a' and 'z'   ; 'and' is commutative Smile
        cmp   al, 'a'-1
        jnz  .next
        sub   al, 'a'-'A'       ; offset
        mov   [esi], al         ; loads AL to str (replace lowercase with uppercase)
        jmp  .next    
alternatively if you want speed instead of size, put this instead of jmp .next
Code:
        lodsb
        test al, al
        jne @b    
where '@@' is a label that points at the 'and' instruction.

also if you know that your strings cannot be empty, you can avoid the beginning 'test al, al' test and jump.

_________________
Previously known as The_Grey_Beast
Post 12 Dec 2009, 19:56
View user's profile Send private message Reply with quote
LocoDelAssembly
Your code has a bug


Joined: 06 May 2005
Posts: 4624
Location: Argentina
LocoDelAssembly 12 Dec 2009, 21:23
Quote:

alternatively if you want speed instead of size, put this instead of jmp .next

Or for something intermediate:
Code:
        mov   esi, [str]        ; define src (you should pass the parameter directly in registers IMO)
        jmp   .loadChar
 .convert:
        and   al, 'a' and 'z'   ; 'and' is commutative Smile
        cmp   al, 'a'-1
        jnz   .loadChar
        sub   al, 'a'-'A'       ; offset
        mov   [esi], al         ; loads AL to str (replace lowercase with uppercase)
.loadChar:
        lodsb                   ; AL = a char
        test  al, al            ; if end_of_string exit
        jnz   .convert
.exit:
    


Sorry, I'm too sleepy to think right now but does the AND thing actually work? Why not just this:
Code:
sub al, 'a'
cmp al, 'z' - 'a'
ja  .next    
Question

PS: BTW, there is a mistake in all the posted code, it should be "mov [esi-1], al", not "mov [esi], al" or stosb (unless ESI is decremented first, in such case both would be valid).
Post 12 Dec 2009, 21:23
View user's profile Send private message Reply with quote
Teehee



Joined: 05 Aug 2009
Posts: 570
Location: Brazil
Teehee 12 Dec 2009, 22:21
Borsuc wrote:
Remember, this is assembly, you don't need to follow HLL conventions unless you interact with bloated functions (as with the Windows API).

Oh, I was just thinking to make a DLL with string functions to use in c++ code later.. hehe

Quote:

Some tips for StrToUpper:

  • in this case it's the same, but in general, whenn comparing a register with 0, use 'test reg, reg' -- it's smaller in most cases.

    i.e: test al, al in this case

  • logical AND operation is commutative -- having two consecutive 'and' instructions in a row is pointless. You can combine the constants.


i.e: and bl, 'a' and 'z'


Nice, I didn't know! and 1 line less! Smile

Quote:
  • why do you mov bl, al? why not just use al directly?


Because AND changes AL value. So when I SUB the offset it get a wrong value. I need SUB with the original value, so I do AND with BL and SUB wit AL (original) Smile

Quote:
  • you use 'stosb' only once so why not just use mov [esi], al and avoid 'edi' completely?


I don't know.. I just tried that here and it crashes the application. (your take crashes too Razz)

LODS should update ESI position and STODS the EDI. But they don't! (idon't know why) Sad so I did EDI update manually each loop.

This still works:
Code:
proc strToUpper str:dword

        mov   esi, [str]    
 .next: mov   edi, esi   
        lodsb         
        test  al, al         
        je   .exit
        mov   bl, al
        and   bl, 'a' and 'z'
        cmp   bl, 'a'-1
        jnz  .next
        sub   al, 'a'-'A'   
        stosb                
        jmp  .next
 .exit: ret

endp    

_________________
Sorry if bad english.
Post 12 Dec 2009, 22:21
View user's profile Send private message Reply with quote
Borsuc



Joined: 29 Dec 2005
Posts: 2465
Location: Bucharest, Romania
Borsuc 12 Dec 2009, 22:22
Yeah sorry about that I was in a hurry and didn't even bother to see if it worked, just at-a-glance optimizations. And you're correct that it's mov [esi-1], al.

_________________
Previously known as The_Grey_Beast
Post 12 Dec 2009, 22:22
View user's profile Send private message Reply with quote
Teehee



Joined: 05 Aug 2009
Posts: 570
Location: Brazil
Teehee 12 Dec 2009, 22:27
Borsuc wrote:
Yeah sorry about that I was in a hurry and didn't even bother to see if it worked, just at-a-glance optimizations. And you're correct that it's mov [esi-1], al.


oooo, now it works:
Code:
proc strToUpper str:dword

        mov   esi, [str]    
 .next: lodsb              
        test  al, al       
        je   .exit
        mov   bl, al
        and   bl, 'a' and 'z'
        cmp   bl, 'a'-1
        jnz  .next
        sub   al, 'a'-'A'     
        mov [esi-1],al
        jmp  .next
 .exit: ret

endp
    


why mov [esi-1],al? Embarassed

_________________
Sorry if bad english.
Post 12 Dec 2009, 22:27
View user's profile Send private message Reply with quote
Teehee



Joined: 05 Aug 2009
Posts: 570
Location: Brazil
Teehee 12 Dec 2009, 22:33
LocoDelAssembly wrote:

Sorry, I'm too sleepy to think right now but does the AND thing actually work? Why not just this:
Code:
sub al, 'a'
cmp al, 'z' - 'a'
ja  .next    
Question


I think that way you proposed can't work in characters like Ç â é ã... and the AND way it works.


But i didn't test with all ascii table. Can everyone make a macro to build a whole ascii table to test that routine? i still don't know work wit macros.

_________________
Sorry if bad english.
Post 12 Dec 2009, 22:33
View user's profile Send private message Reply with quote
Borsuc



Joined: 29 Dec 2005
Posts: 2465
Location: Bucharest, Romania
Borsuc 12 Dec 2009, 22:56
Teehee wrote:
why mov [esi-1],al? Embarassed
Because lodsb loads the char into al and increments esi by 1 (next char). Here visualization:

Code:
this is a string    
at first you start at 't'
Code:
this is a string
^    
then you execute lodsb
Code:
this is a string
 ^

al = 't'    
Clearly, after you check and see it's lowercase, you need to replace 't', not 'h', with 'T' (the calculated uppercase letter), but esi has already incremented to next char when you loaded al (that's what lodsb does). Smile


BTW your code here uses 'bl' not 'al' and is probably a mistake Wink

_________________
Previously known as The_Grey_Beast
Post 12 Dec 2009, 22:56
View user's profile Send private message Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  
Goto page 1, 2, 3  Next

< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2025, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.