flat assembler
Message board for the users of flat assembler.

Index > Windows > String Coding problem

Author
Thread Post new topic Reply to topic
kidscracker



Joined: 29 Oct 2004
Posts: 46
kidscracker 19 Sep 2005, 16:44
Well i have this problem, when i receive data from a socket sometimes i receive strange character that i can't show, this is an example:
Quote:
۝۞۩¢NICK%20URL

well it doesn't look like that but something similar, the point is, how can i decode it?, I think that it's UTF8 encoded, but I'm not sur, I wrote a function to decode some of these chars, but not all, here is my code:
Code:
proc    STR_FixCharsString,lptrLine,lptrBuffer
        pushad
        mov     esi,[lptrLine]
        mov     edi,[lptrBuffer]

.FixLoop:
        lodsb
        or      al,al
        jz      .Done
        cmp     al,0C2h
        jz      .Prefix0C2
        cmp     al,0C3h
        jnz     .NoPrefix
        lodsb
        or      al,0C0h
        stosb
        jmp     .FixLoop

.Prefix0C2:
        lodsb
.NoPrefix:
        stosb
        jmp     .FixLoop
.Done:
        stosb

        popad
        ret
endp    


Thankx in advance Wink

PD: The string I receive are from the Messenger Server, as you can see it's a Nick Name


Description: HOW DOES IT LOOK
Filesize: 1.74 KB
Viewed: 2003 Time(s)

STRINGS.GIF


Post 19 Sep 2005, 16:44
View user's profile Send private message Reply with quote
kidscracker



Joined: 29 Oct 2004
Posts: 46
kidscracker 19 Sep 2005, 20:35
Well as nobody answer my question I had to look for a solution. As i was thinking it was UTF8 encoded, so I had to look for the specification and make my own implementation, here is, I hope you will find it usefull,sorry if the comments are in Spanish Laughing .
Code:
proc    STR_UTF8ToUnicode,lptrUTF,lptrBuffer

        pushad
        mov     esi,[lptrUTF]
        mov     edi,[lptrBuffer]
        xor     ebx,ebx                 ; DWORD que contendra el UNICODE
        xor     ecx,ecx
.ConvertLoop:
        lodsb                           ; Cargo Byte

        or      cl,cl                   ; Contador de Bytes
        jz      .NoReadingBytes         ; Cero, No procesos UNICODE

        and     al,3Fh                  ; Solo los 6 bits bajos
        shl     ebx,6                   ; Desplazo 6 bits
        or      bl,al                   ; Combino

        dec     cl                      ; Decremento Contador
        jnz     .ConvertLoop            ; !Cero, Siguiente Byte

        mov     eax,ebx                 ; Valor UNICODE
        stosw                           ; Escribo
        xor     ebx,ebx
        jmp     .ConvertLoop            ; Siguiente Byte

.NoReadingBytes:
        or      al,al                   ; NULL?
        jz      .Done                   ; Termine

        test    al,80h                  ; Bit 7 activo?
        jz      .IsASCII                ; No,Es ASCII

        mov     ch,al                   ; Byte en CH
        shl     ch,1                    ; Elimino Bit 7
        xor     cl,cl                   ; Limpio CL(Contador de Bytes)
.BytesCountLoop:
        shl     ch,1                    ; Roto
        jnc     .ExitBytesCountLoop     ; Era 0,Nada mas
        inc     cl                      ; Incremento el contador
        jmp     .BytesCountLoop         ; Era 1,Pruebo Siguiente Bit
.ExitBytesCountLoop:

        mov     ah,07Fh                 ; Mascara de los 7 primeros bits
        shr     ah,cl                   ;
        and     al,ah                   ; Tomo los bits validos
        movzx   ebx,al                  ; ECX contiene los bits altos
        jmp     .ConvertLoop            ; Siguiente Byte

.IsASCII:
        xor     ah,ah
        stosw                           ; Escribo el caracter(WORD)
        jmp     .ConvertLoop

.Done:
        xor     eax,eax
        stosw

        popad
        ret
endp    
Post 19 Sep 2005, 20:35
View user's profile Send private message Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2025, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.