flat assembler
Message board for the users of flat assembler.

Index > Windows > Good solution for copy 3 times text ?

Author
Thread Post new topic Reply to topic
Roman



Joined: 21 Apr 2012
Posts: 1847
Roman 16 Aug 2020, 07:38
I have txt1 db 'P4331,buf1',0
And out text in txt2 db 128 dup (0)

I need copy 3 times txt1 to txt2 and get in txt2 this:
Code:
mov [P4331],buf1
mov [P4331+8],32
mov [P4331+12],buf1
    


For test i using invoke MessageBox,0,txt2,0,0

PS: now my code look big and ugly. And code not easy read.


Last edited by Roman on 16 Aug 2020, 08:12; edited 1 time in total
Post 16 Aug 2020, 07:38
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20451
Location: In your JS exploiting you and your system
revolution 16 Aug 2020, 08:04
Your question isn't clear. Are you trying to generate opcodes from a template? Some kind of pre-processor to create .asm files?

But in general to copy text you can use:
Code:
rep movsb    
Post 16 Aug 2020, 08:04
View user's profile Send private message Visit poster's website Reply with quote
Roman



Joined: 21 Apr 2012
Posts: 1847
Roman 16 Aug 2020, 08:12
I write text parser.
And text parser get (unknow)text (lets say txt1) and out in txt2

Quote:
rep movsb

And they also say that I love HLL Smile
Post 16 Aug 2020, 08:12
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20451
Location: In your JS exploiting you and your system
revolution 16 Aug 2020, 08:20
I'm sure awk could do that.

awk is one step higher than an HLL. Razz
Post 16 Aug 2020, 08:20
View user's profile Send private message Visit poster's website Reply with quote
bitRAKE



Joined: 21 Jul 2003
Posts: 4073
Location: vpcmpistri
bitRAKE 16 Aug 2020, 14:50
No need to use a message box. Here is a test program. Write parser, assemble, enter text on left side, click on the right side to update. Can find where the parser is broken. The example just copies text three times. All text is assumed UTF8.


Description: simple test of tokenizer/parser
Download
Filename: langplay.zip
Filesize: 4.57 KB
Downloaded: 491 Time(s)


_________________
¯\(°_o)/¯ “languages are not safe - uses can be” Bjarne Stroustrup
Post 16 Aug 2020, 14:50
View user's profile Send private message Visit poster's website Reply with quote
Roman



Joined: 21 Apr 2012
Posts: 1847
Roman 16 Aug 2020, 15:12
I try run tokens.exe
But Windows 7: Error ! program tokens.exe not Win32 mode !

I look in code and see this:
Code:
format PE64 GUI 6.2
heap 1 shl 20,1 shl 30 ; commit meg, reserve gig
include 'win64wxp.inc'
    


Last edited by Roman on 16 Aug 2020, 17:25; edited 1 time in total
Post 16 Aug 2020, 15:12
View user's profile Send private message Reply with quote
bitRAKE



Joined: 21 Jul 2003
Posts: 4073
Location: vpcmpistri
bitRAKE 16 Aug 2020, 16:55
Windows 7 is what? 6.1 or 6.0? I also assume RICHEDIT50W, but I think that is present. Right-clicking anywhere on the border is sufficient to exit, or the regular Alt-F4, etc.


Description: probably works on W7
Download
Filename: langplay.W7.zip
Filesize: 4.55 KB
Downloaded: 485 Time(s)


_________________
¯\(°_o)/¯ “languages are not safe - uses can be” Bjarne Stroustrup
Post 16 Aug 2020, 16:55
View user's profile Send private message Visit poster's website Reply with quote
bitRAKE



Joined: 21 Jul 2003
Posts: 4073
Location: vpcmpistri
bitRAKE 21 Aug 2020, 17:40
Three copies of the input is kind of a boring parser. So, here is a basic four state parser:
Code:
struc Δ txt& ; give a label the sizeof its data, and not the sizeof its data type
        label . : .#_end - .
        txt
        .#_end:
end struc



Tokenizer:
        enter 32,0
        ; assume every token is one byte w/o whitespace! yikes!
        imul rax,[input_bytes],27
        mov [result_bytes],rax
        invoke HeapAlloc,[hHeap],4,[result_bytes]
        mov [result_buffer],rax
        leave

        push rsi rdi rbx
        mov rsi,[input_buffer]
        mov r11,[result_bytes]
        mov rdi,[result_buffer]
        add r11,rdi ; memory limit of output
        call FourState
        sub r11,rdi ; unused bytes
        sub [result_bytes],r11 ; actual output bytes (not including null byte)
        xor eax,eax
        stosb
        pop rbx rdi rsi

        retn



CLASS_NUMBER    = 0000_0000b
CLASS_LETTER    = 0100_0000b
CLASS_SYMBOL    = 1000_0000b
CLASS_CTRL      = 1100_0000b

CTRL_ENDOFINPUT = 0
CTRL_LINECOMMENT= 1
CTRL_IGNORE     = 2


FourState:
namespace FourState
        mov rbx,tab_UTF8
; go from the empty class to the class of first character
init:   lodsb
        xlatb                           ; byte classes and decoding
        test al,1100_0000b
        jz number_start
        jns letter_start
        jpe other_start
symbol_start: ; punc, math, etc ...     ; state initialization
        lea rdx,[rsi-1]                 ; preserve name start
symbol_more:
        lodsb
        xlatb
        test al,1100_0000b
        jns symbol_end
        jpo symbol_more
symbol_end:                             ; state termination
        push rax
        pushfq
        lea rcx,[rsi-1]
        mov rax,' symbol' or ($0D shl 56)
        call _display
        popfq
        pop rax
        js other_start
        jpe number_start
letter_start:                           ; [A-Za-z][0-9A-Za-z]*
        lea rdx,[rsi-1]
letter_more:
        lodsb
        xlatb
        test al,1100_0000b
        jpe letter_end
        jns letter_more
letter_end:
        push rax
        pushfq
        lea rcx,[rsi-1]
        mov rax,' letter' or ($0D shl 56)
        call _display
        popfq
        pop rax
        jpo symbol_start
        js other_start
number_start:                           ; [0-9][0-9A-Za-z]*( '(' [0-9][0-9]? ')' )?
        lea rdx,[rsi-1]
number_more:
        lodsb
        xlatb
        test al,1100_0000b
        jz number_more
number_end:
        push rax
        pushfq
        lea rcx,[rsi-1]
        mov rax,' number' or ($0D shl 56)
        call _display
        popfq
        pop rax
        jns letter_start
        jpo symbol_start
other_start:                            ; ? control class ?
        and eax,0x3F
        cmp byte [Control_Table-1],al
        jbe _errorX
        push rax
        call qword [Control_Table+rax*8]
        jc _error
        pop rax
        jmp init
_errorX:
        movzx eax,byte [Control_Table-1]
        push rax
_error:
        mov rax,('ERROR: ' shl 8) or 13
        stosq
        pop rax
        push rsi
        mov rsi,[Error_Table+rax*8]
        lodsb
        movzx ecx,al
        rep movsb
        pop rsi
        retn
end namespace ; FourState



EndOfInput:
        stc
        retn

LineComment:
  @@:   lodsb
        test al,al
        jz EndOfInput
        cmp al,10
        jz @F
        cmp al,13
        jnz @B
  @@:   sub rsi,1

IgnoreCtrl:
        clc
        retn


_display:
        push rax

        ; don't exceed buffer - just give up if full
        mov eax,ecx
        sub eax,edx
        lea rax,[rdi+rax+8*3+3]
        cmp r11,rax
        jc .buffer_full

        mov rax,('class:' shl 16) or $090D
        stosq
        pop rax
        stosq
        mov rax,('value: ' shl 8) or $09
        stosq
        mov al,'"'
        stosb
        ; copy data string to output
        push rsi
        mov rsi,rdx
        sub ecx,edx
        rep movsb
        stosb
        pop rsi
        retn

.buffer_full:   ; just don't produce more output
        pop rax
        retn


.data

align 64
rb 7
db (sizeof Control_Table) shr 3
Control_Table Δ dq \
        EndOfInput,\
        LineComment,\
        IgnoreCtrl

Error_Table dq \
        Error_EOI,\
        Error_EOI,\
        Error_NoError,\
        Error_CtlRng

Error_EOI       Δ db sizeof Error_EOI-1,\
        'end of input expected'
Error_NoError   Δ db sizeof Error_NoError-1,\
        "this doesn't happen"
Error_CtlRng    Δ db sizeof Error_CtlRng-1,\
        'unsupported control byte'



align 64
tab_UTF8:
repeat 256,i:0
        match A =i B,: 9 10 11 12 13 32:
                db CTRL_IGNORE or CLASS_CTRL
        else
        if i > 127
                db $3F or CLASS_LETTER
        else if i = 0
                db CTRL_ENDOFINPUT or CLASS_CTRL
        else if i = ';'
                db CTRL_LINECOMMENT or CLASS_CTRL
        else if (i >= 'A') & (i <= 'Z')
                db (i-'A'+10) or CLASS_LETTER
        else if (i >= 'a') & (i <= 'z')
                db (i-'a'+10) or CLASS_LETTER
        else if (i >= '0') & (i <= '9')
                db (i-'0') or CLASS_NUMBER
        else
                db (i and $3F) or CLASS_SYMBOL
        end if
        end match
end repeat    
One can just plug it into the previous tool, and try to break it. I've refactored the code so I can build a whole directory of parser testers at once. I'm going to add a timing function and maybe a very simple fuzzer (will require a different definition for the parser).

_________________
¯\(°_o)/¯ “languages are not safe - uses can be” Bjarne Stroustrup
Post 21 Aug 2020, 17:40
View user's profile Send private message Visit poster's website Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2025, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.