flat assembler
Message board for the users of flat assembler.

Index > Compiler Internals > fasm preprocessor: string slice operator (experimental)

Author
Thread Post new topic Reply to topic
Grom PE



Joined: 13 Mar 2008
Posts: 114
Location: i@grompe.org.ru
Grom PE 24 Oct 2016, 16:16
I wanted fasm to be able to manipulate string arguments to preprocessor
instructions, such as "format", so I dug into the source code and made this.

Specifically, I wanted this to be possible:
Code:
match v,"thingama.jig"
{
  rept v[] n:0
  \{
    match '.',v\[v[]-1-n]
    \\{
      match ext,v\\[v[]-n:]
      \\\{
        format binary as ext ; results in "jig"
      \\\}
    \\}
  \}
}
    


This patch adds string slice operator, inspired by Python,
allowing the preprocessor to manipulate strings like so:

"abc"[2] converts to "c"

"slice and dice"[6:9] converts to "and"
"slice and dice"[:9] converts to "slice and"
"slice and dice"[6:] converts to "and dice"
"slice and dice"[] converts to string length, 14

Like the "#" operator, it only works inside macros.

The addition to preprocessor, gpe_slice.inc:
Code:
      slice:
        ; dl = type of the previous token
        cmp     dl,1Ah
        je      symbol_slice
        cmp     dl,22h
        je      string_slice
        jmp     slice_ignore
      no_slice:
        cmp     esi,edi
        je      before_macro_operators
        jmp     after_macro_operators
      symbol_slice:
        cmp     byte [esi],1Ah
        jne     no_slice
        ; have to ignore in cases like "macro name [param]"
        jmp     slice_ignore ; todo: check if this introduces bugs
      string_slice:
        ; ebx = pointer to the 4-byte length of the string
        mov     eax, [ebx]
        mov     [slice_srclen], eax
        cmp     byte [esi],']'
        je      string_length
        push    [error_line] ebx edi
        ; ebp = end of the line
        mov     edi,ebp
        xor     eax,eax
        cmp     byte [esi],':'
        je      slice_begin_skipped
        ; input: esi = start of expression, edi = free space
        call    precalculate_value
        ; output: eax = result (max 0x7fffffff), esi = end of expression, ecx ebx = trashed, [error_line] = trashed
      slice_begin_skipped:
        mov     [slice_begin], eax
        inc     eax
        mov     [slice_end], eax
        cmp     byte [esi],':'
        jne     finish_slice_parameters
        lodsb
        cmp     byte [esi],']'
        mov     eax,[slice_srclen]
        je      slice_end_skipped
        call    precalculate_value
      slice_end_skipped:
        mov     [slice_end], eax
      finish_slice_parameters:
        pop     edi ebx [error_line]
        lodsb
        cmp     al,']'
        jne     missing_slice_closing_bracket

        ; todo: handle negative index like Python?
        ; todo: allow double slice? "string"[1][0]
        mov     eax,[slice_begin]
        mov     edx,[slice_srclen]
        cmp     eax,0
        jl      value_out_of_range
        cmp     eax,edx
        jge     value_out_of_range
        mov     ecx,[slice_end]
        cmp     ecx,0
        jl      value_out_of_range
        cmp     ecx,edx
        jg      value_out_of_range ; todo: allow out of bounds, just clamp
        sub     ecx,eax
        jb      value_out_of_range
        mov     [ebx],ecx
        push    esi
        lea     esi,[ebx+4+eax]
        lea     edi,[ebx+4]
        rep     movsb
        pop     esi
      slice_shift:
        push    edi
        mov     ecx,ebp
        sub     ecx,esi
        rep     movsb
        pop     edi
        mov     esi,edi
; esi, edi must be correct to not get a crash
        jmp     after_macro_operators
      string_length:    
        mov     eax,[ebx]
        cmp     eax,4;255
        jl      one_byte_string_length
        ; todo: convert to symbol, using string this way is dirty
        mov     [ebx+4],eax
        mov     eax,4
        mov     [ebx],eax
        lodsb
        lea     edi,[ebx+8]
        jmp     slice_shift
      one_byte_string_length:
        mov     [ebx+4],al
        mov     al,1
        mov     [ebx],al
        lodsb
        lea     edi,[ebx+5]
        jmp     slice_shift

missing_slice_closing_bracket:
        push    _missing_slice_closing_bracket
        jmp     error_with_source

_missing_slice_closing_bracket db 'missing slice closing bracket',0    


I don't have a good knowledge of fasm internals as the only partial
documentation is old and there are few comments in the code.

(In particular, I spent a lot of time before finding that call to precalculate_value trashes [error_line] and later it results in access violation.)

So the addition is in "kinda works" quality. Use at your own risk!
Perhaps Tomasz can direct me further on that.

Regardless, hopefully this serves as an example and helps someone in digging in fasm source code.

Related links:
Updated guide to fasm internals/porting
String directives for manipulating text
REQUEST: iterate over characters of symbol/literate.

Attached the patch file and examples.


Description: A patch for fasm source code that adds string slice operator to the preprocessor. Experimental, use with care.
Download
Filename: fasm_superstring.zip
Filesize: 4.56 KB
Downloaded: 719 Time(s)

Post 24 Oct 2016, 16:16
View user's profile Send private message Visit poster's website Reply with quote
Grom PE



Joined: 13 Mar 2008
Posts: 114
Location: i@grompe.org.ru
Grom PE 29 Oct 2016, 07:45
Found the conflict with macros inside a macro like:
Code:
macro s
{ macro t a,[b] \{\} }
    

and fixed the patch. Updated the post and the attachment.
Post 29 Oct 2016, 07:45
View user's profile Send private message Visit poster's website Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2025, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.