flat assembler
Message board for the users of flat assembler.

Index > Compiler Internals > fasm preprocessor: string slice operator (experimental)

Thread Post new topic Reply to topic
Grom PE

Joined: 13 Mar 2008
Posts: 114
Location: i@grompe.org.ru
Grom PE 24 Oct 2016, 16:16
I wanted fasm to be able to manipulate string arguments to preprocessor
instructions, such as "format", so I dug into the source code and made this.

Specifically, I wanted this to be possible:
match v,"thingama.jig"
  rept v[] n:0
    match '.',v\[v[]-1-n]
      match ext,v\\[v[]-n:]
        format binary as ext ; results in "jig"

This patch adds string slice operator, inspired by Python,
allowing the preprocessor to manipulate strings like so:

"abc"[2] converts to "c"

"slice and dice"[6:9] converts to "and"
"slice and dice"[:9] converts to "slice and"
"slice and dice"[6:] converts to "and dice"
"slice and dice"[] converts to string length, 14

Like the "#" operator, it only works inside macros.

The addition to preprocessor, gpe_slice.inc:
        ; dl = type of the previous token
        cmp     dl,1Ah
        je      symbol_slice
        cmp     dl,22h
        je      string_slice
        jmp     slice_ignore
        cmp     esi,edi
        je      before_macro_operators
        jmp     after_macro_operators
        cmp     byte [esi],1Ah
        jne     no_slice
        ; have to ignore in cases like "macro name [param]"
        jmp     slice_ignore ; todo: check if this introduces bugs
        ; ebx = pointer to the 4-byte length of the string
        mov     eax, [ebx]
        mov     [slice_srclen], eax
        cmp     byte [esi],']'
        je      string_length
        push    [error_line] ebx edi
        ; ebp = end of the line
        mov     edi,ebp
        xor     eax,eax
        cmp     byte [esi],':'
        je      slice_begin_skipped
        ; input: esi = start of expression, edi = free space
        call    precalculate_value
        ; output: eax = result (max 0x7fffffff), esi = end of expression, ecx ebx = trashed, [error_line] = trashed
        mov     [slice_begin], eax
        inc     eax
        mov     [slice_end], eax
        cmp     byte [esi],':'
        jne     finish_slice_parameters
        cmp     byte [esi],']'
        mov     eax,[slice_srclen]
        je      slice_end_skipped
        call    precalculate_value
        mov     [slice_end], eax
        pop     edi ebx [error_line]
        cmp     al,']'
        jne     missing_slice_closing_bracket

        ; todo: handle negative index like Python?
        ; todo: allow double slice? "string"[1][0]
        mov     eax,[slice_begin]
        mov     edx,[slice_srclen]
        cmp     eax,0
        jl      value_out_of_range
        cmp     eax,edx
        jge     value_out_of_range
        mov     ecx,[slice_end]
        cmp     ecx,0
        jl      value_out_of_range
        cmp     ecx,edx
        jg      value_out_of_range ; todo: allow out of bounds, just clamp
        sub     ecx,eax
        jb      value_out_of_range
        mov     [ebx],ecx
        push    esi
        lea     esi,[ebx+4+eax]
        lea     edi,[ebx+4]
        rep     movsb
        pop     esi
        push    edi
        mov     ecx,ebp
        sub     ecx,esi
        rep     movsb
        pop     edi
        mov     esi,edi
; esi, edi must be correct to not get a crash
        jmp     after_macro_operators
        mov     eax,[ebx]
        cmp     eax,4;255
        jl      one_byte_string_length
        ; todo: convert to symbol, using string this way is dirty
        mov     [ebx+4],eax
        mov     eax,4
        mov     [ebx],eax
        lea     edi,[ebx+8]
        jmp     slice_shift
        mov     [ebx+4],al
        mov     al,1
        mov     [ebx],al
        lea     edi,[ebx+5]
        jmp     slice_shift

        push    _missing_slice_closing_bracket
        jmp     error_with_source

_missing_slice_closing_bracket db 'missing slice closing bracket',0    

I don't have a good knowledge of fasm internals as the only partial
documentation is old and there are few comments in the code.

(In particular, I spent a lot of time before finding that call to precalculate_value trashes [error_line] and later it results in access violation.)

So the addition is in "kinda works" quality. Use at your own risk!
Perhaps Tomasz can direct me further on that.

Regardless, hopefully this serves as an example and helps someone in digging in fasm source code.

Related links:
Updated guide to fasm internals/porting
String directives for manipulating text
REQUEST: iterate over characters of symbol/literate.

Attached the patch file and examples.

Description: A patch for fasm source code that adds string slice operator to the preprocessor. Experimental, use with care.
Filename: fasm_superstring.zip
Filesize: 4.56 KB
Downloaded: 654 Time(s)

Post 24 Oct 2016, 16:16
View user's profile Send private message Visit poster's website Reply with quote
Grom PE

Joined: 13 Mar 2008
Posts: 114
Location: i@grompe.org.ru
Grom PE 29 Oct 2016, 07:45
Found the conflict with macros inside a macro like:
macro s
{ macro t a,[b] \{\} }

and fixed the patch. Updated the post and the attachment.
Post 29 Oct 2016, 07:45
View user's profile Send private message Visit poster's website Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  

< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum

Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.