flat assembler
Message board for the users of flat assembler.

flat assembler > Compiler Internals > fasm preprocessor: string slice operator (experimental)

Author
Thread Post new topic Reply to topic
Grom PE



Joined: 13 Mar 2008
Posts: 114
Location: i@grompe.org.ru
I wanted fasm to be able to manipulate string arguments to preprocessor
instructions, such as "format", so I dug into the source code and made this.

Specifically, I wanted this to be possible:
Code:
match v,"thingama.jig"

  rept v n0
  \
    match '.',v\v-1-n
    \\
      match ext,v\\v-n
      \\\
        format binary as ext ; results in "jig"
      \\\
    \\
  \

    


This patch adds string slice operator, inspired by Python,
allowing the preprocessor to manipulate strings like so:

"abc"[2] converts to "c"

"slice and dice"[6:9] converts to "and"
"slice and dice"[:9] converts to "slice and"
"slice and dice"[6:] converts to "and dice"
"slice and dice"[] converts to string length, 14

Like the "#" operator, it only works inside macros.

The addition to preprocessor, gpe_slice.inc:
Code:
      slice
        ; dl = type of the previous token
        cmp     dl,1Ah
        je      symbol_slice
        cmp     dl,22h
        je      string_slice
        jmp     slice_ignore
      no_slice
        cmp     esi,edi
        je      before_macro_operators
        jmp     after_macro_operators
      symbol_slice
        cmp     byte esi,1Ah
        jne     no_slice
        ; have to ignore in cases like "macro name param"
        jmp     slice_ignore ; todo check if this introduces bugs
      string_slice
        ; ebx = pointer to the 4-byte length of the string
        mov     eax, ebx
        mov     slice_srclen, eax
        cmp     byte esi,''
        je      string_length
        push    error_line ebx edi
        ; ebp = end of the line
        mov     edi,ebp
        xor     eax,eax
        cmp     byte esi,''
        je      slice_begin_skipped
        ; input esi = start of expression, edi = free space
        call    precalculate_value
        ; output eax = result max 0x7fffffff, esi = end of expression, ecx ebx = trashed, error_line = trashed
      slice_begin_skipped
        mov     slice_begin, eax
        inc     eax
        mov     slice_end, eax
        cmp     byte esi,''
        jne     finish_slice_parameters
        lodsb
        cmp     byte esi,''
        mov     eax,slice_srclen
        je      slice_end_skipped
        call    precalculate_value
      slice_end_skipped
        mov     slice_end, eax
      finish_slice_parameters
        pop     edi ebx error_line
        lodsb
        cmp     al,''
        jne     missing_slice_closing_bracket

        ; todo handle negative index like Python?
        ; todo allow double slice? "string"10
        mov     eax,slice_begin
        mov     edx,slice_srclen
        cmp     eax,0
        jl      value_out_of_range
        cmp     eax,edx
        jge     value_out_of_range
        mov     ecx,slice_end
        cmp     ecx,0
        jl      value_out_of_range
        cmp     ecx,edx
        jg      value_out_of_range ; todo allow out of bounds, just clamp
        sub     ecx,eax
        jb      value_out_of_range
        mov     ebx,ecx
        push    esi
        lea     esi,ebx+4+eax
        lea     edi,ebx+4
        rep     movsb
        pop     esi
      slice_shift
        push    edi
        mov     ecx,ebp
        sub     ecx,esi
        rep     movsb
        pop     edi
        mov     esi,edi
; esi, edi must be correct to not get a crash
        jmp     after_macro_operators
      string_length    
        mov     eax,ebx
        cmp     eax,4;255
        jl      one_byte_string_length
        ; todo convert to symbol, using string this way is dirty
        mov     ebx+4,eax
        mov     eax,4
        mov     ebx,eax
        lodsb
        lea     edi,ebx+8
        jmp     slice_shift
      one_byte_string_length
        mov     ebx+4,al
        mov     al,1
        mov     ebx,al
        lodsb
        lea     edi,ebx+5
        jmp     slice_shift

missing_slice_closing_bracket
        push    _missing_slice_closing_bracket
        jmp     error_with_source

_missing_slice_closing_bracket db 'missing slice closing bracket',0    


I don't have a good knowledge of fasm internals as the only partial
documentation is old and there are few comments in the code.

(In particular, I spent a lot of time before finding that call to precalculate_value trashes [error_line] and later it results in access violation.)

So the addition is in "kinda works" quality. Use at your own risk!
Perhaps Tomasz can direct me further on that.

Regardless, hopefully this serves as an example and helps someone in digging in fasm source code.

Related links:
Updated guide to fasm internals/porting
String directives for manipulating text
REQUEST: iterate over characters of symbol/literate.

Attached the patch file and examples.


Description: A patch for fasm source code that adds string slice operator to the preprocessor. Experimental, use with care.
Download
Filename: fasm_superstring.zip
Filesize: 4.56 KB
Downloaded: 205 Time(s)

Post 24 Oct 2016, 16:16
View user's profile Send private message Visit poster's website Reply with quote
Grom PE



Joined: 13 Mar 2008
Posts: 114
Location: i@grompe.org.ru
Found the conflict with macros inside a macro like:
Code:
macro s
 macro t a,b \\ 
    

and fixed the patch. Updated the post and the attachment.
Post 29 Oct 2016, 07:45
View user's profile Send private message Visit poster's website Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2019, Tomasz Grysztar.

Powered by rwasa.