flat assembler
Message board for the users of flat assembler.

flat assembler > Compiler Internals > fasm preprocessor: string slice operator (experimental)

Author
Thread Post new topic Reply to topic
Grom PE



Joined: 13 Mar 2008
Posts: 113
Location: i@grompe.org.ru
I wanted fasm to be able to manipulate string arguments to preprocessor
instructions, such as "format", so I dug into the source code and made this.

Specifically, I wanted this to be possible:
Code:
match v,"thingama.jig" { rept v[] n:0 \{ match '.',v\[v[]-1-n] \\{ match ext,v\\[v[]-n:] \\\{ format binary as ext ; results in "jig" \\\} \\} \} }


This patch adds string slice operator, inspired by Python,
allowing the preprocessor to manipulate strings like so:

"abc"[2] converts to "c"

"slice and dice"[6:9] converts to "and"
"slice and dice"[:9] converts to "slice and"
"slice and dice"[6:] converts to "and dice"
"slice and dice"[] converts to string length, 14

Like the "#" operator, it only works inside macros.

The addition to preprocessor, gpe_slice.inc:
Code:
slice: ; dl = type of the previous token cmp dl,1Ah je symbol_slice cmp dl,22h je string_slice jmp slice_ignore no_slice: cmp esi,edi je before_macro_operators jmp after_macro_operators symbol_slice: cmp byte [esi],1Ah jne no_slice ; have to ignore in cases like "macro name [param]" jmp slice_ignore ; todo: check if this introduces bugs string_slice: ; ebx = pointer to the 4-byte length of the string mov eax, [ebx] mov [slice_srclen], eax cmp byte [esi],']' je string_length push [error_line] ebx edi ; ebp = end of the line mov edi,ebp xor eax,eax cmp byte [esi],':' je slice_begin_skipped ; input: esi = start of expression, edi = free space call precalculate_value ; output: eax = result (max 0x7fffffff), esi = end of expression, ecx ebx = trashed, [error_line] = trashed slice_begin_skipped: mov [slice_begin], eax inc eax mov [slice_end], eax cmp byte [esi],':' jne finish_slice_parameters lodsb cmp byte [esi],']' mov eax,[slice_srclen] je slice_end_skipped call precalculate_value slice_end_skipped: mov [slice_end], eax finish_slice_parameters: pop edi ebx [error_line] lodsb cmp al,']' jne missing_slice_closing_bracket ; todo: handle negative index like Python? ; todo: allow double slice? "string"[1][0] mov eax,[slice_begin] mov edx,[slice_srclen] cmp eax,0 jl value_out_of_range cmp eax,edx jge value_out_of_range mov ecx,[slice_end] cmp ecx,0 jl value_out_of_range cmp ecx,edx jg value_out_of_range ; todo: allow out of bounds, just clamp sub ecx,eax jb value_out_of_range mov [ebx],ecx push esi lea esi,[ebx+4+eax] lea edi,[ebx+4] rep movsb pop esi slice_shift: push edi mov ecx,ebp sub ecx,esi rep movsb pop edi mov esi,edi ; esi, edi must be correct to not get a crash jmp after_macro_operators string_length: mov eax,[ebx] cmp eax,4;255 jl one_byte_string_length ; todo: convert to symbol, using string this way is dirty mov [ebx+4],eax mov eax,4 mov [ebx],eax lodsb lea edi,[ebx+8] jmp slice_shift one_byte_string_length: mov [ebx+4],al mov al,1 mov [ebx],al lodsb lea edi,[ebx+5] jmp slice_shift missing_slice_closing_bracket: push _missing_slice_closing_bracket jmp error_with_source _missing_slice_closing_bracket db 'missing slice closing bracket',0


I don't have a good knowledge of fasm internals as the only partial
documentation is old and there are few comments in the code.

(In particular, I spent a lot of time before finding that call to precalculate_value trashes [error_line] and later it results in access violation.)

So the addition is in "kinda works" quality. Use at your own risk!
Perhaps Tomasz can direct me further on that.

Regardless, hopefully this serves as an example and helps someone in digging in fasm source code.

Related links:
Updated guide to fasm internals/porting
String directives for manipulating text
REQUEST: iterate over characters of symbol/literate.

Attached the patch file and examples.


Description: A patch for fasm source code that adds string slice operator to the preprocessor. Experimental, use with care.
Download
Filename: fasm_superstring.zip
Filesize: 4.56 KB
Downloaded: 129 Time(s)

Post 24 Oct 2016, 16:16
View user's profile Send private message Visit poster's website Reply with quote
Grom PE



Joined: 13 Mar 2008
Posts: 113
Location: i@grompe.org.ru
Found the conflict with macros inside a macro like:
Code:
macro s { macro t a,[b] \{\} }

and fixed the patch. Updated the post and the attachment.
Post 29 Oct 2016, 07:45
View user's profile Send private message Visit poster's website Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >

Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 2004-2018, Tomasz Grysztar.

Powered by rwasa.