flat assembler
Message board for the users of flat assembler.
Index
> Main > Unicode string value |
Author |
|
l_inc 28 Feb 2013, 03:57
yoshimitsu
There's no native way to do that, but you may wanna try something like this: Code: macro u [arg] { common inst equ irps argn,arg \{ \forward inst equ inst argn match \`argn\#'',argn \\{ \\local ustr restore inst virtual du argn assert $-$$ <= 8 dq 0 load ustr qword from $$ end virtual inst equ inst ustr \\} \common match inst,inst \\{ inst \\} \forward restore inst \} restore inst } Your example would then look like this: Code: u cmp eax,'AC' And an hour of wonderful music can become reality just like that: Code: u invoke Beep,'z','07' |
|||
28 Feb 2013, 03:57 |
|
yoshimitsu 28 Feb 2013, 12:59
Thank you for the macro, l_inc.
Small question, isn't the '' in \`argn\#'' just an empty literal and doesn't get matched at all, so you can omit it in the first place? Your macro seems like a good solution, although a nice native way would be more welcome. |
|||
28 Feb 2013, 12:59 |
|
l_inc 28 Feb 2013, 15:38
yoshimitsu
Quote: isn't the '' in \`argn\#'' just an empty literal and doesn't get matched at all, so you can omit it in the first place '' is an empty string, not an empty literal. In general this concatenation with an empty string is done in order to avoid problems in case argn is an empty argument, cause in this case without concatenation you'll get a construction match `, which is expanded into match "," which in turn is not a valid construction. But you are right. In this specific case it is not necessary to concatenate with an empty string because irps does not produce empty arguments. Quote: Your macro seems like a good solution I personally would prefer an even simpler solution which however needs more coding overhead: Code: macro ustrdef [arg*] { forward match name==val,arg \{ \local ustr name equ ustr virtual du val assert $-$$ <= 8 dq 0 load ustr qword from $$ end virtual \} } And this is the related coding overhead: Code: ustrdef x='ab',y='cd' cmp eax,x mov eax,y restore x,y Quote: although a nice native way would be more welcome That's up to the author to decide. I could live without this possibility without problems, but it would make sense to consider including the corresponding feature into the fasm 2. |
|||
28 Feb 2013, 15:38 |
|
yoshimitsu 28 Feb 2013, 16:36
Thanks again.
However, I don't quite understand why match 'AC''','AC' matches correctly. Also does the documentation say anything about matching strings or rather extracting the text out of a string? Because I wouldn't have thought that match `x,x would work.. |
|||
28 Feb 2013, 16:36 |
|
l_inc 28 Feb 2013, 16:47
yoshimitsu
Quote: I don't quite understand why match 'AC''','AC' matches correctly. It doesn't. In my example there's a concatenation operator in between. And therefore it's match `AC # '','AC' -> match 'AC','AC', and this does match correctly. Quote: Also does the documentation say anything about matching strings or rather extracting the text out of a string? It does not extract anything out of a string, because one quoted string is a single standalone inseparable symbol (not a character though): 1.2.1 Instruction syntax wrote: If the first character of symbol is either a single or double quote, it integrates any sequence of characters following it, even the special ones, into a quoted string, which should end with the same character, with which it began (the single or double quote) and: 2.3.6 Conditional preprocessing wrote: any of symbol characters and any quoted string should be matched exactly as is. Quote: Because I wouldn't have thought that match `x,x would work. This is a common way to check at preprocessing stage whether x is a quoted string symbol. |
|||
28 Feb 2013, 16:47 |
|
yoshimitsu 28 Feb 2013, 17:39
Sorry, overlooked a couple of things and had some major brain lag.
Everything's understood now, thanks l_inc ;) PS: I wonder what Tomasz thinks of using L, u or sth else in front/behind of a string literal to make it a unicode value, like cmp eax,L'AC' cmp eax,u'AC' cmp eax,'AC'L cmp eax,'AC'u It wouldn't add much overhead and one also wouldn't be forced to use it, so nothing breaks. |
|||
28 Feb 2013, 17:39 |
|
l_inc 28 Feb 2013, 17:55
yoshimitsu
Quote: I wonder what Tomasz thinks of using L, u or sth else in front/behind of a string literal to make it a unicode value I think it's quite problematic to make it fit into the current syntax. First of all it's questionable how to handle the following situations: Code: du "ab"
db L"ab" Second and more important problem is that the source code is not in Unicode. Thus you need some rules to convert the string specified between quotation marks into Unicode. These rules are called "encoding". Just appending a zero-byte to a character value does not make it Unicode automatically. The encoding problem with the du directive is solved by including a standard header with a redefining macro corresponding to the desired encoding. With current fasm architecture it's impossible to redefine such inlined L with a macro. |
|||
28 Feb 2013, 17:55 |
|
yoshimitsu 28 Feb 2013, 18:23
Thing is, I never did anything with Unicode so far.
Means I don't have a clue about it :s Atm I'm thinking of porting some code to Unicode (Windows-Unicode which is UTF-16, so 2 Byte for one char, afaik) in which I heavily use ASCII-strings as dword-values for easier strcmp which would then break due to being 8 Bytes in size and not fitting into one instruction, so I got to to split them. I'd also need to fill the values with zereos which would be easy with hex-values instead of strings, it'd make the source unreadable, though. Guess the easiest solution is just to stick with your macro then. |
|||
28 Feb 2013, 18:23 |
|
l_inc 28 Feb 2013, 19:20
yoshimitsu
You may need to implement some higher-level macros. I'm not sure whether the following example is suitable, but when I needed to push strings onto the stack, I implemented a more or less high level construction, which in most cases does not fit into a single assembly instruction: Code: ;Allows to detect current code generation mode (2,4,8 ) macro detectMode byteCount { virtual mov eax,[0] byteCount = 1 shl ($-$$-3) end virtual } ;Allows to save a string onto the stack ;usage: pushstr "This is gonna be a stack string",0 struc pushstr [arg*] { common local pushCount, buf, modeByteCount detectMode modeByteCount virtual db arg pushCount = ($-$$+modeByteCount-1)/modeByteCount end virtual . = pushCount*modeByteCount repeat pushCount virtual db arg db (modeByteCount-1) dup 0 if modeByteCount = 2 load buf word from $$+(pushCount-%)*modeByteCount else if modeByteCount = 4 load buf dword from $$+(pushCount-%)*modeByteCount else if modeByteCount = 8 if % < pushCount load buf qword from $$+(pushCount-%-1)*modeByteCount else load buf qword from $$+(pushCount-1)*modeByteCount end if else display 'Error: unknown code generation mode',13,10 err end if end virtual if modeByteCount = 8 push rax mov rax,buf else push buf end if end repeat if modeByteCount = 8 xchg rax,qword[rsp+(pushCount-1)*modeByteCount] end if } In this case porting to Unicode means just replacing the first two db-directives with du. P.S. With newer fasm abilities (addressing space labels), this implementation is not the best possible, but I didn't have much time to modify my macros appropriately. |
|||
28 Feb 2013, 19:20 |
|
< Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.