flat assembler
Message board for the users of flat assembler.
Index
> Compiler Internals > unicode. |
Author |
|
b1528932 03 Feb 2011, 20:35
Why fasm doesnt support utf8/utf16?
Isnt it a standard for a couple of years? |
|||
03 Feb 2011, 20:35 |
|
Tyler 03 Feb 2011, 20:39
Because maintaining assembly is hell, and updating it's even worse.
|
|||
03 Feb 2011, 20:39 |
|
revolution 03 Feb 2011, 20:49
b1528932 wrote: Why fasm doesnt support utf8...?
|
|||||||||||
03 Feb 2011, 20:49 |
|
revolution 03 Feb 2011, 20:56
See here for previous discussions.
|
|||
03 Feb 2011, 20:56 |
|
revolution 03 Feb 2011, 21:07
The file I uploaded doesn't have a BOM. And it assembles fine on my system.
|
|||
03 Feb 2011, 21:07 |
|
b1528932 03 Feb 2011, 21:15
it accept any byte != 0x00.
I might even feed it with illegal utf8 characters. Also i can put unfinished utf8 char, and it suck it. 0xFC as the name of label works, when it should be error. I bet it use 1 byte = 1 character engine. I wouldnt call it support of unicode. |
|||
03 Feb 2011, 21:15 |
|
revolution 03 Feb 2011, 21:18
If you take a little time to read the thread I linked you will see the trick to dealing with the BOM also.
Start your file with a single line Code: zxcvbnm = 0 Code: <hidden BOM>zxcvbnm = 0 |
|||
03 Feb 2011, 21:18 |
|
b1528932 03 Feb 2011, 21:19
thats called hax, not support.
|
|||
03 Feb 2011, 21:19 |
|
revolution 03 Feb 2011, 21:21
b1528932 wrote: I wouldnt call it support of unicode. |
|||
03 Feb 2011, 21:21 |
|
Tomasz Grysztar 03 Feb 2011, 21:22
fasm itself works on byte-based text encoding, and it doesn't care what actual encoding you use for your regional characters. This is because of the SSSO principle - the exactly same file is always interpreted by fasm in the same way. And it cannot treat BOM in any special way, because it is a sequence of bytes just like any other - you can have label of such name at the very beginning of your file and if fasm ignored these characters and defined some different label (or displayed an error), it would be against its principles.
And also - what is invalid sequence of bytes in UTF-8 may still be perfectly valid sequence in some other encoding. And fasm has to handle them both. |
|||
03 Feb 2011, 21:22 |
|
b1528932 03 Feb 2011, 22:03
Ok, i see the point of this. Fasm does support 'original' instruction names, like cpu manufacturer called and documented them, right?
Fasm support not only x86 line, right? imagine that new popular cpu emerges, and you would like to support its instructions. Manual is written in korean only, as are instructions documentes. Or even china would by amd and introduce sse7 with only chinese names. You would then rewrite entire engine to support it. Based on those assumption that fasm support original instruction names, lack of real unicode parser is a bug. Everything taken from user, wich is character/name based, must be processed as unicode. And seriously, considering what english-speaking countries are doing, i would prepare to support asian and arab languages asap. |
|||
03 Feb 2011, 22:03 |
|
Tyler 04 Feb 2011, 03:08
Mnemonics just arbitrary memory aids. What's to say Tomasz can't just give them English like names? There's enough English speaking asm programmers that there would be an unofficial English version of the Chinese mnemonics that he could use.
|
|||
04 Feb 2011, 03:08 |
|
< Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2025, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.