Message board for the users of flat assembler.
> Compiler Internals > [feature requast] Extended ASCII characters compliler "bug"
EladAshkcenazi335 26 Jun 2017, 05:22
If you try to use eASCII characters (using the windows's eASCII spaicel keyboard feature (alt+0->255) ) in fasm's text editor and compile it you will end up with the hax value of the ASCII character '?' or with their (UNIOCODE, UTF-8...) hex values if you try this with other text editors .
SO what i'm suggesting is to add a spaicel keyword that tell FASM to translate this specific (UNIOCODE, UTF-8...) characters into eASCII hex valuesonly when compiling. so FASM can handle properly this specific characters.
*eascii stands for extended ascii.
Last edited by EladAshkcenazi335 on 26 Jun 2017, 14:44; edited 3 times in total
|26 Jun 2017, 05:22||
EladAshkcenazi335 26 Jun 2017, 06:55
This video explalains why this weird bug even exists in the first place
https://www.youtube.com/watch?v=qBex3IDaUbU - more specifically explains the bug .
In short , you can't represent real eASCII characters in non eASCII forms like ANSI, UNIOCODE, UTF-8...
Because eASCII characters requires bit 7 to be set which is also the bit that UTF-8 uses to know if charcter is one byte long (non extended ascii) or two bytes long (or even longer) thus eASCII characters cannot represent with their real hex values and FASM is putting wrong values when using eASCII characters (their ANSI,UNICODE,UTF-8... forms ) instead of the real hex values.
I'm basically suggesting to use a spaicel keyword to tell FASM to compile this fake eASCII characters into real eASCII hex values (only when the keyword is used , otherwise to use the defualt behavior).
For Example :
the eASCII character ▓ has a (eASCII) hex value of 0xB2 but in (_) encoded text FASM will compile the following as:
mov al,'▓' =
mov al,(UTF-8 ) 0x9396E2 ; because this value is greater than 0xff (maximum value in a byte ), fasm will indicate an error!! (try if you don't belive me ), even though the user probably intended to use the eASCII hex value instead of the UTF-8 one.
mov al,(UTF-16) 0x2593 ; also an error will occur
mov al,(UTF-32) 0x00002593 ;same
BUT with the keyword FASM will theoretically compile the original 0xB2 value. (mov al,0xB2)
|26 Jun 2017, 06:55||
Tomasz Grysztar 29 Jun 2017, 14:18
fasm is generally encoding-neutral, hence the strings of bytes are copied literally from the source text into output. If you need to convert between different encodings, the right way to do it is with macros, just like the "du" encoding macros that come as includes in fasm for Windows package.
|29 Jun 2017, 14:18||
< Last Thread | Next Thread >
Copyright © 1999-2023, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.