flat assembler
Message board for the users of flat assembler.
Index
> OS Construction > Unicode/International Languages question |
Author |
|
decard 20 Jun 2004, 19:14
Maybe you could use UTF-8 char format? it uses different char width: normal (english) characters are encoded as usual, while the other characters (like ą, ś, ń, ź in Polish) are encoded in two or more bytes. This system is good, as it saves place, and makes life a lot easier when working with "standard" characterset. However, things like StrLen routine are more complicated... anyway it is a good system, and you should consider implementing it.
I can't help you much with more detailed explanation, as I was only using routines that deal with UTF-8, and never coding my own. Take a look at Allegro Library http://alleg.sourceforge.net/, it has a rich set of UTF-8 routines (AFAIR they are coded using assembly. regards, Mateusz |
|||
20 Jun 2004, 19:14 |
|
Gomer73 21 Jun 2004, 19:47
Thanks, UTF-8 looks like the way to go with my OS.
I can still use 0 for string termination. Also I didn't realize that unicode was greater than 64k, so even two bytes isn't sufficient. This way I can code my source in any editor and not have to worry. Otherwise it would be kind of a pain trying to save in two byte format. Takes up a little more space for Asian characters, but oh well. |
|||
21 Jun 2004, 19:47 |
|
Juras 11 Nov 2004, 19:41
UTF-8 would be the best choice, I think. It's the application task to support 8-bit code pages when editing/viewing text files (there are more text files that are still encoded in native code pages) . UTF-8 is perfect. However to write a correct FAT12/16/32 driver you'll need in-kernel code pages support in order to encode old dos 8.3 file names with 8-bit code pages, use something like DosCP variable, as I plan to use in my kernel.
|
|||
11 Nov 2004, 19:41 |
|
< Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.