flat assembler
Message board for the users of flat assembler.
Index
> Compiler Internals > Fasm Multilangage Encoding Goto page 1, 2 Next |
Author |
|
edfed 19 Mar 2010, 11:22
totally useless.
if someone wants to code in asm, he should accept to learn alphabetical characters, and english. |
|||
19 Mar 2010, 11:22 |
|
revolution 19 Mar 2010, 11:41
Although the opcodes are based upon English they are not really English. I think they should be taken as just what they are; ASCII sequences to form a mnemonic for the user.
What would be other language equivalents of FYL2X? |
|||
19 Mar 2010, 11:41 |
|
hopcode 19 Mar 2010, 12:01
revolution wrote: ...opcodes...they should be taken as just what they are; ASCII sequences to form a mnemonic for the user. The concept is transliteration of the source code, done by fasm Quote: What would be other language equivalents of FYL2X? this: Quote: فيلءكس NOTE: And you can read it, because php allows a transliteration 1:1 with a different (arabic) encoding... sending headers to an encoding-enabled browser. NOTE 2: or sending &# + HTML encoded char (but the song remains still the same). _________________ ⠓⠕⠏⠉⠕⠙⠑ |
|||
19 Mar 2010, 12:01 |
|
revolution 19 Mar 2010, 12:11
hopcode wrote: transliteration If so then I don't see how that helps the user in any way. |
|||
19 Mar 2010, 12:11 |
|
hopcode 19 Mar 2010, 12:31
revolution wrote: ...So it is not really a language change, but it is a script change?... I would say, a middle way between a transcript (like the passage of the writing standards in 9th-11th Century) Carolingian and a transliteration thru a script writing system like Cyrillic Quote: If so then I don't see how that helps the user in any way. Yes, because you Tomasz edfed and me, we use almost the same encoding, and our files are "ascii" (with different codepages, for different OS languages ). If i write up an UTF-16, or save the same file in UTF-16 i could reopen it in arabic, with the same unchanged semantics: Code:
mov eax,ecx
_________________ ⠓⠕⠏⠉⠕⠙⠑ |
|||
19 Mar 2010, 12:31 |
|
revolution 19 Mar 2010, 12:43
Are you writing this as a pre-preprocessor?
language1("ABCD") ---> fasm_language("mov") language1("XYZ") ---> fasm_language("fyl2x") |
|||
19 Mar 2010, 12:43 |
|
hopcode 19 Mar 2010, 13:28
I mean built-in feature, because fasm could read the encoding and BOM
from the command line, or as the first line of the file (like for html headers) For example, fasm indents an UTF-16 file in this way fasm source.asm -d ARABIC (-d ARABIC in arabic it is implicit, because of the encoding) And i could do it on my ISO encoding with this arabic command line (simply switching the encoding,as i wrote above in bold) Quote: فاسم صورك.أسم د أرابيك the file contains this instruction: فيلءكس this corresponds to HTML encoding (singular separeted letters, visible if you quote it and read the source of the post) ف (F) ي (Y) ل (L) ٢ (2) اكس (X) Corresponding, in our encoding: u0046; (F) u0059 ;(Y) u004C ;(L) u0032 ;(2) u0058 ;(X) ... . . _________________ ⠓⠕⠏⠉⠕⠙⠑ |
|||
19 Mar 2010, 13:28 |
|
shoorick 19 Mar 2010, 13:40
translation of opcodes, as well as transliteration, will lead to real mess. maybe when we will have a new great CPU with original arabic mnemonics, then we will learn it
btw, there already was been a russian translation of fasm (on wasm.ru forum) - it was looking a bit funny _________________ UNICODE forever! |
|||
19 Mar 2010, 13:40 |
|
hopcode 19 Mar 2010, 20:58
Quote: translation of opcodes... There is a reason because i tell it between. Quote: a russian translation of fasm (on wasm.ru forum) What is a exactly there a "translation" ? Did you ever think to express with your ukrain codepage symbols like "aeiou" or "äüöß" ? No, i imagine. You should switch codepage, or worst than all, install a west-euro OS. But the reason is not that your keyboard cannot it, in one moment; the reason is that one byte only is not enough to express in the same moment cyrillic and west-euro characters.(apart from the fact that of Cyrillics there are a lot of encodings) Now, i dont know that sort of wasm-fasm you report. About it, one could state now, a priori, with 100% certainty: 1) it is not international (a multilanguage version) 2) it is "ASCII" 3) it is funny (because of what i explained above here) 4) it is by design a mess, not as a result, because of the one byte encoding. Also, i thought about: 1) internal UTF-16, gives an internal stability to fasm. (avoid that UTF-8 meager variance to let the US-user spare 100 bytes and adding other 10000 slowing ones in the parser for Arabs or Japaneses... etc) 2) fixed tables built-in or (in a way pre-preprocessed as supposed above) 3) This leads to merely and only unicode source files. At that stage will be fasm completely multilanguage without doubt and not as funny as you say about the wasm-fasm, or if you like it, supposedly messy. I think it is clear now. and i dont insist anymore on this thread. Only one thing,please, think upon the following table and draw your comments/conclusions. Quote:
Bye, hopcode _________________ ⠓⠕⠏⠉⠕⠙⠑ |
|||
19 Mar 2010, 20:58 |
|
revolution 20 Mar 2010, 10:39
hopcode: Are you saying the source code file does not change?
e.g. how does the source file look for: Code: mov eax,-1 ret Code: 6D 6F 76 20 65 61 78 2C 2D 31 0D 0A 72 65 74 |
|||
20 Mar 2010, 10:39 |
|
hopcode 20 Mar 2010, 13:18
revolution wrote: hopcode: Are you saying the source code file does not change? Exceptionally i will answer you, because you exercise always a "good mood" effect on my bad soul. I will built it simply (in a near future a little test example) without anymore explanation here, because i need it, especially when fasm will remain ASCII (and i think it will remain ASCII) Ok. Please, s-t-o-p thinking ASCII. Now, to make it simple, save the following as west.asm, 1) compile it as fasm west.asm west.txt (42 bytes) Code: ;--- west euro encoding UTF-16 BE db 0FEh,0FFh,\ 00,6Dh,\ 00,6Fh,\ 00,76h,\ 00,20h,\ 00,65h,\ 00,61h,\ 00,78h,\ 00,20h,\ 00,2Ch,\ 00,20h,\ 00,2Dh,\ 00,20h,\ 00,31h,\ 00,0Dh,\ 00,0Ah,\ 00,72h,\ 00,65h,\ 00,74h,\ 00,0Dh,\ 00,0Ah 2) open it with notepad 3) save the following as arab.asm and compile it fasm arab.asm arab.txt . This file has the same number of bytes of the west.asm file (42 bytes) Code: ;--- arab encoding UTF-16 BE db 0FEh,0FFh,\ 06,45h,\ 06,48h,\ 06,41h,\ 00,20h,\ 06,25h,\ 06,43h,\ 06,33h,\ 00,20h,\ 06,0Ch,\ 00,20h,\ 00,2Dh,\ 00,20h,\ 06,61h,\ 00,0Dh,\ 00,0Ah,\ 06,31h,\ 06,4Ah,\ 06,2Ah,\ 00,0Dh,\ 00,0Ah 4) open it with notepad Now, all toghether save the following as multicolor.asm (langages are like colors for me) and compile it as fasm multicolor.asm multicolor.txt (84 bytes) Code: ;--- west euro encoding UTF-16 BE db 0FEh,0FFh,\ 00,6Dh,\ 00,6Fh,\ 00,76h,\ 00,20h,\ 00,65h,\ 00,61h,\ 00,78h,\ 00,20h,\ 00,2Ch,\ 00,20h,\ 00,2Dh,\ 00,20h,\ 00,31h,\ 00,0Dh,\ 00,0Ah,\ 00,72h,\ 00,65h,\ 00,74h,\ 00,0Dh,\ 00,0Ah ;--- arab encoding UTF-16 BE db 0FEh,0FFh,\ 06,45h,\ 06,48h,\ 06,41h,\ 00,20h,\ 06,25h,\ 06,43h,\ 06,33h,\ 00,20h,\ 06,0Ch,\ 00,20h,\ 00,2Dh,\ 00,20h,\ 06,61h,\ 00,0Dh,\ 00,0Ah,\ 06,31h,\ 06,4Ah,\ 06,2Ah,\ 00,0Dh,\ 00,0Ah Open multicolor.txt with notepad. Do you see it ? I cannot go at the moment so far in the explanation. I will do it simply. I am extending my fasmlab. fasmlab imho should be not a toy like actually most of other ideS out there (...small steps it will grow if God assist me ) I hope that Tomasz understands what i mean, and will make fasm multilanguage, in this way or in another way i cannot imagine now. Sorry, Bye hopcode _________________ ⠓⠕⠏⠉⠕⠙⠑ |
|||
20 Mar 2010, 13:18 |
|
revolution 20 Mar 2010, 13:26
This is what I see in my editor:
|
||||||||||
20 Mar 2010, 13:26 |
|
revolution 20 Mar 2010, 13:30
It appears you want to make the file multi-encoded with lots of different character sets, using Unicode or UTF-16. Is that correct?
And you want to be able to edit the version in your own language and fasm will automatically alter all the other versions to follow. Is that correct? BTW: That Arabic is backwards. There should be a reversing character in there. |
|||
20 Mar 2010, 13:30 |
|
shoorick 22 Mar 2010, 07:11
as originally cyrillic man i can say the transliteration is the pain in the ass, and will not bring to us any profit. i would agree with translation, better when original and translated keywords accepted similar. we have such systems, where same time both english and russian keywords are accepted - they are mostly scripting languages in economical software. it has sense, but not for assembler, as it has different area of application.
btw, have you asked any arabic programmer is it a good idea? how do you think, will polish programmer agree to change mov to mow ? etc.-etc. i do know what i'm talking about see screenshot of 1C (1S), which we are using at work, with part of my code regards!
|
||||||||||
22 Mar 2010, 07:11 |
|
MHajduk 22 Mar 2010, 18:17
shoorick wrote: how do you think, will polish programmer agree to change mov to mow ? |
|||
22 Mar 2010, 18:17 |
|
hopcode 22 Mar 2010, 20:37
revolution wrote: This is what I see in my editor Look at the code, there are extra spaces i added in. Note that notepad produces its own meaningful error/ambiguity when rendering such text with 2 directions... I am working on simple incomplete samples at the moment (incomplete because i dont implement my translit idea there, i need a framework to emulate them, time etc.). Perhaps i manage to have them tommorrow or in a couple of days, if all goes the expected way. But i cannot promise it. Pdf downloadable Patents related to the "transliteration",here http://www.google.com/patents/about?id=vIcLAAAAEBAJ&dq=transliteration&num=4&client=internal-uds lot of ideas about... and 2 constant factors to consider: - 2-byte encoding gives lot of manipulation possibilities - UNICODE allows custom code-point over the BMP 0 Private Use Planes for example, U-000F0000 -> U-0010FFFF to Tomasz, imagine this: something like the "Challenger" that eats (16bit) code-points, instead of ASCII. Also, why not then grouped by 2,3,4 code-points ? - 1) But eating 2,3,4 (ad libitum) code-points is almost the len of an asm instruction! - 2) lot of opcoding possibilities in 2,3,4 byte - 3) also -> automatically multilanguage - 4) also -> automatically assemblable - also ... add your comments or fantasy Cheers, hopcode _________________ ⠓⠕⠏⠉⠕⠙⠑ |
|||
22 Mar 2010, 20:37 |
|
Tomasz Grysztar 22 Mar 2010, 21:01
hopcode wrote: something like the "Challenger" that eats (16bit) code-points, instead of ASCII. Other that that, I'm not able to comment on this, because, frankly, I'm not able to grasp the idea you're trying to present here. |
|||
22 Mar 2010, 21:01 |
|
hopcode 22 Mar 2010, 21:27
Tomasz Grysztar wrote: The Challenger operates on 32-bit codepoints, not ASCII. that is a good start Quote: ...I'm not able to comment on this... No problem. Afterall it is not so simple to render my abstract ideas, the direction of them. i need some time more to draw lines around the thing. i am elaborating and trying to realize something that is not strictly related to fasm. thinking about it primairly abstract as a patent, then realizing it as software/code etc Cheers, hopcode _________________ ⠓⠕⠏⠉⠕⠙⠑ |
|||
22 Mar 2010, 21:27 |
|
revolution 22 Mar 2010, 23:45
hopcode: Are you just simply proposing that fasm be a unicode (or UTF-16) based assembler?
What was the purpose of the mixed ASCII/Arabic example you posted? Is it supposed to be two versions of the same code? Or what? BTW: Not everyone uses plain Notepad. I hope your proposal is not relying upon everyone using just notepad. And don't forget that notepad is very different on different version of Windows. You will have to say which version of notepad that your proposal works with. |
|||
22 Mar 2010, 23:45 |
|
Goto page 1, 2 Next < Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.