flat assembler
Message board for the users of flat assembler.

Index > Windows > Need a new unicode macro.

Author
Thread Post new topic Reply to topic
pearlz



Joined: 07 Jun 2010
Posts: 55
Location: Viet Nam
pearlz
hi all!
in my Country have a inputmethod named is telex
it's same as:
if I type 'dd' it will return 'đ',
'af' ->'à'
'as' -> 'á'
'aj' -> 'ạ'
'ee' -> 'ê'
'eej' -> 'ệ'
etc,.........

i want write a macro, it can convert like:

'Vieejt Nam' -> 'Việt Nam'
'Xin chafo' -> 'Xin chào'
;it's mean 'Hello' in english

'Casm own casc bajn' -> 'Cám ơn các bạn'
;it's mean 'Thank you'

i find du macro in fasm/macro/encoding/*.inc
it containt macro endcoding, but i not understand the way it running

somebody can told to me way to make an macro for this inputmethod
thanks!

_________________
welcome to VietNam!
Post 17 Oct 2010, 23:01
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 17287
Location: In your JS exploiting you and your system
revolution
pearlz: This is not for Unicode, this is an input method. Usually an editor and/or OS would handle the input methods to create the characters in the code page you use.
Post 18 Oct 2010, 08:25
View user's profile Send private message Visit poster's website Reply with quote
baldr



Joined: 19 Mar 2008
Posts: 1651
baldr
pearlz,

ENCODING\UTF8.INC can be used as a template:
Code:
macro du [arg] {
  local length, current, char, char2, char3
  if arg eqtype ""
    virtual at 0
      db arg
      length = $
    end virtual
    current = 0
    while current<length
      virtual at 0
        db arg
        load char from current
        current = current+1
        if current<length
          if char='a'
            load char2 from current
            current = current+1
            if char2='f'
              char = 0x00E0
            else if char2='j'
              char = 0x1EA1
            else if char2='s'
              char = 0x00E1
            else; not a defined sequence
              current = current-1; roll back
            end if
          else if char = 'd'
            load char2 from current
            current = current+1
            if char2='d'
              char = 0x0111
            else
              current = current-1; see above
            end if
          else if char = 'e'
            load char2 from current
            current = current+1
            if char2='e'
              char = 0x00EA; assume "ee"
              if current<length
                load char3 from current
                current = current+1
                if char3='j'
                  char = 0x1EC7
                else
                  current = current-1
                end if
              end if; current<length
            else
              current = current-1
            end if
          else if char = 'o'
            load char2 from current
            current = current+1
            if char2='w'
              char = 0x01A1
            else
              current = current-1
            end if
          end if
        end if; current<length
      end virtual
      dw char
    end while
  else
    dw arg
  end if
}
;;; Test
dw 0xfeff
du "Vieejt Nam"
dw 10
du "Xin chafo"
dw 10
du "Casm own casc bajn"
dw 10    
Post 18 Oct 2010, 16:46
View user's profile Send private message Reply with quote
mindcooler



Joined: 01 Dec 2009
Posts: 423
Location: Västerås, Sweden
mindcooler
Any way of doing it without ifs? This could be useful for japanese.

Can you implement a DFA engine in macros? Smile
Post 18 Oct 2010, 19:07
View user's profile Send private message Visit poster's website MSN Messenger ICQ Number Reply with quote
LocoDelAssembly
Your code has a bug


Joined: 06 May 2005
Posts: 4633
Location: Argentina
LocoDelAssembly
mindcooler,
Somewhat off-topic perhaps because it does not works at compile-time completely, but I wanted to share one of MHajduk's bests: http://board.flatassembler.net/topic.php?t=5735

Maybe it can give some ideas to make a fully assembly-time automaton.
Post 18 Oct 2010, 21:47
View user's profile Send private message Reply with quote
baldr



Joined: 19 Mar 2008
Posts: 1651
baldr
mindcooler,

Not exactly DFA, I thought about LZW encoder with fixed dictionary. Here's the result:
Code:
format binary as "txt"
macro dictionary [code, prefix, char] {
common
  times 128 db -1, -1, (%-1); ASCIIs are literals
  db 3*(65536-128) dup 0
forward
  store word prefix at code*3
  store char at code*3+2
}

macro du [arg] {
  if arg eqtype ""
    virtual at 0
      db arg
      length = $
    end virtual
    current = 0
    prefix = word -1
    while current<length
      virtual at 0
        db arg
        load char from current
      end virtual
      virtual at 0
        dictionary 0x00E0, 'a', 'f',\
                   0x1EA1, 'a', 'j',\
                   0x00E1, 'a', 's',\
                   0x0111, 'd', 'd',\
                   0x00EA, 'e', 'e',\
                   0x1EC7, 0x00EA, 'j',\; "ee" is a prefix for "eej"
                   0x01A1, 'o', 'w'
        codes = $/3
        index = 0
        while index<codes; find prefix+char in dictionary
          load p word from 3*index
          load c from 3*index+2
          if p = prefix & c = char; found
            prefix = index
            break
          end if
          index = index+1
        end while
      end virtual
      if index=codes; not found
        dw prefix
        prefix = char
      end if
      current = current+1
    end while
    if prefix<>-1; something left (almost always it is)
      dw prefix
    end if
  else
    dw arg
  end if
}
;;; Test
dw 0xfeff
du "Vieejt Nam", 10
du "Xin chafo", 10
du "Casm own casc bajn", 10
du "dd", 10    
Quick'n'dirty, and slooow, but works.
Post 18 Oct 2010, 23:24
View user's profile Send private message Reply with quote
mindcooler



Joined: 01 Dec 2009
Posts: 423
Location: Västerås, Sweden
mindcooler
Well, at least it is a LUT. Would probably work excellently for a romaji translator.
Post 18 Oct 2010, 23:29
View user's profile Send private message Visit poster's website MSN Messenger ICQ Number Reply with quote
pearlz



Joined: 07 Jun 2010
Posts: 55
Location: Viet Nam
pearlz
Thank's for all!
special to Baldr, it's work fine Very Happy .

_________________
welcome to VietNam!
Post 22 Oct 2010, 03:00
View user's profile Send private message Reply with quote
pearlz



Joined: 07 Jun 2010
Posts: 55
Location: Viet Nam
pearlz
Code:
macro Telex [args]{
  local char, current, length, char2, char3
  if args eqtype ''
    virtual at 0
      db args
      length = $
    end virtual
    current = 0
    while current<length
    virtual at 0
      db args
      ;---------------------------------------------------------
      load char from current
      current=current+1
      if char='a'
        if current<length
          load char2 from current
          current=current+1
          if (char2='a')|(char2='A')
            if current<length
              load char3 from current
              current=current+1
              if (char3='s')|(char3='S') ;aas
                char=0x1EA5
              else if (char3='f')|(char3='F')
                char=0x1EA7
              else if (char3='x')|(char3='X')
                char=0x1EA9
              else if (char3='r')|(char3='R')
                char=0x1EAB
              else if (char3='j')|(char3='J')
                char=0x1EAD
              else
                char=0xE2
                current=current-1
              end if
            else
              char=0x103
            end if
          else if (char2='w')|(char2='W')
            if current<length
              load char3 from current
              current=current+1
              if (char3='s')|(char3='S') ;aws
                char=0x1EAF
              else if (char3='f')|(char3='F')
                char=0x1EB1
              else if (char3='x')|(char3='X')
                char=0x1EB5
              else if (char3='r')|(char3='R')
                char=0x1EB3
              else if (char3='j')|(char3='J')
                char=0x1EB7
              else
                char=0x103
                current=current-1
              end if
            else
              char=0x103
            end if
          else if (char2='s')|(char2='S')
            char=0xE1
          else if (char2='f')|(char2='F')
            char=0xE0
          else if (char2='x')|(char2='X')
            char=0xE3
          else if (char2='r')|(char2='R')
            char=0x1EA3
          else if (char2='j')|(char2='J')
            char=0x1EA1
          else
            current=current-1
          end if
        end if
      ;a
      else if char='d'
        if current<length
          load char2 from current
          current=current+1
          if (char2='d')|(char2='D')
            char=0x111
          end if
        end if
      ;d
      else if char='e'
        if current<length
          load char2 from current
          current=current+1
          if (char2='e')|(char2='E')
            if current<length
              load char3 from current
              current=current+1
              if (char3='s')|(char3='S') ;ees
                char=0x1EBF
              else if (char3='f')|(char3='F')
                char=0x1EC1
              else if (char3='x')|(char3='X')
                char=0x1EC5
              else if (char3='r')|(char3='R')
                char=0x1EC3
              else if (char3='j')|(char3='J')
                char=0x1EC7
              else
                char=0xEA
                current=current-1
              end if
            else
              char=0xEA
            end if
          else if (char2='s')|(char2='S')
            char=0xE9
          else if (char2='f')|(char2='F')
            char=0xE8
          else if (char2='x')|(char2='X')
            char=0x1EBD
          else if (char2='r')|(char2='R')
            char=0x1EBB
          else if (char2='j')|(char2='J')
            char=0x1EB9
          else
            current=current-1
          end if
        end if
      ;e
      else if char='i'
      if current<length
        load char2 from current
        current=current+1
        if (char2='s')|(char2='S')
          char=0xED
        else if (char2='f')|(char2='F')
          char=0xEC
        else if (char2='x')|(char2='X')
          char=0x129
        else if (char2='r')|(char2='R')
          char=0x1EC9
        else if (char2='j')|(char2='J')
          char=0x1ECB
        else
          current=current-1
        end if
      end if
      ;i
      else if char='o'
        if current<length
          load char2 from current
          current=current+1
          if (char2='o')|(char2='O')
            if current<length
              load char3 from current
              current=current+1
              if (char3='s')|(char3='S') ;oos
                char=0x1ED1
              else if (char3='f')|(char3='F')
                char=0x1ED3
              else if (char3='x')|(char3='X')
                char=0x1ED7
              else if (char3='r')|(char3='R')
                char=0x1ED5
              else if (char3='j')|(char3='J')
                char=0x1ED9
              else
                char=0xF4
                current=current-1
              end if
            else
              char=0xF4
            end if
          else if (char2='w')|(char2='W')
            if current<length
              load char3 from current
              current=current+1
              if (char3='s')|(char3='S') ;ows
                char=0x1EDB
              else if (char3='f')|(char3='F')
                char=0x1EDD
              else if (char3='x')|(char3='X')
                char=0x1EE1
              else if (char3='r')|(char3='R')
                char=0x1EDF
              else if (char3='j')|(char3='J')
                char=0x1EE3
              else
                char=0x1A1
                current=current-1
              end if
            else
              char=0x1A1
            end if
          else if (char2='s')|(char2='S')
            char=0xF3
          else if (char2='f')|(char2='F')
            char=0xF2
          else if (char2='x')|(char2='X')
            char=0xF5
          else if (char2='r')|(char2='R')
            char=0x1ECF
          else if (char2='j')|(char2='J')
            char=0x1ECD
          else
            current=current-1
          end if
        end if
      ;o
      else if char='u'
        if current<length
          load char2 from current
          current=current+1
          if (char2='w')|(char2='W')
            if current<length
              load char3 from current
              current=current+1
              if (char3='s')|(char3='S') ;uws
                char=0x1EE9
              else if (char3='f')|(char3='F')
                char=0x1EEB
              else if (char3='x')|(char3='X')
                char=0x1EEF
              else if (char3='r')|(char3='R')
                char=0x1EED
              else if (char3='j')|(char3='J')
                char=0x1EF1
              else
                char=0x1B0
                current=current-1
              end if
            else
              char=0x1B0
            end if
          else if (char2='s')|(char2='S')
            char=0xFA
          else if (char2='f')|(char2='F')
            char=0xF9
          else if (char2='x')|(char2='X')
            char=0x169
          else if (char2='r')|(char2='R')
            char=0x1EE7
          else if (char2='j')|(char2='J')
            char=0x1EE5
          else
            current=current-1
          end if
        end if
      ;u
      else if char='y'
      if current<length
        load char2 from current
        current=current+1
        if (char2='s')|(char2='S')
          char=0xFD
        else if (char2='f')|(char2='F')
          char=0x1EF3
        else if (char2='x')|(char2='X')
          char=0x1EF9
        else if (char2='r')|(char2='R')
          char=0x1EF7
        else if (char2='j')|(char2='J')
          char=0x1EF5
        else
          current=current-1
        end if
      end if
      ;y
      else if char='A'
        if current<length
          load char2 from current
          current=current+1
          if (char2='a')|(char2='A')
            if current<length
              load char3 from current
              current=current+1
              if (char3='s')|(char3='S') ;aas
                char=0x1EA4
              else if (char3='f')|(char3='F')
                char=0x1EA6
              else if (char3='x')|(char3='X')
                char=0x1EAA
              else if (char3='r')|(char3='R')
                char=0x1EA8
              else if (char3='j')|(char3='J')
                char=0x1EAC
              else
                char=0xC2
                current=current-1
              end if
            else
              char=0xC2
            end if
          else if (char2='w')|(char2='W')
            if current<length
              load char3 from current
              current=current+1
              if (char3='s')|(char3='S') ;aws
                char=0x1EAE
              else if (char3='f')|(char3='F')
                char=0x1EB0
              else if (char3='x')|(char3='X')
                char=0x1EB4
              else if (char3='r')|(char3='R')
                char=0x1EB2
              else if (char3='j')|(char3='J')
                char=0x1EB6
              else
                char=0x102
                current=current-1
              end if
            else
              char=0x102
            end if
          else if (char2='s')|(char2='S')
            char=0xC1
          else if (char2='f')|(char2='F')
            char=0xC0
          else if (char2='x')|(char2='X')
            char=0xC3
          else if (char2='r')|(char2='R')
            char=0x1EA2
          else if (char2='j')|(char2='J')
            char=0x1EA0
          else
            current=current-1
          end if
        end if
      ;A
      else if char='D'
        if current<length
          load char2 from current
          current=current+1
          if (char2='d')|(char2='D')
            char=0x110
          end if
        end if
      ;D
      else if char='E'
        if current<length
          load char2 from current
          current=current+1
          if (char2='e')|(char2='E')
            if current<length
              load char3 from current
              current=current+1
              if (char3='s')|(char3='S') ;ees
                char=0x1EBE
              else if (char3='f')|(char3='F')
                char=0x1EC0
              else if (char3='x')|(char3='X')
                char=0x1EC4
              else if (char3='r')|(char3='R')
                char=0x1EC2
              else if (char3='j')|(char3='J')
                char=0x1EC6
              else
                char=0xCA
                current=current-1
              end if
            else
              char=0xCA
            end if
          else if (char2='s')|(char2='S')
            char=0xC9
          else if (char2='f')|(char2='F')
            char=0xC8
          else if (char2='x')|(char2='X')
            char=0x1EBC
          else if (char2='r')|(char2='R')
            char=0x1EBA
          else if (char2='j')|(char2='J')
            char=0x1EB8
          else
            current=current-1
          end if
        end if
      ;E
      else if char='I'
      if current<length
        load char2 from current
        current=current+1
        if (char2='s')|(char2='S')
          char=0xCD
        else if (char2='f')|(char2='F')
          char=0xCC
        else if (char2='x')|(char2='X')
          char=0x128
        else if (char2='r')|(char2='R')
          char=0x1EC8
        else if (char2='j')|(char2='J')
          char=0x1ECA
        else
          current=current-1
        end if
      end if
      ;I
      else if char='O'
        if current<length
          load char2 from current
          current=current+1
          if (char2='o')|(char2='O')
            if current<length
              load char3 from current
              current=current+1
              if (char3='s')|(char3='S') ;oos
                char=0x1ED0
              else if (char3='f')|(char3='F')
                char=0x1ED2
              else if (char3='x')|(char3='X')
                char=0x1ED6
              else if (char3='r')|(char3='R')
                char=0x1ED4
              else if (char3='j')|(char3='J')
                char=0x1ED8
              else
                char=0xD4
                current=current-1
              end if
            else
              char=0xD4
            end if
          else if (char2='w')|(char2='W')
            if current<length
              load char3 from current
              current=current+1
              if (char3='s')|(char3='S') ;ows
                char=0x1EDA
              else if (char3='f')|(char3='F')
                char=0x1EDC
              else if (char3='x')|(char3='X')
                char=0x1EE0
              else if (char3='r')|(char3='R')
                char=0x1EDE
              else if (char3='j')|(char3='J')
                char=0x1EE2
              else
                char=0x1A0
                current=current-1
              end if
            else
              char=0x1A0
            end if
          else if (char2='s')|(char2='S')
            char=0xD3
          else if (char2='f')|(char2='F')
            char=0xD2
          else if (char2='x')|(char2='X')
            char=0xD5
          else if (char2='r')|(char2='R')
            char=0x1ECE
          else if (char2='j')|(char2='J')
            char=0x1ECC
          else
            current=current-1
          end if
        end if
      ;O
      else if char='U'
        if current<length
          load char2 from current
          current=current+1
          if (char2='w')|(char2='W')
            if current<length
              load char3 from current
              current=current+1
              if (char3='s')|(char3='S') ;Uws
                char=0x1EE8
              else if (char3='f')|(char3='F')
                char=0x1EEA
              else if (char3='x')|(char3='X')
                char=0x1EEE
              else if (char3='r')|(char3='R')
                char=0x1EEC
              else if (char3='j')|(char3='J')
                char=0x1EF0
              else
                char=0x1AF
                current=current-1
              end if
            else
              char=0x1AF
            end if
          else if (char2='s')|(char2='S')
            char=0xDA
          else if (char2='f')|(char2='F')
            char=0xD9
          else if (char2='x')|(char2='X')
            char=0x168
          else if (char2='r')|(char2='R')
            char=0x1EE6
          else if (char2='j')|(char2='J')
            char=0x1EE4
          else
            current=current-1
          end if
        end if
      ;U
      else if char='Y'
      if current<length
        load char2 from current
        current=current+1
        if (char2='s')|(char2='S')
          char=0xDD
        else if (char2='f')|(char2='F')
          char=0x1EF2
        else if (char2='x')|(char2='X')
          char=0x1EF8
        else if (char2='r')|(char2='R')
          char=0x1EF6
        else if (char2='j')|(char2='J')
          char=0x1EF4
        else
          current=current-1
        end if
      end if
      ;Y
      end if
      ;--------------------------------------------------------------
    end virtual
    dw  char
    end while

  else
    dw args
  end if
}
    

Hey baldr!
here my Full macro for Telex in my country.
thank's Baldr, again. Very Happy
Post 22 Oct 2010, 09:21
View user's profile Send private message Reply with quote
pearlz



Joined: 07 Jun 2010
Posts: 55
Location: Viet Nam
pearlz
my first Macro.
Telex to unicode for Viet Nam!


Description: Telex macro!
Download
Filename: Telex.inc
Filesize: 11.63 KB
Downloaded: 30 Time(s)


_________________
welcome to VietNam!
Post 22 Oct 2010, 09:34
View user's profile Send private message Reply with quote
baldr



Joined: 19 Mar 2008
Posts: 1651
baldr
pearlz,

Condition char or 0x20 = 'a' is equivalent to char = 'A' | char = 'a'. Moreover, capital and small letters often are just one bit away; you can process them together: char = 0x1EA4+char shr 5 and 1 for 'Ấ'/'ấ' pair (the logic is simple: char is either 'A' or 'a', char shr 5 and 1 evaluates to 0 if char = 'A', 1 if char = 'a'). Some pairs are further apart, not a big problem: char = 0xC2 or (char and 0x20) for 'Â'/'â'.

Have I understood your source right? It appears that "aa" at the end of string means 'ă' ('a' breve, U+0103), but inside string (if not followed by one of "fjrsx") it means 'â' ('a' circumflex, U+00E2).
Post 22 Oct 2010, 14:02
View user's profile Send private message Reply with quote
pearlz



Joined: 07 Jun 2010
Posts: 55
Location: Viet Nam
pearlz
I also intend to do so, but it's too obscure.
first MACRO need understandable, thank's
[from google translate! i'm too bad English]
Post 22 Oct 2010, 14:12
View user's profile Send private message Reply with quote
baldr



Joined: 19 Mar 2008
Posts: 1651
baldr
pearlz,

I've meant this:
Code:
              else
                char=0xE2
                current=current-1
              end if
            else
              char=0x103
            end if    
Probably just a copy-paste error (because "aw" produces 0x0103 too).
Post 22 Oct 2010, 14:45
View user's profile Send private message Reply with quote
pearlz



Joined: 07 Jun 2010
Posts: 55
Location: Viet Nam
pearlz
yes, it's a problem with copy paste, if aa in the end of string,it will return wrong value. Very Happy
Post 23 Oct 2010, 00:11
View user's profile Send private message Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2020, Tomasz Grysztar. Also on YouTube, Twitter.

Website powered by rwasa.