Joined: 24 Mar 2012
Posts: 812
Location: Russian Federation, Sochi
ProMiNick 02 Dec 2019, 13:48
source of source file (compile than compile over output):
format binary as 'ASM'
db $66,$6F,$72,$6D,$61,$74,$20,$62,$69,$6E,$61,$72,$79,$20,$61,$73
db $20,$27,$74,$78,$74,$27,$0D,$0A,$69,$6E,$63,$6C,$75,$64,$65,$20
db $27,$65,$6E,$63,$6F,$64,$69,$6E,$67,$2F,$77,$69,$6E,$31,$32,$35
db $31,$2E,$69,$6E,$63,$27,$0D,$0A,$64,$75,$20,$27,$D4,$F3,$ED,$EA
db $F6,$E8,$E8,$20,$EF,$EE,$EB,$FC,$E7,$EE,$E2,$E0,$F2,$E5,$EB,$FF
db $27,?    

on russin language there is just format as text, include utf8encoding and du 'user functions'
when I use other then utf8 encodings - all OK,
but in case of utf8 I got:
Error: value out of range.
dw 0D7C0h+wide shr 10,0DC00h or(wide and 3FFh)    

what I expect to see in final output: (text file with content)
$D0, $A4, $D1, $83, $D0, $BD, $D0, $BA, $D1, $86, $D0, $B8, $D0, $B8, ; функции
$20, ; space
$D0, $BF, $D0, $BE, $D0, $BB, $D1, $8C, ; поль
$D0, $B7, $D0, $BE, $D0, $B2, $D0, $B0, ; зова
$D1, $82, $D0, $B5, $D0, $BB, $D1, $8F ; теля    

I don`t like to refer by "you" to one person.
My soul requires acronim "thou" instead.

Last edited by ProMiNick on 03 Dec 2019, 12:55; edited 1 time in total
Post 02 Dec 2019, 13:48
Joined: 24 Mar 2012
Posts: 812
Location: Russian Federation, Sochi
ProMiNick 03 Dec 2019, 07:16
all single cirillic chars will cause error except 3 ones:
format binary as 'txt'
include 'encoding/utf8.inc'
du 'А'    
format binary as 'txt'
include 'encoding/utf8.inc'
du 'Ё'    
format binary as 'txt'
include 'encoding/utf8.inc'
du 'ё'    

Is it impossible in utf8 to encode single cirillic character?
Post 03 Dec 2019, 07:16
Tomasz Grysztar

Joined: 16 Jun 2003
Posts: 8361
Location: Kraków, Poland
Tomasz Grysztar 03 Dec 2019, 08:46
The "encoding/utf8.inc" macro is not to convert to UTF-8, it converts from UTF-8. So your source file should be using UTF-8 (and not Windows 1251 as it does), and what "du" produces is always UTF-16.
Post 03 Dec 2019, 08:46
Joined: 24 Mar 2012
Posts: 812
Location: Russian Federation, Sochi
ProMiNick 03 Dec 2019, 12:54
thanks, it helps.
macro utf8 [arg] {
 local char,..data,size
        if arg eqtype ''
                virtual at 0
                        db arg
                        size = $
                end virtual
                repeat size
                        load char byte from ..data:%-1
                        if char < $80
                                db char
                                load char word from __encoding:char*2
                                if char > $7FF
                                        db $E0 + char shr (6*2),$80 + (char shr 6) and $3F,$80 + char and $3F

                                        db $C0 + (char shr 6) and $3F,$80 + char and $3F
                                end if
                        end if
                end repeat
        else if arg eqtype 0
                if arg > $7FF
                        db $E0 + arg shr (6*2),$80 + (arg shr 6) and $3F,$80 + arg and $3F
                else if arg > $7F
                        db $C0 + (arg shr 6) and $3F,$80 + arg and $3F
                        db arg
                end if
        else ;let standart directive handle error
                db arg
        end if }

struc utf8 [args]
 { common label . word
   utf8 args }      

use macro(struc) utf8 only as encoding parasit over standart WIN... encodings for du directive.

or at least needed to be defined apropriate table somewhere
virtual at 0
end virtual     

If anybody interested why was needed such conversation - there is small example (from my work - creating in fasm some receipt needed to show some developers that accessible set of operators is catastroficaly small, but yes this set still enought to solve problems - creating that receipt manualy possible too - but more mazahistic).

Post 03 Dec 2019, 12:54
