flat assembler
Message board for the users of flat assembler.

Index > Non-x86 architectures > [fasmarm] UTF-8 encoding in Android

Author
Thread Post new topic Reply to topic
Picnic



Joined: 05 May 2007
Posts: 1439
Location: Piraeus, Greece
Picnic 30 Jan 2016, 17:13
Hi,

This program correctly prints: Hello, Здравей, Γεια σου, Cześć, Привет

How come? I declare a byte string.

Thank you.

Quote:

format ELF executable
segment readable writeable executable

entry $
adr r1, s
mov r2, s.len
mov r0, 1
mov r7, 4 ; sys_write
swi 0

mov r7, 1
swi 0

align 4
s db 'Hello, Здравей, Γεια σου, Cześć, Привет',10,0
.len = $-s
Post 30 Jan 2016, 17:13
View user's profile Send private message Visit poster's website Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20754
Location: In your JS exploiting you and your system
revolution 30 Jan 2016, 19:22
UTF8 is a string of bytes. That is exactly what you should be declaring.
Post 30 Jan 2016, 19:22
View user's profile Send private message Visit poster's website Reply with quote
Picnic



Joined: 05 May 2007
Posts: 1439
Location: Piraeus, Greece
Picnic 30 Jan 2016, 20:10
Oh i thought it had to be a du string.
Post 30 Jan 2016, 20:10
View user's profile Send private message Visit poster's website Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20754
Location: In your JS exploiting you and your system
revolution 30 Jan 2016, 20:53
DU is for Unicode.
Post 30 Jan 2016, 20:53
View user's profile Send private message Visit poster's website Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20754
Location: In your JS exploiting you and your system
revolution 30 Jan 2016, 22:29
We can use UTF8 in comments, label names, macro names, EQUs and strings.
Code:
    
This will compile and run fine. But Windows is not set for UTF8 mode and will mangle the text rendering thinking it is ANSI/ASCII in some code page mapping.

You can use code page 65001 in the Windows console (CHCP 65001) with the Consolas font and UTF8 works well there. But at least one utility fails in this code page: "find", it just seems to hang.

Edit: I notice that some of the characters are not properly displayed by the browser/board. Use "quote" to get the original source text.
Post 30 Jan 2016, 22:29
View user's profile Send private message Visit poster's website Reply with quote
Picnic



Joined: 05 May 2007
Posts: 1439
Location: Piraeus, Greece
Picnic 02 Feb 2016, 06:45
I see, thanks for the clarification revolution.
Post 02 Feb 2016, 06:45
View user's profile Send private message Visit poster's website Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2025, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.