flat assembler
Message board for the users of flat assembler.
Index
> Main > Skip BOM in sources |
Author |
|
Jin X 10 Feb 2024, 15:45
Hello Tomasz.
Please let fasm 1 to skip BOM unicode signature in sources. I often use unicode in comments and I would like to add BOM signatures. |
|||
10 Feb 2024, 15:45 |
|
revolution 11 Feb 2024, 06:55
You can also combine the colon with the first instruction/directive.
Code: :format elf executable mov eax, 1 int 0x80 Code: ~ hd BOM.asm 00000000 ef bb bf 3a 66 6f 72 6d 61 74 20 65 6c 66 20 65 |...:format elf e| 00000010 78 65 63 75 74 61 62 6c 65 0a 6d 6f 76 20 65 61 |xecutable.mov ea| 00000020 78 2c 20 31 0a 69 6e 74 20 30 78 38 30 |x, 1.int 0x80| 0000002d ~ fasm BOM.asm && ./BOM flat assembler version 1.73.31 (16384 kilobytes memory) 1 passes, 91 bytes. ~ |
|||
11 Feb 2024, 06:55 |
|
macomics 11 Feb 2024, 11:11
And then what about this?
main.asm Code: : ; BOM format ELF64 executable 3 segment executable entry $ call secondProc mov eax, 60 xor dil, dil syscall include "second.asm" second.asm Code: : ; BOM secondProc: mov edx, .length lea rsi, [.hello] push 1 pop rdi mov eax, edi syscall .hello db 'Hello world!' .length = $ - .hello Code: $ fasm -m 102400 main.asm flat assembler version 1.73.32 (102400 kilobytes memory, x64) second.asm [1]: : ; BOM processed: : error: symbol already defined. You can fix it like this Code: =0 ; BOM format ELF64 executable 3 segment executable entry $ call secondProc mov eax, 60 xor dil, dil syscall include "second.asm" Code: =0 ; BOM secondProc: mov edx, .length lea rsi, [.hello] push 1 pop rdi mov eax, edi syscall .hello db 'Hello world!' .length = $ - .hello Code: $ fasm -m 102400 main.asm flat assembler version 1.73.32 (102400 kilobytes memory, x64) 2 passes, 166 bytes. But in the absence of a BOM, we get this: Code: $ fasm -m 102400 main.asm flat assembler version 1.73.32 (102400 kilobytes memory, x64) main.asm [1]: =0 ; BOM processed: =0 error: illegal instruction. |
|||
11 Feb 2024, 11:11 |
|
revolution 11 Feb 2024, 11:24
The fix for BOM and BOM-less sources?
Code: BOM=0 ; works whether invisible BOM is present or not ;... Last edited by revolution on 11 Feb 2024, 17:09; edited 1 time in total |
|||
11 Feb 2024, 11:24 |
|
Furs 11 Feb 2024, 16:41
revolution wrote: The fix for BOM and BOM-less sources? |
|||
11 Feb 2024, 16:41 |
|
revolution 11 Feb 2024, 16:54
Furs wrote: Some text editors don't display the file in UTF-8 without the BOM and they assume it's ASCII instead. Which to be honest is a sane default because ASCII files definitely don't have any BOM, so it's the most backwards compatible solution. The editor I use the most works perfectly fine to detect UTF-8 vs ASCII vs ISO-8859 without any BOM. It isn't hard, it only needs is small amount of logic. Requiring a BOM would be worse. No scratch that, it is worse. Very few apps add a BOM for UTF-8 (because it isn't needed), so then it would be an awful experience for the user to manually try to figure out what they are looking at. |
|||
11 Feb 2024, 16:54 |
|
Jin X 12 Feb 2024, 10:24
It works. But this is a crutch solution..
I think it's quite easy to add BOM support to compiler. |
|||
12 Feb 2024, 10:24 |
|
revolution 12 Feb 2024, 10:53
Jin X wrote: I think it's quite easy to add BOM support to compiler. |
|||
12 Feb 2024, 10:53 |
|
Furs 12 Feb 2024, 16:46
revolution wrote: Sure, some editors that are annoying. Last edited by Furs on 12 Feb 2024, 16:46; edited 1 time in total |
|||
12 Feb 2024, 16:46 |
|
Jin X 12 Feb 2024, 16:46
Where to post, here?
|
|||
12 Feb 2024, 16:46 |
|
revolution 12 Feb 2024, 17:14
Furs wrote: Heuristics are never perfect. Jin X wrote: Where to post, here? |
|||
12 Feb 2024, 17:14 |
|
Jin X 18 Feb 2024, 17:12
BOM checker is done.
I made 2 versions: normal and extended (with support of extra BOMs) in PREPROCE.EXT.INC. All my inserts are marked as "Jin X". The main code from PREPROCE.INC: Code: mov eax,[esi] cmp ax,0FEFFh ; UTF-16 (LE) / UTF-32 (LE) je unsuppoted_bom cmp ax,0FFFEh ; UTF-16 (BE) je unsuppoted_bom cmp eax,0FFFE0000h ; UTF-32 (BE) je unsuppoted_bom cmp eax,3ABFBBEFh ; UTF-8 + colon char je bom_no_skip ; don't skip if colon trick is used (for backward compatibility) and eax,00FFFFFFh cmp eax,00BFBBEFh ; UTF-8 jne bom_no_skip add esi,3 ; skip BOM bom_no_skip: mov ebx,esi ; moved down by Jin X
|
|||||||||||
18 Feb 2024, 17:12 |
|
macomics 18 Feb 2024, 17:34
This trick has already been discussed. It doesn't work for BOM in multiple files.
Code: cmp byte [esi+ecx],':' ; BOM + colon trick je bom_no_skip ; don't skip for backward compatibility add esi,ecx ; skip BOM bom_no_skip: |
|||
18 Feb 2024, 17:34 |
|
Jin X 18 Feb 2024, 18:08
Ok, fixed and checked (for both versions).
|
|||||||||||
18 Feb 2024, 18:08 |
|
< Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.