flat assembler
Message board for the users of flat assembler.
Index
> DOS > Uppercasing characters |
Author |
|
bitshifter 26 Jun 2010, 01:04
This is not the best but is very simple
Code: toupper: ; DS:SI -> ASCIIZ string lodsb test al,al jz .finished cmp al,'a' jb toupper cmp al,'z' ja toupper sub al,32 dec si mov byte[ds:si],al inc si jmp toupper .finished: ret _________________ Coding a 3D game engine with fasm is like trying to eat an elephant, you just have to keep focused and take it one 'byte' at a time. |
|||
26 Jun 2010, 01:04 |
|
LocoDelAssembly 26 Jun 2010, 01:36
Lets simplify it further
Code: toupper: ; DS:SI -> ASCIIZ string lodsb test al,al jz .finished sub al,'a' ; --+ cmp al,'z'-'a' ; | ja toupper ; | ; | sub al, 'a'-'A' - 'a'; <-+ mov [si-1],al jmp toupper .finished: ret |
|||
26 Jun 2010, 01:36 |
|
b1528932 26 Jun 2010, 11:46
character upper/lower case isnt something you can convert to.
you need a character map wich include entries for characters containing lowercase and uppercase version (and field indicating presence of both of them, one, or none). Another thing is encoding, charc can be stored as ansi, utf16, utf8 or other methods. this require not only a map, but an api. |
|||
26 Jun 2010, 11:46 |
|
revolution 26 Jun 2010, 12:23
b1528932: This is the DOS forum. So things like UTF8, ANSI etc. are pretty much ruled out for most programs. I think it is safe to assume that ASCII will be the character set used in 99.99% of cases.
|
|||
26 Jun 2010, 12:23 |
|
adroit 26 Jun 2010, 12:54
bitshifter, it is quite simpler than mine -- easier to understand too.
LocoDelAssembly, Code: ... sub al, 'a'='A' - 'a' ... b1528932, when I said convert I didn't mean converting ASCII to Unicode. |
|||
26 Jun 2010, 12:54 |
|
revolution 26 Jun 2010, 12:59
MeshNix wrote: ... it computes to -65. Does the compiler treat signed integers as unsigned integers, or does it just wrap around? As you might expect, -1==255, ... ,-128==+128. |
|||
26 Jun 2010, 12:59 |
|
adroit 26 Jun 2010, 13:16
This would mean that -65==191, doesn't it?
|
|||
26 Jun 2010, 13:16 |
|
revolution 26 Jun 2010, 13:19
MeshNix wrote: This would mean that -65==191, doesn't it? |
|||
26 Jun 2010, 13:19 |
|
Picnic 26 Jun 2010, 13:32
this should also work, although raises LocoDelAssembly's code to 21 bytes.
Code: .if ( al > 60h & al < 7bh ) xor al, 20h ; toggle case .endif |
|||
26 Jun 2010, 13:32 |
|
adroit 26 Jun 2010, 17:00
revolution, how does this actually work? -65 was treated as if it was unsigned. Does signed or unsigned matter in ASCII?
Picnic wrote: this should also work, although raises LocoDelAssembly's code to 21 bytes. Actually it would make the character lowercased. Try: Code: .if ( al > 40h & al < 5Bh ) xor al, 20h ; toggle case .endif Great! Now I now how to do reverse uppercasing. |
|||
26 Jun 2010, 17:00 |
|
revolution 26 Jun 2010, 17:28
The CPU doesn't know or care about signed/unsigned, it is all just binary. Only the programmers will know about signs when testing the flags.
|
|||
26 Jun 2010, 17:28 |
|
adroit 26 Jun 2010, 18:02
Oh, I see; it's up to the programmer.
|
|||
26 Jun 2010, 18:02 |
|
edfed 26 Jun 2010, 21:00
maybe just a 256 bytes look up table can be really enough and better.
for example, to change the case of éàèêîëï, all these chars are in ascii, depending on the charset used of course. |
|||
26 Jun 2010, 21:00 |
|
DOS386 27 Jun 2010, 11:26
> http://board.flatassembler.net/topic.php?t=9736 search for "SSUPPER"
> to change the case of éàèêîëï, all these chars are in ascii, depending on the charset used of course. Where ??? This never worked AFAIK |
|||
27 Jun 2010, 11:26 |
|
rugxulo 28 Jun 2010, 06:36
revolution wrote: b1528932: This is the DOS forum. So things like UTF8, ANSI etc. are pretty much ruled out for most programs. I think it is safe to assume that ASCII will be the character set used in 99.99% of cases. It seems Europeans are more conscientious of this than others. And, IIRC, even Tomasz uses this (int 21h, 652xh) in FASMD: http://www.delorie.com/djgpp/doc/rbinter/id/77/31.html P.S. I always just used "and dl,0DFh" to uppercase. Toggle is "xor dl, 20h". And, unless I'm remembering wrong, "or dl, 20h" will lowercase. |
|||
28 Jun 2010, 06:36 |
|
< Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.