flat assembler
Message board for the users of flat assembler.

 Index > Main > String to Integer and vice versa Goto page 1, 2  Next
Author
Tyler

Joined: 19 Nov 2009
Posts: 1216
Location: NC, USA
Tyler
I'm trying to make a string(consisting of 1-9) to integer converter. The only easy way I could think of was to get str length, say I had "1587" then do something like,
1*10^4
5*10^3...
I know how to do all that is required to do this, except the "^x" part. Also, if anyone knows how to convert from multi-digit numbers to strings, that would be greatly appreciated too.
04 Dec 2009, 05:13
kohlrak

Joined: 21 Jul 2006
Posts: 1421
kohlrak
From number, use division by 10 then add '0' (yes, the character) until it's empty, but you'll have to store the character backwords. From string, take a character, subtract '0', multiply by 10, then redo until end of string.
04 Dec 2009, 05:22
Tyler

Joined: 19 Nov 2009
Posts: 1216
Location: NC, USA
Tyler
Thanks, I see why that works (ascii 1-9 are in order). My original function to do this uses that principle, I was just being stupid and making it harder than it had to be.
Anyway, thanks for answering my noob-ish question.
04 Dec 2009, 05:31
Borsuc

Joined: 29 Dec 2005
Posts: 2466
Location: Bucharest, Romania
Borsuc
ASCII to integer:
Code:
```;
; REQUIREMENTS before entering this code:
; esi points at string (first char of string already loaded in 'al', so it points to second char)
; also, eax is zeroed except for 'al'...
;

;
; Convert the ANSI decimal chars to integers
;
xor al, '0'                   ; Translates '0'..'9' to 0..9
xor edx, edx                  ; prepare edx to store the converted number
cmp al, 10                    ; is it outside this range? (chars smaller than '0' will get negative values)
jae .not_number               ; yeah, so it's not a digit (negative numbers are BIG when doing UNSIGNED comparisons)

@@:
lea edx, dword [edx+4*edx]  ; edx*=5
lea edx, dword [eax+2*edx]  ; edx=edx*2+eax
; Therefore we computed  "edx*10 + eax" (this is for decimal ANSI to integer conversion)
lodsb
xor al, '0'
cmp al, 10
jb @b                       ; more digits, so convert some more

;
; at this point, edx contains the number, and al a non-digit character (that ended the number)
;    ```

_________________
Previously known as The_Grey_Beast
04 Dec 2009, 17:26
LocoDelAssembly

Joined: 06 May 2005
Posts: 4633
Location: Argentina
LocoDelAssembly
Quote:

(chars smaller than '0' will get negative values)

Since you used XOR instead of SUB that is not going to happen. The unsigned comparison is still important though.
04 Dec 2009, 18:21
edfed

Joined: 20 Feb 2006
Posts: 4240
Location: 2018
edfed
another str2int routine.
Code:
```str2num:
;esi= signed string, formated as below:
;   db '        900098     ',0
;   there can be many spaces ' ' before number, and end with space ' ' or 0.
;eax= return 32 signed value

@@:
mov     al,[esi]
inc     esi
cmp     al,' '
je      @b
@@:
mov     al,[esi]
inc     esi
cmp     al,' '
je @f
cmp     al,'.'
je @f
cmp     al,','
je @f
cmp     al,'!'
je @f
cmp     al,0
jne     @b
@@:
dec     esi
push    esi
dec     esi
xor     ebx,ebx
mov     edx,1
@@:
movzx   eax,byte[esi]
dec     esi
cmp     al,'-'
je      .neg
cmp     al,' '
je      .end
cmp     al,','
je      .end
cmp     al,'.'
je      .end
sub     al,'0'
jl      @f
cmp     al,9
jg      @f
imul    eax,edx
imul    edx,10
jmp     @b
@@:
pop     esi
.!?:
cmp byte[esi],'!'
jne @f
;        inc esi
cmp ebx,0
jle .null
mov eax,1
.!:
imul eax,ebx
dec ebx
jne .!
mov ebx,eax
@@:
stc
ret
.neg:
mov     al,[esi]
cmp     al,' '
jne     @b
neg     ebx
.end:
pop     esi
stc
ret
.null:
xor ebx,ebx
clc
ret         ```

can be very interresting in case of formula computaions.

strings like:

db ' 21123 * 32143 + 4343 ',0 can be interpreted easy with this little algo:

Code:
```compute
mov [result],0
mov esi,string
@@:
call str2num
cmp byte[esi],0
je .end
call str2operator
cmp byte[esi],0
jne @b
.end:
```

butit needs the str2operator function to work. and some adjustments.
ret
04 Dec 2009, 18:31
Borsuc

Joined: 29 Dec 2005
Posts: 2466
Location: Bucharest, Romania
Borsuc
LocoDelAssembly wrote:
Since you used XOR instead of SUB that is not going to happen. The unsigned comparison is still important though.
They will because of 'cmp'... also xor works in ASCII (due to how digits and '0' are ordered, having the unique bits set), but yeah, I could have used sub instead.

_________________
Previously known as The_Grey_Beast
04 Dec 2009, 22:18
LocoDelAssembly

Joined: 06 May 2005
Posts: 4633
Location: Argentina
LocoDelAssembly
Perhaps I didn't understand the context of the comment but what I saw is that, for instance, the NULL char will be transformed to '0' which is well above 10 but below 128. And no char will get its sign flipped because of that XOR (contrary to what SUB could do).

I don't mean your code isn't correct, I was just talking about the comment I quoted.
04 Dec 2009, 22:36
kohlrak

Joined: 21 Jul 2006
Posts: 1421
kohlrak
Borsuc wrote:
ASCII to integer:
Code:
```;
; REQUIREMENTS before entering this code:
; esi points at string (first char of string already loaded in 'al', so it points to second char)
; also, eax is zeroed except for 'al'...
;

;
; Convert the ANSI decimal chars to integers
;
xor al, '0'                   ; Translates '0'..'9' to 0..9
xor edx, edx                  ; prepare edx to store the converted number
cmp al, 10                    ; is it outside this range? (chars smaller than '0' will get negative values)
jae .not_number               ; yeah, so it's not a digit (negative numbers are BIG when doing UNSIGNED comparisons)

@@:
lea edx, dword [edx+4*edx]  ; edx*=5
lea edx, dword [eax+2*edx]  ; edx=edx*2+eax
; Therefore we computed  "edx*10 + eax" (this is for decimal ANSI to integer conversion)
lodsb
xor al, '0'
cmp al, 10
jb @b                       ; more digits, so convert some more

;
; at this point, edx contains the number, and al a non-digit character (that ended the number)
;    ```

I just woke up and this is the first thing i see... To think i made a routine like this just before going to bed (for AFS) and it doesn't even compare to this.
04 Dec 2009, 22:56
Borsuc

Joined: 29 Dec 2005
Posts: 2466
Location: Bucharest, Romania
Borsuc
LocoDelAssembly wrote:
Perhaps I didn't understand the context of the comment but what I saw is that, for instance, the NULL char will be transformed to '0' which is well above 10 but below 128. And no char will get its sign flipped because of that XOR (contrary to what SUB could do).

I don't mean your code isn't correct, I was just talking about the comment I quoted.
Oh I see the confusion. Yes sub would have been more clear, but '0' is a much higher ASCII number than 9 (max integer), and '0' is "aligned" on specific bits (so it works).

sub is more general-purpose and works in all cases though (instead of xor), I agree -- the xor-ASCII tricks are just a snippet I saw a lot of time ago (not only for string-to-int), that's why.

Here's a tip for those that don't know:

General purpose bound comparison (i.e checking if a number is between x and y, INCLUDING x and y)

Code:
```sub reg, x
cmp reg, y-x+1
jb .true    ```

_________________
Previously known as The_Grey_Beast
04 Dec 2009, 23:57
edfed

Joined: 20 Feb 2006
Posts: 4240
Location: 2018
edfed
Code:
```sub reg,x ; instead of defning x & y, why not define x & size?
cmp reg,y-x+1
jb true
```

can be improved like this:
Code:
```base dd ? ;vector approach
size dd ? ;module & argument
sub reg,base
jl @f
sub reg,size
jge @f
true:
.
@@:
```
05 Dec 2009, 02:30
LocoDelAssembly

Joined: 06 May 2005
Posts: 4633
Location: Argentina
LocoDelAssembly
You are using two conditional jumps and the code was supposed to avoid that. Also, Borsuc's "y-x+1" is already doing the "size" role and "x" alone "base".

Your code has the advantage of being able to define the boundaries at run-time, but it should be something more like this:
Code:
```base dd ?
size dd ?
.
.
; More code
.
.

sub reg, [base]
cmp reg, [size]
jb out_of_range

inside_range:
.
.
.
out_of_range:
.
.
.
```
(No difference with Borsuc except for the imms replaced with [mem]s)
05 Dec 2009, 03:05
Tyler

Joined: 19 Nov 2009
Posts: 1216
Location: NC, USA
Tyler
I ended up with this,
Code:
```str_to_int:
mov si, str_to_convert
mov eax, 00000000h
mov ebx, eax
@@:
lodsb
sub al, '0'
cmp al, 00h
jb @f
cmp al, 09h
ja invalid_input
movsx ebx, al
imul eax, 10
jmp @b
@@:
cmp al, 00h - '0'
jne invalid_input
retn         ```

It's screwed up, why? It causes the program to end without completing it's execution.

Your probably wondering why I didn't just copy one of the many GREAT examples, if you are then I'll explain. If I did that, I would neglect actually learning the code because I'm lazy like that. So I force myself to rewrite every function I get help on, just in a slightly different way. I find that more helpful for learning.

btw, thanks for all the responses.
05 Dec 2009, 05:25
windwakr

Joined: 30 Jun 2004
Posts: 827
Location: Michigan, USA
windwakr
I haven't looked at your routine, but I'd just like to point out that you don't need all those zero's to empty out eax. Just "mov eax, 0" will do.

You could even use "xor eax, eax", which is 3 bytes smaller. A lot of people think using "mov reg, 0" is better than "xor reg, samereg", and a lot of people think the opposite way. I personally use the xor way any time I code anything.
There's some discussion on the subject here:
http://board.flatassembler.net/topic.php?t=6339&postdays=0&postorder=asc&start=0

_________________
----> * <---- My star, won HERE
05 Dec 2009, 05:47
Tyler

Joined: 19 Nov 2009
Posts: 1216
Location: NC, USA
Tyler
Quote:

I haven't looked at your routine, but I'd just like to point out that you don't need all those zero's to empty out eax. Just "mov eax, 0" will do.

I do that to help me keep in mind how much can fit into the reg. Like for al, I use 00h.
05 Dec 2009, 05:52
Borsuc

Joined: 29 Dec 2005
Posts: 2466
Location: Bucharest, Romania
Borsuc
I like "xor eax, eax" to my mind it signals a special "clear" register instruction.

also my code is full of comments and whitespace columns (whether asm or HLL)... i prefer clean & descriptive code
05 Dec 2009, 16:55
Tyler

Joined: 19 Nov 2009
Posts: 1216
Location: NC, USA
Tyler
Quote:

also my code is full of comments and whitespace columns

I don't comment much because I screw up a lot and have to go back and change my code. If I commented every time I changed it I would spend more time commenting than coding.

btw, can anyone tell me where I screwed up?
Code:
```str_to_int:
xor ax, ax
xor bx, bx
xor cx, cx
@@:
lodsb
cmp al, 00h ;Check for null termination
je @f
sub al, '0'
cmp al, 00h ;This and next 3 lines check if it's a valid #
jb invalid_input
cmp al, 09h
ja invalid_input
movsx bx, al ;Should I change this to movzx?
imul cx, 10 ;adjust for next digit
add cx, bx ;insert next digit
jmp @b ;next digit
@@:
mov [int_num1], cx
retn    ```
06 Dec 2009, 01:26
LocoDelAssembly

Joined: 06 May 2005
Posts: 4633
Location: Argentina
LocoDelAssembly
Quote:

cmp al, 00h ;This and next 3 lines check if it's a valid #
jb invalid_input

This jump will never jump because no unsigned number is below 0. You may wanted to use JL, but actually you don't need this check at all (the JA below is already covering "negative" numbers interpreting them as very high values).

I can't see any error, what did you find wrong with the code?

I'll give you another just in case, but for me your code already convert a a string to a number provided it is below 2^16.

Code:
```str_to_int:
pusha ; No need to save the 32-bit register, we are not going to destroy the upper 16 bits of any (and the rest of the code needs the upper 16 bits preserved anyway?)
xor ax, ax
xor cx, cx

.processChar:
sub al, '0'
cmp al, 9
ja .invalidInput

imul cx, 10
add cx, ax ; It is OK doing this, AH is always zero in this code

lodsb
test al, al
jnz .processChar

mov [int_num1], cx
mov [int_error], 0

.return:
popa
retn

.invalidInput:
mov [int_error], 1
jmp .return    ```

Forgot to add, to use the results is this:
Code:
```mov si, string
call str_to_int
cmp [int_error], 0
jne .invalidInt
; You can use int_num1 here    ```
06 Dec 2009, 18:22
DOS386

Joined: 08 Dec 2006
Posts: 1901
DOS386
> String to Integer and vice versa

You can find such code in the FASM parser (string -> integer) and FASM IDE (integer -> string), latter even 2 x (YES, I did report this "bug" ...).
06 Dec 2009, 22:58
bitshifter

Joined: 04 Dec 2007
Posts: 764
Location: Massachusetts, USA
bitshifter
Now... process 4 byte chunks without jumping.
Then you will be cooking with gas
07 Dec 2009, 03:18
 Display posts from previous: All Posts1 Day7 Days2 Weeks1 Month3 Months6 Months1 Year Oldest FirstNewest First

 Jump to: Select a forum Official----------------AssemblyPeripheria General----------------MainDOSWindowsLinuxUnixMenuetOS Specific----------------MacroinstructionsCompiler InternalsIDE DevelopmentOS ConstructionNon-x86 architecturesHigh Level LanguagesProgramming Language DesignProjects and IdeasExamples and Tutorials Other----------------FeedbackHeapTest Area
Goto page 1, 2  Next

Forum Rules:
 You cannot post new topics in this forumYou cannot reply to topics in this forumYou cannot edit your posts in this forumYou cannot delete your posts in this forumYou cannot vote in polls in this forumYou cannot attach files in this forumYou can download files in this forum