flat assembler
Message board for the users of flat assembler.

 Index > Windows > displaying hex-values Goto page Previous  1, 2, 3  Next
Author
Furs

Joined: 04 Mar 2016
Posts: 2453
Furs 19 Jan 2018, 20:29
I'm familiar only with those that start with the least significant digit first (because little endian rules and big endian sucks anyway ). Typically, at the end, you simply reverse the buffer if you write in-place. If you don't write in-place then just make a buffer large enough to hold the largest number (in ASCII) and write from the end then copy from where the pointer ended up at.

There's an alternative: calculate the logarithm in the respective base and add 1, this gets you the number of digits. It depends how fast log can be implemented in that base (but it's only done once, not for each digit).

Of course all this applies only if you don't want leading zeroes in the output. If you do, then it's a moot point, just write from right-to-left using least significant digit first. (e.g. writing 0x00F1C3FF for a 32-bit int always has a "fixed" length so you can just start from the right)

div can be useful in your case I guess because the divisor changes on each iteration (since you start with most significant digit)? div is also useful for code-size optimizations I suppose. (and well, dividing by a non-constant, obviously)
19 Jan 2018, 20:29
fasmnewbie

Joined: 01 Mar 2011
Posts: 555
fasmnewbie 20 Jan 2018, 02:42
@Furs, I don't think u understand how division by multiplication-constant akin to Agner Fog's style works. Logically, there's no way you can extract the last digit (Least Significant Digit) after the first MUL. It is lost forever ;D
20 Jan 2018, 02:42
Furs

Joined: 04 Mar 2016
Posts: 2453
Furs 20 Jan 2018, 12:58
Yes you can? Say we use base10 for simplicity. You have a number 1234, and its 123 (after division), you want to extract 4, using a bit of logic, just multiply 123 by 10 -> 1230, then subtract, 1234-1230 = 4.

So for base26:
Code:
```; input number in ecx
mov eax, 0x4EC4EC4F
mul ecx
shr edx, 3
imul eax, edx, 26
sub ecx, eax

; ecx = last digit, do stuff with it

mov ecx, edx  ; next input

; now loop until ecx = 0    ```
20 Jan 2018, 12:58
Tomasz Grysztar

Joined: 16 Jun 2003
Posts: 8344
Location: Kraków, Poland
Tomasz Grysztar 20 Jan 2018, 14:11
Multiplying the result by the divisor and the subtracting from original number is an universal method of obtaining the remainder, no matter what algorithm was used to divide. However, even in case of division through multiplication by a "magic" number it is possible, at least in some cases, to have remainder obtained directly from the main algorithm. See what I wrote about my alternative approach to these techniques.
20 Jan 2018, 14:11
fasmnewbie

Joined: 01 Mar 2011
Posts: 555
fasmnewbie 20 Jan 2018, 14:50
@Furs

Ofc you can always start from the back digits, if you focus your algorithm that way. But by doing so, you're making your algoritm even slower than a regular DIV. It is pointless. It involves 3 MULS even before you get to the next iteration. If you're doing it from the back, just use a DIV ;D

If u need faster approach, start from the front digits. A simplified technique for a 2-digit BASE-26 of decimal 274 (AE).

Code:
```    mov     eax,274
mov     ebx,0x4ec4ec4f
xor     edx,edx
mul     ebx
shr     edx,3
mov     ecx,edx   ;first digit
mov     eax,edx
mov     ebx,26
mul     ebx
mov     esi,274
sub     esi,eax   ;second digit    ```

But this too is not any better than a regular DIV due to the needs for branches. So, just like I said, just use a DIV for a more standardized way to convert to any base.

Last edited by fasmnewbie on 20 Jan 2018, 14:53; edited 1 time in total
20 Jan 2018, 14:50
fasmnewbie

Joined: 01 Mar 2011
Posts: 555
fasmnewbie 20 Jan 2018, 14:51
Tomasz Grysztar wrote:
Multiplying the result by the divisor and the subtracting from original number is an universal method of obtaining the remainder, no matter what algorithm was used to divide. However, even in case of division through multiplication by a "magic" number it is possible, at least in some cases, to have remainder obtained directly from the main algorithm. See what I wrote about my alternative approach to these techniques.
Interesting...
20 Jan 2018, 14:51
revolution
When all else fails, read the source

Joined: 24 Aug 2004
Posts: 20142
revolution 20 Jan 2018, 14:58
fasmnewbie wrote:
... slower than a regular DIV .... not any better than a regular DIV
"slower" depends upon the system being used. Some systems may be faster, some slower, some the same.

"better" is a subjective term. Different people will interpret that differently. Maybe you can qualify in which way you suggest it is better. For readability? For "speed"? For ease of programming? For fewer instruction bytes? Something else?
20 Jan 2018, 14:58
fasmnewbie

Joined: 01 Mar 2011
Posts: 555
fasmnewbie 20 Jan 2018, 15:12
revolution wrote:
fasmnewbie wrote:
... slower than a regular DIV .... not any better than a regular DIV
"slower" depends upon the system being used. Some systems may be faster, some slower, some the same.

"better" is a subjective term. Different people will interpret that differently. Maybe you can qualify in which way you suggest it is better. For readability? For "speed"? For ease of programming? For fewer instruction bytes? Something else?

revolution, let the different ideas flow more freely on this board. This board is lacking this specific kind of discussions because every time, there's some people who say "stop it peeps, it all depends on the system. So these discussions are useless. End the discussions now".

And as I recall it, this is the first, in a about two years or so, we have this kind of beginners question on conversion. The last one was handled gracefully by AsmGuru. This latest one is already bombarded with high-performance advanced optimization techniques. No wonder why beginners questions are so rare on this board. hahaha ;D
20 Jan 2018, 15:12
revolution
When all else fails, read the source

Joined: 24 Aug 2004
Posts: 20142
revolution 20 Jan 2018, 15:18
Hmm, okay, I wasn't trying stop you discussing anything. If it appears that way then I guess I word things badly. I was trying to get you to define what you mean more clearly. And also to realise that faster/slower are not absolutes. Others will experiences different behaviour on their systems. I think it is important to acknowledge that.
20 Jan 2018, 15:18
fasmnewbie

Joined: 01 Mar 2011
Posts: 555
fasmnewbie 20 Jan 2018, 15:23
It doesn't matter what the terminologies are being used. People will eventually get to it in their own ways. OhEmGee, you're so tight! ;D
20 Jan 2018, 15:23
Furs

Joined: 04 Mar 2016
Posts: 2453
Furs 20 Jan 2018, 16:53
Tomasz Grysztar wrote:
Multiplying the result by the divisor and the subtracting from original number is an universal method of obtaining the remainder, no matter what algorithm was used to divide. However, even in case of division through multiplication by a "magic" number it is possible, at least in some cases, to have remainder obtained directly from the main algorithm. See what I wrote about my alternative approach to these techniques.
Really awesome thread, I'm even saving it offline for reference, thanks.

fasmnewbie wrote:
Ofc you can always start from the back digits, if you focus your algorithm that way. But by doing so, you're making your algoritm even slower than a regular DIV. It is pointless. It involves 3 MULS even before you get to the next iteration. If you're doing it from the back, just use a DIV ;D
It's 2 muls, but I have a hard time understanding how div can be "faster" (note: not "just as fast", but "faster"), but it's still good for size optimizations (which is this case, probably, since I'd rather the code be smaller for stuff like displaying which doesn't need to be fast anyway).

According to Agner (for my CPU), mul/imul have like 3 clock cycle latency, and div is like 22-29 (for 32-bit number, for 64-bit numbers it's even more). 2 serial muls would have combined 6 clock cycle latency which is still far from 22, even if you add the sub (1 clock cycle).
20 Jan 2018, 16:53
fasmnewbie

Joined: 01 Mar 2011
Posts: 555
fasmnewbie 20 Jan 2018, 17:53
@Furs ... in this thread you need to separate the idea of optimizations in mathematical sense and the other one in string conversion sense. In mathematical sense, people talk about how fast an algorithm is based on the speed of a computational result. Look at Tomasz's own thread. He's talking "speed" purely from hypothetical mathematics POV. No strings attached.

This thread discusses about string conversion, where in IMO the fastest division algorithm out there does not make any significant improvement when strings are involved. In this particular sense, DIV is not slow as many people like to believe. That's misleading.
20 Jan 2018, 17:53
fasmnewbie

Joined: 01 Mar 2011
Posts: 555
fasmnewbie 20 Jan 2018, 18:04
Quote:
It's 2 muls, but I have a hard time understanding how div can be "faster"

I've seen codes employing MULS which are slower than a DIV operation

Quote:
According to Agner (for my CPU), mul/imul have like 3 clock cycle latency, and div is like 22-29 (for 32-bit number, for 64-bit numbers it's even more). 2 serial muls would have combined 6 clock cycle latency which is still far from 22, even if you add the sub (1 clock cycle).

Ofc, in theory they have lower latency. But in practice, a string conversion reads / writes from memory (that's a heavy latency). So your 'fast' division algorithm will be completely overwhelmed by the sum of latencies to memory WRITE/READ. Not to mention I/O processes being used in say, C's printf or kernel's I/O routines.

Pretty pointless isn't it? ;D
20 Jan 2018, 18:04
Ali.Z

Joined: 08 Jan 2018
Posts: 660
Ali.Z 21 Jan 2018, 10:52
yeohhs wrote:
Ali.A wrote:
and honestly i dont wanna create a c/c++ program and reverse engineer it to find what function is that.

I just call printf with the hex formatting string.

well that was simple in c/c++ i didnt know that %8.x will display the result as a hex.

about all the other posts from great users, shr and shl for simplicity honestly.
the in depth control is under div. (and actually more advanced)

but the algorithm required for it can be long a bit, which will result extra microseconds (not milliseconds)

_________________
Asm For Wise Humans
21 Jan 2018, 10:52
rugxulo

Joined: 09 Aug 2005
Posts: 2341
Location: Usono (aka, USA)
rugxulo 07 Feb 2018, 20:05
In assembly (but not x64?), for hex (base 16), you can use the old low-nibble conversion trick: "cmp al,10 // sbb al,105 // das".

My own HLL code is fairly naive. But yes, the big advantage to hex is that it's fast and easy to convert to string without slow DIV.

It may not matter for major platforms (e.g. Windows or Linux), but printf() can be a pig. It's both bloated and slow, at least when statically linking (e.g. DJGPP). When writing a partial hexdump / od clone, I wrote my own crude routine (in C) which was significantly smaller and faster. Similarly, I wrote my own for Turbo Pascal 5.5 since it lacked (TP 7 ??; well, at least FPC) hexstr(). Of course, buffering helped a lot, too.

It's also faster to not have one function for everything. FPC's hexstr() is fast, but I found that it was faster in TP 5.5 to use two separate, specialized routines (bytehex, longhex) instead of only one universal one.

The more specific and isolated you can be, the more optimized it is. Writing generic functions that do it all is great, but sometimes you only need the bare minimum. You don't need to call a full printf() implementation, reparsing your format string over and over again, if all you need is simple hex output.
07 Feb 2018, 20:05
Ali.Z

Joined: 08 Jan 2018
Posts: 660
Ali.Z 08 Feb 2018, 04:49
i think my main problem with consoles are:
displaying and clearing the console screen.

im not sure why i have issues with C runtime library (msvcrt.dll) it can be my bad for using it wrong.
currently im using console APIs (WriteConsole) but its not easy to use.

as for algorithms there are many, and people here mentioned many as well.
some of them are confusing a bit, some of them looks simple.

also for sure it might not be a big problem to display 1byte hex, but what if i need to display 4byte long hex value.
say: 3 millions decimal to hex, which in this case i have no idea how to deal with it.
08 Feb 2018, 04:49
Ali.Z

Joined: 08 Jan 2018
Posts: 660
Ali.Z 29 Jun 2018, 23:02
ok guys, previously you all helped me converting to hex format.
now in my program for some reason i need to convert some hex values (reading from file so its actually char to) 4byte (dword)
say i have a string in my file: 4001FC8 as a hex.

converting from base16 to 10 is difficult to me. (ive no idea too)
29 Jun 2018, 23:02
Picnic

Joined: 05 May 2007
Posts: 1386
Location: Piraeus, Greece
Picnic 30 Jun 2018, 07:28
Ali.A wrote:
say i have a string in my file: 4001FC8 as a hex.

converting from base16 to 10 is difficult to me. (ive no idea too)

Hi Ali.A,
Here is a simple HexToDword routine. Converts 4001FC8 to 67117000 in EAX. No error checking whatsoever.

Code:
```; input:  ESI pointer to string buffer
; output: EAX

HexToDword:
push ebx esi
xor ebx, ebx
cld
.loop:
lodsb
test al, al
je .return
sub al, '0'
cmp al, 10
jl @F
sub al, 7
@@:
shl ebx, 4
or bl, al
jmp .loop
.return:
mov eax, ebx
pop esi ebx
ret
```

Ali.A wrote:
i think my main problem with consoles are:
displaying and clearing the console screen.

Here is a CLS routine to get you started.

30 Jun 2018, 07:28
Ali.Z

Joined: 08 Jan 2018
Posts: 660
Ali.Z 30 Jun 2018, 09:11
already im using 0x0D 0x0A to write 100 of lines lol, anyhow i want to understand:

- sub al,'0' ; subtract 30 hex, (ascii table 30 = 0)
- then compare if its 10? 0x0A if its less go forward otherwise sub 7? why?
- bx is 0, shifting it to left will result 64d, then or 64 with content of al!

this algorithm is out of my thinking range.

btw, i was thinking to use SetConsoleCursorPosition .. do i really have to use STRUCT and pass a pointer for this struct? cant i just create a label and define dwXpos and Ypos?
30 Jun 2018, 09:11
Picnic

Joined: 05 May 2007
Posts: 1386
Location: Piraeus, Greece
Picnic 30 Jun 2018, 11:05
Yes you can, just as you imagine it.
Code:
```; data section
dwCoord dw 40,10

; data section
dwCoord dd 0x000A0028

; data section
dwCoord:
x dw 40
y dw 10
```

Code:
```; code section
invoke SetConsoleCursorPosition, dword [hOut], dword [dwCoord]
```

p.s. CLS.asm was a wrong choice of name. It will conflict with cmd.exe CLS command. Please rename it after download.

Code:
```    sub al, '0'    ; subtract 48 from the char
cmp al, 10     ; see if it was a digit 0-9 and not a letter (assume A-F)
jl @F          ; jump if it was a digit
sub al, 7      ; convert char to letter
@@:
shl ebx, 4     ; get space for next nibble
or bl, al      ; store the nibble
```
30 Jun 2018, 11:05
 Display posts from previous: All Posts1 Day7 Days2 Weeks1 Month3 Months6 Months1 Year Oldest FirstNewest First

 Jump to: Select a forum Official----------------AssemblyPeripheria General----------------MainTutorials and ExamplesDOSWindowsLinuxUnixMenuetOS Specific----------------MacroinstructionsOS ConstructionIDE DevelopmentProjects and IdeasNon-x86 architecturesHigh Level LanguagesProgramming Language DesignCompiler Internals Other----------------FeedbackHeapTest Area
Goto page Previous  1, 2, 3  Next

Forum Rules:
 You cannot post new topics in this forumYou cannot reply to topics in this forumYou cannot edit your posts in this forumYou cannot delete your posts in this forumYou cannot vote in polls in this forumYou cannot attach files in this forumYou can download files in this forum