flat assembler
Message board for the users of flat assembler.

 Index > Windows > displaying hex-values Goto page 1, 2, 3  Next
Author
Ali.Z

Joined: 08 Jan 2018
Posts: 233
Ali.Z
say im doing some calculations, and i want the output to be as hex not decimal .. how can i achieve this?

i know in c/c++ i can use:
cout << setbase(16) << endl;

and honestly i dont wanna create a c/c++ program and reverse engineer it to find what function is that.

_________________
Asm For Wise Humans
18 Jan 2018, 13:55
fasmnewbie

Joined: 01 Mar 2011
Posts: 553
fasmnewbie
Basically you can achieve that by repetitively dividing your number with 16. After each iteration, you should get one back digit of your number in RDX.

RBX = Divisor (16)

After a DIV, rdx should contain the back digit.

Then you turn that digit to an ASCII by adding 30h to it. If the value is more than 39h ('9'), add 7 to it to turn it into a Hex character instead of numeric character.

Repeat the process untiil RAX = 0

These are just the basic steps to give you the general idea.

If you need a visual tool to understand it better, you can use my BASELIB's dumpreg routine to help you see what's going on in every steps.

For example, I use "base64w.asm" source

Code:
```        mov     rax,34566ch     ;Dividend
mov     rbx,16          ;Divisor

;Iteration 1
xor     rdx,rdx
div     rbx
call    dumpreg ;Look at RAX and RDX

;Iteration 2
xor     rdx,rdx
div     rbx
call    dumpreg ;Look at RAX and RDX    ```

Observe how RAX and RDX changes after every iteration. You can download it here: https://board.flatassembler.net/topic.php?t=18099
18 Jan 2018, 14:39
revolution
When all else fails, read the source

Joined: 24 Aug 2004
Posts: 16951
revolution
In a binary computer we can divide by 16 by masking and bit-shifting instead of div. For many applications it won't make any perceivable difference, and the code it a bit harder to understand, but if you require high-speed hex conversion (for whatever reason) then avoiding the div instruction might be useful.
Code:
```;...
mov rdx,rax
and rdx,0x0f ;remainder when dividing by 16
shr rax,4 ;quotient when dividing by 16
;...    ```
18 Jan 2018, 14:53
Ali.Z

Joined: 08 Jan 2018
Posts: 233
Ali.Z
thanks fasmnewbie, ill take a look at that code also thanks for reminding me of ascii i worked with ascii instructions too AAA ... etc.

revolution, i prefer your way for faster execution also im pretty much friendly with shr and shl.

thanks guys.
18 Jan 2018, 16:32
revolution
When all else fails, read the source

Joined: 24 Aug 2004
Posts: 16951
revolution
Ali.A wrote:
revolution, i prefer your way for faster execution...
Note that "faster execution" is not a guaranteed* result. I would suggest that if it is important enough for your usage that you compare timings to see if it really is faster.

* There are many interactions within the CPU that cannot be predicted with static analysis. And those interactions will change based upon the current state of the CPU and the point in the program flow that instructions are executed. There are situations where using div can actually be faster, so it is risky to simply assume something is faster by merely looking at the instructions used.
18 Jan 2018, 18:04
Furs

Joined: 04 Mar 2016
Posts: 1445
Furs
Actually there's no situation in which div can be faster because it's way too slow, but it can probably be just as fast though.

Regardless, div requires more work from the CPU, so it wastes more energy for no reason in this case.
18 Jan 2018, 20:29
fasmnewbie

Joined: 01 Mar 2011
Posts: 553
fasmnewbie
The beauty of DIV is that it opens up opportunities for various kind of num-to-string conversions from base-2 up to base-36. If a learner can see it clearly using DIV while converting a certain base, there's a chance he/she can convert a number to any base he/she wants, performance or not.

Besides nobody wants to see a converted number in 'a blink of any eye' on the screen. Any text-based output is by default, slowing everything down. Performance is not crucial because human takes time to interpret what is being printed on the screen before he/she hits the ENTER key. That's 3 seconds minimum. Your nano-seconds display routines makes no difference. LOL.
18 Jan 2018, 21:17
fasmnewbie

Joined: 01 Mar 2011
Posts: 553
fasmnewbie
Ali.A wrote:
thanks fasmnewbie, ill take a look at that code also thanks for reminding me of ascii i worked with ascii instructions too AAA ... etc.

revolution, i prefer your way for faster execution also im pretty much friendly with shr and shl.

thanks guys.

You're in such luck. The "prnreg" routine (which is base-16) of BASELIB is using SHLD. The "prnhexu" routine uses SHRD. Enjoy your shifting ;D
18 Jan 2018, 21:22
Furs

Joined: 04 Mar 2016
Posts: 1445
Furs
fasmnewbie wrote:
The beauty of DIV is that it opens up opportunities for various kind of num-to-string conversions from base-2 up to base-36. If a learner can see it clearly using DIV while converting a certain base, there's a chance he/she can convert a number to any base he/she wants, performance or not.
It is similar with arithmetic coding btw. Let's say you want to pack multiple stuff in a byte for simplicity.

Most people use bit fields, e.g. they place 4-bits for one value, 4-bits for another. But what if your values range from say, 0-10 only? 4 bits is 0-15 which is a bit of a waste.

Simple. Use mod instead of and (to extract value) and div instead of shr (mul instead of shl, when encoding value, add instead of or, etc). You can actually replace div by any constant with a mul also, which is very fast (not just powers of 2).

e.g. to decode a byte with 0-10 values, you'd use x / 11 to get the second value (instead of x>>4), and x % 11 to get the first (instead of x&15).

You can "store" values with arbitrary ranges this way, which is better than using bit fields (and only really need 'mul' instruction since divs are by constants) -- unless you need fast speed AND storage, but that is not very often (and mul isn't so slow).

(of course in this example there's no difference in storage room, but it's just for example purposes, please don't nitpick)

In this case, arithmetic coding is simply the generalization of bit fields, which are really just arithmetic coding on powers of 2 (exact same thing as the printing hex).

Last edited by Furs on 19 Jan 2018, 14:02; edited 1 time in total
18 Jan 2018, 21:49
fasmnewbie

Joined: 01 Mar 2011
Posts: 553
fasmnewbie
Furs, you are talking about lots of cpu cycles there. Show some code, like for example converting to Base-26 (Hexavigesimal) without using any DIV. I am interested to know the optimized way doing it. That sounds interesting.
18 Jan 2018, 22:10
yeohhs

Joined: 19 Jan 2004
Posts: 195
Location: N 5.43564° E 100.3091°
yeohhs
Ali.A wrote:
and honestly i dont wanna create a c/c++ program and reverse engineer it to find what function is that.

I just call printf with the hex formatting string.
18 Jan 2018, 23:54
Furs

Joined: 04 Mar 2016
Posts: 1445
Furs
fasmnewbie wrote:
Furs, you are talking about lots of cpu cycles there. Show some code, like for example converting to Base-26 (Hexavigesimal) without using any DIV. I am interested to know the optimized way doing it. That sounds interesting.
You mean using MUL instead of DIV? Or what?

To use MUL it is easy, even online converters for you and methods, but I do it the lazy way. I just compile a function with GCC optimizations on and see what constant it spews out for mul. (you can also use online compilers like Godbolt's Compiler Explorer)

For your request (assuming int, 32-bit):
Code:
`auto blah(unsigned int x) { return x/26; }    `
Turns into:
Code:
```mov eax, 0x4EC4EC4F
mul <reg>
shr edx, 3
; edx is now <reg>/26, multiply by 26 and sub from original to get remainder as well    ```

I'm sure there's better ways if you need both div/mod though, such as in this case. But this is the lazy way, I guess I could make GCC compiler both the div and mod and see how it does. (for example, do a single imul, then check if result is > divisor, sub another 1)

Yes, this is the lazy way but who said lazy is bad? You can obviously factor it manually, after all, just treat them as fixed point numbers... but meh.
19 Jan 2018, 01:28
revolution
When all else fails, read the source

Joined: 24 Aug 2004
Posts: 16951
revolution
Furs wrote:
Actually there's no situation in which div can be faster because it's way too slow, but it can probably be just as fast though.
In OOO CPUs it is possible to execute div in parallel with other instructions. In some situations it is possible to hide the latency and take advantage of the single issue cycle. Anyhow this type of optimisation is very tricky to get right and is deeply dependent upon the CPU in use. Someone could write code that accidentally uses div in a optimal manner without realising it and discover that it is "faster". This is why testing is the final answer rather than guessing.
19 Jan 2018, 02:49
fasmnewbie

Joined: 01 Mar 2011
Posts: 553
fasmnewbie
Furs, that C technique is similar to agner-fog's, which is optimized only for finding the the Most-Significant digit. For conversion to full string, it is just not any better to a simple DIV operation. If you look into my "sbase32w.asm" source, I included in "prnintu" using Agner fog's technique. When I timed it against a regular DIV, it clocked similar and at times slower than DIV. That's why I ditched that technique altogether for my 64-bit sources.
19 Jan 2018, 03:12
fasmnewbie

Joined: 01 Mar 2011
Posts: 553
fasmnewbie
revolution wrote:
Furs wrote:
Actually there's no situation in which div can be faster because it's way too slow, but it can probably be just as fast though.
In OOO CPUs it is possible to execute div in parallel with other instructions. In some situations it is possible to hide the latency and take advantage of the single issue cycle. Anyhow this type of optimisation is very tricky to get right and is deeply dependent upon the CPU in use. Someone could write code that accidentally uses div in a optimal manner without realising it and discover that it is "faster". This is why testing is the final answer rather than guessing.

But that's the problem though; a DIV needs a more careful design to avoid that heavy latency. For a string conversion, using string instructions (STOSB, LODSB etc) may make things worse for a DIV construct. At least on a 32-bit CPU.
19 Jan 2018, 03:18
revolution
When all else fails, read the source

Joined: 24 Aug 2004
Posts: 16951
revolution
I have yet to see a situation where converting to decimal or hex (or whatever) is a time critical procedure. But that is just me. I guess it is possible someone out there has to do it at the speed of lightning? Anyhow it can be fun to create different versions of the same function and get a feel for how things are working.
19 Jan 2018, 03:28
fasmnewbie

Joined: 01 Mar 2011
Posts: 553
fasmnewbie
revolution wrote:
I guess it is possible someone out there has to do it at the speed of lightning?
Btw, how fast is the lightning? ;D
19 Jan 2018, 03:58
revolution
When all else fails, read the source

Joined: 24 Aug 2004
Posts: 16951
revolution
fasmnewbie wrote:
Btw, how fast is the lightning? ;D
Faster than the speed of my typing. And faster than the speed of my reading. Anything faster than that and I usually don't care about the extra conversion speed ability.
19 Jan 2018, 05:45
Furs

Joined: 04 Mar 2016
Posts: 1445
Furs
fasmnewbie wrote:
Furs, that C technique is similar to agner-fog's, which is optimized only for finding the the Most-Significant digit.
You mean least significant? The first remainder will get the least significant digit and that's what you start with (each mul will divide it by 10, so you start with least significant digit). But, I thought you were asking about the arithmetic coding instead of bit fields I was talking about, I'll have a look at the printing (it uses the exact same principle though). (of course, this isn't useful for hex since you can just use bitwise operators there, but it works on base26)
19 Jan 2018, 14:02
fasmnewbie

Joined: 01 Mar 2011
Posts: 553
fasmnewbie
Furs wrote:
fasmnewbie wrote:
Furs, that C technique is similar to agner-fog's, which is optimized only for finding the the Most-Significant digit.
You mean least significant? The first remainder will get the least significant digit and that's what you start with (each mul will divide it by 10, so you start with least significant digit). But, I thought you were asking about the arithmetic coding instead of bit fields I was talking about, I'll have a look at the printing (it uses the exact same principle though). (of course, this isn't useful for hex since you can just use bitwise operators there, but it works on base26)

Those techniques actually start by finding the first or Most-significant digit first. For example 4M in base-26, the first digit that comes out after the first iteration is '4'. Then u need another MUL get the closest to the original and SUB it to get the remainder. Then repeat the whole process until you get to the least significant. It's the reverse of DIV.

It is not any faster than DIV. It involves lots more cpu cycles than a DIV up to the point that it makes no significant differences when the conversion loops are involved. Those techniques are good (optimized) only for finding the answer of the first division result. It is much faster than DIV only in that regard. You should time it some times.
19 Jan 2018, 15:41
 Display posts from previous: All Posts1 Day7 Days2 Weeks1 Month3 Months6 Months1 Year Oldest FirstNewest First

 Jump to: Select a forum Official----------------AssemblyPeripheria General----------------MainDOSWindowsLinuxUnixMenuetOS Specific----------------MacroinstructionsCompiler InternalsIDE DevelopmentOS ConstructionNon-x86 architecturesHigh Level LanguagesProgramming Language DesignProjects and IdeasExamples and Tutorials Other----------------FeedbackHeapTest Area
Goto page 1, 2, 3  Next

Forum Rules:
 You cannot post new topics in this forumYou cannot reply to topics in this forumYou cannot edit your posts in this forumYou cannot delete your posts in this forumYou cannot vote in polls in this forumYou cannot attach files in this forumYou can download files in this forum