flat assembler
Message board for the users of flat assembler.
Index
> Tutorials and Examples > BASELIB: General purpose libs for beginners Goto page Previous 1, 2, 3, 4, 5, 6, 7 Next |
Author |
|
fasmnewbie 27 Sep 2015, 13:52
JohnFound wrote:
You should have told me about it much earlier John. I have my sources scattered all over the place on multiple PCs, at work, at home etc and at times I forgot which file and / or routine was just updated. Sometimes I saved the newer routines in the older source file and on other PC at work thinking of it as the same file I updated earlier on home laptop and vice versa It was a total mess! |
|||
27 Sep 2015, 13:52 |
|
JohnFound 27 Sep 2015, 16:27
fasmnewbie wrote: You should have told me about it much earlier John. I have my sources scattered all over the place on multiple PCs, at work, at home etc and at times I forgot which file and / or routine was just updated. Sometimes I saved the newer routines in the older source file and on other PC at work thinking of it as the same file I updated earlier on home laptop and vice versa It was a total mess! Well, actually I have told you several times. For example here and here. _________________ Tox ID: 48C0321ADDB2FE5F644BB5E3D58B0D58C35E5BCBC81D7CD333633FEDF1047914A534256478D9 |
|||
27 Sep 2015, 16:27 |
|
revolution 28 Sep 2015, 04:59
Just a comment about float printing functions. If you use the FPU/SSE to do conversion and scaling etc. then you can lose accuracy. This may or may not be important for a particular application, but if you have a requirement for accurate production of decimal values of binary FP numbers then a multi-precision integer code implementation can provide full accuracy.
Also be mindful of non-number values like NaN and infinity. |
|||
28 Sep 2015, 04:59 |
|
fasmnewbie 28 Sep 2015, 05:14
For the sake of completion, I changed a few lines, but still missing the E notation engine.
Code: ;---------------------------- ;prnfloat(1) ;Simple print float without E ;---------------------------- ;EAX : FP value ;---------------------------- ;Ret : - ;Note : No E notation ; : Deals with normals only ;---------------------------- prnfloat: jmp .start align 16 ;for EXE only .save rb 512 ;don't turn to object .strflt rb 16 .start: push rax rbx rcx rdx push rsi rdi r8 mov rdi,.save fxsave [rdi] mov rbx,8 ;Total digits cld mov rdi,.strflt test eax,eax jns .nxt mov byte[rdi],'-' add rdi,1 btr eax,31 .nxt: test eax,eax jnz .proceed mov eax,'0.0' stosd sub rdi,1 jmp .done .proceed: mov edx,eax ;shl eax,1 ;shr eax,24 ;mov ecx,127 ;sub eax,ecx ;exponent call sse_round.down ;-------------------------- .get_parts: movd xmm0,edx movd xmm1,edx cvtss2si eax,xmm0 cvtsi2ss xmm0,eax subss xmm1,xmm0 cvtss2si eax,xmm0 ;integer movd ecx,xmm1 ;fraction ;-------------------------- .tmp1: mov r8,10 xor esi,esi ;-------------------------- .int: xor rdx,rdx div r8 add dl,30h push rdx inc esi test rax,rax jnz .int .rep1: pop rax stosb sub rbx,1 jz .err dec esi jnz .rep1 ;-------------------------- .tmp2: mov al,'.' stosb mov edx,1.0 movd xmm3,edx mov edx,10.0 movd xmm0,ecx movd xmm1,edx movd xmm2,ecx ;-------------------------- .frac: mulss xmm0,xmm1 mulss xmm2,xmm1 cvtss2si eax,xmm2 add al,30h stosb sub ebx,1 jz .done sub al,30h cvtsi2ss xmm2,eax subss xmm0,xmm2 movq xmm2,xmm0 comisd xmm0,xmm3 jc .frac jmp .done ;-------------------------- .err: ;whatever comes through here mov al,'#' stosb ;-------------------------- .done: xor al,al stosb mov rax,.strflt call prtstrz mov rdi,.save fxrstor [rdi] pop r8 rdi rsi pop rdx rcx rbx rax ret NOTE: prnfloat is not part of the zipped library. This is just for testing purposes. You can include it if you want btw. Revo How does scaling results in precision lost? My old prtflt does use scaling but the final result still comply with IEEE Real4. I did it bit-by-bit. Can I see your code? Last edited by fasmnewbie on 28 Sep 2015, 06:08; edited 1 time in total |
|||
28 Sep 2015, 05:14 |
|
fasmnewbie 28 Sep 2015, 05:27
Note on all FP routines (thanks to revo) all special values (NaN, infinity, DBZ) are handled via '#' error message for the sake of simplicity. This library isn't recommended for production anyway (as stated in the documentation).
|
|||
28 Sep 2015, 05:27 |
|
revolution 28 Sep 2015, 05:48
You appear to be using single precision SSE instructions to print single precision numbers in decimal. Accuracy loss will happen for many numbers. For the most part it will only affect the 7th and higher digits but it can also affect the 6th digit for some values. If accuracy is important to an application then knowing the limitations is necessary. Otherwise you could limit the output to 5 digits only, and with correct rounding of the last digit, you could state that it is accurate for all input values within the range of numbers it is designed to handle.
If you needed accurate output for double precision numbers then you would, at a minimum, be required to carefully use extended precision from the FPU. And if you need to print accurate extended precision numbers then you can't use the hardware floating point instructions at all. |
|||
28 Sep 2015, 05:48 |
|
fasmnewbie 28 Sep 2015, 06:05
revolution wrote: You appear to be using single precision SSE instructions to print single precision numbers in decimal. Accuracy loss will happen for many numbers. For the most part it will only affect the 7th and higher digits but it can also affect the 6th digit for some values. If accuracy is important to an application then knowing the limitations is necessary. Otherwise you could limit the output to 5 digits only, and with correct rounding of the last digit, you could state that it is accurate for all input values within the range of numbers it is designed to handle. Oh I got what you mean. I did try using the extended precision but the precision loss is poorer than the double version. I don't know why. I had it attached to the zip file under prtdblx (80-bit version) but I have to cancel it because it offers no accuracy advantage over its double sibling.. Maybe I need to re-check it. |
|||
28 Sep 2015, 06:05 |
|
revolution 28 Sep 2015, 06:10
You might need to enable the extended precision mode in the control word for the FPU. IIRC the default is to use only 64-bit precision.
|
|||
28 Sep 2015, 06:10 |
|
fasmnewbie 28 Sep 2015, 06:23
revolution wrote: You might need to enable the extended precision mode in the control word for the FPU. IIRC the default is to use only 64-bit precision. Oh-eM-G!! I completely missed that one. I'll re-check it tonight. Will tell you the result. I need the summoning ritual to call prtdblx back from the grave. Thanks. |
|||
28 Sep 2015, 06:23 |
|
fasmnewbie 28 Sep 2015, 21:09
This is the completed version of prnfloat that uses SSE instead of FPU. You can manually include this in the 64-bit source as an alternative to current FPU-based prtflt routine. It now has the E-notation engine.
Code: prnfoalt: jmp .start .val dd 0.0 .strflt rb 16 .start: push rbp mov rbp,rsp sub rsp,512 and rsp,-16 fxsave [rsp] push rax push rbx push rcx push rdx push rsi push rdi push r8 push r9 push r10 xor r10,r10 xor r9,r9 mov rbx,7 ;8? cld mov rdi,.strflt mov edx,eax mov [.val],eax xor eax,eax fld dword[.val] fxam fstsw ax and eax,4500h ;Mask C3, C2 and C0 cmp eax,500h ;NAN je .err cmp eax,100h ;INF je .err cmp eax,0 ;Unsupported je .err mov eax,edx test eax,eax jns .nxt mov byte[rdi],'-' add rdi,1 btr eax,31 .nxt: test eax,eax jnz .proceed mov eax,'0.0' stosd sub rdi,1 jmp .done .proceed: mov edx,eax shl eax,1 shr eax,24 mov ecx,127 sub eax,ecx ;exponent test eax,eax js .small cmp eax,23 jae .big ;-------------------------- .get_parts: call sse_round.down movd xmm0,edx movd xmm1,edx cvtss2si eax,xmm0 cvtsi2ss xmm0,eax subss xmm1,xmm0 cvtss2si eax,xmm0 ;integer movd ecx,xmm1 ;fraction jmp .tmp1 ;--------------------------- .small: mov ecx,edx movd xmm0,edx mov edx,10.0 movd xmm1,edx mov edx,1.0 movd xmm2,edx .again: mulss xmm0,xmm1 inc r9 comisd xmm0,xmm2 jc .again cmp r9,7 ja .nxt3 mov edx,ecx jmp .get_parts .nxt3: mov r10,1 movd edx,xmm0 jmp .get_parts ;-------------------------- .big: movd xmm0,edx mov edx,10.0 movd xmm1,edx .again1: divss xmm0,xmm1 inc r9 comisd xmm0,xmm1 jnc .again1 movd edx,xmm0 mov r10,2 jmp .get_parts ;-------------------------- .tmp1: mov r8,10 xor esi,esi ;-------------------------- .int: xor rdx,rdx div r8 add dl,30h push rdx inc esi test rax,rax jnz .int .rep1: pop rax stosb sub rbx,1 jz .err dec esi jnz .rep1 ;-------------------------- .tmp2: test r10,r10 js .done mov al,'.' stosb mov edx,1.0 movd xmm3,edx mov edx,10.0 movd xmm0,ecx movd xmm1,edx movd xmm2,ecx ;-------------------------- .frac: mulss xmm0,xmm1 mulss xmm2,xmm1 cvtss2si eax,xmm2 add al,30h stosb sub ebx,1 jz .predone sub al,30h cvtsi2ss xmm2,eax subss xmm0,xmm2 movq xmm2,xmm0 comisd xmm0,xmm3 jc .frac ;-------------------------- .predone: cmp r10,0 jle .done mov ax,'e-' cmp r10,1 je .nxt4 cmp r10,2 jne .done mov ax,'e+' .nxt4: stosw mov eax,r9d mov r10,-1 jmp .tmp1 ;-------------------------- .err: mov al,'#' stosb ;-------------------------- .done: xor al,al stosb mov rax,.strflt call prtstrz pop r10 pop r9 pop r8 pop rdi pop rsi pop rdx pop rcx pop rbx pop rax fxrstor [rsp] mov rsp,rbp pop rbp ret |
|||
28 Sep 2015, 21:09 |
|
HaHaAnonymous 29 Sep 2015, 02:04
Quote:
Just want to apologize about that. I know I was stupid. |
|||
29 Sep 2015, 02:04 |
|
fasmnewbie 04 Oct 2015, 21:41
Since I had more spare time last week, I made lots of changes and modifications to the current library.
Added: -------- 1) pow2 - calculate 2^n (powint does the same but I made this anyway) 2) stackviewf - The ability to view the stack by flat bytes 3) sincos - self-explanatory 4) fpu_precision - set fpu_precision control 5) prtstreg - display short string off RAX. Handy for short string. 6) str_trim - trim a 0-ended string Improved/Edit/Corrected: --------- 1) pow - now can calculate x^n for precision values 2) powint - introduced a skip to base 2 to improve speed. Corrected for sign base 3) dumpreg/d/u - now RBP shows surface / caller's RBP instead. 4) stackview - now with options to select multiple format 5) prtintd - now with sign/unsign options 6) prtintw - same as above 7) prtintb - same as above 8) prtxmm - code shortened due to deletions of prtintdu, prtintwu, prtintbu. Changed the selection sequence too. 9) getflt/getdbl - corrected the size of buffer to make it more permissive for keyboard kittens to enter very large or small values like 0.00000000000000000000045876 10) corrected log10 stack balancing typo error (add rax,8 instead of add rsp,8) Deleted: --------- 1) stackviewd - this option is available in stackview 2) stackviewdu - same as above 3) prtintdu - this option now available in prtintd 4) prtintwu - this option now available in prtintw 5) prtintbu - this option now available in prtintb Update October 6th, 2015 Last edited by fasmnewbie on 05 Oct 2015, 21:05; edited 1 time in total |
|||
04 Oct 2015, 21:41 |
|
Picnic 05 Oct 2015, 06:29
fasmnewbie wrote: I think this library is complete now and has almost every routines required by beginners to start learning assembly programming quite comfortably. It's hard not to find something useful. I will test the library extensively this week and let you know about my impressions. Thanks for your effort fasmnewbie. |
|||
05 Oct 2015, 06:29 |
|
fasmnewbie 05 Oct 2015, 21:10
Picnic wrote: It's hard not to find something useful. I will test the library extensively this week and let you know about my impressions. I don't know why is it I have this guilty feeling every time a senior wants to check out my code. Feels like I am 10 year old, in the principal's office waiting for punishment for not paying attention in class xD |
|||
05 Oct 2015, 21:10 |
|
fasmnewbie 05 Oct 2015, 21:26
I still have no idea how to develop half-precision routine since there's no verification tool available to test the accuracy / correctness of the result. If anybody happen to know one, please let me know.
|
|||
05 Oct 2015, 21:26 |
|
fasmnewbie 07 Oct 2015, 17:32
Did a final touch up here and there. Tested bug-free but not fully tested for logical error.
Not perfect, but should be enough for beginners to go thru those difficult times. Happy coding. |
|||
07 Oct 2015, 17:32 |
|
fasmnewbie 12 Oct 2015, 16:45
I added 3 more routines (str_wordcnt, dumpseg, ascii), which are not too important but it gives me a sense of completion of this basic library. With these latest additions, a beginner who chooses this utility to accompany him/her learning basic assembly programming with FASM is now armed to the teeth to face most of the learning challenges at the basic level - flat style, no C, no linker, no debugger required.
Enjoy it and happy coding. |
|||
12 Oct 2015, 16:45 |
|
idle 13 Oct 2015, 16:21
Hi fasmnewbie!
Just wanted to warn you of the rakes i've met on my way: - are the routines limited to specific sse-version? - it may happen a routine of you to be included in a write-protected section - avoid ... Code: ... section '' code readable ... prnfloat: jmp .start .strflt rb 16 ... ;... avoid writes to .strflt etc, use stack or user-passed buffer - debuggers should be your friends Great job! |
|||
13 Oct 2015, 16:21 |
|
fasmnewbie 14 Oct 2015, 11:13
Hi idle. Thanks for stopping by. I agree with your points except for this one;
idle wrote: - debuggers should be your friends At beginners level, IMHO, a debugger is counter-productive. This library is specifically tailored to reproduce a debugger output on-the-fly. It saves time, don't have to stop coding just to use the debugger, and a user can choose what information he wants at any point of execution. Lets say a beginner wants to learn about EXTRACTPS instruction. Code: format PE64 console include 'win64axp.inc' entry Begin section '.data' data readable writeable x dd 5.0,8.0,6.5,-0.6 section '.code' code readable executable Begin: movdqa xmm0,dqword[x] extractps eax,xmm0,1 ;learn this movd xmm1,eax ;copy the result to xmm1 mov eax,8 ;shows xmm dump with singles output call dumpxmm ;use this routine instead of calling debugger call exitp The output; Code: XMM0 : -0.6|6.5|8.0|5.0 XMM1 : 0.0|0.0|0.0|8.0 ;visually proven XMM2 : 0.0|0.0|0.0|0.0 XMM3 : 0.0|0.0|0.0|0.0 XMM4 : 0.0|0.0|0.0|0.0 XMM5 : 0.0|0.0|0.0|0.0 XMM6 : 0.0|0.0|0.0|0.0 XMM7 : 0.0|0.0|0.0|0.0 XMM8 : 0.0|0.0|0.0|0.0 XMM9 : 0.0|0.0|0.0|0.0 XMM10: 0.0|0.0|0.0|0.0 XMM11: 0.0|0.0|0.0|0.0 XMM12: 0.0|0.0|0.0|0.0 XMM13: 0.0|0.0|0.0|0.0 XMM14: 0.0|0.0|0.0|0.0 XMM15: 0.0|0.0|0.0|0.0 Once satisfied and confirmed, a user can delete the call to dumpxmm or try another imm8 for EXTRACTPS (learn efficiently), AVOID bugs line-by-line, saves time and most importantly, it can be done while at coding, interactively and visually. These routines are particularly designed to mimic a debugger output; dumpreg/d/u prtreg/d/u flags fpu_stack fpu_sflag fpu_cflag prtxmm dumpxmm sse_flags fpu_reg stackview stackviewf and probably more... But anyway, it's just a matter of personal convenience. |
|||
14 Oct 2015, 11:13 |
|
Goto page Previous 1, 2, 3, 4, 5, 6, 7 Next < Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.