flat assembler
Message board for the users of flat assembler.
Index
> Main > printf |
Author |
|
sylware 01 Nov 2023, 18:39
I know I did already ask for this, but I try again: does anyone know about a sane assembly written and well-featured "printf" function?
|
|||
01 Nov 2023, 18:39 |
|
sylware 02 Nov 2023, 10:29
wow, this looks like great work.
There is a lot to cherry pick to compose a well featured printf function. BTW, "modern" printfs now allow to cherry pick their arguments for their conversions and precisions with a '$' marker. With the ABI being a mess when mixing floats and integers, and variable argument being shabby (va_list), heavy work ahead, I guess a compromise would have to be done with the user code calling such function. |
|||
02 Nov 2023, 10:29 |
|
AsmGuru62 02 Nov 2023, 16:19
Just curious -- why not use a standard?
Code: ; --------------------------------------------------------------------------- section '.idata' import data readable writeable library kernel32,'KERNEL32.DLL',user32,'USER32.DLL',gdi32,'GDI32.DLL',msvcrt,'MSVCRT.DLL' include 'API\Kernel32.Inc' include 'API\User32.Inc' include 'API\Gdi32.Inc' import msvcrt, \ atof,'atof', \ atoi,'atoi',\ printf,'printf',\ wtoi,'_wtoi' Tested by time and MSVCRT is installed on Windows since the time of the Vikings! |
|||
02 Nov 2023, 16:19 |
|
revolution 02 Nov 2023, 19:16
The MS version of printf is inefficient, has limited format options, doesn't report when the buffer is overflowed, does "smart" float rounding (it changes the value!) and can't be used to make a float that can be reliably reconstruct from the text.
Just because it is there doesn't mean it is suitable. If you need consistent behaviour across all implementations of printf then it isn't possible. You must have your own implementation to ensure it works the same everywhere. |
|||
02 Nov 2023, 19:16 |
|
Furs 03 Nov 2023, 17:19
revolution wrote: The MS version of printf is inefficient And yes, C++ standard library definitely has "zero cost" abstractions. Trust the standard library devs guys! They optimize to the max! |
|||
03 Nov 2023, 17:19 |
|
ProMiNick 22 Nov 2023, 15:13
I test printf on
Code: szEnv db "%0.100tf",0 valu dt 1.0E-1 so I get subcall tree something like: Code: PRINTF_decode_table PRINTF_get_float_parameters PRINTF_triple_precision_upscale PRINTF_triple_precision_floor_log10 PRINTF_get_triple_precision_10_power_y PRINTF_check_triple_precision_power_table PRINTF_make_triple_precision_power_table PRINTF_variable_precision_mul PRINTF_variable_precision_mul ; until here no convertion, fp just transfered to TRIPLE_PRECISION form with exponent biased and mantissa shifted to high bits ; other stuff related to how exactly will be outputted is filled ; between PRINTF_make_triple_precision_power_table & PRINTF_variable_precision_mul builded table above PRINTF_generate_f PRINTF_triple_precision_scale PRINTF_get_triple_precision_10_power_y PRINTF_decimalise_unsigned ; after here all must be converted, just printing according to flags PRINTF_elements_print PRINTF_elements_print_character PRINTF_elements_print_string PRINTF_elements_print_decimal_repeated_character PRINTF_elements_print_decimal_string what goes from PRINTF_get_float_parameters: Code: PRINTF_ELEMENTS: .flags = PRINTF_ELEMENTS_FLAG.precision_specified or PRINTF_ELEMENTS_FLAG.size_specified or PRINTF_ELEMENTS_FLAG.zero_pad or PRINTF_ELEMENTS_SIZE.triple .precision = 100 .arg_value = $3FFB:CCCCCCCCCCCCCCCD FDT[4]: .sign_bit_position = 16 ; .exponent_width = 15 ;bit .implicit_bit_flag = 0 .exponent_bias = $3FFE call PRINTF_get_float_parameters shift left .arg_value with .sign_bit_position them add & adc float parts twice(shl 1 analog) with shifting high bit to carry & stroring it to [negative_flag] then shld exponent with .exponent_width, increase it, then shr it back, if nonzero - we get infinity_NaN shift .arg_value with .exponent_width if .arg_value.exponent >0 then jmp .normal if .arg_value.mantissa_low or .arg_value.mantissa_low or .arg_value.mantissa_high = 0 then jmp .zero .normal: if .implicit_bit_flag <> 0 then shrd .arg_value.mantissa with 1 where .mantissa_high forced to shift 1: stc,rcr ?,1 sub .arg_value.exponent,.exponent_bias if high bit(js) of .arg_value.mantissa_high then jmp .normalized if denormalized shld .arg_value.mantissa and increase .arg_value.exponent on shift bit amount elements.flags set ecx filtered with PRINTF_ELEMENTS_FLAG.negative (ecx=-1 mean negate from prev subcall) elements.significant_bits = 64 and than builded this table Code: .datapf:00404240 stru_404240 .datapf:00404240 TRIPLE_PRECISION < 0, 0, 80000000h, 1>; 0 .datapf:00404240 TRIPLE_PRECISION < 0, 0, 0A0000000h, 4>; 1 .datapf:00404240 TRIPLE_PRECISION < 0, 0, 0C8000000h, 7>; 2 .datapf:00404240 TRIPLE_PRECISION < 0, 0, 0FA000000h, 0Ah>; 3 .datapf:00404240 TRIPLE_PRECISION < 0, 0, 9C400000h, 0Eh>; 4 .datapf:00404240 TRIPLE_PRECISION < 0, 0, 0C3500000h, 11h>; 5 .datapf:00404240 TRIPLE_PRECISION < 0, 0, 0F4240000h, 14h>; 6 .datapf:00404240 TRIPLE_PRECISION < 0, 0, 98968000h, 18h>; 7 .datapf:00404240 TRIPLE_PRECISION < 0, 0, 0BEBC2000h, 1Bh>; 8 .datapf:00404240 TRIPLE_PRECISION < 0, 0, 0EE6B2800h, 1Eh>; 9 .datapf:00404240 TRIPLE_PRECISION < 0, 0, 9502F900h, 22h>; 10 .datapf:00404240 TRIPLE_PRECISION < 0, 0, 0BA43B740h, 25h>; 11 .datapf:00404240 TRIPLE_PRECISION < 0, 0, 0E8D4A510h, 28h>; 12 .datapf:00404240 TRIPLE_PRECISION < 0, 0, 9184E72Ah, 2Ch>; 13 .datapf:00404240 TRIPLE_PRECISION < 0, 80000000h, 0B5E620F4h, 2Fh>; 14 .datapf:00404240 TRIPLE_PRECISION < 0, 0A0000000h, 0E35FA931h, 32h>; 15 .datapf:00404240 TRIPLE_PRECISION < 0, 4000000h, 8E1BC9BFh, 36h>; 16 .datapf:00404240 TRIPLE_PRECISION <0F0200000h, 2B70B59Dh, 9DC5ADA8h, 6Bh>; 17 .datapf:00404240 TRIPLE_PRECISION <9670B12Bh, 0E4395D6h, 0AF298D05h, 0A0h>; 18 .datapf:00404240 TRIPLE_PRECISION <3CBF6B72h, 0FFCFA6D5h, 0C2781F49h, 0D5h>; 19 .datapf:00404240 TRIPLE_PRECISION <0DC33745Fh, 87DAF7FBh, 0D7E77A8Fh, 10Ah>; 20 .datapf:00404240 TRIPLE_PRECISION <0C5CFE94Fh, 0C59B14A2h, 0EFB3AB16h, 13Fh>; 21 .datapf:00404240 TRIPLE_PRECISION <3E2CF6Ch, 9923329Eh, 850FADC0h, 175h>; 22 .datapf:00404240 TRIPLE_PRECISION <0C66F336Ch, 80E98CDFh, 93BA47C9h, 1AAh>; 23 .datapf:00404240 TRIPLE_PRECISION <5F16206Dh, 0A8D3A6E7h, 0A402B9C5h, 1DFh>; 24 .datapf:00404240 TRIPLE_PRECISION <577B986Bh, 7FE617AAh, 0B616A12Bh, 214h>; 25 .datapf:00404240 TRIPLE_PRECISION <7D7B8F75h, 859BBF93h, 0CA28A291h, 249h>; 26 .datapf:00404240 TRIPLE_PRECISION <85BBE254h, 3927556Ah, 0E070F78Dh, 27Eh>; 27 .datapf:00404240 TRIPLE_PRECISION <0A7709A57h, 37826145h, 0F92E0C35h, 2B3h>; 28 .datapf:00404240 TRIPLE_PRECISION <82BD6B71h, 0E33CC92Fh, 8A5296FFh, 2E9h>; 29 .datapf:00404240 TRIPLE_PRECISION <0ACCA6DA2h, 0D6BF1765h, 9991A6F3h, 31Eh>; 30 .datapf:00404240 TRIPLE_PRECISION <0DDBB901Ch, 9DF9DE8Dh, 0AA7EEBFBh, 353h>; 31 .datapf:00404240 TRIPLE_PRECISION <0CC655C55h, 0A60E91C6h, 0E319A0AEh, 6A5h>; 32 .datapf:00404240 TRIPLE_PRECISION <6C8D3FCAh, 0CD00A68Ch, 973F9CA8h, 9F8h>; 33 .datapf:00404240 TRIPLE_PRECISION <650D3D29h, 81750C17h, 0C9767586h, 0D4Ah>; 34 .datapf:00404240 TRIPLE_PRECISION <85BCCD6h, 0EB856ECBh, 862C8C0Eh, 109Dh>; 35 .datapf:00404240 TRIPLE_PRECISION <4257AC3Bh, 3993A7E4h, 0B2B8353Bh, 13EFh>; 36 .datapf:00404240 TRIPLE_PRECISION <2D4070F3h, 924AB88Ch, 0EE0DDD84h, 1741h>; 37 .datapf:00404240 TRIPLE_PRECISION <0A74D28CEh, 0C53D5DE4h, 9E8B3B5Dh, 1A94h>; 38 .datapf:00404240 TRIPLE_PRECISION <3F50C802h, 41F4806Fh, 0D32E2032h, 1DE6h>; 39 .datapf:00404240 TRIPLE_PRECISION <5DFED099h, 20A1F0A6h, 8CA554C0h, 2139h>; 40 .datapf:00404240 TRIPLE_PRECISION <4C808754h, 9BD977CCh, 0BB570A9Ah, 248Bh>; 41 .datapf:00404240 TRIPLE_PRECISION <0FDD08C4Eh, 0D88B5A8Ah, 0F9895D25h, 27DDh>; 42 .datapf:00404240 TRIPLE_PRECISION <50E36602h, 5699FE45h, 0A630EF7Dh, 2B30h>; 43 .datapf:00404240 TRIPLE_PRECISION <95AA118Fh, 0BF27F3F7h, 0DD5DC8A2h, 2E82h>; 44 .datapf:00404240 TRIPLE_PRECISION <8C474BB6h, 7DC64F6Dh, 936E0773h, 31D5h>; 45 .datapf:00404240 TRIPLE_PRECISION <0C94C1540h, 8A20979Ah, 0C4605202h, 3527h>; 46 .datapf:00404240 TRIPLE_PRECISION <0CCCCCCCDh, 0CCCCCCCCh, 0CCCCCCCCh, 0FFFFFFFDh>; 47 .datapf:00404240 TRIPLE_PRECISION <3D70A3D7h, 70A3D70Ah, 0A3D70A3Dh, 0FFFFFFFAh>; 48 .datapf:00404240 TRIPLE_PRECISION <645A1CACh, 8D4FDF3Bh, 83126E97h, 0FFFFFFF7h>; 49 .datapf:00404240 TRIPLE_PRECISION <0D3C36113h, 0E219652Bh, 0D1B71758h, 0FFFFFFF3h>; 50 .datapf:00404240 TRIPLE_PRECISION <0FCF80DCh, 1B478423h, 0A7C5AC47h, 0FFFFFFF0h>; 51 .datapf:00404240 TRIPLE_PRECISION <0A63F9A4Ah, 0AF6C69B5h, 8637BD05h, 0FFFFFFEDh>; 52 .datapf:00404240 TRIPLE_PRECISION <3D329076h, 0E57A42BCh, 0D6BF94D5h, 0FFFFFFE9h>; 53 .datapf:00404240 TRIPLE_PRECISION <0FDC20D2Bh, 8461CEFCh, 0ABCC7711h, 0FFFFFFE6h>; 54 .datapf:00404240 TRIPLE_PRECISION <31680A89h, 36B4A597h, 89705F41h, 0FFFFFFE3h>; 55 .datapf:00404240 TRIPLE_PRECISION <0B573440Eh, 0BDEDD5BEh, 0DBE6FECEh, 0FFFFFFDFh>; 56 .datapf:00404240 TRIPLE_PRECISION <0F78F69A5h, 0CB24AAFEh, 0AFEBFF0Bh, 0FFFFFFDCh>; 57 .datapf:00404240 TRIPLE_PRECISION <0F93F87B7h, 6F5088CBh, 8CBCCC09h, 0FFFFFFD9h>; 58 .datapf:00404240 TRIPLE_PRECISION <2865A5F2h, 4BB40E13h, 0E12E1342h, 0FFFFFFD5h>; 59 .datapf:00404240 TRIPLE_PRECISION <538484C2h, 95CD80Fh, 0B424DC35h, 0FFFFFFD2h>; 60 .datapf:00404240 TRIPLE_PRECISION <0F9D3701h, 3AB0ACD9h, 901D7CF7h, 0FFFFFFCFh>; 61 .datapf:00404240 TRIPLE_PRECISION <4C2EBE68h, 0C44DE15Bh, 0E69594BEh, 0FFFFFFCBh>; 62 .datapf:00404240 TRIPLE_PRECISION <67DE18EEh, 453994BAh, 0CFB11EADh, 0FFFFFF96h>; 63 .datapf:00404240 TRIPLE_PRECISION <5560C018h, 0B17EC159h, 0BB127C53h, 0FFFFFF61h>; 64 .datapf:00404240 TRIPLE_PRECISION <3F2398D7h, 0A539E9A5h, 0A87FEA27h, 0FFFFFF2Ch>; 65 .datapf:00404240 TRIPLE_PRECISION <0DCCD87A0h, 6B0919A5h, 97C560BAh, 0FFFFFEF7h>; 66 .datapf:00404240 TRIPLE_PRECISION <11DBCB02h, 0FD75539Bh, 88B402F7h, 0FFFFFEC2h>; 67 .datapf:00404240 TRIPLE_PRECISION <4D4617B6h, 0F065D37Dh, 0F64335BCh, 0FFFFFE8Ch>; 68 .datapf:00404240 TRIPLE_PRECISION <0AC7CB3F7h, 64BCE4A0h, 0DDD0467Ch, 0FFFFFE57h>; 69 .datapf:00404240 TRIPLE_PRECISION <0FE64A52Fh, 7C5382C8h, 0C7CABA6Eh, 0FFFFFE22h>; 70 .datapf:00404240 TRIPLE_PRECISION <59ED2167h, 0DB73A093h, 0B3F4E093h, 0FFFFFDEDh>; 71 .datapf:00404240 TRIPLE_PRECISION <0B8ADA00Eh, 38CB002Fh, 0A21727DBh, 0FFFFFDB8h>; 72 .datapf:00404240 TRIPLE_PRECISION <7B6306A3h, 5423CC06h, 91FF8377h, 0FFFFFD83h>; 73 .datapf:00404240 TRIPLE_PRECISION <4247CB9Eh, 3DA4BC60h, 8380DEA9h, 0FFFFFD4Eh>; 74 .datapf:00404240 TRIPLE_PRECISION <0A4F8BF56h, 4A314EBDh, 0ECE53CECh, 0FFFFFD18h>; 75 .datapf:00404240 TRIPLE_PRECISION <0FB1E4A9Bh, 0CF32E1D6h, 0D5605FCDh, 0FFFFFCE3h>; 76 .datapf:00404240 TRIPLE_PRECISION <0FA911156h, 637A1939h, 0C0314325h, 0FFFFFCAEh>; 77 .datapf:00404240 TRIPLE_PRECISION <7132D333h, 0DB23D21Ch, 9049EE32h, 0FFFFF95Ch>; 78 .datapf:00404240 TRIPLE_PRECISION <5AE1B259h, 505DE96Bh, 0D8A66D4Ah, 0FFFFF609h>; 79 .datapf:00404240 TRIPLE_PRECISION <87A60158h, 0DA57C0BDh, 0A2A682A5h, 0FFFFF2B7h>; 80 .datapf:00404240 TRIPLE_PRECISION <1F4BF665h, 75EDBABEh, 0F4385D09h, 0FFFFEF64h>; 81 .datapf:00404240 TRIPLE_PRECISION <68E1EB75h, 52A711B2h, 0B759449Fh, 0FFFFEC12h>; 82 .datapf:00404240 TRIPLE_PRECISION <6C83AD12h, 0C497B50Eh, 89A63BA4h, 0FFFFE8C0h>; 83 .datapf:00404240 TRIPLE_PRECISION <492512D5h, 34362DE4h, 0CEAE534Fh, 0FFFFE56Dh>; 84 .datapf:00404240 TRIPLE_PRECISION <0E393A9C0h, 28A1638Fh, 9B2A840Fh, 0FFFFE21Bh>; 85 .datapf:00404240 TRIPLE_PRECISION <598EEC7Dh, 0DEC0A404h, 0E8FB7DC2h, 0FFFFDEC8h>; 86 .datapf:00404240 TRIPLE_PRECISION <0E3187C34h, 1228ABCAh, 0AEE97391h, 0FFFFDB76h>; 87 .datapf:00404240 TRIPLE_PRECISION <0E79E236Ch, 91575A87h, 8350BF3Ch, 0FFFFD824h>; 88 .datapf:00404240 TRIPLE_PRECISION <9E98CB98h, 0AEB15D92h, 0C52BA8A6h, 0FFFFD4D1h>; 89 .datapf:00404240 TRIPLE_PRECISION <4B4DE34Eh, 83FD6265h, 9406AF8Fh, 0FFFFD17Fh>; 90 .datapf:00404240 TRIPLE_PRECISION <1463EF49h, 37CAD87Fh, 0DE42FF8Dh, 0FFFFCE2Ch>; 91 .datapf:00404240 TRIPLE_PRECISION <2DE38124h, 0D2CE9FDEh, 0A6DD04C8h, 0FFFFCADAh>; 92 .datapf:00404240 TRIPLE_PRECISION 7 dup(<0>) could anyone describe logic from here - PRINTF_make_triple_precision_power_table - how they build and what for? I could say that this table of power of tens 1,10,100 etc 0.1,0.01,0.001 etc in TRIPLE_PRECISION form. but question is open - what for? thanks _________________ I don`t like to refer by "you" to one person. My soul requires acronim "thou" instead. |
|||
22 Nov 2023, 15:13 |
|
revolution 22 Nov 2023, 21:53
It is a multiplication table to shift the mantissa into an integer in the range 0 to 2^96-1.
|
|||
22 Nov 2023, 21:53 |
|
ProMiNick 23 Nov 2023, 00:06
I am understand how to get first 17 elements of TRIPLE_PRECISION multipliers table in compile time:
Code: macro tpc a{ local %' match base ** exp,a\{ if exp<17 & exp>=0 %' = 1 if exp>0 repeat exp %'=%'*10 end repeat end if b = bsr (%')+1 c = (%') shl (64-b) ;display (b/100) mod 100 +'0',(b/10) mod 10 +'0',b mod 10+'0',13,10 dd 0, c and $FFFFFFFF,c shr 32,b end if \} } ;power of tens tpc 10 **$0000 ;1 tpc 10 **$0001 ;10 tpc 10 **$0002 ;100 tpc 10 **$0003 ;... tpc 10 **$0004 tpc 10 **$0005 tpc 10 **$0006 tpc 10 **$0007 tpc 10 **$0008 tpc 10 **$0009 tpc 10 **$000A tpc 10 **$000B tpc 10 **$000C tpc 10 **$000D tpc 10 **$000E tpc 10 **$000F tpc 10 **$0010 tpc 10 **$0020 ; from here calculation become more complex tpc 10 **$0030 tpc 10 **$0040 tpc 10 **$0050 tpc 10 **$0060 tpc 10 **$0070 tpc 10 **$0080 tpc 10 **$0090 tpc 10 **$00A0 tpc 10 **$00B0 tpc 10 **$00C0 tpc 10 **$00D0 tpc 10 **$00E0 tpc 10 **$00F0 tpc 10 **$0100 tpc 10 **$0200 tpc 10 **$0300 tpc 10 **$0400 tpc 10 **$0500 tpc 10 **$0600 tpc 10 **$0700 tpc 10 **$0800 tpc 10 **$0900 tpc 10 **$0A00 tpc 10 **$0B00 tpc 10 **$0C00 tpc 10 **$0D00 tpc 10 **$0E00 tpc 10 **$0F00 tpc 10 **$1000 tpc 10 **-$0001 tpc 10 **-$0002 tpc 10 **-$0003 tpc 10 **-$0004 tpc 10 **-$0005 tpc 10 **-$0006 tpc 10 **-$0007 tpc 10 **-$0008 tpc 10 **-$0009 tpc 10 **-$000A tpc 10 **-$000B tpc 10 **-$000C tpc 10 **-$000D tpc 10 **-$000E tpc 10 **-$000F tpc 10 **-$0010 tpc 10 **-$0020 tpc 10 **-$0030 tpc 10 **-$0040 tpc 10 **-$0050 tpc 10 **-$0060 tpc 10 **-$0070 tpc 10 **-$0080 tpc 10 **-$0090 tpc 10 **-$00A0 tpc 10 **-$00B0 tpc 10 **-$00C0 tpc 10 **-$00D0 tpc 10 **-$00E0 tpc 10 **-$00F0 tpc 10 **-$0100 tpc 10 **-$0200 tpc 10 **-$0300 tpc 10 **-$0400 tpc 10 **-$0500 tpc 10 **-$0600 tpc 10 **-$0700 tpc 10 **-$0800 tpc 10 **-$0900 tpc 10 **-$0A00 tpc 10 **-$0B00 tpc 10 **-$0C00 tpc 10 **-$0D00 tpc 10 **-$0E00 tpc 10 **-$0F00 tpc 10 **-$1000 dd 0,0,0,0 after seventeen element "tpc 10 **16" calculation formula Code: ;hi1:mid1:lo1*hi2:mid2:lo2=hi1*hi2+adc : hi1*mid2+mid1*hi2+adc : hi1*lo2+mid1*mid2+lo1*hi2+adc : highbitorcoupleforrouding(mid1*lo2+lo1*mid2) : ignore(lo1*lo2) revolution, can thou get example of how some float multiplied on value from that table that it became integer. |
|||
23 Nov 2023, 00:06 |
|
ProMiNick 23 Nov 2023, 07:15
by the way there mistake in rounding algorithm:
Code: .round_and_copy_result: add ebx,sizeof.TRIPLE_PRECISION bt [next_value.exponent-4*4],31 mov eax,[next_value.exponent-4*3] mov edx,[next_value.exponent-4*2] mov ecx,[next_value.exponent-4*1] adc eax,0 adc edx,0 adc ecx,0 mov [ebx+TRIPLE_PRECISION.mantissa_low],eax mov eax,[next_value.exponent-4*0] mov [ebx+TRIPLE_PRECISION.mantissa_mid],edx mov [ebx+TRIPLE_PRECISION.mantissa_high],ecx mov [ebx+TRIPLE_PRECISION.exponent],eax retn according to IEEE standard must be Code: .round_and_copy_result: add ebx,sizeof.TRIPLE_PRECISION bt [next_value.exponent-4*4],31 jnс .skip_accuracy_test bt [next_value.exponent-4*4],30 .skip_accuracy_test: mov eax,[next_value.exponent-4*3] mov edx,[next_value.exponent-4*2] mov ecx,[next_value.exponent-4*1] adc eax,0 adc edx,0 adc ecx,0 mov [ebx+TRIPLE_PRECISION.mantissa_low],eax mov eax,[next_value.exponent-4*0] mov [ebx+TRIPLE_PRECISION.mantissa_mid],edx mov [ebx+TRIPLE_PRECISION.mantissa_high],ecx mov [ebx+TRIPLE_PRECISION.exponent],eax retn without it float 0.1 rounded not accuracy - last significant digit greter by 1, and many many other number too than I more explore this code - than I more like it. |
|||
23 Nov 2023, 07:15 |
|
revolution 23 Nov 2023, 09:23
If you change the rounding then is it still possible to complete a full round trip num->text->num and always get back the exact same value?
I validated the code as given to comply with that requirement, so changing the internal operations would need revalidating. |
|||
23 Nov 2023, 09:23 |
|
Roman 23 Nov 2023, 09:24
ProMiNick and how using your code ?
Eax= 12.55 Show example. |
|||
23 Nov 2023, 09:24 |
|
ProMiNick 23 Nov 2023, 11:32
Roman, my code is that part:
Code: format PE GUI 4.0 entry start include 'win32a.inc' section '.text' code readable executable start: RTL_C ; cut off RTL_C for official fasm stdcall PRINTF,INVALID_HANDLE_VALUE,buffer,-1,szEnv,valu ;cinvoke sprintf,buffer,szEnv,[valu],[valu+4] invoke MessageBox,0,buffer,esp,0 invoke ExitProcess,0 flush_locals ; cut off flush_locals for official fasm section '.data' data readable writeable szEnv db "%0.20tf",0 ;db "%0.20f",0 valu dt 12.55 ;dq 12.55 buffer db "1 ",0 section '.idata' import data readable writeable library kernel32,'KERNEL32.DLL',\ user32,'USER32.DLL';,\ ;msvcrt,'msvcrt.dll' include 'api\kernel32.inc' include 'api\user32.inc' ;import msvcrt,\ ;sprintf,'sprintf' all the rest I reduced to float conversion needs only (but there is more to cut off) Code: DEFAULT_FLOAT_PRECISION = 6 DEFAULT_INTEGER_PRECISION = 1 DEFAULT_CHARACTER_PRECISION = 1 DEFAULT_STRING_PRECISION = -1 ;print all characters DEFAULT_FLOAT_SIZE = PRINTF_ELEMENTS_SIZE.dword DEFAULT_INTEGER_SIZE = PRINTF_ELEMENTS_SIZE.word ;word for 32-bit process MAXIMUM_CONVERSION_LENGTH = 96 ;enough space for triple sized binary at 96 bits MAXIMUM_EXPONENT_LENGTH = 8 ;a and A formats produce the longest ('p+16383') at 7 characters PRINTF_BUFFER_SIZE = 1 shl 12 PRINTF_CLASS_ZERO = 0 PRINTF_CLASS_NORMAL = 1 PRINTF_CLASS_INFINITY = 2 PRINTF_CLASS_SNAN = 3 PRINTF_CLASS_QNAN = 4 EXPONENT_LENGTH_FORMAT_A = 5 EXPONENT_LENGTH_FORMAT_E = 3 EXPONENT_LENGTH_FORMAT_G = 1 TRIPLE_PRECISION_LOG2_MAXIMUM_EXPONENT = 12 ;10^4096 enough for 80-bit extended reals up to 10^+-4933 TRIPLE_PRECISION_EXPONENT_TABLE_SCALE = 4 ;exponent reduction in bits per pass. supports using 1, 2, 3, 4, 6 or 12 only struct TRIPLE_PRECISION mantissa_low dd ? mantissa_mid dd ? mantissa_high dd ? exponent dd ? ends sizeof.TRIPLE_PRECISION.mantissa = 32*3 struct FLOAT_DESCRIPTION sign_bit_position db ? ;number of leading bits to skip to find the sign bit exponent_width db ? implicit_bit_flag db ? db ? ;align exponent_bias dd ? ends struct PRINTF_OUTPUT buffer dd ? length dd ? handle dd ? sent dd ? error dd ? ends ;elements of a number ; [leading spaces] [sign] [prefix] [padding zeros] [precision zeros] [number] [magnitude zeros] [exponent] [trailing spaces] ; with a decimal point placed somewhere within [precision zeros], [number] or [magnitude zeros] struct PRINTF_ELEMENTS flags dd ? ;as specified in the format string width dd ? ;as specified in the format string precision dd ? ;as specified in the format string sign_length dd ? ;0 or 1 prefix_length dd ? ;0 to 2 precision_zeros dd ? ;0 to many number_length dd ? ;0 to MAXIMUM_CONVERSION_LENGTH magnitude_zeros dd ? ;0 to many exponent_length dd ? ;0 to MAXIMUM_EXPONENT_LENGTH decimal_point dd ? ;0 to many. 0 means no decimal point, 1 means after the first digit significant_bits dd ? ;0 to 64. count of mantissa bits in a float written dd ? ;running count of total characters in output separator_modulus dd ? ;0, 3 or 4. 0 means no separators arg_value TRIPLE_PRECISION prefix_string rb 2 ;'0' or '0x' sign_character rb 1 ;'-', '+' or ' ' separator_character rb 1 ;',' or ' ' number_string rb MAXIMUM_CONVERSION_LENGTH exponent_string rb MAXIMUM_EXPONENT_LENGTH ends PRINTF_ELEMENTS_FLAG.argument_size_mask = 7 shl 0 ;byte, hword, word, dword, triple PRINTF_ELEMENTS_FLAG.left_justify = 1 shl 3 ;justify left PRINTF_ELEMENTS_FLAG.show_sign = 1 shl 4 ;prefix option: show + or - sign PRINTF_ELEMENTS_FLAG.blank_sign = 1 shl 5 ;prefix option: show <space> or - sign PRINTF_ELEMENTS_FLAG.hash_option = 1 shl 6 ;prefix option: print 0x (hex), 0 (octal) or decimal point (float) PRINTF_ELEMENTS_FLAG.zero_pad = 1 shl 7 ;pad left with zeros PRINTF_ELEMENTS_FLAG.uppercase = 1 shl 8 ;push to upper case PRINTF_ELEMENTS_FLAG.pointer = 1 shl 9 ;pointer to the argument, not the argument itself PRINTF_ELEMENTS_FLAG.separator = 1 shl 10 ;print separator characters (space or comma) within the number PRINTF_ELEMENTS_FLAG.size_specified = 1 shl 11 ;don't use default argument size PRINTF_ELEMENTS_FLAG.precision_specified= 1 shl 12 ;don't use default precision PRINTF_ELEMENTS_FLAG.negative = 1 shl 13 ;set if arg_value is negative PRINTF_ELEMENTS_FLAG.zero = 1 shl 14 ;set if arg_value is zero PRINTF_ELEMENTS_SIZE.byte = 0 ;8-bit PRINTF_ELEMENTS_SIZE.hword = 1 ;16-bit PRINTF_ELEMENTS_SIZE.word = 2 ;32-bit PRINTF_ELEMENTS_SIZE.dword = 3 ;64-bit PRINTF_ELEMENTS_SIZE.triple = 4 ;96/80-bit PRINTF_ELEMENTS_SIZE.pointer = -2 PRINTF_ELEMENTS_SIZE.null = -1 struct PRINTF_FLAG_TABLE char db ? flag dd ? ends struct PRINTF_DECODE_TABLE function dd ? default_size dd ? ends section '.datapf' data readable writeable if used PRINTF_flag_table PRINTF_flag_table: PRINTF_FLAG_TABLE '-',PRINTF_ELEMENTS_FLAG.left_justify PRINTF_FLAG_TABLE '+',PRINTF_ELEMENTS_FLAG.show_sign PRINTF_FLAG_TABLE ' ',PRINTF_ELEMENTS_FLAG.blank_sign PRINTF_FLAG_TABLE '#',PRINTF_ELEMENTS_FLAG.hash_option PRINTF_FLAG_TABLE '0',PRINTF_ELEMENTS_FLAG.zero_pad PRINTF_FLAG_TABLE '^',PRINTF_ELEMENTS_FLAG.pointer PRINTF_FLAG_TABLE ',',PRINTF_ELEMENTS_FLAG.separator PRINTF_FLAG_TABLE 0 end if if used PRINTF_size_table PRINTF_size_table: PRINTF_FLAG_TABLE 'b',PRINTF_ELEMENTS_SIZE.byte + PRINTF_ELEMENTS_FLAG.size_specified PRINTF_FLAG_TABLE 'h',PRINTF_ELEMENTS_SIZE.hword + PRINTF_ELEMENTS_FLAG.size_specified PRINTF_FLAG_TABLE 'w',PRINTF_ELEMENTS_SIZE.word + PRINTF_ELEMENTS_FLAG.size_specified PRINTF_FLAG_TABLE 'd',PRINTF_ELEMENTS_SIZE.dword + PRINTF_ELEMENTS_FLAG.size_specified PRINTF_FLAG_TABLE 't',PRINTF_ELEMENTS_SIZE.triple + PRINTF_ELEMENTS_FLAG.size_specified PRINTF_FLAG_TABLE 0 end if if used PRINTF_decode_table align 4 PRINTF_decode_table: PRINTF_DECODE_TABLE PRINTF_decode_a,DEFAULT_FLOAT_SIZE PRINTF_DECODE_TABLE PRINTF_decode_b,DEFAULT_INTEGER_SIZE PRINTF_DECODE_TABLE PRINTF_decode_c,PRINTF_ELEMENTS_SIZE.byte PRINTF_DECODE_TABLE PRINTF_decode_d,PRINTF_ELEMENTS_SIZE.null PRINTF_DECODE_TABLE PRINTF_decode_e,DEFAULT_FLOAT_SIZE PRINTF_DECODE_TABLE PRINTF_decode_f,DEFAULT_FLOAT_SIZE PRINTF_DECODE_TABLE PRINTF_decode_g,DEFAULT_FLOAT_SIZE PRINTF_DECODE_TABLE PRINTF_decode_h,PRINTF_ELEMENTS_SIZE.null PRINTF_DECODE_TABLE PRINTF_decode_i,DEFAULT_INTEGER_SIZE PRINTF_DECODE_TABLE PRINTF_decode_j,PRINTF_ELEMENTS_SIZE.null PRINTF_DECODE_TABLE PRINTF_decode_k,PRINTF_ELEMENTS_SIZE.null PRINTF_DECODE_TABLE PRINTF_decode_l,PRINTF_ELEMENTS_SIZE.null PRINTF_DECODE_TABLE PRINTF_decode_m,PRINTF_ELEMENTS_SIZE.null PRINTF_DECODE_TABLE PRINTF_decode_n,DEFAULT_INTEGER_SIZE PRINTF_DECODE_TABLE PRINTF_decode_o,DEFAULT_INTEGER_SIZE PRINTF_DECODE_TABLE PRINTF_decode_p,PRINTF_ELEMENTS_SIZE.pointer PRINTF_DECODE_TABLE PRINTF_decode_q,DEFAULT_INTEGER_SIZE PRINTF_DECODE_TABLE PRINTF_decode_r,DEFAULT_FLOAT_SIZE PRINTF_DECODE_TABLE PRINTF_decode_s,DEFAULT_INTEGER_SIZE PRINTF_DECODE_TABLE PRINTF_decode_t,PRINTF_ELEMENTS_SIZE.null PRINTF_DECODE_TABLE PRINTF_decode_u,DEFAULT_INTEGER_SIZE PRINTF_DECODE_TABLE PRINTF_decode_v,PRINTF_ELEMENTS_SIZE.null PRINTF_DECODE_TABLE PRINTF_decode_w,PRINTF_ELEMENTS_SIZE.null PRINTF_DECODE_TABLE PRINTF_decode_x,DEFAULT_INTEGER_SIZE PRINTF_DECODE_TABLE PRINTF_decode_y,PRINTF_ELEMENTS_SIZE.null PRINTF_DECODE_TABLE PRINTF_decode_z,PRINTF_ELEMENTS_SIZE.null end if if used float_description_table align 8 float_description_table: FLOAT_DESCRIPTION sizeof.TRIPLE_PRECISION.mantissa- 8, 4,-1,,1 shl 3 - 2 ;1. 4. 3 format FLOAT_DESCRIPTION sizeof.TRIPLE_PRECISION.mantissa-16, 5,-1,,1 shl 4 - 2 ;1. 5.10 format FLOAT_DESCRIPTION sizeof.TRIPLE_PRECISION.mantissa-32, 8,-1,,1 shl 7 - 2 ;1. 8.23 format FLOAT_DESCRIPTION sizeof.TRIPLE_PRECISION.mantissa-64,11,-1,,1 shl 10 - 2 ;1.11.52 format FLOAT_DESCRIPTION sizeof.TRIPLE_PRECISION.mantissa-80,15, 0,,1 shl 14 - 2 ;1.15.63 format + explicit mantissa MSb end if if used PRINTF_integer_base_10_multiples_32 align 4 PRINTF_integer_base_10_multiples_32: dd 1-1 dd 10-1 dd 100-1 dd 1000-1 dd 10000-1 dd 100000-1 dd 1000000-1 dd 10000000-1 dd 100000000-1 dd 1000000000-1 dd -1 PRINTF_integer_base_10_multiples_64: dq 10000000000-1 dq 100000000000-1 dq 1000000000000-1 dq 10000000000000-1 dq 100000000000000-1 dq 1000000000000000-1 dq 10000000000000000-1 dq 100000000000000000-1 dq 1000000000000000000-1 dq 10000000000000000000-1 dq -1 PRINTF_integer_base_10_multiples_96: .l = 10000000000000000000 and (1 shl 32 - 1) .h = 10000000000000000000 shr 32 dd (10 * .l) and (1 shl 32 - 1)-1, (10 * .l) shr 32 + (10 * .h) and (1 shl 32 - 1), (10 * .h) shr 32 dd (100 * .l) and (1 shl 32 - 1)-1, (100 * .l) shr 32 + (100 * .h) and (1 shl 32 - 1), (100 * .h) shr 32 dd (1000 * .l) and (1 shl 32 - 1)-1, (1000 * .l) shr 32 + (1000 * .h) and (1 shl 32 - 1), (1000 * .h) shr 32 dd (10000 * .l) and (1 shl 32 - 1)-1, (10000 * .l) shr 32 + (10000 * .h) and (1 shl 32 - 1), (10000 * .h) shr 32 dd (100000 * .l) and (1 shl 32 - 1)-1, (100000 * .l) shr 32 + (100000 * .h) and (1 shl 32 - 1), (100000 * .h) shr 32 dd (1000000 * .l) and (1 shl 32 - 1)-1, (1000000 * .l) shr 32 + (1000000 * .h) and (1 shl 32 - 1), (1000000 * .h) shr 32 dd (10000000 * .l) and (1 shl 32 - 1)-1, (10000000 * .l) shr 32 + (10000000 * .h) and (1 shl 32 - 1), (10000000 * .h) shr 32 dd (100000000 * .l) and (1 shl 32 - 1)-1, (100000000 * .l) shr 32 + (100000000 * .h) and (1 shl 32 - 1), (100000000 * .h) shr 32 dd (1000000000 * .l) and (1 shl 32 - 1)-1, (1000000000 * .l) shr 32 + (1000000000 * .h) and (1 shl 32 - 1), (1000000000 * .h) shr 32 dd -1,-1,-1 end if if used PRINTF_integer_reciprocals align 4 PRINTF_integer_reciprocals: dd (1 shl 24 + 0)/1 ;ceiling(2^24/1) binary dd (1 shl 24 + 1)/2 ;ceiling(2^24/2) quaternary dd (1 shl 24 + 2)/3 ;ceiling(2^24/3) octal dd (1 shl 24 + 3)/4 ;ceiling(2^24/4) hexadecimal dd (1 shl 24 + 4)/5 ;ceiling(2^24/5) base-32 end if if used PRINTF_triple_precision_power_table align 16 PRINTF_triple_precision_power_table rb TRIPLE_PRECISION_EXPONENT_TABLE_SIZE*sizeof.TRIPLE_PRECISION end if section '.codepf' code executable PRINTF_PROC_FLAG_RESTORE_ESP = 1 shl 32 PRINTF_PROC_FLAG_RESTORE_EBP = 1 shl 33 macro proc_leaf [args] { common prologue@proc equ PRINTF_prologue_leaf proc args restore prologue@proc } macro PRINTF_prologue_leaf procname,flag,parambytes,localbytes,reglist { PRINTF_prologue procname,flag,parambytes,localbytes,reglist,1 } macro PRINTF_prologue procname,flag,parambytes,localbytes,reglist,leaf { local varsize,regsize varsize = (localbytes + 3) and (not 3) match x,leaf \{ localbase@proc equ esp-varsize parmbase@proc equ esp+4+regsize rept 0 \{ \} rept 1 \{ localbase@proc equ ebp-varsize parmbase@proc equ ebp+4+regsize \} regsize = 0 irps reg,reglist \{ push reg regsize = regsize + 4 \} if (parambytes | localbytes) & ~ leaf+0 regsize = regsize + 4 flag = flag or PRINTF_PROC_FLAG_RESTORE_EBP push ebp mov ebp,esp if localbytes flag = flag or PRINTF_PROC_FLAG_RESTORE_ESP add esp,-varsize end if end if } macro PRINTF_epilogue procname,flag,parambytes,localbytes,reglist { if flag and PRINTF_PROC_FLAG_RESTORE_ESP sub esp,-((localbytes + 3) and (not 3)) end if if flag and PRINTF_PROC_FLAG_RESTORE_EBP pop ebp end if irps reg,reglist \{ reverse pop reg \} if flag and 10000b retn ;c call else retn parambytes ;standard call end if } prologue@proc equ PRINTF_prologue epilogue@proc equ PRINTF_epilogue proc PRINTF uses ebx esi edi,handle,dest,size,format,arglist locals output PRINTF_OUTPUT elements PRINTF_ELEMENTS endl mov edx,[dest] mov ecx,[size] mov eax,[handle] mov esi,[format] xor ebx,ebx mov [output.buffer],edx mov [output.length],ecx mov [output.handle],eax mov [output.sent],ebx mov [output.error],ebx .next: xor eax,eax lea edi,[elements] mov ecx,sizeof.PRINTF_ELEMENTS/4 rep stosd .next_char: movzx eax,byte[esi] inc esi test eax,eax jz .done cmp al,'%' jz .process_flags .print_char: stdcall PRINTF_elements_print_character,addr elements,eax,addr output jmp .next_char .process_flags: movzx eax,byte[esi] inc esi test eax,eax jz .done .next_flag: mov ebx,PRINTF_flag_table call .convert_table test eax,eax jz .done cmp [ebx+PRINTF_FLAG_TABLE.char],0 jnz .next_flag .process_width: cmp al,'*' jz .width_from_argument cmp al,'1' jb .process_precision cmp al,'9' ja .process_precision .width_from_argument: call .convert_decimal mov [elements.width],edi test eax,eax jz .done .process_precision: cmp al,'.' jnz .process_size movzx eax,byte[esi] inc esi test eax,eax jz .done xor edi,edi cmp al,'*' jz .precision_from_argument cmp al,'0' jb .set_precision cmp al,'9' ja .set_precision .precision_from_argument: call .convert_decimal .set_precision: mov [elements.precision],edi or [elements.flags],PRINTF_ELEMENTS_FLAG.precision_specified test eax,eax jz .done .process_size: mov ebx,PRINTF_size_table call .convert_table test eax,eax jz .done .process_decode: movzx ebx,al cmp bl,'A' jbe .check_type cmp bl,'Z' ja .check_type add bl,'a'-'A' or [elements.flags],PRINTF_ELEMENTS_FLAG.uppercase .check_type: cmp bl,'a' jb .print_char cmp bl,'z' ja .print_char ;compute current output length mov edi,[output.buffer] sub edi,[dest] mov [elements.written],edi ;get the argument value stdcall PRINTF_read_arg,addr elements.arg_value,[arglist],\ [PRINTF_decode_table+(ebx-'a')*sizeof.PRINTF_DECODE_TABLE+PRINTF_DECODE_TABLE.default_size],\ [elements.flags] or [elements.flags],edx ;set the size mov [arglist],ecx ;call the decoder mov eax,ebx ;we pass the character type to the function in eax (for printing the unknown type) stdcall [PRINTF_decode_table+(ebx-'a')*sizeof.PRINTF_DECODE_TABLE+PRINTF_DECODE_TABLE.function],addr elements,addr output jmp .next .done: cmp [output.handle],INVALID_HANDLE_VALUE jne .flush stdcall PRINTF_elements_print_character,addr elements,0,addr output mov edx,[output.buffer] cmp [output.length],0 jnz .return_no_overflow mov eax,edx sub eax,[dest] cmp eax,[size] jz .return_no_overflow sub eax,[size] neg eax ;return negative the number of extra bytes required cmp [size],0 jz .ret mov byte[edx+eax-1],0 ;always return a null terminated string .ret: ret .return_no_overflow: lea eax,[edx-1] ;don't include the terminating null sub eax,[dest] ;return number of characters stored ret .flush: ;stdcall PRINTF_flush_buffer,addr output mov eax,[output.sent] ;return number of characters output mov ecx,[output.error] ;return non-zero if there was an error ret .convert_table: cmp [ebx+PRINTF_FLAG_TABLE.char],0 jz .table_done cmp [ebx+PRINTF_FLAG_TABLE.char],al jz .set_flag add ebx,sizeof.PRINTF_FLAG_TABLE jmp .convert_table .set_flag: mov eax,[ebx+PRINTF_FLAG_TABLE.flag] or [elements.flags],eax movzx eax,byte[esi] inc esi .table_done: retn .convert_decimal: cmp al,'*' jz .decimal_from_argument lea edi,[eax-'0'] .decimal_next: movzx eax,byte[esi] inc esi test eax,eax jz .decimal_last sub eax,'0' cmp eax,9 ja .decimal_done lea edi,[edi*5] lea edi,[edi*2+eax] jmp .decimal_next .decimal_from_argument: mov eax,[arglist] mov edi,[eax] add eax,4 mov [arglist],eax movzx eax,byte[esi] inc esi jmp .decimal_last .decimal_done: add eax,'0' .decimal_last: retn endp proc_leaf PRINTF_read_arg uses edi esi ebx,dest,arg_pointer,default_size,flags ;return ecx=new arg pointer, edx=argument size mov edx,[default_size] mov ecx,[arg_pointer] cmp edx,PRINTF_ELEMENTS_SIZE.null jz .null mov ebx,[flags] mov edi,[dest] mov esi,ecx add ecx,1 shl DEFAULT_INTEGER_SIZE test ebx,PRINTF_ELEMENTS_FLAG.pointer jz .address_known mov esi,[esi] .address_known: cmp edx,PRINTF_ELEMENTS_SIZE.pointer jz .pointer test ebx,PRINTF_ELEMENTS_FLAG.size_specified jz .size_known mov edx,ebx and edx,PRINTF_ELEMENTS_FLAG.argument_size_mask .size_known: push edx cmp edx,PRINTF_ELEMENTS_SIZE.byte jz .read_byte cmp edx,PRINTF_ELEMENTS_SIZE.hword jz .read_hword cmp edx,PRINTF_ELEMENTS_SIZE.word jz .read_word cmp edx,PRINTF_ELEMENTS_SIZE.dword jz .read_dword cmp edx,PRINTF_ELEMENTS_SIZE.triple jz .read_triple int3 .read_byte: movzx eax,byte[esi] jmp .store_word .read_hword: movzx eax,word[esi] jmp .store_word .read_word: mov eax,[esi] jmp .store_word .read_dword: mov eax,[esi] mov edx,[esi+4] jmp .store_dword .read_triple: mov eax,[esi] mov edx,[esi+4] mov ebx,[esi+8] mov [edi+8],ebx add esi,4 .store_dword: mov [edi+4],edx add esi,4 .store_word: mov [edi+0],eax add esi,4 pop edx test [flags],PRINTF_ELEMENTS_FLAG.pointer jnz .done mov ecx,esi .done: ret .pointer: mov [edi+0],esi .null: xor edx,edx ;return no size value ret endp PRINTF_decode_a = PRINTF_decode_unknown PRINTF_decode_b = PRINTF_decode_unknown PRINTF_decode_c = PRINTF_decode_unknown PRINTF_decode_d = PRINTF_decode_unknown PRINTF_decode_e = PRINTF_decode_unknown proc PRINTF_decode_f uses ebx,elements,output ;output a decimal floating point number "-dddd.dddddd" ;digits are in decimal ;the precision field specifies the number of digits after the decimal point mov ebx,[elements] stdcall PRINTF_get_float_parameters,ebx,-1 test edx,edx jnz .print stdcall PRINTF_generate_f,ebx,addr ebx+PRINTF_ELEMENTS.arg_value,eax .print: stdcall PRINTF_elements_print,ebx,[output] ret endp PRINTF_decode_g = PRINTF_decode_unknown PRINTF_decode_h = PRINTF_decode_unknown PRINTF_decode_i = PRINTF_decode_unknown PRINTF_decode_j = PRINTF_decode_unknown PRINTF_decode_k = PRINTF_decode_unknown PRINTF_decode_l = PRINTF_decode_unknown PRINTF_decode_m = PRINTF_decode_unknown PRINTF_decode_n = PRINTF_decode_unknown PRINTF_decode_o = PRINTF_decode_unknown PRINTF_decode_p = PRINTF_decode_unknown PRINTF_decode_q = PRINTF_decode_unknown PRINTF_decode_r = PRINTF_decode_unknown PRINTF_decode_s = PRINTF_decode_unknown PRINTF_decode_t = PRINTF_decode_unknown PRINTF_decode_u = PRINTF_decode_unknown PRINTF_decode_v = PRINTF_decode_unknown PRINTF_decode_w = PRINTF_decode_unknown PRINTF_decode_x = PRINTF_decode_unknown PRINTF_decode_y = PRINTF_decode_unknown PRINTF_decode_z = PRINTF_decode_unknown proc PRINTF_decode_unknown elements,output ;the unknown character type is in eax ;output the type character ;this also allows to output a single % symbol by placing a pair (%%) in the format string stdcall PRINTF_elements_print_character,[elements],eax,[output] ret endp macro PRINTF_ceiling_log2_10 reg,bits,offset { ;computes: bits * log10(2) + roundup + offset ;result is valid for input values up to 2620 bits imul reg,bits,631306 ;ceiling(2^21 * log10(2)) add reg,(offset+1) shl 21 - 1 ;round up and add the offset shr reg,21 } proc PRINTF_generate_f uses ebx,elements,value,log10 ;return eax=length of printed number locals expected_size dd ? scaled_value TRIPLE_PRECISION significant_decimal_digits_m1 dd ? endl mov ebx,[elements] test [ebx+PRINTF_ELEMENTS.flags],PRINTF_ELEMENTS_FLAG.separator jz .separator_okay mov [ebx+PRINTF_ELEMENTS.separator_character],',' mov [ebx+PRINTF_ELEMENTS.separator_modulus],3 .separator_okay: PRINTF_ceiling_log2_10 eax,[ebx+PRINTF_ELEMENTS.significant_bits],1 ;+1 for last digit distinction dec eax mov [significant_decimal_digits_m1],eax mov eax,[log10] mov edx,[ebx+PRINTF_ELEMENTS.precision] ;eax=log10 ;edx=specified precision test [ebx+PRINTF_ELEMENTS.flags],PRINTF_ELEMENTS_FLAG.zero jz .non_zero mov eax,1 shl 31 ;the log of zero is extremely tiny .non_zero: test eax,eax jns .positive_log mov ecx,eax neg ecx cmp ecx,edx jbe .store_precision_zeros mov ecx,edx .store_precision_zeros: mov [ebx+PRINTF_ELEMENTS.precision_zeros],ecx mov [ebx+PRINTF_ELEMENTS.decimal_point],1 jmp .decimal_point_done .positive_log: ;set the decimal point position lea ecx,[eax+1] mov [ebx+PRINTF_ELEMENTS.decimal_point],ecx ;check if a decimal is printed test edx,edx ;non-zero precisions always have a decimal point jnz .decimal_point_done test [ebx+PRINTF_ELEMENTS.flags],PRINTF_ELEMENTS_FLAG.hash_option jnz .decimal_point_done xor ecx,ecx mov [ebx+PRINTF_ELEMENTS.decimal_point],ecx .decimal_point_done: add eax,edx ;compute most significant digit lea ecx,[eax+1] jns .store_expected_size xor ecx,ecx .store_expected_size: mov [expected_size],ecx test ecx,ecx jnz .compute_trailing_zeros mov ecx,[ebx+PRINTF_ELEMENTS.precision_zeros] inc ecx mov [ebx+PRINTF_ELEMENTS.precision_zeros],ecx cmp ecx,1 jnz .scale test [ebx+PRINTF_ELEMENTS.flags],PRINTF_ELEMENTS_FLAG.hash_option jnz .scale mov [ebx+PRINTF_ELEMENTS.decimal_point],0 jmp .scale .compute_trailing_zeros: ;find the number of trailing zeros to print sub eax,[significant_decimal_digits_m1] jle .scale sub edx,eax mov [ebx+PRINTF_ELEMENTS.magnitude_zeros],eax sub [expected_size],eax .scale: ;scale to fit stdcall PRINTF_triple_precision_scale,addr scaled_value,[value],edx stdcall PRINTF_decimalise_unsigned,addr ebx+PRINTF_ELEMENTS.number_string,addr scaled_value mov [ebx+PRINTF_ELEMENTS.number_length],eax sub eax,[expected_size] je .conversion_okay add [log10],eax ;if the number was rounded up (e.g. 99.9 to 100) then adjust mov ecx,[ebx+PRINTF_ELEMENTS.precision_zeros] sub ecx,1 jnc .set_new_leading_zeros mov ecx,[ebx+PRINTF_ELEMENTS.decimal_point] test ecx,ecx jz .conversion_okay inc ecx mov [ebx+PRINTF_ELEMENTS.decimal_point],ecx jmp .conversion_okay .set_new_leading_zeros: mov [ebx+PRINTF_ELEMENTS.precision_zeros],ecx .conversion_okay: mov eax,[ebx+PRINTF_ELEMENTS.number_length] ret endp proc PRINTF_get_float_parameters uses ebx,elements,log10_flag ;return eax=log10(value), edx=conversion done flag (for NaN and infinity) ;upscales the argument ;sets the negative flag ;sets the precision ;converts infinity and NaN to standard formats mov ebx,[elements] mov edx,[ebx+PRINTF_ELEMENTS.flags] and edx,PRINTF_ELEMENTS_FLAG.argument_size_mask lea edx,[edx*sizeof.FLOAT_DESCRIPTION+float_description_table] lea ecx,[ebx+PRINTF_ELEMENTS.arg_value] stdcall PRINTF_triple_precision_upscale,ecx,ecx,edx and ecx,PRINTF_ELEMENTS_FLAG.negative or [ebx+PRINTF_ELEMENTS.flags],ecx mov [ebx+PRINTF_ELEMENTS.significant_bits],edx cmp eax,PRINTF_CLASS_INFINITY je .infinity cmp eax,PRINTF_CLASS_SNAN je .SNaN cmp eax,PRINTF_CLASS_QNAN je .QNaN cmp eax,PRINTF_CLASS_ZERO je .zero cmp [log10_flag],0 jz .log10_okay stdcall PRINTF_triple_precision_floor_log10,addr ebx+PRINTF_ELEMENTS.arg_value .log10_okay: ;set the precision test [ebx+PRINTF_ELEMENTS.flags],PRINTF_ELEMENTS_FLAG.precision_specified jnz .precision_okay mov [ebx+PRINTF_ELEMENTS.precision],DEFAULT_FLOAT_PRECISION .precision_okay: xor edx,edx ;indicate not yet converted .done: ret .zero: or [ebx+PRINTF_ELEMENTS.flags],PRINTF_ELEMENTS_FLAG.zero jmp .log10_okay .infinity: mov eax,3 mov dword[ebx+PRINTF_ELEMENTS.number_string],'inf' jmp .converted .SNaN: mov ecx,'SNaN' jmp .NaN .QNaN: mov ecx,'QNaN' .NaN: mov dword[ebx+PRINTF_ELEMENTS.number_string+0],ecx mov dword[ebx+PRINTF_ELEMENTS.number_string+4],'0x0' ;stdcall PRINTF_base2_radix_unsigned,addr ebx+PRINTF_ELEMENTS.number_string+6,addr ebx+PRINTF_ELEMENTS.arg_value,16 cmp eax,1 adc eax,6 .converted: mov [ebx+PRINTF_ELEMENTS.number_length],eax or edx,-1 ;indicate that conversion is complete jmp .done endp proc_leaf PRINTF_triple_precision_upscale uses ebp edi esi ebx,dest,source,description ;returns eax=number class, ecx=negative flag, edx=number of significant bits ;convert incoming float to triple precision format locals negative_flag dd ? number_class dd ? endl mov esi,[description] mov ebp,[source] mov edx,sizeof.TRIPLE_PRECISION.mantissa ;edx = number of significant bits mov [number_class],PRINTF_CLASS_NORMAL movzx ecx,[esi+FLOAT_DESCRIPTION.sign_bit_position] mov eax,[ebp+0] mov ebx,[ebp+4] mov ebp,[ebp+8] ;shift from 16 to 88 bits left sub ecx,32 jb .shift_in_sign .find_sign: sub edx,32 mov ebp,ebx mov ebx,eax xor eax,eax sub ecx,32 jae .find_sign .shift_in_sign: and ecx,0x1f sub edx,ecx shld ebp,ebx,cl shld ebx,eax,cl shl eax,cl ;extract the sign dec edx add eax,eax adc ebx,ebx adc ebp,ebp sbb ecx,ecx mov [negative_flag],ecx ;extract the exponent movzx ecx,[esi+FLOAT_DESCRIPTION.exponent_width] sub edx,ecx xor edi,edi shld edi,ebp,cl ;check for infinity and NaN inc edi shr edi,cl jnz .infinity_NaN shld edi,ebp,cl shld ebp,ebx,cl shld ebx,eax,cl shl eax,cl test edi,edi jnz .normal ;check for zero mov ecx,ebp or ecx,ebx or ecx,eax jz .zero ;adjust denormal exponent cmp [esi+FLOAT_DESCRIPTION.implicit_bit_flag],0 setz cl movzx ecx,cl add edi,ecx ;restore exponent extra jmp .implied_bit_okay .normal: ;add the implied bit cmp [esi+FLOAT_DESCRIPTION.implicit_bit_flag],0 jz .implied_bit_okay inc edx shrd eax,ebx,1 shrd ebx,ebp,1 stc rcr ebp,1 .implied_bit_okay: sub edi,[esi+FLOAT_DESCRIPTION.exponent_bias] ;normalise test ebp,ebp js .normalised bsr ecx,ebp jnz .shift_in_denormal ;check for unnormal zero mov ecx,ebp or ecx,ebx or ecx,eax jz .unnormal_zero .macro_shift_denormal: sub edi,32 sub edx,32 mov ebp,ebx mov ebx,eax xor eax,eax bsr ecx,ebp jz .macro_shift_denormal .shift_in_denormal: not ecx and ecx,0x1f sub edi,ecx sub edx,ecx shld ebp,ebx,cl shld ebx,eax,cl shl eax,cl .normalised: mov ecx,[dest] mov [ecx+TRIPLE_PRECISION.exponent],edi mov [ecx+TRIPLE_PRECISION.mantissa_high],ebp mov [ecx+TRIPLE_PRECISION.mantissa_mid],ebx mov [ecx+TRIPLE_PRECISION.mantissa_low],eax mov ecx,[negative_flag] mov eax,[number_class] ret .unnormal_zero: xor edi,edi .zero: xor edx,edx mov [number_class],PRINTF_CLASS_ZERO jmp .normalised .infinity_NaN: xor edi,edi mov [number_class],PRINTF_CLASS_INFINITY shld ebp,ebx,cl shld ebx,eax,cl shl eax,cl ;if there is an explicit bit then we shift it out and ignore it ;this means all pseudo NaNs and pseudo infinity become normal NaNs and normal infinity cmp [esi+FLOAT_DESCRIPTION.implicit_bit_flag],0 jnz .explicit_bit_okay shld ebp,ebx,1 shld ebx,eax,1 shl eax,1 dec edx .explicit_bit_okay: ;check for infinity mov ecx,ebp or ecx,ebx or ecx,eax xchg edx,ecx jz .normalised dec ecx mov edx,ecx ;get the Q/S bit btr ebp,31 mov esi,PRINTF_CLASS_QNAN jc .NaN_class_okay mov esi,PRINTF_CLASS_SNAN .NaN_class_okay: mov [number_class],esi ;shift back to the LSb sub ecx,sizeof.TRIPLE_PRECISION.mantissa not ecx sub ecx,32 jb .final_shift_back_mantissa .shift_back_mantissa: mov eax,ebx mov ebx,ebp xor ebp,ebp sub ecx,32 jae .shift_back_mantissa .final_shift_back_mantissa: shrd eax,ebx,cl shrd ebx,ebp,cl shr ebp,cl jmp .normalised endp proc PRINTF_triple_precision_floor_log10 uses esi ebx,source ;compute eax=floor(log10(source)) locals temp_power TRIPLE_PRECISION endl mov esi,[source] mov eax,0x4D104D43 ;ceiling(2^32 * log2(10)) imul dword[esi+TRIPLE_PRECISION.exponent] ;edx=approximate log. always equal to, or one higher, than the required value mov ebx,edx stdcall PRINTF_get_triple_precision_10_power_y,addr temp_power,0,edx assert sizeof.TRIPLE_PRECISION = 16 mov eax,[esi+4*0] mov ecx,[esi+4*1] mov edx,[esi+4*2] mov esi,[esi+4*3] sub eax,[temp_power+4*0] sbb ecx,[temp_power+4*1] sbb edx,[temp_power+4*2] sbb esi,[temp_power+4*3] setl cl ;adjust log to the correct value movzx ecx,cl neg ecx lea eax,[ebx+ecx] ret endp proc PRINTF_triple_precision_scale uses ebx esi,dest,source,scale ;compute dest=round(source*10^scale) locals scaled_value TRIPLE_PRECISION endl stdcall PRINTF_get_triple_precision_10_power_y,addr scaled_value,[source],[scale] ;integerise mov ecx,[scaled_value.exponent] xor edx,edx xor ebx,ebx xor eax,eax test ecx,ecx js .store mov edx,[scaled_value.mantissa_high] mov ebx,[scaled_value.mantissa_mid] mov eax,[scaled_value.mantissa_low] sub ecx,sizeof.TRIPLE_PRECISION.mantissa ;jg .overflow jge .store xor esi,esi neg ecx sub ecx,32 jb .final_shift .shifter_loop: mov esi,eax mov eax,ebx mov ebx,edx xor edx,edx sub ecx,32 jae .shifter_loop .final_shift: shrd esi,eax,cl shrd eax,ebx,cl shrd ebx,edx,cl shr edx,cl ;round up add esi,esi adc eax,0 adc ebx,0 adc edx,0 .store: mov esi,[dest] mov [esi+TRIPLE_PRECISION.mantissa_low],eax mov [esi+TRIPLE_PRECISION.mantissa_mid],ebx mov [esi+TRIPLE_PRECISION.mantissa_high],edx ret endp TRIPLE_PRECISION_EXPONENT_TABLE_PASSES = (TRIPLE_PRECISION_LOG2_MAXIMUM_EXPONENT+TRIPLE_PRECISION_EXPONENT_TABLE_SCALE)/\ TRIPLE_PRECISION_EXPONENT_TABLE_SCALE TRIPLE_PRECISION_EXPONENT_LAST_PASS_BITS= 1 + TRIPLE_PRECISION_LOG2_MAXIMUM_EXPONENT - \ TRIPLE_PRECISION_EXPONENT_TABLE_SCALE * (TRIPLE_PRECISION_EXPONENT_TABLE_PASSES - 1) TRIPLE_PRECISION_EXPONENT_TABLE_LENGTH = ((1 shl TRIPLE_PRECISION_EXPONENT_TABLE_SCALE - 1) * \ (TRIPLE_PRECISION_EXPONENT_TABLE_PASSES - 1) + \ (1 shl TRIPLE_PRECISION_EXPONENT_LAST_PASS_BITS - 1)) TRIPLE_PRECISION_EXPONENT_TABLE_SIZE = 1 + 2 * TRIPLE_PRECISION_EXPONENT_TABLE_LENGTH TRIPLE_PRECISION_EXPONENT_10_TO_M4096 = 0xffffcada ;the last value written to the power table, to ensure this is thread safe proc PRINTF_get_triple_precision_10_power_y uses edi esi ebx,dest,source,y ;compute dest=source*10^y mov esi,PRINTF_triple_precision_power_table mov ebx,[source] mov edi,[dest] stdcall PRINTF_check_triple_precision_power_table,esi ;either start with 10^0 in esi, ... test ebx,ebx jz .copy ;... or start with the source value mov esi,ebx .copy: mov eax,[esi+TRIPLE_PRECISION.mantissa_low] mov ecx,[esi+TRIPLE_PRECISION.mantissa_mid] mov edx,[esi+TRIPLE_PRECISION.mantissa_high] mov ebx,[esi+TRIPLE_PRECISION.exponent] mov [edi+TRIPLE_PRECISION.mantissa_low],eax mov [edi+TRIPLE_PRECISION.mantissa_mid],ecx mov [edi+TRIPLE_PRECISION.mantissa_high],edx mov [edi+TRIPLE_PRECISION.exponent],ebx mov ebx,[y] mov esi,PRINTF_triple_precision_power_table ;10^1 test ebx,ebx jz .done jns .raise_loop add esi,TRIPLE_PRECISION_EXPONENT_TABLE_LENGTH * sizeof.TRIPLE_PRECISION ;10^-1 neg ebx .raise_loop: mov eax,ebx and eax,1 shl TRIPLE_PRECISION_EXPONENT_TABLE_SCALE - 1 jz .raise_next assert sizeof.TRIPLE_PRECISION=16 shl eax,4 stdcall PRINTF_variable_precision_mul,edi,edi,addr eax+esi,sizeof.TRIPLE_PRECISION.mantissa / 32 .raise_next: add esi,(1 shl TRIPLE_PRECISION_EXPONENT_TABLE_SCALE - 1) * sizeof.TRIPLE_PRECISION shr ebx,TRIPLE_PRECISION_EXPONENT_TABLE_SCALE jnz .raise_loop .done: ret endp struct QUAD_PRECISION mantissa rd sizeof.TRIPLE_PRECISION.mantissa / 32 + 1 exponent rd 1 ends sizeof.QUAD_PRECISION.mantissa = sizeof.TRIPLE_PRECISION.mantissa + 32 proc_leaf PRINTF_check_triple_precision_power_table table mov eax,[table] cmp [eax+(TRIPLE_PRECISION_EXPONENT_TABLE_SIZE-1)*sizeof.TRIPLE_PRECISION+TRIPLE_PRECISION.exponent],TRIPLE_PRECISION_EXPONENT_10_TO_M4096 jnz PRINTF_make_triple_precision_power_table ret endp proc PRINTF_make_triple_precision_power_table uses edi esi ebx,table locals base_multiplier QUAD_PRECISION next_value QUAD_PRECISION endl assert TRIPLE_PRECISION_EXPONENT_LAST_PASS_BITS = 1 mov ebx,[table] ;start with 10^0 at the beginning xor ecx,ecx mov [ebx+TRIPLE_PRECISION.mantissa_low],ecx mov [ebx+TRIPLE_PRECISION.mantissa_mid],ecx mov [ebx+TRIPLE_PRECISION.mantissa_high],1 shl 31 mov [ebx+TRIPLE_PRECISION.exponent],+1 ;then the first block starts with 10^1 repeat sizeof.QUAD_PRECISION.mantissa / 32 - 1 mov [next_value.mantissa + 4 * (%-1)],ecx end repeat mov [next_value.mantissa + 4 * (sizeof.QUAD_PRECISION.mantissa / 32 - 1)],10 shl 28 mov [next_value.exponent],+4 call .build ;then the next block starts with 10^-1 mov ecx,0xcccccccd mov [next_value.mantissa + 4 * 0],ecx dec ecx repeat sizeof.QUAD_PRECISION.mantissa / 32 - 1 mov [next_value.mantissa + 4 * %],ecx end repeat mov [next_value.exponent],-3 call .build ret .build: call .round_and_copy_result mov edi,TRIPLE_PRECISION_EXPONENT_TABLE_PASSES - 1 .build_scale_loop: ;transfer next value to base mov edx,edi lea esi,[next_value] lea edi,[base_multiplier] mov ecx,sizeof.QUAD_PRECISION / 4 rep movsd mov edi,edx mov esi,1 shl TRIPLE_PRECISION_EXPONENT_TABLE_SCALE - 1 .build_bit_loop: lea ecx,[next_value] stdcall PRINTF_variable_precision_mul,ecx,ecx,addr base_multiplier,sizeof.QUAD_PRECISION.mantissa / 32 call .round_and_copy_result dec esi jnz .build_bit_loop dec edi jnz .build_scale_loop retn .round_and_copy_result: add ebx,sizeof.TRIPLE_PRECISION bt [next_value.exponent-4*4],31 jnc .skip_accuracy_test bt [next_value.exponent-4*4],30 .skip_accuracy_test: mov eax,[next_value.exponent-4*3] mov edx,[next_value.exponent-4*2] mov ecx,[next_value.exponent-4*1] adc eax,0 adc edx,0 adc ecx,0 mov [ebx+TRIPLE_PRECISION.mantissa_low],eax mov eax,[next_value.exponent-4*0] mov [ebx+TRIPLE_PRECISION.mantissa_mid],edx mov [ebx+TRIPLE_PRECISION.mantissa_high],ecx mov [ebx+TRIPLE_PRECISION.exponent],eax retn endp proc_leaf PRINTF_variable_precision_mul uses ebp esi edi ebx,dest,source1,source2,mantissa_length std mov esi,[source1] mov ebp,[source2] mov ebx,[mantissa_length] lea edi,[esp-4] lea esi,[esi+ebx*4] lea ebp,[ebp+ebx*4] lea ecx,[ebx*2] xor eax,eax rep stosd neg ebx .loop_multiplier: mov ecx,[mantissa_length] neg ecx .loop_multiplicand: lea edi,[ebx+ecx] ;multiply esi+ebx * ebp+ecx ---> esp+edi mov eax,[esi+ebx*4] mul dword[ebp+ecx*4] add [esp+edi*4+0],eax adc [esp+edi*4+4],edx jnc .next_multiplicand lea eax,[edi+1] .add_in_carry: inc eax adc dword[esp+eax*4],0 jc .add_in_carry .next_multiplicand: inc ecx jnz .loop_multiplicand inc ebx jnz .loop_multiplier ;add the exponents into edx mov edx,[esi] add edx,[ebp] ;round to the destination length mov ecx,[mantissa_length] not ecx ;find the rounding bit bsr ebp,[esp-4] ;will be either 31 or 30 mov ebx,ecx xor esi,esi bts esi,ebp ;carry is zeroed here .ripple_carry_through_result: adc [esp+ebx*4],esi mov esi,0 inc ebx jnz .ripple_carry_through_result ;test the MSb and scale up if necessary test byte[esp-1],-1 ;carry is zeroed here js .MSb_okay mov ebx,ecx .normalise_result: mov eax,[esp+ebx*4] adc eax,eax mov [esp+ebx*4],eax inc ebx jnz .normalise_result dec edx ;adjust exponent .MSb_okay: ;store the result not ecx mov edi,[dest] lea esi,[esp-4] mov [edi+ecx*4],edx ;exponent lea edi,[edi+ecx*4-4] rep movsd cld ret endp proc_leaf PRINTF_decimalise_unsigned uses ebp esi edi ebx,dest,value ;return eax=length locals length dd ? pos dd ? mid dd ? high dd ? endl mov ecx,[value] mov ebx,[ecx+8] mov edx,[ecx+4] mov eax,[ecx] bsr edi,ebx jnz .reduce96 bsr edi,edx jnz .reduce64 bsr edi,eax jz .done ;a value of zero prints nothing .reduce32: PRINTF_ceiling_log2_10 edi,edi mov ebp,[edi*4+PRINTF_integer_base_10_multiples_32] sub ebp,eax adc edi,0 mov [length],edi add edi,[dest] jmp .start32 .reduce64: add edi,32 PRINTF_ceiling_log2_10 edi,edi mov ebp,[(edi-10)*8+PRINTF_integer_base_10_multiples_64+0] mov esi,[(edi-10)*8+PRINTF_integer_base_10_multiples_64+4] sub ebp,eax sbb esi,edx adc edi,0 mov [length],edi add edi,[dest] jmp .start64 .reduce96: add edi,64 PRINTF_ceiling_log2_10 edi,edi lea ecx,[edi*4] mov ebp,[(ecx-20*4)*3+PRINTF_integer_base_10_multiples_96+0] mov esi,[(ecx-20*4)*3+PRINTF_integer_base_10_multiples_96+4] mov ecx,[(ecx-20*4)*3+PRINTF_integer_base_10_multiples_96+8] sub ebp,eax sbb esi,edx sbb ecx,ebx adc edi,0 mov [length],edi add edi,[dest] mov esi,eax mov eax,edx mov edx,ebx .loop96: mov [pos],edi mov [high],edx mov [mid],eax mov ecx,esi ; [high] [mid] ecx mov ebx,0xcccccccc ;floor(2^35/10) mov eax,esi mul ebx mov edi,eax mov esi,edx ; esi edi mov eax,ebx mul [mid] add esi,eax adc edx,0 xchg ebx,edx ; ebx esi edi mov eax,[high] mul edx add eax,ebx adc edx,0 ;edx eax esi edi xor ebx,ebx ; [high] [mid] ecx =num*0x000000000000000000000001 mov ebp,edi ; edx eax esi edi =num*0x0000000000000000cccccccc add ebp,ecx ; edx eax esi edi =num*0x00000000cccccccc00000000 mov ebp,edi ; edx eax esi edi =num*0xcccccccc0000000000000000 adc ebp,esi adc ebx,0 add ebp,[mid] adc ebx,0 xor ebp,ebp add edi,esi adc ebp,0 add edi,eax adc ebp,0 add edi,[high] adc ebp,0 add edi,ebx mov edi,[pos] adc ebp,0 xor ebx,ebx add esi,eax adc eax,edx adc edx,0 add esi,edx adc ebx,0 add esi,ebp adc eax,ebx adc edx,0 and esi,-8 sub ecx,esi shrd esi,eax,2 shrd eax,edx,2 shr edx,2 sub ecx,esi shrd esi,eax,1 shrd eax,edx,1 dec edi add ecx,'0' mov [edi],cl shr edx,1 jnz .loop96 mov edx,eax mov eax,esi .start64: ;edx:eax/esi .loop64: mov esi,0xcccccccc ;floor(2^35/10) dec edi mov ecx,eax mov ebp,edx ;ebp:ecx=num mul esi mov ebx,eax xchg esi,edx mov eax,ebp mul edx add eax,esi adc edx,0 ;edx:eax:ebx:000=num*0xcccccccc00000000 mov esi,ebx add esi,eax adc eax,edx adc edx,0 ;edx:eax:esi:ebx=num*0xcccccccccccccccc add ebx,ecx adc esi,ebp adc eax,0 adc edx,0 ;edx:eax:esi:ebx=num*0xcccccccccccccccd and eax,-8 sub ecx,eax shrd eax,edx,2 shr edx,2 sub ecx,eax ;ecx=remainder shrd eax,edx,1 add ecx,'0' mov [edi],cl shr edx,1 ;edx:eax=quotient=num/10 jnz .loop64 .start32: ;eax = eax / 10 mov ebx,0xcccccccd ;ceiling(2^35/10) .loop32: mov ecx,eax dec edi mul ebx and edx,-8 mov eax,edx sub ecx,edx shr edx,2 sub ecx,edx add ecx,'0' mov [edi],cl shr eax,3 jnz .loop32 mov eax,[length] .done: ret endp proc PRINTF_elements_print uses ebx esi edi,elements,output mov ebx,[elements] mov esi,[output] ;format sign xor ecx,ecx mov al,'-' test [ebx+PRINTF_ELEMENTS.flags],PRINTF_ELEMENTS_FLAG.negative jnz .initialise_sign mov al,'+' test [ebx+PRINTF_ELEMENTS.flags],PRINTF_ELEMENTS_FLAG.show_sign jnz .initialise_sign mov al,' ' test [ebx+PRINTF_ELEMENTS.flags],PRINTF_ELEMENTS_FLAG.blank_sign jnz .initialise_sign not ecx .initialise_sign: inc ecx mov [ebx+PRINTF_ELEMENTS.sign_character],al mov [ebx+PRINTF_ELEMENTS.sign_length],ecx xor edi,edi ;edi = number of padding characters ;stdcall PRINTF_get_minimum_length,ebx sub eax,[ebx+PRINTF_ELEMENTS.width] jae .no_padding neg eax mov edi,eax .no_padding: ;leading spaces test edi,edi jz .leading_spaces_done ;if zero padding then do nothing test [ebx+PRINTF_ELEMENTS.flags],PRINTF_ELEMENTS_FLAG.zero_pad jnz .leading_spaces_done ;if left justified then do nothing test [ebx+PRINTF_ELEMENTS.flags],PRINTF_ELEMENTS_FLAG.left_justify jnz .leading_spaces_done ;stdcall PRINTF_elements_print_repeated_character,ebx,+' ',edi,esi .leading_spaces_done: ;sign mov ecx,[ebx+PRINTF_ELEMENTS.sign_length] test ecx,ecx jz .sign_done movzx eax,[ebx+PRINTF_ELEMENTS.sign_character] stdcall PRINTF_elements_print_character,ebx,eax,esi .sign_done: ;prefix mov ecx,[ebx+PRINTF_ELEMENTS.prefix_length] test ecx,ecx jz .prefix_done stdcall PRINTF_elements_print_string,ebx,addr ebx+PRINTF_ELEMENTS.prefix_string,ecx,esi .prefix_done: test edi,edi jz .padding_zeros_done test [ebx+PRINTF_ELEMENTS.flags],PRINTF_ELEMENTS_FLAG.zero_pad jz .padding_zeros_done ;stdcall PRINTF_elements_print_padding_zeros,ebx,edi,esi .padding_zeros_done: ;precision zeros xor eax,eax mov ecx,[ebx+PRINTF_ELEMENTS.precision_zeros] test ecx,ecx jz .precision_zeros_done stdcall PRINTF_elements_print_decimal_repeated_character,ebx,+'0',ecx,eax,esi .precision_zeros_done: ;number mov ecx,[ebx+PRINTF_ELEMENTS.number_length] test ecx,ecx jz .number_done stdcall PRINTF_elements_print_decimal_string,ebx,addr ebx+PRINTF_ELEMENTS.number_string,ecx,eax,esi .number_done: ;magnitude zeros mov ecx,[ebx+PRINTF_ELEMENTS.magnitude_zeros] test ecx,ecx jz .magnitude_zeros_done stdcall PRINTF_elements_print_decimal_repeated_character,ebx,+'0',ecx,eax,esi .magnitude_zeros_done: ;exponent mov ecx,[ebx+PRINTF_ELEMENTS.exponent_length] test ecx,ecx jz .exponent_done stdcall PRINTF_elements_print_string,ebx,addr ebx+PRINTF_ELEMENTS.exponent_string,ecx,esi .exponent_done: ;trailing spaces test edi,edi jz .trailing_spaces_done ;if zero padding then do nothing test [ebx+PRINTF_ELEMENTS.flags],PRINTF_ELEMENTS_FLAG.zero_pad jnz .trailing_spaces_done ;if right justified then do nothing test [ebx+PRINTF_ELEMENTS.flags],PRINTF_ELEMENTS_FLAG.left_justify jz .trailing_spaces_done ;stdcall PRINTF_elements_print_repeated_character,ebx,+' ',edi,esi .trailing_spaces_done: ret endp proc PRINTF_elements_print_decimal_repeated_character uses ebx edi esi,elements,char,repeats,current_position,output ;return eax=next position mov ebx,[elements] mov edi,[current_position] mov esi,[repeats] test esi,esi jz .done .loop: stdcall PRINTF_elements_print_character,ebx,[char],[output] inc edi stdcall PRINTF_elements_print_separator,ebx,edi,[output] dec esi cmp edi,[ebx+PRINTF_ELEMENTS.decimal_point] jnz .decimal_done stdcall PRINTF_elements_print_character,ebx,+'.',[output] .decimal_done: test esi,esi jnz .loop .done: mov eax,edi ret endp ;V proc PRINTF_elements_print_decimal_string uses esi ebx edi,elements,string,length,current_position,output ;return eax=next position mov edi,[current_position] mov esi,[string] mov ebx,[length] test ebx,ebx jz .done .loop: movzx eax,byte[esi] stdcall PRINTF_elements_print_character,[elements],eax,[output] inc edi stdcall PRINTF_elements_print_separator,[elements],edi,[output] inc esi dec ebx mov eax,[elements] cmp edi,[eax+PRINTF_ELEMENTS.decimal_point] jnz .decimal_done stdcall PRINTF_elements_print_character,eax,+'.',[output] .decimal_done: test ebx,ebx jnz .loop .done: mov eax,edi ret endp proc PRINTF_elements_print_separator uses ebx,elements,current_position,output mov ebx,[elements] cmp [ebx+PRINTF_ELEMENTS.separator_modulus],0 jz .separator_skip stdcall PRINTF_get_digits_before_decimal_point,ebx sub eax,[current_position] jbe .separator_skip xor edx,edx div [ebx+PRINTF_ELEMENTS.separator_modulus] test edx,edx jnz .separator_skip movzx eax,[ebx+PRINTF_ELEMENTS.separator_character] stdcall PRINTF_elements_print_character,ebx,eax,[output] .separator_skip: ret endp proc_leaf PRINTF_get_digits_before_decimal_point elements mov edx,[elements] mov eax,[edx+PRINTF_ELEMENTS.decimal_point] test eax,eax jnz .done mov eax,[edx+PRINTF_ELEMENTS.precision_zeros] add eax,[edx+PRINTF_ELEMENTS.number_length] add eax,[edx+PRINTF_ELEMENTS.magnitude_zeros] .done: ret endp proc PRINTF_elements_print_string uses esi ebx,elements,string,length,output mov esi,[string] mov ebx,[length] test ebx,ebx jz .done .loop: movzx eax,byte[esi] test eax,eax jz .done stdcall PRINTF_elements_print_character,[elements],eax,[output] inc esi dec ebx jnz .loop .done: ret endp proc PRINTF_elements_print_character uses ebx,elements,char,output ;applies uppercase flag mov ebx,[output] mov eax,[elements] mov ecx,[ebx+PRINTF_OUTPUT.length] mov edx,[ebx+PRINTF_OUTPUT.buffer] test [eax+PRINTF_ELEMENTS.flags],PRINTF_ELEMENTS_FLAG.uppercase mov eax,[char] jz .case_change_done cmp al,'a' jb .case_change_done cmp al,'z' ja .case_change_done sub al,'a'-'A' .case_change_done: test ecx,ecx jne .add cmp [ebx+PRINTF_OUTPUT.handle],INVALID_HANDLE_VALUE je .next push eax ;stdcall PRINTF_flush_buffer,ebx pop eax mov edx,[ebx+PRINTF_OUTPUT.buffer] mov ecx,[ebx+PRINTF_OUTPUT.length] .add: mov [edx],al dec ecx mov [ebx+PRINTF_OUTPUT.length],ecx .next: inc edx mov [ebx+PRINTF_OUTPUT.buffer],edx ret endp restore prologue@proc,epilogue@proc _________________ I don`t like to refer by "you" to one person. My soul requires acronim "thou" instead. |
|||
23 Nov 2023, 11:32 |
|
ProMiNick 24 Nov 2023, 21:03
I tried to test muliplication algorithm(not current, but same like I understand it) for extracting decimals
test data - any dword. multiplicator $1000000000000000000000000/10000000000 that is equal to $6DF37F675EF6EADF. What I got: decimals extraction is differ than actual value by +\-0..2 Code: Code: format PE GUI 4.0 entry start include 'win32a.inc' section '.text' code readable executable start: RTL_C ; cut off RTL_C for official fasm push ebx push ebp push esi push edi mov esi, 10 mov edi, buffer mov ecx, [a] mov eax, $6DF37F675EF6EADF shr 32 ;$6DF37F675EF6EADE=1/$100000000 mul ecx mov ebp, edx mov ebx, 0;eax mov eax, $6DF37F675EF6EADF and $FFFFFFFF mul ecx add ebx, edx adc ebp, 0 mov ecx, 10 .loop: mul ecx xchg eax, edx mov ebx, edx xchg eax, ebp mul ecx add ebp, eax mov al, dl adc al, '0' stosb mov eax, ebx dec esi jnz .loop pop edi pop esi pop ebp pop ebx invoke MessageBox,0,buffer,esp,0 invoke ExitProcess,0 flush_locals ; cut off flush_locals for official fasm section '.data' data readable writeable a dd 9 buffer db 11 dup 0 section '.idata' import data readable writeable library kernel32,'KERNEL32.DLL',\ user32,'USER32.DLL' include 'api\kernel32.inc' include 'api\user32.inc' _________________ I don`t like to refer by "you" to one person. My soul requires acronim "thou" instead. |
|||
24 Nov 2023, 21:03 |
|
sylware 08 Dec 2023, 19:46
I am working on my own version (64bits though).
I wonder how much different it will be in the end (mine won't be complete and quite limited though). I am finishing the conversion directive decoding stage (numbered argument mode and auto-incremented argument mode). Since I am on and off on that piece of code. I could test something: I had a significant break in the middle of its developement, and I wanted to see how much time it would take me to recall everything and move forward again, that in comparison to C. Well, it felt easier and simpler with assembly than with C. Weird. |
|||
08 Dec 2023, 19:46 |
|
sylware 05 Jan 2024, 16:12
Allright, here is my take, partial implementation though (string and hexadecimal, but the current conversions can be used as code templates for the others).
Conclusion: printf family functions should be avoided like hell, and a brutal put_string with string conversion functions with dynamic stack allocation should be used instead, even if it means a bit more work on our side. I understand now why some busybox people are/were so hostile to printf functions. It does assemble with fasmg(duh!), binutils gas and nasm (then probably yasm). vim color syntax file provided. It is there: https://www.rocketgit.com/user/sylware/nyanvsnprintf (I did a recent modification for a better fatal error code path) |
|||
05 Jan 2024, 16:12 |
|
< Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.