flat assembler
Message board for the users of flat assembler.

Index > Main > printf

Author
Thread Post new topic Reply to topic
sylware



Joined: 23 Oct 2020
Posts: 437
Location: Marseille/France
sylware 01 Nov 2023, 18:39
I know I did already ask for this, but I try again: does anyone know about a sane assembly written and well-featured "printf" function?
Post 01 Nov 2023, 18:39
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20303
Location: In your JS exploiting you and your system
revolution 01 Nov 2023, 19:35
I posted this a long time ago. But here it is again.
Code:
;Console output formatted text print functions:
;
;proc printf    c       format,[arglist*]
;proc vprintf   stdcall format,arglist
;
;File output formatted text print functions:
;
;proc fprintf   c       handle,format,[arglist*]
;proc vfprintf  stdcall handle,format,arglist
;
;Memory output formatted text print functions:
;
;proc sprintf   c       dest,format,[arglist*]
;proc vsprintf  stdcall dest,format,arglist
;
;Length controlled memory output formatted text print functions:
;
;proc snprintf  c       dest,max_length,format,[arglist*]
;proc vsnprintf stdcall dest,max_length,format,arglist
;
;
;Format specifications always begin with a percent sign (%). If a
;percent sign is followed by a sequence that has no meaning as a format
;field then the invalid character is copied to the output and scanning
;resumes at the next character.
;
;The format specification has the following form.
;See below for a description of each of the fields.
;
;       %[flags*][width][.precision][size]type
;
;Return Values:
;
;For memory output functions (sprintf, vsprintf, snprintf, vsnprintf):
;
;If the function succeeds, the return value in eax is the number of
;characters stored in the output buffer, not counting the terminating
;null character.
;
;If the function fails, the return value in eax is the negative value of
;the number of extra bytes required to fully print the string including
;a null character. The output buffer has been filled completely with as
;much output that could fit and a terminating null is the last
;character.
;
;For stream output functions (printf, vprintf, fprintf, vfprintf):
;
;If the function succeeds, the return value in eax is the number of
;characters sent to the stream and ecx is zero.
;
;If the function fails, the return value in eax is the number of
;characters sent to the stream and ecx is the most recent value returned
;by GetLastError.
;
;
;flags field
;The first optional field is flags. A flag directive is a character that
;specifies output justification and emission of signs, blanks, leading
;zeros, decimal points and octal and hexadecimal prefixes. More than one
;flag directive may appear in a format specification, and flags can
;appear in any order.
;
;-      Pad the output with blanks or zeros to the right to fill the
;       field width, justifying output to the left. If this field is
;       omitted, the output is padded to the left, justifying it to the
;       right.
;
;#      Used with o, x or X specifiers the value is preceded with 0, 0x
;       or 0X respectively.
;       Used with a, A, e, E, f, F, g, G, r or R it forces the written
;       output to contain a decimal point even if no more digits follow.
;       By default, if no digits follow, no decimal point is written.
;
;0      Left-pads a number value with zeros to fill the field width. If
;       this field is omitted, the output value is padded with blank
;       spaces.
;
;+      For number values a plus or minus sign (+ or -) precedes the
;       number even for positive numbers. By default, only negative
;       numbers are preceded with a - sign.
;
;^      Argument is a pointer to the argument value.
;
;,      Includes separator characters. For decimal formats f (including
;       g), r, i and u a comma character is placed between every third
;       digit that appears before the decimal point. For formats B, q, o
;       and x a space character is placed between every fourth digit.
;
;space  If no sign is going to be written, a blank space is inserted
;       before the value.
;
;
;width field
;Copy the specified minimum number of characters to the output buffer.
;The width field is a nonnegative integer. The width specification never
;causes a value to be truncated. If the number of characters in the
;output value is greater than the specified width, or if the width field
;is not present, all characters of the value are printed, subject to the
;precision specification. Default width is 0
;
;number Minimum number of characters to be printed. If the value to be
;       printed is shorter than this number, the result is padded with
;       either blank spaces or zeros. The value is not truncated even if
;       the result is larger.
;
;*      The width is not specified in the format string, but as an
;       additional integer argument value preceding the argument that
;       has to be formatted.
;
;
;.precision field
;For numbers, copy the specified minimum number of digits to the output
;buffer. If the number of digits in the argument is less than the
;specified precision, the output value is padded on the left with zeros.
;The value is not truncated when the number of digits exceeds the
;specified precision. If the dot (.) appears without a number following
;it, the precision is set to 0. For strings, copy the specified maximum
;number of characters to the output buffer, thus truncating the string
;if appropriate.
;
;.number
;       For integer specifiers (i, o, u, x, X): specifies the minimum
;       number of digits to be written. If the value to be written is
;       shorter than this number, the result is padded with leading
;       zeros. The value is not truncated even if the result is longer.
;       A precision of 0 means that no character is written for the
;       value 0. Default precision for integer specifiers is 1.
;       For a, A, e, E, f, F, r and R specifiers: this is the number of
;       digits to be printed after the decimal point. By default, this
;       is 6.
;       For s: this is the maximum number of characters to be printed.
;       By default all characters are printed until the ending null
;       character is encountered.
;       If the dot is specified without an explicit value for precision,
;       0 is assumed.
;
;.*     The precision is not specified in the format string, but as an
;       additional integer value argument preceding the argument that
;       has to be formatted.
;
;
;size field
;If size is omitted the default size used: for integer specifiers is the
;"natural" size based upon the process mode (32-bit or 64-bit); for
;float specifiers the default size used is double precision (64-bit,
;dword sized). Byte sized floats use the 1.4.3 format. Half word sized
;floats use 1.5.10 format. Word sized floats use 1.8.23 format (standard
;IEEE single precision). Dword sized floats use 1.11.52 format (standard
;IEEE double precision). Triple (tword) sized floats use 1.15.63 format
;(standard IEEE extended precision). Integer types are sign extended for
;the i type and are zero extended for other integer types.
;
;b      Byte size. 8-bit values are loaded.
;
;h      Half word size. 16-bit values are loaded.
;
;w      Word (single) size. 32-bit values are loaded.
;
;d      Dword (double) size. 64-bit values are loaded. If using a 32-bit
;       process two word arguments are consumed.
;
;t      Triple size. 96-bit integer values or 80-bit extended precision
;       float values are loaded. If using a 32-bit process three word
;       arguments are consumed. If using a 64-bit process two dword
;       arguments are consumed.
;
;
;type field
;Output the corresponding argument as a character, a string, or a
;number. If this field is an uppercase character then the output is
;changed to uppercase. This field can be any of the following
;characters:
;
;a or A Hexadecimal floating point in scientific notation.
;     B Unsigned binary integer. Note: b (lowercase) can't be used.
;c or C Repetition of a single character. This function doesn't print
;       character arguments with a numeric value of zero.
;e or E Decimal floating point in scientific notation
;       (sign+mantissa+exponent).
;f or F Decimal floating point.
;g or G Use the shortest representation: e, E, f or F where the
;       precision field specifies the number of significant digits
;i or I Signed decimal integer.
;n or N Nothing printed. The corresponding argument must be a pointer to
;       an unsigned integer. The number of characters written so far is
;       stored in the pointed location.
;o or O Unsigned octal integer.
;p or P Argument pointer address in hexadecimal.
;q or Q Unsigned quaternary integer.
;r or R Decimal floating point without trailing zeros.
;s or S Null terminated string of characters.
;u or U Unsigned decimal integer.
;x or X Unsigned hexadecimal integer.

DEFAULT_FLOAT_PRECISION                 = 6
DEFAULT_INTEGER_PRECISION               = 1
DEFAULT_CHARACTER_PRECISION             = 1
DEFAULT_STRING_PRECISION                = -1    ;print all characters
DEFAULT_FLOAT_SIZE                      = PRINTF_ELEMENTS_SIZE.dword
DEFAULT_INTEGER_SIZE                    = PRINTF_ELEMENTS_SIZE.word     ;word for 32-bit process
MAXIMUM_CONVERSION_LENGTH               = 96    ;enough space for triple sized binary at 96 bits
MAXIMUM_EXPONENT_LENGTH                 = 8     ;a and A formats produce the longest ('p+16383') at 7 characters
PRINTF_BUFFER_SIZE                      = 1 shl 12
PRINTF_CLASS_ZERO                       = 0
PRINTF_CLASS_NORMAL                     = 1
PRINTF_CLASS_INFINITY                   = 2
PRINTF_CLASS_SNAN                       = 3
PRINTF_CLASS_QNAN                       = 4
EXPONENT_LENGTH_FORMAT_A                = 5
EXPONENT_LENGTH_FORMAT_E                = 3
EXPONENT_LENGTH_FORMAT_G                = 1

TRIPLE_PRECISION_LOG2_MAXIMUM_EXPONENT  = 12    ;10^4096 enough for 80-bit extended reals up to 10^+-4933
TRIPLE_PRECISION_EXPONENT_TABLE_SCALE   = 4     ;exponent reduction in bits per pass. supports using 1, 2, 3, 4, 6 or 12 only

struct TRIPLE_PRECISION
        mantissa_low            dd ?
        mantissa_mid            dd ?
        mantissa_high           dd ?
        exponent                dd ?
ends
sizeof.TRIPLE_PRECISION.mantissa = 32*3

struct FLOAT_DESCRIPTION
        sign_bit_position       db ?    ;number of leading bits to skip to find the sign bit
        exponent_width          db ?
        implicit_bit_flag       db ?
                                db ?    ;align
        exponent_bias           dd ?
ends

struct PRINTF_OUTPUT
        buffer                  dd ?
        length                  dd ?
        handle                  dd ?
        sent                    dd ?
        error                   dd ?
ends

;elements of a number
; [leading spaces] [sign] [prefix] [padding zeros] [precision zeros] [number] [magnitude zeros] [exponent] [trailing spaces]
; with a decimal point placed somewhere within [precision zeros], [number] or [magnitude zeros]

struct PRINTF_ELEMENTS
        flags                   dd ?    ;as specified in the format string
        width                   dd ?    ;as specified in the format string
        precision               dd ?    ;as specified in the format string
        sign_length             dd ?    ;0 or 1
        prefix_length           dd ?    ;0 to 2
        precision_zeros         dd ?    ;0 to many
        number_length           dd ?    ;0 to MAXIMUM_CONVERSION_LENGTH
        magnitude_zeros         dd ?    ;0 to many
        exponent_length         dd ?    ;0 to MAXIMUM_EXPONENT_LENGTH
        decimal_point           dd ?    ;0 to many. 0 means no decimal point, 1 means after the first digit
        significant_bits        dd ?    ;0 to 64. count of mantissa bits in a float
        written                 dd ?    ;running count of total characters in output
        separator_modulus       dd ?    ;0, 3 or 4. 0 means no separators
        arg_value               TRIPLE_PRECISION
        prefix_string           rb 2    ;'0' or '0x'
        sign_character          rb 1    ;'-', '+' or ' '
        separator_character     rb 1    ;',' or ' '
        number_string           rb MAXIMUM_CONVERSION_LENGTH
        exponent_string         rb MAXIMUM_EXPONENT_LENGTH
ends

PRINTF_ELEMENTS_FLAG.argument_size_mask = 7 shl 0       ;byte, hword, word, dword, triple
PRINTF_ELEMENTS_FLAG.left_justify       = 1 shl 3       ;justify left
PRINTF_ELEMENTS_FLAG.show_sign          = 1 shl 4       ;prefix option: show + or - sign
PRINTF_ELEMENTS_FLAG.blank_sign         = 1 shl 5       ;prefix option: show <space> or - sign
PRINTF_ELEMENTS_FLAG.hash_option        = 1 shl 6       ;prefix option: print 0x (hex), 0 (octal) or decimal point (float)
PRINTF_ELEMENTS_FLAG.zero_pad           = 1 shl 7       ;pad left with zeros
PRINTF_ELEMENTS_FLAG.uppercase          = 1 shl 8       ;push to upper case
PRINTF_ELEMENTS_FLAG.pointer            = 1 shl 9       ;pointer to the argument, not the argument itself
PRINTF_ELEMENTS_FLAG.separator          = 1 shl 10      ;print separator characters (space or comma) within the number
PRINTF_ELEMENTS_FLAG.size_specified     = 1 shl 11      ;don't use default argument size
PRINTF_ELEMENTS_FLAG.precision_specified= 1 shl 12      ;don't use default precision
PRINTF_ELEMENTS_FLAG.negative           = 1 shl 13      ;set if arg_value is negative
PRINTF_ELEMENTS_FLAG.zero               = 1 shl 14      ;set if arg_value is zero

PRINTF_ELEMENTS_SIZE.byte               = 0     ;8-bit
PRINTF_ELEMENTS_SIZE.hword              = 1     ;16-bit
PRINTF_ELEMENTS_SIZE.word               = 2     ;32-bit
PRINTF_ELEMENTS_SIZE.dword              = 3     ;64-bit
PRINTF_ELEMENTS_SIZE.triple             = 4     ;96/80-bit
PRINTF_ELEMENTS_SIZE.pointer            = -2
PRINTF_ELEMENTS_SIZE.null               = -1

struct PRINTF_FLAG_TABLE
        char                    db ?
        flag                    dd ?
ends

struct PRINTF_DECODE_TABLE
        function                dd ?
        default_size            dd ?
ends

section '.datapf' data readable writeable

        if used PRINTF_flag_table
                PRINTF_flag_table:
                        PRINTF_FLAG_TABLE       '-',PRINTF_ELEMENTS_FLAG.left_justify
                        PRINTF_FLAG_TABLE       '+',PRINTF_ELEMENTS_FLAG.show_sign
                        PRINTF_FLAG_TABLE       ' ',PRINTF_ELEMENTS_FLAG.blank_sign
                        PRINTF_FLAG_TABLE       '#',PRINTF_ELEMENTS_FLAG.hash_option
                        PRINTF_FLAG_TABLE       '0',PRINTF_ELEMENTS_FLAG.zero_pad
                        PRINTF_FLAG_TABLE       '^',PRINTF_ELEMENTS_FLAG.pointer
                        PRINTF_FLAG_TABLE       ',',PRINTF_ELEMENTS_FLAG.separator
                        PRINTF_FLAG_TABLE       0
        end if
        if used PRINTF_size_table
                PRINTF_size_table:
                        PRINTF_FLAG_TABLE       'b',PRINTF_ELEMENTS_SIZE.byte   + PRINTF_ELEMENTS_FLAG.size_specified
                        PRINTF_FLAG_TABLE       'h',PRINTF_ELEMENTS_SIZE.hword  + PRINTF_ELEMENTS_FLAG.size_specified
                        PRINTF_FLAG_TABLE       'w',PRINTF_ELEMENTS_SIZE.word   + PRINTF_ELEMENTS_FLAG.size_specified
                        PRINTF_FLAG_TABLE       'd',PRINTF_ELEMENTS_SIZE.dword  + PRINTF_ELEMENTS_FLAG.size_specified
                        PRINTF_FLAG_TABLE       't',PRINTF_ELEMENTS_SIZE.triple + PRINTF_ELEMENTS_FLAG.size_specified
                        PRINTF_FLAG_TABLE       0
        end if
        if used PRINTF_decode_table
                align 4
                PRINTF_decode_table:
                        PRINTF_DECODE_TABLE     PRINTF_decode_a,DEFAULT_FLOAT_SIZE
                        PRINTF_DECODE_TABLE     PRINTF_decode_b,DEFAULT_INTEGER_SIZE
                        PRINTF_DECODE_TABLE     PRINTF_decode_c,PRINTF_ELEMENTS_SIZE.byte
                        PRINTF_DECODE_TABLE     PRINTF_decode_d,PRINTF_ELEMENTS_SIZE.null
                        PRINTF_DECODE_TABLE     PRINTF_decode_e,DEFAULT_FLOAT_SIZE
                        PRINTF_DECODE_TABLE     PRINTF_decode_f,DEFAULT_FLOAT_SIZE
                        PRINTF_DECODE_TABLE     PRINTF_decode_g,DEFAULT_FLOAT_SIZE
                        PRINTF_DECODE_TABLE     PRINTF_decode_h,PRINTF_ELEMENTS_SIZE.null
                        PRINTF_DECODE_TABLE     PRINTF_decode_i,DEFAULT_INTEGER_SIZE
                        PRINTF_DECODE_TABLE     PRINTF_decode_j,PRINTF_ELEMENTS_SIZE.null
                        PRINTF_DECODE_TABLE     PRINTF_decode_k,PRINTF_ELEMENTS_SIZE.null
                        PRINTF_DECODE_TABLE     PRINTF_decode_l,PRINTF_ELEMENTS_SIZE.null
                        PRINTF_DECODE_TABLE     PRINTF_decode_m,PRINTF_ELEMENTS_SIZE.null
                        PRINTF_DECODE_TABLE     PRINTF_decode_n,DEFAULT_INTEGER_SIZE
                        PRINTF_DECODE_TABLE     PRINTF_decode_o,DEFAULT_INTEGER_SIZE
                        PRINTF_DECODE_TABLE     PRINTF_decode_p,PRINTF_ELEMENTS_SIZE.pointer
                        PRINTF_DECODE_TABLE     PRINTF_decode_q,DEFAULT_INTEGER_SIZE
                        PRINTF_DECODE_TABLE     PRINTF_decode_r,DEFAULT_FLOAT_SIZE
                        PRINTF_DECODE_TABLE     PRINTF_decode_s,DEFAULT_INTEGER_SIZE
                        PRINTF_DECODE_TABLE     PRINTF_decode_t,PRINTF_ELEMENTS_SIZE.null
                        PRINTF_DECODE_TABLE     PRINTF_decode_u,DEFAULT_INTEGER_SIZE
                        PRINTF_DECODE_TABLE     PRINTF_decode_v,PRINTF_ELEMENTS_SIZE.null
                        PRINTF_DECODE_TABLE     PRINTF_decode_w,PRINTF_ELEMENTS_SIZE.null
                        PRINTF_DECODE_TABLE     PRINTF_decode_x,DEFAULT_INTEGER_SIZE
                        PRINTF_DECODE_TABLE     PRINTF_decode_y,PRINTF_ELEMENTS_SIZE.null
                        PRINTF_DECODE_TABLE     PRINTF_decode_z,PRINTF_ELEMENTS_SIZE.null
        end if
        if used float_description_table
                align 8
                float_description_table:
                        FLOAT_DESCRIPTION       sizeof.TRIPLE_PRECISION.mantissa- 8, 4,-1,,1 shl  3 - 2 ;1. 4. 3 format
                        FLOAT_DESCRIPTION       sizeof.TRIPLE_PRECISION.mantissa-16, 5,-1,,1 shl  4 - 2 ;1. 5.10 format
                        FLOAT_DESCRIPTION       sizeof.TRIPLE_PRECISION.mantissa-32, 8,-1,,1 shl  7 - 2 ;1. 8.23 format
                        FLOAT_DESCRIPTION       sizeof.TRIPLE_PRECISION.mantissa-64,11,-1,,1 shl 10 - 2 ;1.11.52 format
                        FLOAT_DESCRIPTION       sizeof.TRIPLE_PRECISION.mantissa-80,15, 0,,1 shl 14 - 2 ;1.15.63 format + explicit mantissa MSb
        end if
        if used PRINTF_integer_base_10_multiples_32
                align 4
                PRINTF_integer_base_10_multiples_32:
                        dd      1-1
                        dd      10-1
                        dd      100-1
                        dd      1000-1
                        dd      10000-1
                        dd      100000-1
                        dd      1000000-1
                        dd      10000000-1
                        dd      100000000-1
                        dd      1000000000-1
                        dd      -1
                PRINTF_integer_base_10_multiples_64:
                        dq      10000000000-1
                        dq      100000000000-1
                        dq      1000000000000-1
                        dq      10000000000000-1
                        dq      100000000000000-1
                        dq      1000000000000000-1
                        dq      10000000000000000-1
                        dq      100000000000000000-1
                        dq      1000000000000000000-1
                        dq      10000000000000000000-1
                        dq      -1
                PRINTF_integer_base_10_multiples_96:
                    .l = 10000000000000000000 and (1 shl 32 - 1)
                    .h = 10000000000000000000 shr 32
                        dd      (10         * .l) and (1 shl 32 - 1)-1, (10         * .l) shr 32 + (10         * .h) and (1 shl 32 - 1), (10         * .h) shr 32
                        dd      (100        * .l) and (1 shl 32 - 1)-1, (100        * .l) shr 32 + (100        * .h) and (1 shl 32 - 1), (100        * .h) shr 32
                        dd      (1000       * .l) and (1 shl 32 - 1)-1, (1000       * .l) shr 32 + (1000       * .h) and (1 shl 32 - 1), (1000       * .h) shr 32
                        dd      (10000      * .l) and (1 shl 32 - 1)-1, (10000      * .l) shr 32 + (10000      * .h) and (1 shl 32 - 1), (10000      * .h) shr 32
                        dd      (100000     * .l) and (1 shl 32 - 1)-1, (100000     * .l) shr 32 + (100000     * .h) and (1 shl 32 - 1), (100000     * .h) shr 32
                        dd      (1000000    * .l) and (1 shl 32 - 1)-1, (1000000    * .l) shr 32 + (1000000    * .h) and (1 shl 32 - 1), (1000000    * .h) shr 32
                        dd      (10000000   * .l) and (1 shl 32 - 1)-1, (10000000   * .l) shr 32 + (10000000   * .h) and (1 shl 32 - 1), (10000000   * .h) shr 32
                        dd      (100000000  * .l) and (1 shl 32 - 1)-1, (100000000  * .l) shr 32 + (100000000  * .h) and (1 shl 32 - 1), (100000000  * .h) shr 32
                        dd      (1000000000 * .l) and (1 shl 32 - 1)-1, (1000000000 * .l) shr 32 + (1000000000 * .h) and (1 shl 32 - 1), (1000000000 * .h) shr 32
                        dd      -1,-1,-1
        end if
        if used PRINTF_integer_reciprocals
                align 4
                PRINTF_integer_reciprocals:
                        dd      (1 shl 24 + 0)/1        ;ceiling(2^24/1) binary
                        dd      (1 shl 24 + 1)/2        ;ceiling(2^24/2) quaternary
                        dd      (1 shl 24 + 2)/3        ;ceiling(2^24/3) octal
                        dd      (1 shl 24 + 3)/4        ;ceiling(2^24/4) hexadecimal
                        dd      (1 shl 24 + 4)/5        ;ceiling(2^24/5) base-32
        end if


        if used PRINTF_triple_precision_power_table
                align 16
                PRINTF_triple_precision_power_table rb TRIPLE_PRECISION_EXPONENT_TABLE_SIZE*sizeof.TRIPLE_PRECISION
        end if

section '.codepf' code executable

PRINTF_PROC_FLAG_RESTORE_ESP = 1 shl 32
PRINTF_PROC_FLAG_RESTORE_EBP = 1 shl 33

macro proc_leaf [args] { common
        prologue@proc equ PRINTF_prologue_leaf
        proc args
        restore prologue@proc
}
macro PRINTF_prologue_leaf procname,flag,parambytes,localbytes,reglist {
        PRINTF_prologue procname,flag,parambytes,localbytes,reglist,1
}

macro PRINTF_prologue procname,flag,parambytes,localbytes,reglist,leaf {
        local   varsize,regsize
        varsize = (localbytes + 3) and (not 3)
        match x,leaf \{
                localbase@proc  equ esp-varsize
                parmbase@proc   equ esp+4+regsize
                rept 0 \{
        \} rept 1 \{
                localbase@proc  equ ebp-varsize
                parmbase@proc   equ ebp+4+regsize
        \}
        regsize = 0
        irps reg,reglist \{
                push    reg
                regsize = regsize + 4
        \}
        if (parambytes | localbytes) & ~ leaf+0
                regsize = regsize + 4
                flag = flag or PRINTF_PROC_FLAG_RESTORE_EBP
                push    ebp
                mov     ebp,esp
                if localbytes
                        flag = flag or PRINTF_PROC_FLAG_RESTORE_ESP
                        add     esp,-varsize
                end if
        end if
}

macro PRINTF_epilogue procname,flag,parambytes,localbytes,reglist {
        if flag and PRINTF_PROC_FLAG_RESTORE_ESP
                sub     esp,-((localbytes + 3) and (not 3))
        end if
        if flag and PRINTF_PROC_FLAG_RESTORE_EBP
                pop     ebp
        end if
        irps reg,reglist \{ reverse
                pop     reg
        \}
        if flag and 10000b
                retn                    ;c call
        else
                retn    parambytes      ;standard call
        end if
}

prologue@proc equ PRINTF_prologue
epilogue@proc equ PRINTF_epilogue

proc printf c format,arglist
        local   buffer[PRINTF_BUFFER_SIZE]:BYTE
        invoke  GetStdHandle,STD_OUTPUT_HANDLE
        stdcall PRINTF,eax,addr buffer,PRINTF_BUFFER_SIZE,[format],addr arglist
        ret
endp

proc vprintf format,arglist
        local   buffer[PRINTF_BUFFER_SIZE]:BYTE
        invoke  GetStdHandle,STD_OUTPUT_HANDLE
        stdcall PRINTF,eax,addr buffer,PRINTF_BUFFER_SIZE,[format],[arglist]
        ret
endp

proc fprintf c handle,format,arglist
        local   buffer[PRINTF_BUFFER_SIZE]:BYTE
        stdcall PRINTF,[handle],addr buffer,PRINTF_BUFFER_SIZE,[format],addr arglist
        ret
endp

proc vfprintf handle,format,arglist
        local   buffer[PRINTF_BUFFER_SIZE]:BYTE
        stdcall PRINTF,[handle],addr buffer,PRINTF_BUFFER_SIZE,[format],[arglist]
        ret
endp

proc sprintf c dest,format,arglist
        stdcall PRINTF,INVALID_HANDLE_VALUE,[dest],-1,[format],addr arglist
        ret
endp

proc vsprintf dest,format,arglist
        stdcall PRINTF,INVALID_HANDLE_VALUE,[dest],-1,[format],[arglist]
        ret
endp

proc snprintf c dest,size,format,arglist
        stdcall PRINTF,INVALID_HANDLE_VALUE,[dest],[size],[format],addr arglist
        ret
endp

proc vsnprintf dest,size,format,arglist
        stdcall PRINTF,INVALID_HANDLE_VALUE,[dest],[size],[format],[arglist]
        ret
endp

proc PRINTF uses ebx esi edi,handle,dest,size,format,arglist
        locals
                output          PRINTF_OUTPUT
                elements        PRINTF_ELEMENTS
        endl
        mov     edx,[dest]
        mov     ecx,[size]
        mov     eax,[handle]
        mov     esi,[format]
        xor     ebx,ebx
        mov     [output.buffer],edx
        mov     [output.length],ecx
        mov     [output.handle],eax
        mov     [output.sent],ebx
        mov     [output.error],ebx
    .next:
        xor     eax,eax
        lea     edi,[elements]
        mov     ecx,sizeof.PRINTF_ELEMENTS/4
        rep     stosd
    .next_char:
        movzx   eax,byte[esi]
        inc     esi
        test    eax,eax
        jz      .done
        cmp     al,'%'
        jz      .process_flags
    .print_char:
        stdcall PRINTF_elements_print_character,addr elements,eax,addr output
        jmp     .next_char
    .process_flags:
        movzx   eax,byte[esi]
        inc     esi
        test    eax,eax
        jz      .done
    .next_flag:
        mov     ebx,PRINTF_flag_table
        call    .convert_table
        test    eax,eax
        jz      .done
        cmp     [ebx+PRINTF_FLAG_TABLE.char],0
        jnz     .next_flag
    .process_width:
        cmp     al,'*'
        jz      .width_from_argument
        cmp     al,'1'
        jb      .process_precision
        cmp     al,'9'
        ja      .process_precision
    .width_from_argument:
        call    .convert_decimal
        mov     [elements.width],edi
        test    eax,eax
        jz      .done
    .process_precision:
        cmp     al,'.'
        jnz     .process_size
        movzx   eax,byte[esi]
        inc     esi
        test    eax,eax
        jz      .done
        xor     edi,edi
        cmp     al,'*'
        jz      .precision_from_argument
        cmp     al,'0'
        jb      .set_precision
        cmp     al,'9'
        ja      .set_precision
    .precision_from_argument:
        call    .convert_decimal
    .set_precision:
        mov     [elements.precision],edi
        or      [elements.flags],PRINTF_ELEMENTS_FLAG.precision_specified
        test    eax,eax
        jz      .done
    .process_size:
        mov     ebx,PRINTF_size_table
        call    .convert_table
        test    eax,eax
        jz      .done
    .process_decode:
        movzx   ebx,al
        cmp     bl,'A'
        jbe     .check_type
        cmp     bl,'Z'
        ja      .check_type
        add     bl,'a'-'A'
        or      [elements.flags],PRINTF_ELEMENTS_FLAG.uppercase
    .check_type:
        cmp     bl,'a'
        jb      .print_char
        cmp     bl,'z'
        ja      .print_char
        ;compute current output length
        mov     edi,[output.buffer]
        sub     edi,[dest]
        mov     [elements.written],edi
        ;get the argument value
        stdcall PRINTF_read_arg,addr elements.arg_value,[arglist],\
                        [PRINTF_decode_table+(ebx-'a')*sizeof.PRINTF_DECODE_TABLE+PRINTF_DECODE_TABLE.default_size],\
                        [elements.flags]
        or      [elements.flags],edx    ;set the size
        mov     [arglist],ecx
        ;call the decoder
        mov     eax,ebx                 ;we pass the character type to the function in eax (for printing the unknown type)
        stdcall [PRINTF_decode_table+(ebx-'a')*sizeof.PRINTF_DECODE_TABLE+PRINTF_DECODE_TABLE.function],addr elements,addr output
        jmp     .next
    .done:
        cmp     [output.handle],INVALID_HANDLE_VALUE
        jne     .flush
        stdcall PRINTF_elements_print_character,addr elements,0,addr output
        mov     edx,[output.buffer]
        cmp     [output.length],0
        jnz     .return_no_overflow
        mov     eax,edx
        sub     eax,[dest]
        cmp     eax,[size]
        jz      .return_no_overflow
        sub     eax,[size]
        neg     eax                     ;return negative the number of extra bytes required
        cmp     [size],0
        jz      .ret
        mov     byte[edx+eax-1],0       ;always return a null terminated string
    .ret:
        ret
    .return_no_overflow:
        lea     eax,[edx-1]             ;don't include the terminating null
        sub     eax,[dest]              ;return number of characters stored
        ret
    .flush:
        stdcall PRINTF_flush_buffer,addr output
        mov     eax,[output.sent]       ;return number of characters output
        mov     ecx,[output.error]      ;return non-zero if there was an error
        ret

    .convert_table:
        cmp     [ebx+PRINTF_FLAG_TABLE.char],0
        jz      .table_done
        cmp     [ebx+PRINTF_FLAG_TABLE.char],al
        jz      .set_flag
        add     ebx,sizeof.PRINTF_FLAG_TABLE
        jmp     .convert_table
    .set_flag:
        mov     eax,[ebx+PRINTF_FLAG_TABLE.flag]
        or      [elements.flags],eax
        movzx   eax,byte[esi]
        inc     esi
    .table_done:
        retn

    .convert_decimal:
        cmp     al,'*'
        jz      .decimal_from_argument
        lea     edi,[eax-'0']
    .decimal_next:
        movzx   eax,byte[esi]
        inc     esi
        test    eax,eax
        jz      .decimal_last
        sub     eax,'0'
        cmp     eax,9
        ja      .decimal_done
        lea     edi,[edi*5]
        lea     edi,[edi*2+eax]
        jmp     .decimal_next
    .decimal_from_argument:
        mov     eax,[arglist]
        mov     edi,[eax]
        add     eax,4
        mov     [arglist],eax
        movzx   eax,byte[esi]
        inc     esi
        jmp     .decimal_last
    .decimal_done:
        add     eax,'0'
    .decimal_last:
        retn

endp

proc_leaf PRINTF_read_arg uses edi esi ebx,dest,arg_pointer,default_size,flags
        ;return ecx=new arg pointer, edx=argument size
        mov     edx,[default_size]
        mov     ecx,[arg_pointer]
        cmp     edx,PRINTF_ELEMENTS_SIZE.null
        jz      .null
        mov     ebx,[flags]
        mov     edi,[dest]
        mov     esi,ecx
        add     ecx,1 shl DEFAULT_INTEGER_SIZE
        test    ebx,PRINTF_ELEMENTS_FLAG.pointer
        jz      .address_known
        mov     esi,[esi]
    .address_known:
        cmp     edx,PRINTF_ELEMENTS_SIZE.pointer
        jz      .pointer
        test    ebx,PRINTF_ELEMENTS_FLAG.size_specified
        jz      .size_known
        mov     edx,ebx
        and     edx,PRINTF_ELEMENTS_FLAG.argument_size_mask
    .size_known:
        push    edx
        cmp     edx,PRINTF_ELEMENTS_SIZE.byte
        jz      .read_byte
        cmp     edx,PRINTF_ELEMENTS_SIZE.hword
        jz      .read_hword
        cmp     edx,PRINTF_ELEMENTS_SIZE.word
        jz      .read_word
        cmp     edx,PRINTF_ELEMENTS_SIZE.dword
        jz      .read_dword
        cmp     edx,PRINTF_ELEMENTS_SIZE.triple
        jz      .read_triple
        int3
    .read_byte:
        movzx   eax,byte[esi]
        jmp     .store_word
    .read_hword:
        movzx   eax,word[esi]
        jmp     .store_word
    .read_word:
        mov     eax,[esi]
        jmp     .store_word
    .read_dword:
        mov     eax,[esi]
        mov     edx,[esi+4]
        jmp     .store_dword
    .read_triple:
        mov     eax,[esi]
        mov     edx,[esi+4]
        mov     ebx,[esi+8]
        mov     [edi+8],ebx
        add     esi,4
    .store_dword:
        mov     [edi+4],edx
        add     esi,4
    .store_word:
        mov     [edi+0],eax
        add     esi,4
        pop     edx
        test    [flags],PRINTF_ELEMENTS_FLAG.pointer
        jnz     .done
        mov     ecx,esi
    .done:
        ret
    .pointer:
        mov     [edi+0],esi
    .null:
        xor     edx,edx         ;return no size value
        ret
endp

proc PRINTF_decode_a uses esi edi ebx,elements,output
        ;output a hexadecimal floating point number in scientific notation "-0x1.hhhhhhp+dddd"
        ;mantissa digits are in hexadecimal. exponent digits are in decimal
        ;the precision field specifies the number of mantissa digits
        ;the exponent part is the decimal value of the unbiased binary exponent (2^dddd)
        locals
                rounding                TRIPLE_PRECISION
                significant_hex_digits  dd ?
        endl
        mov     ebx,[elements]
        stdcall PRINTF_get_float_parameters,ebx,0
        test    edx,edx
        jnz     .converted
        mov     eax,[ebx+PRINTF_ELEMENTS.significant_bits]
        add     eax,2
        shr     eax,2
        mov     [significant_hex_digits],eax
        mov     eax,[ebx+PRINTF_ELEMENTS.precision]
        mov     ecx,eax
        test    [ebx+PRINTF_ELEMENTS.flags],PRINTF_ELEMENTS_FLAG.zero
        jz      .set_precision
        ;zero
        mov     [ebx+PRINTF_ELEMENTS.magnitude_zeros],eax
        jmp     .shifted
    .set_precision:
        sub     eax,[significant_hex_digits]
        jle     .scale
        mov     [ebx+PRINTF_ELEMENTS.magnitude_zeros],eax
        mov     ecx,[significant_hex_digits]
    .scale:
        ;round to 96-(4*number_length+2)
        xor     eax,eax
        mov     [rounding.mantissa_low],eax
        mov     [rounding.mantissa_mid],eax
        mov     [rounding.mantissa_high],eax
        shl     ecx,2
        neg     ecx
        add     ecx,sizeof.TRIPLE_PRECISION.mantissa-2
        bts     [rounding.mantissa_low],ecx
        mov     eax,[ebx+PRINTF_ELEMENTS.arg_value.mantissa_low]
        mov     esi,[ebx+PRINTF_ELEMENTS.arg_value.mantissa_mid]
        mov     edx,[ebx+PRINTF_ELEMENTS.arg_value.mantissa_high]
        add     eax,[rounding.mantissa_low]
        adc     esi,[rounding.mantissa_mid]
        adc     edx,[rounding.mantissa_high]
        jnc     .shift
        ;if it overflowed then renormalise
        shrd    eax,esi,1
        shrd    esi,edx,1
        stc
        rcr     edx,1
        inc     [ebx+PRINTF_ELEMENTS.arg_value.exponent]
    .shift:
        sub     ecx,32-1
        jb      .final_shift
    .shifter_loop:
        mov     eax,esi
        mov     esi,edx
        xor     edx,edx
        sub     ecx,32
        jae     .shifter_loop
    .final_shift:
        xor     edi,edi
        shrd    edi,eax,cl
        shrd    eax,esi,cl
        shrd    esi,edx,cl
        shr     edx,cl
        mov     [ebx+PRINTF_ELEMENTS.arg_value.mantissa_low],eax
        mov     [ebx+PRINTF_ELEMENTS.arg_value.mantissa_mid],esi
        mov     [ebx+PRINTF_ELEMENTS.arg_value.mantissa_high],edx
        dec     [ebx+PRINTF_ELEMENTS.arg_value.exponent]
    .shifted:
        ;0x prefix
        mov     word[ebx+PRINTF_ELEMENTS.prefix_string],'0x'
        mov     [ebx+PRINTF_ELEMENTS.prefix_length],2
        ;decimal point
        mov     edx,[ebx+PRINTF_ELEMENTS.precision]
        test    edx,edx
        setnz   cl
        test    [ebx+PRINTF_ELEMENTS.flags],PRINTF_ELEMENTS_FLAG.hash_option
        setnz   ch
        or      cl,ch
        movzx   ecx,cl
        mov     [ebx+PRINTF_ELEMENTS.decimal_point],ecx
        ;format exponent
        stdcall PRINTF_format_decimal_exponent,+'p',addr ebx+PRINTF_ELEMENTS.exponent_string,EXPONENT_LENGTH_FORMAT_A,[ebx+PRINTF_ELEMENTS.arg_value.exponent]
        mov     [ebx+PRINTF_ELEMENTS.exponent_length],eax
        ;format mantissa
        mov     [ebx+PRINTF_ELEMENTS.number_string],'0'
        stdcall PRINTF_base2_radix_unsigned,addr ebx+PRINTF_ELEMENTS.number_string,addr ebx+PRINTF_ELEMENTS.arg_value,16
        cmp     eax,1
        adc     eax,0
        mov     [ebx+PRINTF_ELEMENTS.number_length],eax
    .converted:
        stdcall PRINTF_elements_print,ebx,[output]
        ret
endp

proc PRINTF_decode_b elements,output
        ;output a binary unsigned integer "bbbb"
        ;digits are in binary
        ;the precision field specifies the minimum number of bits
        mov     ecx,[elements]
        test    [ecx+PRINTF_ELEMENTS.flags],PRINTF_ELEMENTS_FLAG.separator
        jz      .separator_okay
        mov     [ecx+PRINTF_ELEMENTS.separator_character],' '
        mov     [ecx+PRINTF_ELEMENTS.separator_modulus],4
    .separator_okay:
        stdcall PRINTF_elements_print_integer,ecx,2,[output]
        ret
endp

proc PRINTF_decode_c uses ebx esi,elements,output
        ;output repetitions of a single character
        ;ignores zero value arguments and thus won't prematurely terminate the output string
        mov     ebx,[elements]
        movzx   eax,byte[ebx+PRINTF_ELEMENTS.arg_value]
        test    eax,eax
        jz      .no_character
        ;get precision to esi
        mov     esi,DEFAULT_CHARACTER_PRECISION
        test    [ebx+PRINTF_ELEMENTS.flags],PRINTF_ELEMENTS_FLAG.precision_specified
        jz      .precision_okay
        mov     esi,[ebx+PRINTF_ELEMENTS.precision]
    .precision_okay:
        ;print left padding
        test    [ebx+PRINTF_ELEMENTS.flags],PRINTF_ELEMENTS_FLAG.left_justify
        jnz     .leading_spaces_done
        mov     ecx,[ebx+PRINTF_ELEMENTS.width]
        sub     ecx,esi
        jbe     .leading_spaces_done
        stdcall PRINTF_elements_print_repeated_character,ebx,+' ',ecx,[output]
    .leading_spaces_done:
        movzx   eax,byte[ebx+PRINTF_ELEMENTS.arg_value]
        stdcall PRINTF_elements_print_repeated_character,ebx,eax,esi,[output]
        ;print right padding
        test    [ebx+PRINTF_ELEMENTS.flags],PRINTF_ELEMENTS_FLAG.left_justify
        jz      .trailing_spaces_done
        mov     ecx,[ebx+PRINTF_ELEMENTS.width]
        sub     ecx,esi
        jbe     .trailing_spaces_done
        stdcall PRINTF_elements_print_repeated_character,ebx,+' ',ecx,[output]
    .trailing_spaces_done:
        ret
    .no_character:
        stdcall PRINTF_elements_print_repeated_character,ebx,+' ',[ebx+PRINTF_ELEMENTS.width],[output]
        ret
endp

PRINTF_decode_d = PRINTF_decode_unknown

proc PRINTF_decode_e uses ebx,elements,output
        ;output a decimal floating point number in scientific notation "-d.dddddde+ddd"
        ;mantissa digits are in decimal. exponent digits are in decimal
        ;the precision field specifies the number of mantissa digits after the decimal point
        ;the exponent part is the decimal value of the unbiased decimal exponent (10^ddd)
        mov     ebx,[elements]
        stdcall PRINTF_get_float_parameters,ebx,-1
        test    edx,edx
        jnz     .print
        stdcall PRINTF_generate_e,ebx,addr ebx+PRINTF_ELEMENTS.arg_value,eax,EXPONENT_LENGTH_FORMAT_E
    .print:
        stdcall PRINTF_elements_print,ebx,[output]
        ret
endp

proc PRINTF_decode_f uses ebx,elements,output
        ;output a decimal floating point number "-dddd.dddddd"
        ;digits are in decimal
        ;the precision field specifies the number of digits after the decimal point
        mov     ebx,[elements]
        stdcall PRINTF_get_float_parameters,ebx,-1
        test    edx,edx
        jnz     .print
        stdcall PRINTF_generate_f,ebx,addr ebx+PRINTF_ELEMENTS.arg_value,eax
    .print:
        stdcall PRINTF_elements_print,ebx,[output]
        ret
endp

proc PRINTF_decode_g uses edi esi ebx,elements,output
        ;output the shortest representation of e, E, f or F format
        ;the precision field specifies the number of significant digits
        locals
                other_elements  PRINTF_ELEMENTS
        endl
        mov     ebx,[elements]
        stdcall PRINTF_get_float_parameters,ebx,-1
        test    edx,edx
        jnz     .print
        mov     esi,ebx
        lea     edi,[other_elements]
        mov     ecx,sizeof.PRINTF_ELEMENTS/4
        rep     movsd
        mov     esi,eax                 ;esi=log10
        mov     edi,[ebx+PRINTF_ELEMENTS.precision]     ;specified number of significant digits
        stdcall PRINTF_decode_g_format_e,ebx,addr ebx+PRINTF_ELEMENTS.arg_value,esi,edi,EXPONENT_LENGTH_FORMAT_G
        stdcall PRINTF_decode_g_format_f,addr other_elements,addr ebx+PRINTF_ELEMENTS.arg_value,esi,edi
        stdcall PRINTF_get_minimum_length,ebx
        mov     edi,eax
        stdcall PRINTF_get_minimum_length,addr other_elements
        cmp     eax,edi
        ja      .print
        lea     ebx,[other_elements]
    .print:
        stdcall PRINTF_elements_print,ebx,[output]
        ret
endp

PRINTF_decode_h = PRINTF_decode_unknown

proc PRINTF_decode_i uses ebx,elements,output
        ;output a decimal signed integer "-dddd"
        ;digits are in decimal
        ;the precision field specifies the minimum number of digits
        mov     ecx,[elements]
        test    [ecx+PRINTF_ELEMENTS.flags],PRINTF_ELEMENTS_FLAG.separator
        jz      .separator_okay
        mov     [ecx+PRINTF_ELEMENTS.separator_character],','
        mov     [ecx+PRINTF_ELEMENTS.separator_modulus],3
    .separator_okay:
        mov     edx,[ecx+PRINTF_ELEMENTS.flags]
        mov     eax,dword[ecx+PRINTF_ELEMENTS.arg_value]
        and     edx,PRINTF_ELEMENTS_FLAG.argument_size_mask
        cmp     edx,PRINTF_ELEMENTS_SIZE.byte
        jz      .check_byte
        cmp     edx,PRINTF_ELEMENTS_SIZE.hword
        jz      .check_hword
        cmp     edx,PRINTF_ELEMENTS_SIZE.word
        jz      .check_word
        cmp     edx,PRINTF_ELEMENTS_SIZE.dword
        jz      .check_dword
        cmp     edx,PRINTF_ELEMENTS_SIZE.triple
        jz      .check_triple
        int3
    .check_byte:
        test    al,al
        jns     .print
        neg     al
        jmp     .store_word
    .check_hword:
        test    ax,ax
        jns     .print
        neg     ax
        jmp     .store_word
    .check_word:
        test    eax,eax
        jns     .print
        neg     eax
        jmp     .store_word
    .check_dword:
        mov     edx,dword[ecx+PRINTF_ELEMENTS.arg_value+4]
        test    edx,edx
        jns     .print
        not     edx
        neg     eax
        cmc
        adc     edx,0
        jmp     .store_dword
    .check_triple:
        mov     edx,dword[ecx+PRINTF_ELEMENTS.arg_value+4]
        mov     ebx,dword[ecx+PRINTF_ELEMENTS.arg_value+8]
        test    ebx,ebx
        jns     .print
        not     eax
        not     edx
        not     ebx
        add     eax,1
        adc     edx,0
        adc     ebx,0
        mov     dword[ecx+PRINTF_ELEMENTS.arg_value+8],ebx
    .store_dword:
        mov     dword[ecx+PRINTF_ELEMENTS.arg_value+4],edx
    .store_word:
        mov     dword[ecx+PRINTF_ELEMENTS.arg_value],eax
        or      [ecx+PRINTF_ELEMENTS.flags],PRINTF_ELEMENTS_FLAG.negative
    .print:
        stdcall PRINTF_elements_print_integer,ecx,10,[output]
        ret
endp

PRINTF_decode_j = PRINTF_decode_unknown
PRINTF_decode_k = PRINTF_decode_unknown
PRINTF_decode_l = PRINTF_decode_unknown
PRINTF_decode_m = PRINTF_decode_unknown

proc PRINTF_decode_n elements,output
        ;nothing printed. the argument is a pointer to an unsigned integer
        ;the number of characters written so far is stored in the pointed location
        mov     ecx,[elements]
        mov     eax,dword[ecx+PRINTF_ELEMENTS.arg_value]
        mov     edx,[ecx+PRINTF_ELEMENTS.written]
        mov     [eax],edx
        ret
endp

proc PRINTF_decode_o elements,output
        ;output an octal unsigned integer "oooo"
        ;digits are in octal
        ;the precision field specifies the minimum number of digits
        mov     ecx,[elements]
        test    [ecx+PRINTF_ELEMENTS.flags],PRINTF_ELEMENTS_FLAG.separator
        jz      .separator_okay
        mov     [ecx+PRINTF_ELEMENTS.separator_character],' '
        mov     [ecx+PRINTF_ELEMENTS.separator_modulus],4
    .separator_okay:
        test    [ecx+PRINTF_ELEMENTS.flags],PRINTF_ELEMENTS_FLAG.hash_option
        jz      .print
        mov     [ecx+PRINTF_ELEMENTS.prefix_string],'0'
        mov     [ecx+PRINTF_ELEMENTS.prefix_length],1
    .print:
        stdcall PRINTF_elements_print_integer,ecx,8,[output]
        ret
endp

PRINTF_decode_p = PRINTF_decode_x

proc PRINTF_decode_q elements,output
        ;output a quaternary unsigned integer "qqqq"
        ;digits are in quaternary
        ;the precision field specifies the minimum number of nibbles
        mov     ecx,[elements]
        test    [ecx+PRINTF_ELEMENTS.flags],PRINTF_ELEMENTS_FLAG.separator
        jz      .separator_okay
        mov     [ecx+PRINTF_ELEMENTS.separator_character],' '
        mov     [ecx+PRINTF_ELEMENTS.separator_modulus],4
    .separator_okay:
        stdcall PRINTF_elements_print_integer,ecx,4,[output]
        ret
endp

proc PRINTF_decode_r uses ebx,elements,output
        ;output a decimal floating point number point without trailing zeros "-dddd.dddddd"
        ;digits are in decimal
        ;the precision field specifies the number of maximum number of significant digits after the decimal point
        mov     ebx,[elements]
        stdcall PRINTF_get_float_parameters,ebx,-1
        test    edx,edx
        jnz     .print
        stdcall PRINTF_generate_f,ebx,addr ebx+PRINTF_ELEMENTS.arg_value,eax
        stdcall PRINTF_get_base_length,ebx
        cmp     eax,1
        jbe     .print                  ;a single character is always printed
        cmp     [ebx+PRINTF_ELEMENTS.decimal_point],0
        jz      .print
        dec     eax
        ;count the number of trailing zeros
        mov     edx,[ebx+PRINTF_ELEMENTS.magnitude_zeros]
        mov     ecx,[ebx+PRINTF_ELEMENTS.number_length]
        test    ecx,ecx
        jz      .add_precision_zeros
    .number_loop:
        dec     ecx
        cmp     [ebx+PRINTF_ELEMENTS.number_string+ecx],'0'
        jnz     .trailing_zero_count_known
        inc     edx
        test    ecx,ecx
        jnz     .number_loop
    .add_precision_zeros:
        add     edx,[ebx+PRINTF_ELEMENTS.precision_zeros]
    .trailing_zero_count_known:
        ;eax=minimum length
        ;edx=trailing zeros
        sub     eax,[ebx+PRINTF_ELEMENTS.decimal_point] ;eax=maximum allowable number of zeros to remove
        jz      .print
        cmp     edx,eax
        jbe     .remove
        mov     edx,eax
    .remove:
    ;remove from trailing zereos
        sub     [ebx+PRINTF_ELEMENTS.magnitude_zeros],edx
        jae     .check_decimal_removal
        xor     eax,eax
        mov     edx,[ebx+PRINTF_ELEMENTS.magnitude_zeros]
        mov     [ebx+PRINTF_ELEMENTS.magnitude_zeros],eax
        neg     edx
    ;remove from formatted number
        sub     [ebx+PRINTF_ELEMENTS.number_length],edx
        jae     .check_decimal_removal
        mov     edx,[ebx+PRINTF_ELEMENTS.number_length]
        mov     [ebx+PRINTF_ELEMENTS.number_length],eax
        neg     edx
    ;remove from precision zeros
        sub     [ebx+PRINTF_ELEMENTS.precision_zeros],edx
    .check_decimal_removal:
        ;remove last decimal point if not required
        stdcall PRINTF_get_base_length,ebx
        dec     eax                     ;adjust for DP
        cmp     eax,[ebx+PRINTF_ELEMENTS.decimal_point]
        setz    cl
        test    [ebx+PRINTF_ELEMENTS.flags],PRINTF_ELEMENTS_FLAG.hash_option
        setz    ch
        and     cl,ch
        movzx   ecx,cl
        dec     ecx
        and     [ebx+PRINTF_ELEMENTS.decimal_point],ecx
    .print:
        stdcall PRINTF_elements_print,ebx,[output]
        ret
endp

proc PRINTF_decode_s elements,output
        ;output a null terminated string of characters
        ;the precision specifies the maximum output length
        mov     eax,[elements]
        mov     ecx,DEFAULT_STRING_PRECISION
        test    [eax+PRINTF_ELEMENTS.flags],PRINTF_ELEMENTS_FLAG.precision_specified
        jz      .specified_precision_known
        mov     ecx,[eax+PRINTF_ELEMENTS.precision]
    .specified_precision_known:
        stdcall PRINTF_elements_print_string,eax,dword[eax+PRINTF_ELEMENTS.arg_value],ecx,[output]
        ret
endp

PRINTF_decode_t = PRINTF_decode_unknown

proc PRINTF_decode_u elements,output
        ;output a decimal unsigned integer "dddd"
        ;digits are in decimal
        ;the precision field specifies the minimum number of digits
        mov     ecx,[elements]
        test    [ecx+PRINTF_ELEMENTS.flags],PRINTF_ELEMENTS_FLAG.separator
        jz      .separator_okay
        mov     [ecx+PRINTF_ELEMENTS.separator_character],','
        mov     [ecx+PRINTF_ELEMENTS.separator_modulus],3
    .separator_okay:
        stdcall PRINTF_elements_print_integer,ecx,10,[output]
        ret
endp

PRINTF_decode_v = PRINTF_decode_unknown
PRINTF_decode_w = PRINTF_decode_unknown

proc PRINTF_decode_x elements,output
        ;output a hexadecimal unsigned integer "hhhh"
        ;digits are in hexadecimal
        ;the precision field specifies the minimum number of digits
        mov     ecx,[elements]
        test    [ecx+PRINTF_ELEMENTS.flags],PRINTF_ELEMENTS_FLAG.separator
        jz      .separator_okay
        mov     [ecx+PRINTF_ELEMENTS.separator_character],' '
        mov     [ecx+PRINTF_ELEMENTS.separator_modulus],4
    .separator_okay:
        test    [ecx+PRINTF_ELEMENTS.flags],PRINTF_ELEMENTS_FLAG.hash_option
        jz      .print
        mov     word[ecx+PRINTF_ELEMENTS.prefix_string],'0x'
        mov     [ecx+PRINTF_ELEMENTS.prefix_length],2
    .print:
        stdcall PRINTF_elements_print_integer,ecx,16,[output]
        ret
endp

PRINTF_decode_y = PRINTF_decode_unknown
PRINTF_decode_z = PRINTF_decode_unknown

proc PRINTF_decode_unknown elements,output
        ;the unknown character type is in eax
        ;output the type character
        ;this also allows to output a single % symbol by placing a pair (%%) in the format string
        stdcall PRINTF_elements_print_character,[elements],eax,[output]
        ret
endp

proc PRINTF_decode_g_format_e uses ebx,elements,value,log10,significant_digits,exponent_length
        ;return eax=number of significant digits printed
        ;       precision setting for E format significant digits
        ;       precision = SD-1
        mov     ebx,[elements]
        mov     ecx,[log10]
        mov     eax,[significant_digits]
        dec     eax
        jns     .precision_known
        xor     eax,eax
    .precision_known:
        mov     [ebx+PRINTF_ELEMENTS.precision],eax
        stdcall PRINTF_generate_e,ebx,[value],ecx,[exponent_length]
        cmp     eax,[significant_digits]
        jbe     .done
        dec     [ebx+PRINTF_ELEMENTS.precision]
        js      .done
        stdcall PRINTF_generate_e,ebx,[value],[log10],[exponent_length]
    .done:
        ret
endp

proc PRINTF_decode_g_format_f uses ebx,elements,value,log10,significant_digits
        ;return eax=number of significant digits printed
        ;       precision setting for F format significant digits
        ;       log10 < 0 then precision = SD-1-log10
        ;       log10 => 0 then precision = max(SD-1-log10, 0)
        mov     ebx,[elements]
        mov     ecx,[log10]
        mov     eax,[significant_digits]
        sub     eax,ecx
        dec     eax
        jns     .precision_known
        xor     eax,eax
    .precision_known:
        mov     [ebx+PRINTF_ELEMENTS.precision],eax
        stdcall PRINTF_generate_f,ebx,[value],ecx
        cmp     eax,[significant_digits]
        jbe     .done
        dec     [ebx+PRINTF_ELEMENTS.precision]
        js      .done
        stdcall PRINTF_generate_f,ebx,[value],[log10]
    .done:
        ret
endp

macro PRINTF_ceiling_log2_10 reg,bits,offset {
        ;computes: bits * log10(2) + roundup + offset
        ;result is valid for input values up to 2620 bits
        imul    reg,bits,631306                 ;ceiling(2^21 * log10(2))
        add     reg,(offset+1) shl 21 - 1       ;round up and add the offset
        shr     reg,21
}

proc PRINTF_generate_e uses ebx,elements,value,log10,exponent_length
        ;return eax=length of printed number
        locals
                scaled_value                    TRIPLE_PRECISION
                significant_decimal_digits      dd ?
        endl
        mov     ebx,[elements]
        mov     edx,[ebx+PRINTF_ELEMENTS.precision]
        test    edx,edx
        setnz   cl
        test    [ebx+PRINTF_ELEMENTS.flags],PRINTF_ELEMENTS_FLAG.hash_option
        setnz   ch
        or      cl,ch
        movzx   ecx,cl
        mov     [ebx+PRINTF_ELEMENTS.decimal_point],ecx
        test    [ebx+PRINTF_ELEMENTS.flags],PRINTF_ELEMENTS_FLAG.zero
        jz      .non_zero
    ;zero
        xor     eax,eax
        inc     edx
        mov     [log10],eax
        mov     [ebx+PRINTF_ELEMENTS.magnitude_zeros],edx
        jmp     .conversion_okay
    .non_zero:
        PRINTF_ceiling_log2_10 eax,[ebx+PRINTF_ELEMENTS.significant_bits],1     ;+1 for last digit distinction
        mov     [significant_decimal_digits],eax
        ;edx=specified precision
        lea     ecx,[edx+1]                     ;ecx = output size
        sub     edx,[log10]                     ;scale by edx
        mov     eax,ecx
        sub     eax,[significant_decimal_digits]
        jle     .scale
        mov     [ebx+PRINTF_ELEMENTS.magnitude_zeros],eax
        sub     edx,eax
        mov     ecx,[significant_decimal_digits]
    .scale:
        mov     [ebx+PRINTF_ELEMENTS.number_length],ecx
        ;scale to fit
        stdcall PRINTF_triple_precision_scale,addr scaled_value,[value],edx
        stdcall PRINTF_decimalise_unsigned,addr ebx+PRINTF_ELEMENTS.number_string,addr scaled_value
        sub     eax,[ebx+PRINTF_ELEMENTS.number_length]
        add     [log10],eax             ;if the number was rounded up (e.g. 99.9 to 100) then adjust
    .conversion_okay:
        ;format exponent
        stdcall PRINTF_format_decimal_exponent,+'e',addr ebx+PRINTF_ELEMENTS.exponent_string,[exponent_length],[log10]
        mov     [ebx+PRINTF_ELEMENTS.exponent_length],eax
        mov     eax,[ebx+PRINTF_ELEMENTS.number_length]
        ret
endp

proc PRINTF_generate_f uses ebx,elements,value,log10
        ;return eax=length of printed number
        locals
                expected_size                   dd ?
                scaled_value                    TRIPLE_PRECISION
                significant_decimal_digits_m1   dd ?
        endl
        mov     ebx,[elements]
        test    [ebx+PRINTF_ELEMENTS.flags],PRINTF_ELEMENTS_FLAG.separator
        jz      .separator_okay
        mov     [ebx+PRINTF_ELEMENTS.separator_character],','
        mov     [ebx+PRINTF_ELEMENTS.separator_modulus],3
    .separator_okay:
        PRINTF_ceiling_log2_10 eax,[ebx+PRINTF_ELEMENTS.significant_bits],1     ;+1 for last digit distinction
        dec     eax
        mov     [significant_decimal_digits_m1],eax
        mov     eax,[log10]
        mov     edx,[ebx+PRINTF_ELEMENTS.precision]
        ;eax=log10
        ;edx=specified precision
        test    [ebx+PRINTF_ELEMENTS.flags],PRINTF_ELEMENTS_FLAG.zero
        jz      .non_zero
        mov     eax,1 shl 31            ;the log of zero is extremely tiny
    .non_zero:
        test    eax,eax
        jns     .positive_log
        mov     ecx,eax
        neg     ecx
        cmp     ecx,edx
        jbe     .store_precision_zeros
        mov     ecx,edx
    .store_precision_zeros:
        mov     [ebx+PRINTF_ELEMENTS.precision_zeros],ecx
        mov     [ebx+PRINTF_ELEMENTS.decimal_point],1
        jmp     .decimal_point_done
    .positive_log:
        ;set the decimal point position
        lea     ecx,[eax+1]
        mov     [ebx+PRINTF_ELEMENTS.decimal_point],ecx
        ;check if a decimal is printed
        test    edx,edx                 ;non-zero precisions always have a decimal point
        jnz     .decimal_point_done
        test    [ebx+PRINTF_ELEMENTS.flags],PRINTF_ELEMENTS_FLAG.hash_option
        jnz     .decimal_point_done
        xor     ecx,ecx
        mov     [ebx+PRINTF_ELEMENTS.decimal_point],ecx
    .decimal_point_done:
        add     eax,edx                 ;compute most significant digit
        lea     ecx,[eax+1]
        jns     .store_expected_size
        xor     ecx,ecx
    .store_expected_size:
        mov     [expected_size],ecx
        test    ecx,ecx
        jnz     .compute_trailing_zeros
        mov     ecx,[ebx+PRINTF_ELEMENTS.precision_zeros]
        inc     ecx
        mov     [ebx+PRINTF_ELEMENTS.precision_zeros],ecx
        cmp     ecx,1
        jnz     .scale
        test    [ebx+PRINTF_ELEMENTS.flags],PRINTF_ELEMENTS_FLAG.hash_option
        jnz     .scale
        mov     [ebx+PRINTF_ELEMENTS.decimal_point],0
        jmp     .scale
    .compute_trailing_zeros:
        ;find the number of trailing zeros to print
        sub     eax,[significant_decimal_digits_m1]
        jle     .scale
        sub     edx,eax
        mov     [ebx+PRINTF_ELEMENTS.magnitude_zeros],eax
        sub     [expected_size],eax
    .scale:
        ;scale to fit
        stdcall PRINTF_triple_precision_scale,addr scaled_value,[value],edx
        stdcall PRINTF_decimalise_unsigned,addr ebx+PRINTF_ELEMENTS.number_string,addr scaled_value
        mov     [ebx+PRINTF_ELEMENTS.number_length],eax
        sub     eax,[expected_size]
        je      .conversion_okay
        add     [log10],eax             ;if the number was rounded up (e.g. 99.9 to 100) then adjust
        mov     ecx,[ebx+PRINTF_ELEMENTS.precision_zeros]
        sub     ecx,1
        jnc     .set_new_leading_zeros
        mov     ecx,[ebx+PRINTF_ELEMENTS.decimal_point]
        test    ecx,ecx
        jz      .conversion_okay
        inc     ecx
        mov     [ebx+PRINTF_ELEMENTS.decimal_point],ecx
        jmp     .conversion_okay
    .set_new_leading_zeros:
        mov     [ebx+PRINTF_ELEMENTS.precision_zeros],ecx
    .conversion_okay:
        mov     eax,[ebx+PRINTF_ELEMENTS.number_length]
        ret
endp

proc PRINTF_get_float_parameters uses ebx,elements,log10_flag
        ;return eax=log10(value), edx=conversion done flag (for NaN and infinity)
        ;upscales the argument
        ;sets the negative flag
        ;sets the precision
        ;converts infinity and NaN to standard formats
        mov     ebx,[elements]
        mov     edx,[ebx+PRINTF_ELEMENTS.flags]
        and     edx,PRINTF_ELEMENTS_FLAG.argument_size_mask
        lea     edx,[edx*sizeof.FLOAT_DESCRIPTION+float_description_table]
        lea     ecx,[ebx+PRINTF_ELEMENTS.arg_value]
        stdcall PRINTF_triple_precision_upscale,ecx,ecx,edx
        and     ecx,PRINTF_ELEMENTS_FLAG.negative
        or      [ebx+PRINTF_ELEMENTS.flags],ecx
        mov     [ebx+PRINTF_ELEMENTS.significant_bits],edx
        cmp     eax,PRINTF_CLASS_INFINITY
        je      .infinity
        cmp     eax,PRINTF_CLASS_SNAN
        je      .SNaN
        cmp     eax,PRINTF_CLASS_QNAN
        je      .QNaN
        cmp     eax,PRINTF_CLASS_ZERO
        je      .zero
        cmp     [log10_flag],0
        jz      .log10_okay
        stdcall PRINTF_triple_precision_floor_log10,addr ebx+PRINTF_ELEMENTS.arg_value
    .log10_okay:
        ;set the precision
        test    [ebx+PRINTF_ELEMENTS.flags],PRINTF_ELEMENTS_FLAG.precision_specified
        jnz     .precision_okay
        mov     [ebx+PRINTF_ELEMENTS.precision],DEFAULT_FLOAT_PRECISION
    .precision_okay:
        xor     edx,edx         ;indicate not yet converted
    .done:
        ret
    .zero:
        or      [ebx+PRINTF_ELEMENTS.flags],PRINTF_ELEMENTS_FLAG.zero
        jmp     .log10_okay
    .infinity:
        mov     eax,3
        mov     dword[ebx+PRINTF_ELEMENTS.number_string],'inf'
        jmp     .converted
    .SNaN:
        mov     ecx,'SNaN'
        jmp     .NaN
    .QNaN:
        mov     ecx,'QNaN'
    .NaN:
        mov     dword[ebx+PRINTF_ELEMENTS.number_string+0],ecx
        mov     dword[ebx+PRINTF_ELEMENTS.number_string+4],'0x0'
        stdcall PRINTF_base2_radix_unsigned,addr ebx+PRINTF_ELEMENTS.number_string+6,addr ebx+PRINTF_ELEMENTS.arg_value,16
        cmp     eax,1
        adc     eax,6
    .converted:
        mov     [ebx+PRINTF_ELEMENTS.number_length],eax
        or      edx,-1          ;indicate that conversion is complete
        jmp     .done
endp

proc_leaf PRINTF_triple_precision_upscale uses ebp edi esi ebx,dest,source,description
        ;returns eax=number class, ecx=negative flag, edx=number of significant bits
        ;convert incoming float to triple precision format
        locals
                negative_flag   dd ?
                number_class    dd ?
        endl
        mov     esi,[description]
        mov     ebp,[source]
        mov     edx,sizeof.TRIPLE_PRECISION.mantissa            ;edx = number of significant bits
        mov     [number_class],PRINTF_CLASS_NORMAL
        movzx   ecx,[esi+FLOAT_DESCRIPTION.sign_bit_position]
        mov     eax,[ebp+0]
        mov     ebx,[ebp+4]
        mov     ebp,[ebp+8]
        ;shift from 16 to 88 bits left
        sub     ecx,32
        jb      .shift_in_sign
    .find_sign:
        sub     edx,32
        mov     ebp,ebx
        mov     ebx,eax
        xor     eax,eax
        sub     ecx,32
        jae     .find_sign
    .shift_in_sign:
        and     ecx,0x1f
        sub     edx,ecx
        shld    ebp,ebx,cl
        shld    ebx,eax,cl
        shl     eax,cl
        ;extract the sign
        dec     edx
        add     eax,eax
        adc     ebx,ebx
        adc     ebp,ebp
        sbb     ecx,ecx
        mov     [negative_flag],ecx
        ;extract the exponent
        movzx   ecx,[esi+FLOAT_DESCRIPTION.exponent_width]
        sub     edx,ecx
        xor     edi,edi
        shld    edi,ebp,cl
        ;check for infinity and NaN
        inc     edi
        shr     edi,cl
        jnz     .infinity_NaN
        shld    edi,ebp,cl
        shld    ebp,ebx,cl
        shld    ebx,eax,cl
        shl     eax,cl
        test    edi,edi
        jnz     .normal
        ;check for zero
        mov     ecx,ebp
        or      ecx,ebx
        or      ecx,eax
        jz      .zero
        ;adjust denormal exponent
        cmp     [esi+FLOAT_DESCRIPTION.implicit_bit_flag],0
        setz    cl
        movzx   ecx,cl
        add     edi,ecx         ;restore exponent extra
        jmp     .implied_bit_okay
    .normal:
        ;add the implied bit
        cmp     [esi+FLOAT_DESCRIPTION.implicit_bit_flag],0
        jz      .implied_bit_okay
        inc     edx
        shrd    eax,ebx,1
        shrd    ebx,ebp,1
        stc
        rcr     ebp,1
    .implied_bit_okay:
        sub     edi,[esi+FLOAT_DESCRIPTION.exponent_bias]
        ;normalise
        test    ebp,ebp
        js      .normalised
        bsr     ecx,ebp
        jnz     .shift_in_denormal
        ;check for unnormal zero
        mov     ecx,ebp
        or      ecx,ebx
        or      ecx,eax
        jz      .unnormal_zero
    .macro_shift_denormal:
        sub     edi,32
        sub     edx,32
        mov     ebp,ebx
        mov     ebx,eax
        xor     eax,eax
        bsr     ecx,ebp
        jz      .macro_shift_denormal
    .shift_in_denormal:
        not     ecx
        and     ecx,0x1f
        sub     edi,ecx
        sub     edx,ecx
        shld    ebp,ebx,cl
        shld    ebx,eax,cl
        shl     eax,cl
    .normalised:
        mov     ecx,[dest]
        mov     [ecx+TRIPLE_PRECISION.exponent],edi
        mov     [ecx+TRIPLE_PRECISION.mantissa_high],ebp
        mov     [ecx+TRIPLE_PRECISION.mantissa_mid],ebx
        mov     [ecx+TRIPLE_PRECISION.mantissa_low],eax
        mov     ecx,[negative_flag]
        mov     eax,[number_class]
        ret
    .unnormal_zero:
        xor     edi,edi
    .zero:
        xor     edx,edx
        mov     [number_class],PRINTF_CLASS_ZERO
        jmp     .normalised
    .infinity_NaN:
        xor     edi,edi
        mov     [number_class],PRINTF_CLASS_INFINITY
        shld    ebp,ebx,cl
        shld    ebx,eax,cl
        shl     eax,cl
        ;if there is an explicit bit then we shift it out and ignore it
        ;this means all pseudo NaNs and pseudo infinity become normal NaNs and normal infinity
        cmp     [esi+FLOAT_DESCRIPTION.implicit_bit_flag],0
        jnz     .explicit_bit_okay
        shld    ebp,ebx,1
        shld    ebx,eax,1
        shl     eax,1
        dec     edx
    .explicit_bit_okay:
        ;check for infinity
        mov     ecx,ebp
        or      ecx,ebx
        or      ecx,eax
        xchg    edx,ecx
        jz      .normalised
        dec     ecx
        mov     edx,ecx
        ;get the Q/S bit
        btr     ebp,31
        mov     esi,PRINTF_CLASS_QNAN
        jc      .NaN_class_okay
        mov     esi,PRINTF_CLASS_SNAN
    .NaN_class_okay:
        mov     [number_class],esi
        ;shift back to the LSb
        sub     ecx,sizeof.TRIPLE_PRECISION.mantissa
        not     ecx
        sub     ecx,32
        jb      .final_shift_back_mantissa
    .shift_back_mantissa:
        mov     eax,ebx
        mov     ebx,ebp
        xor     ebp,ebp
        sub     ecx,32
        jae     .shift_back_mantissa
    .final_shift_back_mantissa:
        shrd    eax,ebx,cl
        shrd    ebx,ebp,cl
        shr     ebp,cl
        jmp     .normalised
endp

proc PRINTF_triple_precision_floor_log10 uses esi ebx,source
        ;compute eax=floor(log10(source))
        locals
                temp_power TRIPLE_PRECISION
        endl
        mov     esi,[source]
        mov     eax,0x4D104D43                          ;ceiling(2^32 * log2(10))
        imul    dword[esi+TRIPLE_PRECISION.exponent]    ;edx=approximate log. always equal to, or one higher, than the required value
        mov     ebx,edx
        stdcall PRINTF_get_triple_precision_10_power_y,addr temp_power,0,edx
        assert  sizeof.TRIPLE_PRECISION = 16
        mov     eax,[esi+4*0]
        mov     ecx,[esi+4*1]
        mov     edx,[esi+4*2]
        mov     esi,[esi+4*3]
        sub     eax,[temp_power+4*0]
        sbb     ecx,[temp_power+4*1]
        sbb     edx,[temp_power+4*2]
        sbb     esi,[temp_power+4*3]
        setl    cl                                      ;adjust log to the correct value
        movzx   ecx,cl
        neg     ecx
        lea     eax,[ebx+ecx]
        ret
endp

proc PRINTF_triple_precision_scale uses ebx esi,dest,source,scale
        ;compute dest=round(source*10^scale)
        locals
                scaled_value TRIPLE_PRECISION
        endl
        stdcall PRINTF_get_triple_precision_10_power_y,addr scaled_value,[source],[scale]
    ;integerise
        mov     ecx,[scaled_value.exponent]
        xor     edx,edx
        xor     ebx,ebx
        xor     eax,eax
        test    ecx,ecx
        js      .store
        mov     edx,[scaled_value.mantissa_high]
        mov     ebx,[scaled_value.mantissa_mid]
        mov     eax,[scaled_value.mantissa_low]
        sub     ecx,sizeof.TRIPLE_PRECISION.mantissa
        ;jg     .overflow
        jge     .store
        xor     esi,esi
        neg     ecx
        sub     ecx,32
        jb      .final_shift
    .shifter_loop:
        mov     esi,eax
        mov     eax,ebx
        mov     ebx,edx
        xor     edx,edx
        sub     ecx,32
        jae     .shifter_loop
    .final_shift:
        shrd    esi,eax,cl
        shrd    eax,ebx,cl
        shrd    ebx,edx,cl
        shr     edx,cl
        ;round up
        add     esi,esi
        adc     eax,0
        adc     ebx,0
        adc     edx,0
    .store:
        mov     esi,[dest]
        mov     [esi+TRIPLE_PRECISION.mantissa_low],eax
        mov     [esi+TRIPLE_PRECISION.mantissa_mid],ebx
        mov     [esi+TRIPLE_PRECISION.mantissa_high],edx
        ret
endp

TRIPLE_PRECISION_EXPONENT_TABLE_PASSES  = (TRIPLE_PRECISION_LOG2_MAXIMUM_EXPONENT+TRIPLE_PRECISION_EXPONENT_TABLE_SCALE)/\
                                          TRIPLE_PRECISION_EXPONENT_TABLE_SCALE
TRIPLE_PRECISION_EXPONENT_LAST_PASS_BITS= 1 + TRIPLE_PRECISION_LOG2_MAXIMUM_EXPONENT - \
                                          TRIPLE_PRECISION_EXPONENT_TABLE_SCALE * (TRIPLE_PRECISION_EXPONENT_TABLE_PASSES - 1)
TRIPLE_PRECISION_EXPONENT_TABLE_LENGTH  = ((1 shl TRIPLE_PRECISION_EXPONENT_TABLE_SCALE - 1) * \
                                          (TRIPLE_PRECISION_EXPONENT_TABLE_PASSES - 1) + \
                                          (1 shl TRIPLE_PRECISION_EXPONENT_LAST_PASS_BITS - 1))
TRIPLE_PRECISION_EXPONENT_TABLE_SIZE    = 1 + 2 * TRIPLE_PRECISION_EXPONENT_TABLE_LENGTH
TRIPLE_PRECISION_EXPONENT_10_TO_M4096   = 0xffffcada    ;the last value written to the power table, to ensure this is thread safe

proc PRINTF_get_triple_precision_10_power_y uses edi esi ebx,dest,source,y
        ;compute dest=source*10^y
        mov     esi,PRINTF_triple_precision_power_table
        mov     ebx,[source]
        mov     edi,[dest]
        stdcall PRINTF_check_triple_precision_power_table,esi
        ;either start with 10^0 in esi, ...
        test    ebx,ebx
        jz      .copy
        ;... or start with the source value
        mov     esi,ebx
    .copy:
        mov     eax,[esi+TRIPLE_PRECISION.mantissa_low]
        mov     ecx,[esi+TRIPLE_PRECISION.mantissa_mid]
        mov     edx,[esi+TRIPLE_PRECISION.mantissa_high]
        mov     ebx,[esi+TRIPLE_PRECISION.exponent]
        mov     [edi+TRIPLE_PRECISION.mantissa_low],eax
        mov     [edi+TRIPLE_PRECISION.mantissa_mid],ecx
        mov     [edi+TRIPLE_PRECISION.mantissa_high],edx
        mov     [edi+TRIPLE_PRECISION.exponent],ebx
        mov     ebx,[y]
        mov     esi,PRINTF_triple_precision_power_table                                 ;10^1
        test    ebx,ebx
        jz      .done
        jns     .raise_loop
        add     esi,TRIPLE_PRECISION_EXPONENT_TABLE_LENGTH * sizeof.TRIPLE_PRECISION    ;10^-1
        neg     ebx
    .raise_loop:
        mov     eax,ebx
        and     eax,1 shl TRIPLE_PRECISION_EXPONENT_TABLE_SCALE - 1
        jz      .raise_next
        assert  sizeof.TRIPLE_PRECISION=16
        shl     eax,4
        stdcall PRINTF_variable_precision_mul,edi,edi,addr eax+esi,sizeof.TRIPLE_PRECISION.mantissa / 32
    .raise_next:
        add     esi,(1 shl TRIPLE_PRECISION_EXPONENT_TABLE_SCALE - 1) * sizeof.TRIPLE_PRECISION
        shr     ebx,TRIPLE_PRECISION_EXPONENT_TABLE_SCALE
        jnz     .raise_loop
    .done:
        ret
endp

struct QUAD_PRECISION
        mantissa        rd sizeof.TRIPLE_PRECISION.mantissa / 32 + 1
        exponent        rd 1
ends
sizeof.QUAD_PRECISION.mantissa = sizeof.TRIPLE_PRECISION.mantissa + 32

proc_leaf PRINTF_check_triple_precision_power_table table
        mov     eax,[table]
        cmp     [eax+(TRIPLE_PRECISION_EXPONENT_TABLE_SIZE-1)*sizeof.TRIPLE_PRECISION+TRIPLE_PRECISION.exponent],TRIPLE_PRECISION_EXPONENT_10_TO_M4096
        jnz     PRINTF_make_triple_precision_power_table
        ret
endp

proc PRINTF_make_triple_precision_power_table uses edi esi ebx,table
        locals
                base_multiplier QUAD_PRECISION
                next_value      QUAD_PRECISION
        endl
        assert TRIPLE_PRECISION_EXPONENT_LAST_PASS_BITS = 1
        mov     ebx,[table]
        ;start with 10^0 at the beginning
        xor     ecx,ecx
        mov     [ebx+TRIPLE_PRECISION.mantissa_low],ecx
        mov     [ebx+TRIPLE_PRECISION.mantissa_mid],ecx
        mov     [ebx+TRIPLE_PRECISION.mantissa_high],1 shl 31
        mov     [ebx+TRIPLE_PRECISION.exponent],+1
        ;then the first block starts with 10^1
        repeat sizeof.QUAD_PRECISION.mantissa / 32 - 1
                mov     [next_value.mantissa + 4 * (%-1)],ecx
        end repeat
        mov     [next_value.mantissa + 4 * (sizeof.QUAD_PRECISION.mantissa / 32 - 1)],10 shl 28
        mov     [next_value.exponent],+4
        call    .build
        ;then the next block starts with 10^-1
        mov     ecx,0xcccccccd
        mov     [next_value.mantissa + 4 * 0],ecx
        dec     ecx
        repeat sizeof.QUAD_PRECISION.mantissa / 32 - 1
                mov     [next_value.mantissa + 4 * %],ecx
        end repeat
        mov     [next_value.exponent],-3
        call    .build
        ret

    .build:
        call    .round_and_copy_result
        mov     edi,TRIPLE_PRECISION_EXPONENT_TABLE_PASSES - 1
    .build_scale_loop:
        ;transfer next value to base
        mov     edx,edi
        lea     esi,[next_value]
        lea     edi,[base_multiplier]
        mov     ecx,sizeof.QUAD_PRECISION / 4
        rep     movsd
        mov     edi,edx
        mov     esi,1 shl TRIPLE_PRECISION_EXPONENT_TABLE_SCALE - 1
    .build_bit_loop:
        lea     ecx,[next_value]
        stdcall PRINTF_variable_precision_mul,ecx,ecx,addr base_multiplier,sizeof.QUAD_PRECISION.mantissa / 32
        call    .round_and_copy_result
        dec     esi
        jnz     .build_bit_loop
        dec     edi
        jnz     .build_scale_loop
        retn

    .round_and_copy_result:
        add     ebx,sizeof.TRIPLE_PRECISION
        bt      [next_value.exponent-4*4],31
        mov     eax,[next_value.exponent-4*3]
        mov     edx,[next_value.exponent-4*2]
        mov     ecx,[next_value.exponent-4*1]
        adc     eax,0
        adc     edx,0
        adc     ecx,0
        mov     [ebx+TRIPLE_PRECISION.mantissa_low],eax
        mov     eax,[next_value.exponent-4*0]
        mov     [ebx+TRIPLE_PRECISION.mantissa_mid],edx
        mov     [ebx+TRIPLE_PRECISION.mantissa_high],ecx
        mov     [ebx+TRIPLE_PRECISION.exponent],eax
        retn

endp

proc_leaf PRINTF_variable_precision_mul uses ebp esi edi ebx,dest,source1,source2,mantissa_length
        std
        mov     esi,[source1]
        mov     ebp,[source2]
        mov     ebx,[mantissa_length]
        lea     edi,[esp-4]
        lea     esi,[esi+ebx*4]
        lea     ebp,[ebp+ebx*4]
        lea     ecx,[ebx*2]
        xor     eax,eax
        rep     stosd
        neg     ebx
    .loop_multiplier:
        mov     ecx,[mantissa_length]
        neg     ecx
    .loop_multiplicand:
        lea     edi,[ebx+ecx]
        ;multiply esi+ebx * ebp+ecx ---> esp+edi
        mov     eax,[esi+ebx*4]
        mul     dword[ebp+ecx*4]
        add     [esp+edi*4+0],eax
        adc     [esp+edi*4+4],edx
        jnc     .next_multiplicand
        lea     eax,[edi+1]
    .add_in_carry:
        inc     eax
        adc     dword[esp+eax*4],0
        jc      .add_in_carry
    .next_multiplicand:
        inc     ecx
        jnz     .loop_multiplicand
        inc     ebx
        jnz     .loop_multiplier
        ;add the exponents into edx
        mov     edx,[esi]
        add     edx,[ebp]
        ;round to the destination length
        mov     ecx,[mantissa_length]
        not     ecx
        ;find the rounding bit
        bsr     ebp,[esp-4]                     ;will be either 31 or 30
        mov     ebx,ecx
        xor     esi,esi
        bts     esi,ebp                         ;carry is zeroed here
    .ripple_carry_through_result:
        adc     [esp+ebx*4],esi
        mov     esi,0
        inc     ebx
        jnz     .ripple_carry_through_result
        ;test the MSb and scale up if necessary
        test    byte[esp-1],-1                  ;carry is zeroed here
        js      .MSb_okay
        mov     ebx,ecx
    .normalise_result:
        mov     eax,[esp+ebx*4]
        adc     eax,eax
        mov     [esp+ebx*4],eax
        inc     ebx
        jnz     .normalise_result
        dec     edx                             ;adjust exponent
    .MSb_okay:
        ;store the result
        not     ecx
        mov     edi,[dest]
        lea     esi,[esp-4]
        mov     [edi+ecx*4],edx                 ;exponent
        lea     edi,[edi+ecx*4-4]
        rep     movsd
        cld
        ret
endp

proc PRINTF_elements_print_integer uses ebx,elements,base,output
        mov     ebx,[elements]
        stdcall PRINTF_asciify_unsigned,addr ebx+PRINTF_ELEMENTS.number_string,addr ebx+PRINTF_ELEMENTS.arg_value,[base]
        mov     [ebx+PRINTF_ELEMENTS.number_length],eax
        xor     ecx,ecx
        mov     edx,DEFAULT_INTEGER_PRECISION
        test    [ebx+PRINTF_ELEMENTS.flags],PRINTF_ELEMENTS_FLAG.precision_specified
        jz      .specified_precision_known
        mov     edx,[ebx+PRINTF_ELEMENTS.precision]
    .specified_precision_known:
        sub     eax,edx
        jae     .precision_known
        sub     ecx,eax
    .precision_known:
        mov     [ebx+PRINTF_ELEMENTS.precision_zeros],ecx
        stdcall PRINTF_elements_print,ebx,[output]
        ret
endp

proc PRINTF_asciify_unsigned uses ebx,dest,value,base
        ;return eax=length
        mov     eax,[base]
        mov     ecx,[value]
        mov     edx,[dest]
        cmp     eax,10
        jnz     .not_base_10
        stdcall PRINTF_decimalise_unsigned,edx,ecx
        ret
    .not_base_10:
        lea     ebx,[eax-1]
        and     ebx,eax
        jnz     .arbitrary_base
        stdcall PRINTF_base2_radix_unsigned,edx,ecx,eax
        ret
    .arbitrary_base:
        stdcall PRINTF_arbitrary_radix_unsigned,edx,ecx,eax
        ret
endp

proc_leaf PRINTF_decimalise_unsigned uses ebp esi edi ebx,dest,value
        ;return eax=length
        locals
                length          dd ?
                pos             dd ?
                mid             dd ?
                high            dd ?
        endl
        mov     ecx,[value]
        mov     ebx,[ecx+8]
        mov     edx,[ecx+4]
        mov     eax,[ecx]
        bsr     edi,ebx
        jnz     .reduce96
        bsr     edi,edx
        jnz     .reduce64
        bsr     edi,eax
        jz      .done                   ;a value of zero prints nothing
    .reduce32:
        PRINTF_ceiling_log2_10 edi,edi
        mov     ebp,[edi*4+PRINTF_integer_base_10_multiples_32]
        sub     ebp,eax
        adc     edi,0
        mov     [length],edi
        add     edi,[dest]
        jmp     .start32
    .reduce64:
        add     edi,32
        PRINTF_ceiling_log2_10 edi,edi
        mov     ebp,[(edi-10)*8+PRINTF_integer_base_10_multiples_64+0]
        mov     esi,[(edi-10)*8+PRINTF_integer_base_10_multiples_64+4]
        sub     ebp,eax
        sbb     esi,edx
        adc     edi,0
        mov     [length],edi
        add     edi,[dest]
        jmp     .start64
    .reduce96:
        add     edi,64
        PRINTF_ceiling_log2_10 edi,edi
        lea     ecx,[edi*4]
        mov     ebp,[(ecx-20*4)*3+PRINTF_integer_base_10_multiples_96+0]
        mov     esi,[(ecx-20*4)*3+PRINTF_integer_base_10_multiples_96+4]
        mov     ecx,[(ecx-20*4)*3+PRINTF_integer_base_10_multiples_96+8]
        sub     ebp,eax
        sbb     esi,edx
        sbb     ecx,ebx
        adc     edi,0
        mov     [length],edi
        add     edi,[dest]
        mov     esi,eax
        mov     eax,edx
        mov     edx,ebx
    .loop96:
        mov     [pos],edi
        mov     [high],edx
        mov     [mid],eax
        mov     ecx,esi         ;       [high]  [mid]   ecx
        mov     ebx,0xcccccccc  ;floor(2^35/10)
        mov     eax,esi
        mul     ebx
        mov     edi,eax
        mov     esi,edx         ;               esi     edi
        mov     eax,ebx
        mul     [mid]
        add     esi,eax
        adc     edx,0
        xchg    ebx,edx         ;       ebx     esi     edi
        mov     eax,[high]
        mul     edx
        add     eax,ebx
        adc     edx,0           ;edx    eax     esi     edi
        xor     ebx,ebx         ;                               [high]  [mid]   ecx     =num*0x000000000000000000000001
        mov     ebp,edi         ;                       edx     eax     esi     edi     =num*0x0000000000000000cccccccc
        add     ebp,ecx         ;               edx     eax     esi     edi             =num*0x00000000cccccccc00000000
        mov     ebp,edi         ;       edx     eax     esi     edi                     =num*0xcccccccc0000000000000000
        adc     ebp,esi
        adc     ebx,0
        add     ebp,[mid]
        adc     ebx,0
        xor     ebp,ebp
        add     edi,esi
        adc     ebp,0
        add     edi,eax
        adc     ebp,0
        add     edi,[high]
        adc     ebp,0
        add     edi,ebx
        mov     edi,[pos]
        adc     ebp,0
        xor     ebx,ebx
        add     esi,eax
        adc     eax,edx
        adc     edx,0
        add     esi,edx
        adc     ebx,0
        add     esi,ebp
        adc     eax,ebx
        adc     edx,0
        and     esi,-8
        sub     ecx,esi
        shrd    esi,eax,2
        shrd    eax,edx,2
        shr     edx,2
        sub     ecx,esi
        shrd    esi,eax,1
        shrd    eax,edx,1
        dec     edi
        add     ecx,'0'
        mov     [edi],cl
        shr     edx,1
        jnz     .loop96
        mov     edx,eax
        mov     eax,esi
    .start64:
        ;edx:eax/esi
    .loop64:
        mov     esi,0xcccccccc  ;floor(2^35/10)
        dec     edi
        mov     ecx,eax
        mov     ebp,edx         ;ebp:ecx=num
        mul     esi
        mov     ebx,eax
        xchg    esi,edx
        mov     eax,ebp
        mul     edx
        add     eax,esi
        adc     edx,0           ;edx:eax:ebx:000=num*0xcccccccc00000000
        mov     esi,ebx
        add     esi,eax
        adc     eax,edx
        adc     edx,0           ;edx:eax:esi:ebx=num*0xcccccccccccccccc
        add     ebx,ecx
        adc     esi,ebp
        adc     eax,0
        adc     edx,0           ;edx:eax:esi:ebx=num*0xcccccccccccccccd
        and     eax,-8
        sub     ecx,eax
        shrd    eax,edx,2
        shr     edx,2
        sub     ecx,eax         ;ecx=remainder
        shrd    eax,edx,1
        add     ecx,'0'
        mov     [edi],cl
        shr     edx,1           ;edx:eax=quotient=num/10
        jnz     .loop64
    .start32:
        ;eax = eax / 10
        mov     ebx,0xcccccccd  ;ceiling(2^35/10)
    .loop32:
        mov     ecx,eax
        dec     edi
        mul     ebx
        and     edx,-8
        mov     eax,edx
        sub     ecx,edx
        shr     edx,2
        sub     ecx,edx
        add     ecx,'0'
        mov     [edi],cl
        shr     eax,3
        jnz     .loop32
        mov     eax,[length]
    .done:
        ret
endp

proc_leaf PRINTF_base2_radix_unsigned uses ebp ebx edi esi,dest,value,base
        ;return eax=length
        mov     ecx,[base]
        mov     ebx,[value]
        mov     esi,[ebx+8]
        mov     edx,[ebx+4]
        mov     ebp,[ebx]
        xor     eax,eax
        bsr     ecx,ecx         ;ecx=bit shift count. either 1, 2, 3, 4 or 5
        bsr     edi,esi
        lea     edi,[edi+64]
        jnz     .MSb_known
        bsr     edi,edx
        lea     edi,[edi+32]
        jnz     .MSb_known
        bsr     edi,ebp
        jz      .done           ;a value of zero prints nothing
    .MSb_known:
        imul    edi,[(ecx-1)*4+PRINTF_integer_reciprocals]
        shr     edi,24
        lea     eax,[edi+1]
        add     edi,[dest]
    .reduce:
        xor     ebx,ebx
        shrd    ebx,ebp,cl
        shrd    ebp,edx,cl
        shrd    edx,esi,cl
        shr     esi,cl
        rol     ebx,cl
        cmp     bl,10
        sbb     bh,bh
        add     bl,'a'-10
        and     bh,'0'-('a'-10)
        add     bl,bh
        mov     [edi],bl
        dec     edi
        mov     ebx,esi
        or      ebx,edx
        or      ebx,ebp
        jnz     .reduce
    .done:
        ret
endp

proc_leaf PRINTF_arbitrary_radix_unsigned uses ebx edi esi,dest,value,base
        ;return eax=length
        locals
                temp_output rb MAXIMUM_CONVERSION_LENGTH
        endl
        mov     edx,[value]
        mov     ebx,[base]
        mov     eax,[edx]
        mov     ecx,[edx+4]
        mov     esi,[edx+8]
        mov     edi,MAXIMUM_CONVERSION_LENGTH
        mov     edx,esi
        or      edx,ecx
        or      edx,eax
        jz      .done
    .reduce:
        ;esi:ecx:eax = esi:ecx:eax / ebx : remainder=edx
        xor     edx,edx
        xchg    eax,esi         ;eax:ecx:esi
        div     ebx             ;eax:ecx:esi
        xchg    eax,ecx         ;ecx:eax:esi
        div     ebx             ;ecx:eax:esi
        xchg    eax,esi         ;ecx:esi:eax
        div     ebx             ;ecx:esi:eax
        xchg    esi,ecx         ;esi:ecx:eax r=edx
        dec     edi
        cmp     dl,10
        sbb     dh,dh
        add     dl,'a'-10
        and     dh,'0'-('a'-10)
        add     dl,dh
        mov     [edi+temp_output],dl
        mov     edx,esi
        or      edx,ecx
        or      edx,eax
        jnz     .reduce
        mov     ecx,MAXIMUM_CONVERSION_LENGTH
        sub     ecx,edi
        lea     esi,[edi+temp_output]
        mov     edi,[dest]
        mov     eax,ecx
        rep     movsb
    .done:
        ret
endp

proc_leaf PRINTF_format_decimal_exponent uses edi esi ebx,identifier,dest,length,value
        ;return eax=length
        mov     eax,[value]
        mov     ecx,[identifier]
        mov     ebx,[dest]
        mov     edi,[length]
        mov     ch,'+'
        test    eax,eax
        jns     .sign_okay
        neg     eax
        mov     ch,'-'
    .sign_okay:
        mov     [ebx],cx
        bsr     ecx,eax
        jz      .length_okay
        PRINTF_ceiling_log2_10 edx,ecx
        mov     ecx,[edx*4+PRINTF_integer_base_10_multiples_32]
        sub     ecx,eax
        adc     edx,0
        cmp     edx,[length]
        jbe     .length_okay
        mov     [length],edx
        mov     edi,edx
    .length_okay:
        add     ebx,2
        mov     esi,0xcccccccd  ;ceiling(2^35/10)
    .loop32:
        mov     ecx,eax
        dec     edi
        mul     esi
        and     edx,-8
        mov     eax,edx
        sub     ecx,edx
        shr     edx,2
        sub     ecx,edx
        add     ecx,'0'
        mov     [ebx+edi],cl
        shr     eax,3
        test    edi,edi
        jnz     .loop32
        mov     eax,[length]
        add     eax,2
        ret
endp

proc PRINTF_elements_print uses ebx esi edi,elements,output
        mov     ebx,[elements]
        mov     esi,[output]
        ;format sign
        xor     ecx,ecx
        mov     al,'-'
        test    [ebx+PRINTF_ELEMENTS.flags],PRINTF_ELEMENTS_FLAG.negative
        jnz     .initialise_sign
        mov     al,'+'
        test    [ebx+PRINTF_ELEMENTS.flags],PRINTF_ELEMENTS_FLAG.show_sign
        jnz     .initialise_sign
        mov     al,' '
        test    [ebx+PRINTF_ELEMENTS.flags],PRINTF_ELEMENTS_FLAG.blank_sign
        jnz     .initialise_sign
        not     ecx
    .initialise_sign:
        inc     ecx
        mov     [ebx+PRINTF_ELEMENTS.sign_character],al
        mov     [ebx+PRINTF_ELEMENTS.sign_length],ecx
        xor     edi,edi                         ;edi = number of padding characters
        stdcall PRINTF_get_minimum_length,ebx
        sub     eax,[ebx+PRINTF_ELEMENTS.width]
        jae     .no_padding
        neg     eax
        mov     edi,eax
    .no_padding:
        ;leading spaces
        test    edi,edi
        jz      .leading_spaces_done
        ;if zero padding then do nothing
        test    [ebx+PRINTF_ELEMENTS.flags],PRINTF_ELEMENTS_FLAG.zero_pad
        jnz     .leading_spaces_done
        ;if left justified then do nothing
        test    [ebx+PRINTF_ELEMENTS.flags],PRINTF_ELEMENTS_FLAG.left_justify
        jnz     .leading_spaces_done
        stdcall PRINTF_elements_print_repeated_character,ebx,+' ',edi,esi
    .leading_spaces_done:
        ;sign
        mov     ecx,[ebx+PRINTF_ELEMENTS.sign_length]
        test    ecx,ecx
        jz      .sign_done
        movzx   eax,[ebx+PRINTF_ELEMENTS.sign_character]
        stdcall PRINTF_elements_print_character,ebx,eax,esi
    .sign_done:
        ;prefix
        mov     ecx,[ebx+PRINTF_ELEMENTS.prefix_length]
        test    ecx,ecx
        jz      .prefix_done
        stdcall PRINTF_elements_print_string,ebx,addr ebx+PRINTF_ELEMENTS.prefix_string,ecx,esi
    .prefix_done:
        test    edi,edi
        jz      .padding_zeros_done
        test    [ebx+PRINTF_ELEMENTS.flags],PRINTF_ELEMENTS_FLAG.zero_pad
        jz      .padding_zeros_done
        stdcall PRINTF_elements_print_padding_zeros,ebx,edi,esi
    .padding_zeros_done:
        ;precision zeros
        xor     eax,eax
        mov     ecx,[ebx+PRINTF_ELEMENTS.precision_zeros]
        test    ecx,ecx
        jz      .precision_zeros_done
        stdcall PRINTF_elements_print_decimal_repeated_character,ebx,+'0',ecx,eax,esi
    .precision_zeros_done:
        ;number
        mov     ecx,[ebx+PRINTF_ELEMENTS.number_length]
        test    ecx,ecx
        jz      .number_done
        stdcall PRINTF_elements_print_decimal_string,ebx,addr ebx+PRINTF_ELEMENTS.number_string,ecx,eax,esi
    .number_done:
        ;magnitude zeros
        mov     ecx,[ebx+PRINTF_ELEMENTS.magnitude_zeros]
        test    ecx,ecx
        jz      .magnitude_zeros_done
        stdcall PRINTF_elements_print_decimal_repeated_character,ebx,+'0',ecx,eax,esi
    .magnitude_zeros_done:
        ;exponent
        mov     ecx,[ebx+PRINTF_ELEMENTS.exponent_length]
        test    ecx,ecx
        jz      .exponent_done
        stdcall PRINTF_elements_print_string,ebx,addr ebx+PRINTF_ELEMENTS.exponent_string,ecx,esi
    .exponent_done:
        ;trailing spaces
        test    edi,edi
        jz      .trailing_spaces_done
        ;if zero padding then do nothing
        test    [ebx+PRINTF_ELEMENTS.flags],PRINTF_ELEMENTS_FLAG.zero_pad
        jnz     .trailing_spaces_done
        ;if right justified then do nothing
        test    [ebx+PRINTF_ELEMENTS.flags],PRINTF_ELEMENTS_FLAG.left_justify
        jz      .trailing_spaces_done
        stdcall PRINTF_elements_print_repeated_character,ebx,+' ',edi,esi
    .trailing_spaces_done:
        ret
endp

proc PRINTF_elements_print_padding_zeros uses ebx esi edi,elements,padding,output
        mov     ebx,[elements]
        mov     esi,[padding]
        mov     edi,[ebx+PRINTF_ELEMENTS.separator_modulus]
        test    edi,edi
        jz      .all_zeros
        stdcall PRINTF_get_digits_before_decimal_point,ebx
        push    eax
        stdcall PRINTF_count_separator_characters,ebx
        pop     ecx
        add     eax,ecx
        add     eax,esi
        xor     edx,edx
        inc     edi
        div     edi
        test    edx,edx
        jnz     .set_modulus
        mov     edx,[ebx+PRINTF_ELEMENTS.separator_modulus]
        inc     edx
    .set_modulus:
        mov     edi,edx
        mov     eax,'0'
        jmp     .separator_character_known              ;ensure we don't start with a separator character
    .padding_with_separators_loop:
        test    edi,edi
        mov     eax,'0'
        jnz     .separator_character_known
        mov     edi,[ebx+PRINTF_ELEMENTS.separator_modulus]
        movzx   eax,[ebx+PRINTF_ELEMENTS.separator_character]
        inc     edi
    .separator_character_known:
        stdcall PRINTF_elements_print_character,ebx,eax,[output]
        dec     edi
        dec     esi
        jnz     .padding_with_separators_loop
        ret
    .all_zeros:
        stdcall PRINTF_elements_print_repeated_character,ebx,+'0',esi,[output]
        ret
endp

proc PRINTF_get_minimum_length elements
        stdcall PRINTF_count_separator_characters,[elements]
        push    eax
        stdcall PRINTF_get_base_length,[elements]
        pop     ecx
        add     eax,ecx
        ret
endp

proc_leaf PRINTF_get_base_length elements
        mov     ecx,[elements]
        xor     edx,edx
        mov     eax,[ecx+PRINTF_ELEMENTS.sign_length]
        add     eax,[ecx+PRINTF_ELEMENTS.prefix_length]
        add     eax,[ecx+PRINTF_ELEMENTS.precision_zeros]
        add     eax,[ecx+PRINTF_ELEMENTS.number_length]
        add     eax,[ecx+PRINTF_ELEMENTS.magnitude_zeros]
        add     eax,[ecx+PRINTF_ELEMENTS.exponent_length]
        cmp     [ecx+PRINTF_ELEMENTS.decimal_point],0
        setnz   dl
        add     eax,edx
        ret
endp

proc PRINTF_count_separator_characters uses ebx,elements
        mov     ebx,[elements]
        mov     ecx,[ebx+PRINTF_ELEMENTS.separator_modulus]
        test    ecx,ecx
        jz      .no_separator_characters
        stdcall PRINTF_get_digits_before_decimal_point,ebx
        sub     eax,1
        jb      .no_separator_characters
        xor     edx,edx
        div     [ebx+PRINTF_ELEMENTS.separator_modulus]
        ret
    .no_separator_characters:
        xor     eax,eax
        ret
endp

proc PRINTF_elements_print_decimal_repeated_character uses ebx edi esi,elements,char,repeats,current_position,output
        ;return eax=next position
        mov     ebx,[elements]
        mov     edi,[current_position]
        mov     esi,[repeats]
        test    esi,esi
        jz      .done
    .loop:
        stdcall PRINTF_elements_print_character,ebx,[char],[output]
        inc     edi
        stdcall PRINTF_elements_print_separator,ebx,edi,[output]
        dec     esi
        cmp     edi,[ebx+PRINTF_ELEMENTS.decimal_point]
        jnz     .decimal_done
        stdcall PRINTF_elements_print_character,ebx,+'.',[output]
    .decimal_done:
        test    esi,esi
        jnz     .loop
    .done:
        mov     eax,edi
        ret
endp

proc PRINTF_elements_print_repeated_character uses ebx,elements,char,repeats,output
        mov     ebx,[repeats]
        test    ebx,ebx
        jz      .done
    .loop:
        stdcall PRINTF_elements_print_character,[elements],[char],[output]
        dec     ebx
        jnz     .loop
    .done:
        ret
endp

proc PRINTF_elements_print_decimal_string uses esi ebx edi,elements,string,length,current_position,output
        ;return eax=next position
        mov     edi,[current_position]
        mov     esi,[string]
        mov     ebx,[length]
        test    ebx,ebx
        jz      .done
    .loop:
        movzx   eax,byte[esi]
        stdcall PRINTF_elements_print_character,[elements],eax,[output]
        inc     edi
        stdcall PRINTF_elements_print_separator,[elements],edi,[output]
        inc     esi
        dec     ebx
        mov     eax,[elements]
        cmp     edi,[eax+PRINTF_ELEMENTS.decimal_point]
        jnz     .decimal_done
        stdcall PRINTF_elements_print_character,eax,+'.',[output]
    .decimal_done:
        test    ebx,ebx
        jnz     .loop
    .done:
        mov     eax,edi
        ret
endp

proc PRINTF_elements_print_separator uses ebx,elements,current_position,output
        mov     ebx,[elements]
        cmp     [ebx+PRINTF_ELEMENTS.separator_modulus],0
        jz      .separator_skip
        stdcall PRINTF_get_digits_before_decimal_point,ebx
        sub     eax,[current_position]
        jbe     .separator_skip
        xor     edx,edx
        div     [ebx+PRINTF_ELEMENTS.separator_modulus]
        test    edx,edx
        jnz     .separator_skip
        movzx   eax,[ebx+PRINTF_ELEMENTS.separator_character]
        stdcall PRINTF_elements_print_character,ebx,eax,[output]
    .separator_skip:
        ret
endp

proc_leaf PRINTF_get_digits_before_decimal_point elements
        mov     edx,[elements]
        mov     eax,[edx+PRINTF_ELEMENTS.decimal_point]
        test    eax,eax
        jnz     .done
        mov     eax,[edx+PRINTF_ELEMENTS.precision_zeros]
        add     eax,[edx+PRINTF_ELEMENTS.number_length]
        add     eax,[edx+PRINTF_ELEMENTS.magnitude_zeros]
    .done:
        ret
endp

proc PRINTF_elements_print_string uses esi ebx,elements,string,length,output
        mov     esi,[string]
        mov     ebx,[length]
        test    ebx,ebx
        jz      .done
    .loop:
        movzx   eax,byte[esi]
        test    eax,eax
        jz      .done
        stdcall PRINTF_elements_print_character,[elements],eax,[output]
        inc     esi
        dec     ebx
        jnz     .loop
    .done:
        ret
endp

proc PRINTF_elements_print_character uses ebx,elements,char,output
        ;applies uppercase flag
        mov     ebx,[output]
        mov     eax,[elements]
        mov     ecx,[ebx+PRINTF_OUTPUT.length]
        mov     edx,[ebx+PRINTF_OUTPUT.buffer]
        test    [eax+PRINTF_ELEMENTS.flags],PRINTF_ELEMENTS_FLAG.uppercase
        mov     eax,[char]
        jz      .case_change_done
        cmp     al,'a'
        jb      .case_change_done
        cmp     al,'z'
        ja      .case_change_done
        sub     al,'a'-'A'
    .case_change_done:
        test    ecx,ecx
        jne     .add
        cmp     [ebx+PRINTF_OUTPUT.handle],INVALID_HANDLE_VALUE
        je      .next
        push    eax
        stdcall PRINTF_flush_buffer,ebx
        pop     eax
        mov     edx,[ebx+PRINTF_OUTPUT.buffer]
        mov     ecx,[ebx+PRINTF_OUTPUT.length]
    .add:
        mov     [edx],al
        dec     ecx
        mov     [ebx+PRINTF_OUTPUT.length],ecx
    .next:
        inc     edx
        mov     [ebx+PRINTF_OUTPUT.buffer],edx
        ret
endp

proc PRINTF_flush_buffer output
                local bytes_written:DWORD
        mov     eax,[output]
        mov     ecx,PRINTF_BUFFER_SIZE
        sub     ecx,[eax+PRINTF_OUTPUT.length]
        jz      .done
        add     [eax+PRINTF_OUTPUT.sent],ecx
        sub     [eax+PRINTF_OUTPUT.buffer],ecx
        mov     [eax+PRINTF_OUTPUT.length],PRINTF_BUFFER_SIZE
        invoke  WriteFile,[eax+PRINTF_OUTPUT.handle],[eax+PRINTF_OUTPUT.buffer],ecx,addr bytes_written,0
        test    eax,eax
        jz      .error
    .done:
        ret
    .error:
        invoke  GetLastError
        mov     ecx,[output]
        mov     [ecx+PRINTF_OUTPUT.error],eax
        ret
endp

restore prologue@proc,epilogue@proc    
It is mostly a clone of the libc formatting options.

There are some extra bits and bobs like the ',' flag, 't' size and the 'r' and 'q' types.

To make with work with OSes other than Windows only the very last function (PRINTF_flush_buffer) needs to be changed.

There are many better (IMO) formatting options than the "ugly" c method. Python has some nice formatting options with using the curly brackets {}. However that also suffers from the terribleness of being position dependant for the formatting options. Instead, for example, with the following format specifiers it is possilbe to make everything position independent
Code:
format field prefixes can be in any order
:[0-9]  argument position
[a-Z]   type
*[0-9]  field width
[0-9]   field width, use * to make unambiguous
<       align left
=       align centre
>       align right
,       digit separator ,
_       digit separator _
~       digit separator space
@[0-9]  digit separator width
" "     [without the quotes] pad to precision with spaces
#       pad to precision with zeros
-       reserved place for sign "-" or space
+       reserved place for sign "-" or "+"
.[0-9]  precision for numbers: minimum number of digits to output
.[0-9]  length for strings: maximum number of characters to output
/[0-9]  conversion base    
Post 01 Nov 2023, 19:35
View user's profile Send private message Visit poster's website Reply with quote
sylware



Joined: 23 Oct 2020
Posts: 437
Location: Marseille/France
sylware 02 Nov 2023, 10:29
wow, this looks like great work.

There is a lot to cherry pick to compose a well featured printf function.

BTW, "modern" printfs now allow to cherry pick their arguments for their conversions and precisions with a '$' marker. With the ABI being a mess when mixing floats and integers, and variable argument being shabby (va_list), heavy work ahead, I guess a compromise would have to be done with the user code calling such function.
Post 02 Nov 2023, 10:29
View user's profile Send private message Reply with quote
AsmGuru62



Joined: 28 Jan 2004
Posts: 1619
Location: Toronto, Canada
AsmGuru62 02 Nov 2023, 16:19
Just curious -- why not use a standard?
Code:
; ---------------------------------------------------------------------------
section '.idata' import data readable writeable

    library kernel32,'KERNEL32.DLL',user32,'USER32.DLL',gdi32,'GDI32.DLL',msvcrt,'MSVCRT.DLL'

    include 'API\Kernel32.Inc'
    include 'API\User32.Inc'
    include 'API\Gdi32.Inc'

    import msvcrt, \
        atof,'atof', \
        atoi,'atoi',\
        printf,'printf',\
        wtoi,'_wtoi'
    

Tested by time and MSVCRT is installed on Windows since the time of the Vikings!
Post 02 Nov 2023, 16:19
View user's profile Send private message Send e-mail Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20303
Location: In your JS exploiting you and your system
revolution 02 Nov 2023, 19:16
The MS version of printf is inefficient, has limited format options, doesn't report when the buffer is overflowed, does "smart" float rounding (it changes the value!) and can't be used to make a float that can be reliably reconstruct from the text.

Just because it is there doesn't mean it is suitable.

If you need consistent behaviour across all implementations of printf then it isn't possible. You must have your own implementation to ensure it works the same everywhere.
Post 02 Nov 2023, 19:16
View user's profile Send private message Visit poster's website Reply with quote
Furs



Joined: 04 Mar 2016
Posts: 2493
Furs 03 Nov 2023, 17:19
revolution wrote:
The MS version of printf is inefficient
Here's an interesting writeup with tests: https://aras-p.info/blog/2022/02/25/Curious-lack-of-sprintf-scaling/

And yes, C++ standard library definitely has "zero cost" abstractions. Trust the standard library devs guys! They optimize to the max!
Post 03 Nov 2023, 17:19
View user's profile Send private message Reply with quote
ProMiNick



Joined: 24 Mar 2012
Posts: 798
Location: Russian Federation, Sochi
ProMiNick 22 Nov 2023, 15:13
I test printf on
Code:
szEnv    db "%0.100tf",0
  valu     dt 1.0E-1    
interested only part that convert floats

so I get subcall tree something like:
Code:
PRINTF_decode_table
        PRINTF_get_float_parameters
                PRINTF_triple_precision_upscale
                PRINTF_triple_precision_floor_log10
                        PRINTF_get_triple_precision_10_power_y
                                PRINTF_check_triple_precision_power_table
                                        PRINTF_make_triple_precision_power_table
                                                PRINTF_variable_precision_mul
                                PRINTF_variable_precision_mul
        ; until here no convertion, fp just transfered to TRIPLE_PRECISION form with exponent biased and mantissa shifted to high bits
        ; other stuff related to how exactly will be outputted is filled
        ; between PRINTF_make_triple_precision_power_table & PRINTF_variable_precision_mul builded table above
        PRINTF_generate_f
                PRINTF_triple_precision_scale
                        PRINTF_get_triple_precision_10_power_y
                PRINTF_decimalise_unsigned
        ; after here all must be converted, just printing according to flags
        PRINTF_elements_print
                PRINTF_elements_print_character
                PRINTF_elements_print_string
                PRINTF_elements_print_decimal_repeated_character
                PRINTF_elements_print_decimal_string    


what goes from PRINTF_get_float_parameters:
Code:
PRINTF_ELEMENTS:
.flags = PRINTF_ELEMENTS_FLAG.precision_specified or PRINTF_ELEMENTS_FLAG.size_specified or PRINTF_ELEMENTS_FLAG.zero_pad or PRINTF_ELEMENTS_SIZE.triple
.precision = 100
.arg_value = $3FFB:CCCCCCCCCCCCCCCD

FDT[4]:
.sign_bit_position      = 16 ;
.exponent_width         = 15 ;bit
.implicit_bit_flag      = 0
.exponent_bias          = $3FFE

call PRINTF_get_float_parameters

shift left .arg_value with .sign_bit_position
them add & adc float parts twice(shl 1 analog) with shifting high bit to carry & stroring it to [negative_flag]
then shld exponent with .exponent_width, increase it, then shr it back, if nonzero - we get infinity_NaN
shift .arg_value with .exponent_width
if .arg_value.exponent >0 then jmp .normal
if .arg_value.mantissa_low or .arg_value.mantissa_low or .arg_value.mantissa_high = 0 then jmp .zero
.normal:
if .implicit_bit_flag <> 0 then shrd .arg_value.mantissa with 1 where .mantissa_high forced to shift 1: stc,rcr ?,1
sub .arg_value.exponent,.exponent_bias
if high bit(js) of .arg_value.mantissa_high then jmp .normalized
if denormalized shld .arg_value.mantissa and increase .arg_value.exponent on shift bit amount

elements.flags set ecx filtered with PRINTF_ELEMENTS_FLAG.negative (ecx=-1 mean negate from prev subcall)
elements.significant_bits = 64    

and than builded this table
Code:
.datapf:00404240 stru_404240
.datapf:00404240 TRIPLE_PRECISION <     0,      0, 80000000h,      1>; 0
.datapf:00404240 TRIPLE_PRECISION <     0,      0, 0A0000000h,      4>; 1
.datapf:00404240 TRIPLE_PRECISION <     0,      0, 0C8000000h,      7>; 2
.datapf:00404240 TRIPLE_PRECISION <     0,      0, 0FA000000h,    0Ah>; 3
.datapf:00404240 TRIPLE_PRECISION <     0,      0, 9C400000h,    0Eh>; 4
.datapf:00404240 TRIPLE_PRECISION <     0,      0, 0C3500000h,    11h>; 5
.datapf:00404240 TRIPLE_PRECISION <     0,      0, 0F4240000h,    14h>; 6
.datapf:00404240 TRIPLE_PRECISION <     0,      0, 98968000h,    18h>; 7
.datapf:00404240 TRIPLE_PRECISION <     0,      0, 0BEBC2000h,    1Bh>; 8
.datapf:00404240 TRIPLE_PRECISION <     0,      0, 0EE6B2800h,    1Eh>; 9
.datapf:00404240 TRIPLE_PRECISION <     0,      0, 9502F900h,    22h>; 10
.datapf:00404240 TRIPLE_PRECISION <     0,      0, 0BA43B740h,    25h>; 11
.datapf:00404240 TRIPLE_PRECISION <     0,      0, 0E8D4A510h,    28h>; 12
.datapf:00404240 TRIPLE_PRECISION <     0,      0, 9184E72Ah,    2Ch>; 13
.datapf:00404240 TRIPLE_PRECISION <     0, 80000000h, 0B5E620F4h,    2Fh>; 14
.datapf:00404240 TRIPLE_PRECISION <     0, 0A0000000h, 0E35FA931h,    32h>; 15
.datapf:00404240 TRIPLE_PRECISION <     0, 4000000h, 8E1BC9BFh,    36h>; 16
.datapf:00404240 TRIPLE_PRECISION <0F0200000h, 2B70B59Dh, 9DC5ADA8h,    6Bh>; 17
.datapf:00404240 TRIPLE_PRECISION <9670B12Bh, 0E4395D6h, 0AF298D05h,   0A0h>; 18
.datapf:00404240 TRIPLE_PRECISION <3CBF6B72h, 0FFCFA6D5h, 0C2781F49h,   0D5h>; 19
.datapf:00404240 TRIPLE_PRECISION <0DC33745Fh, 87DAF7FBh, 0D7E77A8Fh,   10Ah>; 20
.datapf:00404240 TRIPLE_PRECISION <0C5CFE94Fh, 0C59B14A2h, 0EFB3AB16h,   13Fh>; 21
.datapf:00404240 TRIPLE_PRECISION <3E2CF6Ch, 9923329Eh, 850FADC0h,   175h>; 22
.datapf:00404240 TRIPLE_PRECISION <0C66F336Ch, 80E98CDFh, 93BA47C9h,   1AAh>; 23
.datapf:00404240 TRIPLE_PRECISION <5F16206Dh, 0A8D3A6E7h, 0A402B9C5h,   1DFh>; 24
.datapf:00404240 TRIPLE_PRECISION <577B986Bh, 7FE617AAh, 0B616A12Bh,   214h>; 25
.datapf:00404240 TRIPLE_PRECISION <7D7B8F75h, 859BBF93h, 0CA28A291h,   249h>; 26
.datapf:00404240 TRIPLE_PRECISION <85BBE254h, 3927556Ah, 0E070F78Dh,   27Eh>; 27
.datapf:00404240 TRIPLE_PRECISION <0A7709A57h, 37826145h, 0F92E0C35h,   2B3h>; 28
.datapf:00404240 TRIPLE_PRECISION <82BD6B71h, 0E33CC92Fh, 8A5296FFh,   2E9h>; 29
.datapf:00404240 TRIPLE_PRECISION <0ACCA6DA2h, 0D6BF1765h, 9991A6F3h,   31Eh>; 30
.datapf:00404240 TRIPLE_PRECISION <0DDBB901Ch, 9DF9DE8Dh, 0AA7EEBFBh,   353h>; 31
.datapf:00404240 TRIPLE_PRECISION <0CC655C55h, 0A60E91C6h, 0E319A0AEh,   6A5h>; 32
.datapf:00404240 TRIPLE_PRECISION <6C8D3FCAh, 0CD00A68Ch, 973F9CA8h,   9F8h>; 33
.datapf:00404240 TRIPLE_PRECISION <650D3D29h, 81750C17h, 0C9767586h,  0D4Ah>; 34
.datapf:00404240 TRIPLE_PRECISION <85BCCD6h, 0EB856ECBh, 862C8C0Eh,  109Dh>; 35
.datapf:00404240 TRIPLE_PRECISION <4257AC3Bh, 3993A7E4h, 0B2B8353Bh,  13EFh>; 36
.datapf:00404240 TRIPLE_PRECISION <2D4070F3h, 924AB88Ch, 0EE0DDD84h,  1741h>; 37
.datapf:00404240 TRIPLE_PRECISION <0A74D28CEh, 0C53D5DE4h, 9E8B3B5Dh,  1A94h>; 38
.datapf:00404240 TRIPLE_PRECISION <3F50C802h, 41F4806Fh, 0D32E2032h,  1DE6h>; 39
.datapf:00404240 TRIPLE_PRECISION <5DFED099h, 20A1F0A6h, 8CA554C0h,  2139h>; 40
.datapf:00404240 TRIPLE_PRECISION <4C808754h, 9BD977CCh, 0BB570A9Ah,  248Bh>; 41
.datapf:00404240 TRIPLE_PRECISION <0FDD08C4Eh, 0D88B5A8Ah, 0F9895D25h,  27DDh>; 42
.datapf:00404240 TRIPLE_PRECISION <50E36602h, 5699FE45h, 0A630EF7Dh,  2B30h>; 43
.datapf:00404240 TRIPLE_PRECISION <95AA118Fh, 0BF27F3F7h, 0DD5DC8A2h,  2E82h>; 44
.datapf:00404240 TRIPLE_PRECISION <8C474BB6h, 7DC64F6Dh, 936E0773h,  31D5h>; 45
.datapf:00404240 TRIPLE_PRECISION <0C94C1540h, 8A20979Ah, 0C4605202h,  3527h>; 46
.datapf:00404240 TRIPLE_PRECISION <0CCCCCCCDh, 0CCCCCCCCh, 0CCCCCCCCh, 0FFFFFFFDh>; 47
.datapf:00404240 TRIPLE_PRECISION <3D70A3D7h, 70A3D70Ah, 0A3D70A3Dh, 0FFFFFFFAh>; 48
.datapf:00404240 TRIPLE_PRECISION <645A1CACh, 8D4FDF3Bh, 83126E97h, 0FFFFFFF7h>; 49
.datapf:00404240 TRIPLE_PRECISION <0D3C36113h, 0E219652Bh, 0D1B71758h, 0FFFFFFF3h>; 50
.datapf:00404240 TRIPLE_PRECISION <0FCF80DCh, 1B478423h, 0A7C5AC47h, 0FFFFFFF0h>; 51
.datapf:00404240 TRIPLE_PRECISION <0A63F9A4Ah, 0AF6C69B5h, 8637BD05h, 0FFFFFFEDh>; 52
.datapf:00404240 TRIPLE_PRECISION <3D329076h, 0E57A42BCh, 0D6BF94D5h, 0FFFFFFE9h>; 53
.datapf:00404240 TRIPLE_PRECISION <0FDC20D2Bh, 8461CEFCh, 0ABCC7711h, 0FFFFFFE6h>; 54
.datapf:00404240 TRIPLE_PRECISION <31680A89h, 36B4A597h, 89705F41h, 0FFFFFFE3h>; 55
.datapf:00404240 TRIPLE_PRECISION <0B573440Eh, 0BDEDD5BEh, 0DBE6FECEh, 0FFFFFFDFh>; 56
.datapf:00404240 TRIPLE_PRECISION <0F78F69A5h, 0CB24AAFEh, 0AFEBFF0Bh, 0FFFFFFDCh>; 57
.datapf:00404240 TRIPLE_PRECISION <0F93F87B7h, 6F5088CBh, 8CBCCC09h, 0FFFFFFD9h>; 58
.datapf:00404240 TRIPLE_PRECISION <2865A5F2h, 4BB40E13h, 0E12E1342h, 0FFFFFFD5h>; 59
.datapf:00404240 TRIPLE_PRECISION <538484C2h, 95CD80Fh, 0B424DC35h, 0FFFFFFD2h>; 60
.datapf:00404240 TRIPLE_PRECISION <0F9D3701h, 3AB0ACD9h, 901D7CF7h, 0FFFFFFCFh>; 61
.datapf:00404240 TRIPLE_PRECISION <4C2EBE68h, 0C44DE15Bh, 0E69594BEh, 0FFFFFFCBh>; 62
.datapf:00404240 TRIPLE_PRECISION <67DE18EEh, 453994BAh, 0CFB11EADh, 0FFFFFF96h>; 63
.datapf:00404240 TRIPLE_PRECISION <5560C018h, 0B17EC159h, 0BB127C53h, 0FFFFFF61h>; 64
.datapf:00404240 TRIPLE_PRECISION <3F2398D7h, 0A539E9A5h, 0A87FEA27h, 0FFFFFF2Ch>; 65
.datapf:00404240 TRIPLE_PRECISION <0DCCD87A0h, 6B0919A5h, 97C560BAh, 0FFFFFEF7h>; 66
.datapf:00404240 TRIPLE_PRECISION <11DBCB02h, 0FD75539Bh, 88B402F7h, 0FFFFFEC2h>; 67
.datapf:00404240 TRIPLE_PRECISION <4D4617B6h, 0F065D37Dh, 0F64335BCh, 0FFFFFE8Ch>; 68
.datapf:00404240 TRIPLE_PRECISION <0AC7CB3F7h, 64BCE4A0h, 0DDD0467Ch, 0FFFFFE57h>; 69
.datapf:00404240 TRIPLE_PRECISION <0FE64A52Fh, 7C5382C8h, 0C7CABA6Eh, 0FFFFFE22h>; 70
.datapf:00404240 TRIPLE_PRECISION <59ED2167h, 0DB73A093h, 0B3F4E093h, 0FFFFFDEDh>; 71
.datapf:00404240 TRIPLE_PRECISION <0B8ADA00Eh, 38CB002Fh, 0A21727DBh, 0FFFFFDB8h>; 72
.datapf:00404240 TRIPLE_PRECISION <7B6306A3h, 5423CC06h, 91FF8377h, 0FFFFFD83h>; 73
.datapf:00404240 TRIPLE_PRECISION <4247CB9Eh, 3DA4BC60h, 8380DEA9h, 0FFFFFD4Eh>; 74
.datapf:00404240 TRIPLE_PRECISION <0A4F8BF56h, 4A314EBDh, 0ECE53CECh, 0FFFFFD18h>; 75
.datapf:00404240 TRIPLE_PRECISION <0FB1E4A9Bh, 0CF32E1D6h, 0D5605FCDh, 0FFFFFCE3h>; 76
.datapf:00404240 TRIPLE_PRECISION <0FA911156h, 637A1939h, 0C0314325h, 0FFFFFCAEh>; 77
.datapf:00404240 TRIPLE_PRECISION <7132D333h, 0DB23D21Ch, 9049EE32h, 0FFFFF95Ch>; 78
.datapf:00404240 TRIPLE_PRECISION <5AE1B259h, 505DE96Bh, 0D8A66D4Ah, 0FFFFF609h>; 79
.datapf:00404240 TRIPLE_PRECISION <87A60158h, 0DA57C0BDh, 0A2A682A5h, 0FFFFF2B7h>; 80
.datapf:00404240 TRIPLE_PRECISION <1F4BF665h, 75EDBABEh, 0F4385D09h, 0FFFFEF64h>; 81
.datapf:00404240 TRIPLE_PRECISION <68E1EB75h, 52A711B2h, 0B759449Fh, 0FFFFEC12h>; 82
.datapf:00404240 TRIPLE_PRECISION <6C83AD12h, 0C497B50Eh, 89A63BA4h, 0FFFFE8C0h>; 83
.datapf:00404240 TRIPLE_PRECISION <492512D5h, 34362DE4h, 0CEAE534Fh, 0FFFFE56Dh>; 84
.datapf:00404240 TRIPLE_PRECISION <0E393A9C0h, 28A1638Fh, 9B2A840Fh, 0FFFFE21Bh>; 85
.datapf:00404240 TRIPLE_PRECISION <598EEC7Dh, 0DEC0A404h, 0E8FB7DC2h, 0FFFFDEC8h>; 86
.datapf:00404240 TRIPLE_PRECISION <0E3187C34h, 1228ABCAh, 0AEE97391h, 0FFFFDB76h>; 87
.datapf:00404240 TRIPLE_PRECISION <0E79E236Ch, 91575A87h, 8350BF3Ch, 0FFFFD824h>; 88
.datapf:00404240 TRIPLE_PRECISION <9E98CB98h, 0AEB15D92h, 0C52BA8A6h, 0FFFFD4D1h>; 89
.datapf:00404240 TRIPLE_PRECISION <4B4DE34Eh, 83FD6265h, 9406AF8Fh, 0FFFFD17Fh>; 90
.datapf:00404240 TRIPLE_PRECISION <1463EF49h, 37CAD87Fh, 0DE42FF8Dh, 0FFFFCE2Ch>; 91
.datapf:00404240 TRIPLE_PRECISION <2DE38124h, 0D2CE9FDEh, 0A6DD04C8h, 0FFFFCADAh>; 92
.datapf:00404240 TRIPLE_PRECISION  7 dup(<0>)    


could anyone describe logic from here - PRINTF_make_triple_precision_power_table - how they build and what for? I could say that this table of power of tens 1,10,100 etc 0.1,0.01,0.001 etc in TRIPLE_PRECISION form. but question is open - what for? thanks

_________________
I don`t like to refer by "you" to one person.
My soul requires acronim "thou" instead.
Post 22 Nov 2023, 15:13
View user's profile Send private message Send e-mail Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20303
Location: In your JS exploiting you and your system
revolution 22 Nov 2023, 21:53
It is a multiplication table to shift the mantissa into an integer in the range 0 to 2^96-1.
Post 22 Nov 2023, 21:53
View user's profile Send private message Visit poster's website Reply with quote
ProMiNick



Joined: 24 Mar 2012
Posts: 798
Location: Russian Federation, Sochi
ProMiNick 23 Nov 2023, 00:06
I am understand how to get first 17 elements of TRIPLE_PRECISION multipliers table in compile time:
Code:
macro tpc a{ local %'
match base ** exp,a\{
if exp<17 & exp>=0
%' = 1
if exp>0
repeat exp
        %'=%'*10
end repeat
end if
b = bsr (%')+1
c = (%') shl (64-b)
;display (b/100) mod 100 +'0',(b/10) mod 10 +'0',b mod 10+'0',13,10
dd 0, c and $FFFFFFFF,c shr 32,b
end if
\} }

;power of tens
tpc 10 **$0000 ;1
tpc 10 **$0001 ;10
tpc 10 **$0002 ;100
tpc 10 **$0003 ;...
tpc 10 **$0004
tpc 10 **$0005
tpc 10 **$0006
tpc 10 **$0007
tpc 10 **$0008
tpc 10 **$0009
tpc 10 **$000A
tpc 10 **$000B
tpc 10 **$000C
tpc 10 **$000D
tpc 10 **$000E
tpc 10 **$000F
tpc 10 **$0010
tpc 10 **$0020 ; from here calculation become more complex
tpc 10 **$0030
tpc 10 **$0040
tpc 10 **$0050
tpc 10 **$0060
tpc 10 **$0070
tpc 10 **$0080
tpc 10 **$0090
tpc 10 **$00A0
tpc 10 **$00B0
tpc 10 **$00C0
tpc 10 **$00D0
tpc 10 **$00E0
tpc 10 **$00F0
tpc 10 **$0100
tpc 10 **$0200
tpc 10 **$0300
tpc 10 **$0400
tpc 10 **$0500
tpc 10 **$0600
tpc 10 **$0700
tpc 10 **$0800
tpc 10 **$0900
tpc 10 **$0A00
tpc 10 **$0B00
tpc 10 **$0C00
tpc 10 **$0D00
tpc 10 **$0E00
tpc 10 **$0F00
tpc 10 **$1000

tpc 10 **-$0001
tpc 10 **-$0002
tpc 10 **-$0003
tpc 10 **-$0004
tpc 10 **-$0005
tpc 10 **-$0006
tpc 10 **-$0007
tpc 10 **-$0008
tpc 10 **-$0009
tpc 10 **-$000A
tpc 10 **-$000B
tpc 10 **-$000C
tpc 10 **-$000D
tpc 10 **-$000E
tpc 10 **-$000F
tpc 10 **-$0010
tpc 10 **-$0020
tpc 10 **-$0030
tpc 10 **-$0040
tpc 10 **-$0050
tpc 10 **-$0060
tpc 10 **-$0070
tpc 10 **-$0080
tpc 10 **-$0090
tpc 10 **-$00A0
tpc 10 **-$00B0
tpc 10 **-$00C0
tpc 10 **-$00D0
tpc 10 **-$00E0
tpc 10 **-$00F0
tpc 10 **-$0100
tpc 10 **-$0200
tpc 10 **-$0300
tpc 10 **-$0400
tpc 10 **-$0500
tpc 10 **-$0600
tpc 10 **-$0700
tpc 10 **-$0800
tpc 10 **-$0900
tpc 10 **-$0A00
tpc 10 **-$0B00
tpc 10 **-$0C00
tpc 10 **-$0D00
tpc 10 **-$0E00
tpc 10 **-$0F00
tpc 10 **-$1000
dd 0,0,0,0     

after seventeen element "tpc 10 **16" calculation formula
Code:
;hi1:mid1:lo1*hi2:mid2:lo2=hi1*hi2+adc : hi1*mid2+mid1*hi2+adc : hi1*lo2+mid1*mid2+lo1*hi2+adc : highbitorcoupleforrouding(mid1*lo2+lo1*mid2) : ignore(lo1*lo2)    
so mantissa of positive ones are calculateble and mantissa of negative ones too if first element of negatives will be hardcoded. resolving of exponent is more interesting.

revolution, can thou get example of how some float multiplied on value from that table that it became integer.
Post 23 Nov 2023, 00:06
View user's profile Send private message Send e-mail Reply with quote
ProMiNick



Joined: 24 Mar 2012
Posts: 798
Location: Russian Federation, Sochi
ProMiNick 23 Nov 2023, 07:15
by the way there mistake in rounding algorithm:
Code:
    .round_and_copy_result:
        add     ebx,sizeof.TRIPLE_PRECISION
        bt      [next_value.exponent-4*4],31
        mov     eax,[next_value.exponent-4*3]
        mov     edx,[next_value.exponent-4*2]
        mov     ecx,[next_value.exponent-4*1]
        adc     eax,0
        adc     edx,0
        adc     ecx,0
        mov     [ebx+TRIPLE_PRECISION.mantissa_low],eax
        mov     eax,[next_value.exponent-4*0]
        mov     [ebx+TRIPLE_PRECISION.mantissa_mid],edx
        mov     [ebx+TRIPLE_PRECISION.mantissa_high],ecx
        mov     [ebx+TRIPLE_PRECISION.exponent],eax
        retn    

according to IEEE standard must be
Code:
.round_and_copy_result:
        add     ebx,sizeof.TRIPLE_PRECISION
        bt      [next_value.exponent-4*4],31
        jnс     .skip_accuracy_test
        bt      [next_value.exponent-4*4],30
    .skip_accuracy_test:
        mov     eax,[next_value.exponent-4*3]
        mov     edx,[next_value.exponent-4*2]
        mov     ecx,[next_value.exponent-4*1]
        adc     eax,0
        adc     edx,0
        adc     ecx,0
        mov     [ebx+TRIPLE_PRECISION.mantissa_low],eax
        mov     eax,[next_value.exponent-4*0]
        mov     [ebx+TRIPLE_PRECISION.mantissa_mid],edx
        mov     [ebx+TRIPLE_PRECISION.mantissa_high],ecx
        mov     [ebx+TRIPLE_PRECISION.exponent],eax
        retn    

without it float 0.1 rounded not accuracy - last significant digit greter by 1, and many many other number too
than I more explore this code - than I more like it.
Post 23 Nov 2023, 07:15
View user's profile Send private message Send e-mail Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20303
Location: In your JS exploiting you and your system
revolution 23 Nov 2023, 09:23
If you change the rounding then is it still possible to complete a full round trip num->text->num and always get back the exact same value?

I validated the code as given to comply with that requirement, so changing the internal operations would need revalidating.
Post 23 Nov 2023, 09:23
View user's profile Send private message Visit poster's website Reply with quote
Roman



Joined: 21 Apr 2012
Posts: 1769
Roman 23 Nov 2023, 09:24
ProMiNick and how using your code ?

Eax= 12.55

Show example.
Post 23 Nov 2023, 09:24
View user's profile Send private message Reply with quote
ProMiNick



Joined: 24 Mar 2012
Posts: 798
Location: Russian Federation, Sochi
ProMiNick 23 Nov 2023, 11:32
Roman, my code is that part:
Code:
format PE GUI 4.0
entry start

include 'win32a.inc'

section '.text' code readable executable

  start: RTL_C ; cut off RTL_C for official fasm

        stdcall PRINTF,INVALID_HANDLE_VALUE,buffer,-1,szEnv,valu
        ;cinvoke sprintf,buffer,szEnv,[valu],[valu+4]
        invoke  MessageBox,0,buffer,esp,0
        invoke  ExitProcess,0
flush_locals ; cut off flush_locals for official fasm

section '.data' data readable writeable

  szEnv    db "%0.20tf",0 ;db "%0.20f",0

  valu     dt 12.55 ;dq 12.55
  buffer db "1                              ",0

section '.idata' import data readable writeable

  library kernel32,'KERNEL32.DLL',\
          user32,'USER32.DLL';,\
          ;msvcrt,'msvcrt.dll'

  include 'api\kernel32.inc'
  include 'api\user32.inc'

  ;import msvcrt,\
        ;sprintf,'sprintf'    


all the rest I reduced to float conversion needs only (but there is more to cut off)
Code:
DEFAULT_FLOAT_PRECISION                 = 6
DEFAULT_INTEGER_PRECISION               = 1
DEFAULT_CHARACTER_PRECISION             = 1
DEFAULT_STRING_PRECISION                = -1    ;print all characters
DEFAULT_FLOAT_SIZE                      = PRINTF_ELEMENTS_SIZE.dword
DEFAULT_INTEGER_SIZE                    = PRINTF_ELEMENTS_SIZE.word     ;word for 32-bit process
MAXIMUM_CONVERSION_LENGTH               = 96    ;enough space for triple sized binary at 96 bits
MAXIMUM_EXPONENT_LENGTH                 = 8     ;a and A formats produce the longest ('p+16383') at 7 characters
PRINTF_BUFFER_SIZE                      = 1 shl 12
PRINTF_CLASS_ZERO                       = 0
PRINTF_CLASS_NORMAL                     = 1
PRINTF_CLASS_INFINITY                   = 2
PRINTF_CLASS_SNAN                       = 3
PRINTF_CLASS_QNAN                       = 4
EXPONENT_LENGTH_FORMAT_A                = 5
EXPONENT_LENGTH_FORMAT_E                = 3
EXPONENT_LENGTH_FORMAT_G                = 1

TRIPLE_PRECISION_LOG2_MAXIMUM_EXPONENT  = 12    ;10^4096 enough for 80-bit extended reals up to 10^+-4933
TRIPLE_PRECISION_EXPONENT_TABLE_SCALE   = 4     ;exponent reduction in bits per pass. supports using 1, 2, 3, 4, 6 or 12 only

struct TRIPLE_PRECISION
        mantissa_low            dd ?
        mantissa_mid            dd ?
        mantissa_high           dd ?
        exponent                dd ?
ends
sizeof.TRIPLE_PRECISION.mantissa = 32*3

struct FLOAT_DESCRIPTION
        sign_bit_position       db ?    ;number of leading bits to skip to find the sign bit
        exponent_width          db ?
        implicit_bit_flag       db ?
                                db ?    ;align
        exponent_bias           dd ?
ends

struct PRINTF_OUTPUT
        buffer                  dd ?
        length                  dd ?
        handle                  dd ?
        sent                    dd ?
        error                   dd ?
ends

;elements of a number
; [leading spaces] [sign] [prefix] [padding zeros] [precision zeros] [number] [magnitude zeros] [exponent] [trailing spaces]
; with a decimal point placed somewhere within [precision zeros], [number] or [magnitude zeros]

struct PRINTF_ELEMENTS
        flags                   dd ?    ;as specified in the format string
        width                   dd ?    ;as specified in the format string
        precision               dd ?    ;as specified in the format string
        sign_length             dd ?    ;0 or 1
        prefix_length           dd ?    ;0 to 2
        precision_zeros         dd ?    ;0 to many
        number_length           dd ?    ;0 to MAXIMUM_CONVERSION_LENGTH
        magnitude_zeros         dd ?    ;0 to many
        exponent_length         dd ?    ;0 to MAXIMUM_EXPONENT_LENGTH
        decimal_point           dd ?    ;0 to many. 0 means no decimal point, 1 means after the first digit
        significant_bits        dd ?    ;0 to 64. count of mantissa bits in a float
        written                 dd ?    ;running count of total characters in output
        separator_modulus       dd ?    ;0, 3 or 4. 0 means no separators
        arg_value               TRIPLE_PRECISION
        prefix_string           rb 2    ;'0' or '0x'
        sign_character          rb 1    ;'-', '+' or ' '
        separator_character     rb 1    ;',' or ' '
        number_string           rb MAXIMUM_CONVERSION_LENGTH
        exponent_string         rb MAXIMUM_EXPONENT_LENGTH
ends

PRINTF_ELEMENTS_FLAG.argument_size_mask = 7 shl 0       ;byte, hword, word, dword, triple
PRINTF_ELEMENTS_FLAG.left_justify       = 1 shl 3       ;justify left
PRINTF_ELEMENTS_FLAG.show_sign          = 1 shl 4       ;prefix option: show + or - sign
PRINTF_ELEMENTS_FLAG.blank_sign         = 1 shl 5       ;prefix option: show <space> or - sign
PRINTF_ELEMENTS_FLAG.hash_option        = 1 shl 6       ;prefix option: print 0x (hex), 0 (octal) or decimal point (float)
PRINTF_ELEMENTS_FLAG.zero_pad           = 1 shl 7       ;pad left with zeros
PRINTF_ELEMENTS_FLAG.uppercase          = 1 shl 8       ;push to upper case
PRINTF_ELEMENTS_FLAG.pointer            = 1 shl 9       ;pointer to the argument, not the argument itself
PRINTF_ELEMENTS_FLAG.separator          = 1 shl 10      ;print separator characters (space or comma) within the number
PRINTF_ELEMENTS_FLAG.size_specified     = 1 shl 11      ;don't use default argument size
PRINTF_ELEMENTS_FLAG.precision_specified= 1 shl 12      ;don't use default precision
PRINTF_ELEMENTS_FLAG.negative           = 1 shl 13      ;set if arg_value is negative
PRINTF_ELEMENTS_FLAG.zero               = 1 shl 14      ;set if arg_value is zero

PRINTF_ELEMENTS_SIZE.byte               = 0     ;8-bit
PRINTF_ELEMENTS_SIZE.hword              = 1     ;16-bit
PRINTF_ELEMENTS_SIZE.word               = 2     ;32-bit
PRINTF_ELEMENTS_SIZE.dword              = 3     ;64-bit
PRINTF_ELEMENTS_SIZE.triple             = 4     ;96/80-bit
PRINTF_ELEMENTS_SIZE.pointer            = -2
PRINTF_ELEMENTS_SIZE.null               = -1

struct PRINTF_FLAG_TABLE
        char                    db ?
        flag                    dd ?
ends

struct PRINTF_DECODE_TABLE
        function                dd ?
        default_size            dd ?
ends

section '.datapf' data readable writeable

        if used PRINTF_flag_table
                PRINTF_flag_table:
                        PRINTF_FLAG_TABLE       '-',PRINTF_ELEMENTS_FLAG.left_justify
                        PRINTF_FLAG_TABLE       '+',PRINTF_ELEMENTS_FLAG.show_sign
                        PRINTF_FLAG_TABLE       ' ',PRINTF_ELEMENTS_FLAG.blank_sign
                        PRINTF_FLAG_TABLE       '#',PRINTF_ELEMENTS_FLAG.hash_option
                        PRINTF_FLAG_TABLE       '0',PRINTF_ELEMENTS_FLAG.zero_pad
                        PRINTF_FLAG_TABLE       '^',PRINTF_ELEMENTS_FLAG.pointer
                        PRINTF_FLAG_TABLE       ',',PRINTF_ELEMENTS_FLAG.separator
                        PRINTF_FLAG_TABLE       0
        end if
        if used PRINTF_size_table
                PRINTF_size_table:
                        PRINTF_FLAG_TABLE       'b',PRINTF_ELEMENTS_SIZE.byte   + PRINTF_ELEMENTS_FLAG.size_specified
                        PRINTF_FLAG_TABLE       'h',PRINTF_ELEMENTS_SIZE.hword  + PRINTF_ELEMENTS_FLAG.size_specified
                        PRINTF_FLAG_TABLE       'w',PRINTF_ELEMENTS_SIZE.word   + PRINTF_ELEMENTS_FLAG.size_specified
                        PRINTF_FLAG_TABLE       'd',PRINTF_ELEMENTS_SIZE.dword  + PRINTF_ELEMENTS_FLAG.size_specified
                        PRINTF_FLAG_TABLE       't',PRINTF_ELEMENTS_SIZE.triple + PRINTF_ELEMENTS_FLAG.size_specified
                        PRINTF_FLAG_TABLE       0
        end if
        if used PRINTF_decode_table
                align 4
                PRINTF_decode_table:
                        PRINTF_DECODE_TABLE     PRINTF_decode_a,DEFAULT_FLOAT_SIZE
                        PRINTF_DECODE_TABLE     PRINTF_decode_b,DEFAULT_INTEGER_SIZE
                        PRINTF_DECODE_TABLE     PRINTF_decode_c,PRINTF_ELEMENTS_SIZE.byte
                        PRINTF_DECODE_TABLE     PRINTF_decode_d,PRINTF_ELEMENTS_SIZE.null
                        PRINTF_DECODE_TABLE     PRINTF_decode_e,DEFAULT_FLOAT_SIZE
                        PRINTF_DECODE_TABLE     PRINTF_decode_f,DEFAULT_FLOAT_SIZE
                        PRINTF_DECODE_TABLE     PRINTF_decode_g,DEFAULT_FLOAT_SIZE
                        PRINTF_DECODE_TABLE     PRINTF_decode_h,PRINTF_ELEMENTS_SIZE.null
                        PRINTF_DECODE_TABLE     PRINTF_decode_i,DEFAULT_INTEGER_SIZE
                        PRINTF_DECODE_TABLE     PRINTF_decode_j,PRINTF_ELEMENTS_SIZE.null
                        PRINTF_DECODE_TABLE     PRINTF_decode_k,PRINTF_ELEMENTS_SIZE.null
                        PRINTF_DECODE_TABLE     PRINTF_decode_l,PRINTF_ELEMENTS_SIZE.null
                        PRINTF_DECODE_TABLE     PRINTF_decode_m,PRINTF_ELEMENTS_SIZE.null
                        PRINTF_DECODE_TABLE     PRINTF_decode_n,DEFAULT_INTEGER_SIZE
                        PRINTF_DECODE_TABLE     PRINTF_decode_o,DEFAULT_INTEGER_SIZE
                        PRINTF_DECODE_TABLE     PRINTF_decode_p,PRINTF_ELEMENTS_SIZE.pointer
                        PRINTF_DECODE_TABLE     PRINTF_decode_q,DEFAULT_INTEGER_SIZE
                        PRINTF_DECODE_TABLE     PRINTF_decode_r,DEFAULT_FLOAT_SIZE
                        PRINTF_DECODE_TABLE     PRINTF_decode_s,DEFAULT_INTEGER_SIZE
                        PRINTF_DECODE_TABLE     PRINTF_decode_t,PRINTF_ELEMENTS_SIZE.null
                        PRINTF_DECODE_TABLE     PRINTF_decode_u,DEFAULT_INTEGER_SIZE
                        PRINTF_DECODE_TABLE     PRINTF_decode_v,PRINTF_ELEMENTS_SIZE.null
                        PRINTF_DECODE_TABLE     PRINTF_decode_w,PRINTF_ELEMENTS_SIZE.null
                        PRINTF_DECODE_TABLE     PRINTF_decode_x,DEFAULT_INTEGER_SIZE
                        PRINTF_DECODE_TABLE     PRINTF_decode_y,PRINTF_ELEMENTS_SIZE.null
                        PRINTF_DECODE_TABLE     PRINTF_decode_z,PRINTF_ELEMENTS_SIZE.null
        end if
        if used float_description_table
                align 8
                float_description_table:
                        FLOAT_DESCRIPTION       sizeof.TRIPLE_PRECISION.mantissa- 8, 4,-1,,1 shl  3 - 2 ;1. 4. 3 format
                        FLOAT_DESCRIPTION       sizeof.TRIPLE_PRECISION.mantissa-16, 5,-1,,1 shl  4 - 2 ;1. 5.10 format
                        FLOAT_DESCRIPTION       sizeof.TRIPLE_PRECISION.mantissa-32, 8,-1,,1 shl  7 - 2 ;1. 8.23 format
                        FLOAT_DESCRIPTION       sizeof.TRIPLE_PRECISION.mantissa-64,11,-1,,1 shl 10 - 2 ;1.11.52 format
                        FLOAT_DESCRIPTION       sizeof.TRIPLE_PRECISION.mantissa-80,15, 0,,1 shl 14 - 2 ;1.15.63 format + explicit mantissa MSb
        end if
        if used PRINTF_integer_base_10_multiples_32
                align 4
                PRINTF_integer_base_10_multiples_32:
                        dd      1-1
                        dd      10-1
                        dd      100-1
                        dd      1000-1
                        dd      10000-1
                        dd      100000-1
                        dd      1000000-1
                        dd      10000000-1
                        dd      100000000-1
                        dd      1000000000-1
                        dd      -1
                PRINTF_integer_base_10_multiples_64:
                        dq      10000000000-1
                        dq      100000000000-1
                        dq      1000000000000-1
                        dq      10000000000000-1
                        dq      100000000000000-1
                        dq      1000000000000000-1
                        dq      10000000000000000-1
                        dq      100000000000000000-1
                        dq      1000000000000000000-1
                        dq      10000000000000000000-1
                        dq      -1
                PRINTF_integer_base_10_multiples_96:
                    .l = 10000000000000000000 and (1 shl 32 - 1)
                    .h = 10000000000000000000 shr 32
                        dd      (10         * .l) and (1 shl 32 - 1)-1, (10         * .l) shr 32 + (10         * .h) and (1 shl 32 - 1), (10         * .h) shr 32
                        dd      (100        * .l) and (1 shl 32 - 1)-1, (100        * .l) shr 32 + (100        * .h) and (1 shl 32 - 1), (100        * .h) shr 32
                        dd      (1000       * .l) and (1 shl 32 - 1)-1, (1000       * .l) shr 32 + (1000       * .h) and (1 shl 32 - 1), (1000       * .h) shr 32
                        dd      (10000      * .l) and (1 shl 32 - 1)-1, (10000      * .l) shr 32 + (10000      * .h) and (1 shl 32 - 1), (10000      * .h) shr 32
                        dd      (100000     * .l) and (1 shl 32 - 1)-1, (100000     * .l) shr 32 + (100000     * .h) and (1 shl 32 - 1), (100000     * .h) shr 32
                        dd      (1000000    * .l) and (1 shl 32 - 1)-1, (1000000    * .l) shr 32 + (1000000    * .h) and (1 shl 32 - 1), (1000000    * .h) shr 32
                        dd      (10000000   * .l) and (1 shl 32 - 1)-1, (10000000   * .l) shr 32 + (10000000   * .h) and (1 shl 32 - 1), (10000000   * .h) shr 32
                        dd      (100000000  * .l) and (1 shl 32 - 1)-1, (100000000  * .l) shr 32 + (100000000  * .h) and (1 shl 32 - 1), (100000000  * .h) shr 32
                        dd      (1000000000 * .l) and (1 shl 32 - 1)-1, (1000000000 * .l) shr 32 + (1000000000 * .h) and (1 shl 32 - 1), (1000000000 * .h) shr 32
                        dd      -1,-1,-1
        end if
        if used PRINTF_integer_reciprocals
                align 4
                PRINTF_integer_reciprocals:
                        dd      (1 shl 24 + 0)/1        ;ceiling(2^24/1) binary
                        dd      (1 shl 24 + 1)/2        ;ceiling(2^24/2) quaternary
                        dd      (1 shl 24 + 2)/3        ;ceiling(2^24/3) octal
                        dd      (1 shl 24 + 3)/4        ;ceiling(2^24/4) hexadecimal
                        dd      (1 shl 24 + 4)/5        ;ceiling(2^24/5) base-32
        end if


        if used PRINTF_triple_precision_power_table
                align 16
                PRINTF_triple_precision_power_table rb TRIPLE_PRECISION_EXPONENT_TABLE_SIZE*sizeof.TRIPLE_PRECISION
        end if

section '.codepf' code executable

PRINTF_PROC_FLAG_RESTORE_ESP = 1 shl 32
PRINTF_PROC_FLAG_RESTORE_EBP = 1 shl 33

macro proc_leaf [args] { common
        prologue@proc equ PRINTF_prologue_leaf
        proc args
        restore prologue@proc
}
macro PRINTF_prologue_leaf procname,flag,parambytes,localbytes,reglist {
        PRINTF_prologue procname,flag,parambytes,localbytes,reglist,1
}

macro PRINTF_prologue procname,flag,parambytes,localbytes,reglist,leaf {
        local   varsize,regsize
        varsize = (localbytes + 3) and (not 3)
        match x,leaf \{
                localbase@proc  equ esp-varsize
                parmbase@proc   equ esp+4+regsize
                rept 0 \{
        \} rept 1 \{
                localbase@proc  equ ebp-varsize
                parmbase@proc   equ ebp+4+regsize
        \}
        regsize = 0
        irps reg,reglist \{
                push    reg
                regsize = regsize + 4
        \}
        if (parambytes | localbytes) & ~ leaf+0
                regsize = regsize + 4
                flag = flag or PRINTF_PROC_FLAG_RESTORE_EBP
                push    ebp
                mov     ebp,esp
                if localbytes
                        flag = flag or PRINTF_PROC_FLAG_RESTORE_ESP
                        add     esp,-varsize
                end if
        end if
}

macro PRINTF_epilogue procname,flag,parambytes,localbytes,reglist {
        if flag and PRINTF_PROC_FLAG_RESTORE_ESP
                sub     esp,-((localbytes + 3) and (not 3))
        end if
        if flag and PRINTF_PROC_FLAG_RESTORE_EBP
                pop     ebp
        end if
        irps reg,reglist \{ reverse
                pop     reg
        \}
        if flag and 10000b
                retn                    ;c call
        else
                retn    parambytes      ;standard call
        end if
}

prologue@proc equ PRINTF_prologue
epilogue@proc equ PRINTF_epilogue

proc PRINTF uses ebx esi edi,handle,dest,size,format,arglist
        locals
                output          PRINTF_OUTPUT
                elements        PRINTF_ELEMENTS
        endl
        mov     edx,[dest]
        mov     ecx,[size]
        mov     eax,[handle]
        mov     esi,[format]
        xor     ebx,ebx
        mov     [output.buffer],edx
        mov     [output.length],ecx
        mov     [output.handle],eax
        mov     [output.sent],ebx
        mov     [output.error],ebx
    .next:
        xor     eax,eax
        lea     edi,[elements]
        mov     ecx,sizeof.PRINTF_ELEMENTS/4
        rep     stosd
    .next_char:
        movzx   eax,byte[esi]
        inc     esi
        test    eax,eax
        jz      .done
        cmp     al,'%'
        jz      .process_flags
    .print_char:
        stdcall PRINTF_elements_print_character,addr elements,eax,addr output
        jmp     .next_char
    .process_flags:
        movzx   eax,byte[esi]
        inc     esi
        test    eax,eax
        jz      .done
    .next_flag:
        mov     ebx,PRINTF_flag_table
        call    .convert_table
        test    eax,eax
        jz      .done
        cmp     [ebx+PRINTF_FLAG_TABLE.char],0
        jnz     .next_flag
    .process_width:
        cmp     al,'*'
        jz      .width_from_argument
        cmp     al,'1'
        jb      .process_precision
        cmp     al,'9'
        ja      .process_precision
    .width_from_argument:
        call    .convert_decimal
        mov     [elements.width],edi
        test    eax,eax
        jz      .done
    .process_precision:
        cmp     al,'.'
        jnz     .process_size
        movzx   eax,byte[esi]
        inc     esi
        test    eax,eax
        jz      .done
        xor     edi,edi
        cmp     al,'*'
        jz      .precision_from_argument
        cmp     al,'0'
        jb      .set_precision
        cmp     al,'9'
        ja      .set_precision
    .precision_from_argument:
        call    .convert_decimal
    .set_precision:
        mov     [elements.precision],edi
        or      [elements.flags],PRINTF_ELEMENTS_FLAG.precision_specified
        test    eax,eax
        jz      .done
    .process_size:
        mov     ebx,PRINTF_size_table
        call    .convert_table
        test    eax,eax
        jz      .done
    .process_decode:
        movzx   ebx,al
        cmp     bl,'A'
        jbe     .check_type
        cmp     bl,'Z'
        ja      .check_type
        add     bl,'a'-'A'
        or      [elements.flags],PRINTF_ELEMENTS_FLAG.uppercase
    .check_type:
        cmp     bl,'a'
        jb      .print_char
        cmp     bl,'z'
        ja      .print_char
        ;compute current output length
        mov     edi,[output.buffer]
        sub     edi,[dest]
        mov     [elements.written],edi
        ;get the argument value
        stdcall PRINTF_read_arg,addr elements.arg_value,[arglist],\
                        [PRINTF_decode_table+(ebx-'a')*sizeof.PRINTF_DECODE_TABLE+PRINTF_DECODE_TABLE.default_size],\
                        [elements.flags]
        or      [elements.flags],edx    ;set the size
        mov     [arglist],ecx
        ;call the decoder
        mov     eax,ebx                 ;we pass the character type to the function in eax (for printing the unknown type)
        stdcall [PRINTF_decode_table+(ebx-'a')*sizeof.PRINTF_DECODE_TABLE+PRINTF_DECODE_TABLE.function],addr elements,addr output
        jmp     .next
    .done:
        cmp     [output.handle],INVALID_HANDLE_VALUE
        jne     .flush
        stdcall PRINTF_elements_print_character,addr elements,0,addr output
        mov     edx,[output.buffer]
        cmp     [output.length],0
        jnz     .return_no_overflow
        mov     eax,edx
        sub     eax,[dest]
        cmp     eax,[size]
        jz      .return_no_overflow
        sub     eax,[size]
        neg     eax                     ;return negative the number of extra bytes required
        cmp     [size],0
        jz      .ret
        mov     byte[edx+eax-1],0       ;always return a null terminated string
    .ret:
        ret
    .return_no_overflow:
        lea     eax,[edx-1]             ;don't include the terminating null
        sub     eax,[dest]              ;return number of characters stored
        ret
    .flush:
        ;stdcall PRINTF_flush_buffer,addr output
        mov     eax,[output.sent]       ;return number of characters output
        mov     ecx,[output.error]      ;return non-zero if there was an error
        ret

    .convert_table:
        cmp     [ebx+PRINTF_FLAG_TABLE.char],0
        jz      .table_done
        cmp     [ebx+PRINTF_FLAG_TABLE.char],al
        jz      .set_flag
        add     ebx,sizeof.PRINTF_FLAG_TABLE
        jmp     .convert_table
    .set_flag:
        mov     eax,[ebx+PRINTF_FLAG_TABLE.flag]
        or      [elements.flags],eax
        movzx   eax,byte[esi]
        inc     esi
    .table_done:
        retn

    .convert_decimal:
        cmp     al,'*'
        jz      .decimal_from_argument
        lea     edi,[eax-'0']
    .decimal_next:
        movzx   eax,byte[esi]
        inc     esi
        test    eax,eax
        jz      .decimal_last
        sub     eax,'0'
        cmp     eax,9
        ja      .decimal_done
        lea     edi,[edi*5]
        lea     edi,[edi*2+eax]
        jmp     .decimal_next
    .decimal_from_argument:
        mov     eax,[arglist]
        mov     edi,[eax]
        add     eax,4
        mov     [arglist],eax
        movzx   eax,byte[esi]
        inc     esi
        jmp     .decimal_last
    .decimal_done:
        add     eax,'0'
    .decimal_last:
        retn

endp

proc_leaf PRINTF_read_arg uses edi esi ebx,dest,arg_pointer,default_size,flags
        ;return ecx=new arg pointer, edx=argument size
        mov     edx,[default_size]
        mov     ecx,[arg_pointer]
        cmp     edx,PRINTF_ELEMENTS_SIZE.null
        jz      .null
        mov     ebx,[flags]
        mov     edi,[dest]
        mov     esi,ecx
        add     ecx,1 shl DEFAULT_INTEGER_SIZE
        test    ebx,PRINTF_ELEMENTS_FLAG.pointer
        jz      .address_known
        mov     esi,[esi]
    .address_known:
        cmp     edx,PRINTF_ELEMENTS_SIZE.pointer
        jz      .pointer
        test    ebx,PRINTF_ELEMENTS_FLAG.size_specified
        jz      .size_known
        mov     edx,ebx
        and     edx,PRINTF_ELEMENTS_FLAG.argument_size_mask
    .size_known:
        push    edx
        cmp     edx,PRINTF_ELEMENTS_SIZE.byte
        jz      .read_byte
        cmp     edx,PRINTF_ELEMENTS_SIZE.hword
        jz      .read_hword
        cmp     edx,PRINTF_ELEMENTS_SIZE.word
        jz      .read_word
        cmp     edx,PRINTF_ELEMENTS_SIZE.dword
        jz      .read_dword
        cmp     edx,PRINTF_ELEMENTS_SIZE.triple
        jz      .read_triple
        int3
    .read_byte:
        movzx   eax,byte[esi]
        jmp     .store_word
    .read_hword:
        movzx   eax,word[esi]
        jmp     .store_word
    .read_word:
        mov     eax,[esi]
        jmp     .store_word
    .read_dword:
        mov     eax,[esi]
        mov     edx,[esi+4]
        jmp     .store_dword
    .read_triple:
        mov     eax,[esi]
        mov     edx,[esi+4]
        mov     ebx,[esi+8]
        mov     [edi+8],ebx
        add     esi,4
    .store_dword:
        mov     [edi+4],edx
        add     esi,4
    .store_word:
        mov     [edi+0],eax
        add     esi,4
        pop     edx
        test    [flags],PRINTF_ELEMENTS_FLAG.pointer
        jnz     .done
        mov     ecx,esi
    .done:
        ret
    .pointer:
        mov     [edi+0],esi
    .null:
        xor     edx,edx         ;return no size value
        ret
endp

PRINTF_decode_a = PRINTF_decode_unknown
PRINTF_decode_b = PRINTF_decode_unknown
PRINTF_decode_c = PRINTF_decode_unknown
PRINTF_decode_d = PRINTF_decode_unknown
PRINTF_decode_e = PRINTF_decode_unknown

proc PRINTF_decode_f uses ebx,elements,output
        ;output a decimal floating point number "-dddd.dddddd"
        ;digits are in decimal
        ;the precision field specifies the number of digits after the decimal point
        mov     ebx,[elements]
        stdcall PRINTF_get_float_parameters,ebx,-1
        test    edx,edx
        jnz     .print
        stdcall PRINTF_generate_f,ebx,addr ebx+PRINTF_ELEMENTS.arg_value,eax
    .print:
        stdcall PRINTF_elements_print,ebx,[output]
        ret
endp


PRINTF_decode_g = PRINTF_decode_unknown

PRINTF_decode_h = PRINTF_decode_unknown
PRINTF_decode_i = PRINTF_decode_unknown
PRINTF_decode_j = PRINTF_decode_unknown
PRINTF_decode_k = PRINTF_decode_unknown
PRINTF_decode_l = PRINTF_decode_unknown
PRINTF_decode_m = PRINTF_decode_unknown
PRINTF_decode_n = PRINTF_decode_unknown
PRINTF_decode_o = PRINTF_decode_unknown
PRINTF_decode_p = PRINTF_decode_unknown
PRINTF_decode_q = PRINTF_decode_unknown
PRINTF_decode_r = PRINTF_decode_unknown
PRINTF_decode_s = PRINTF_decode_unknown
PRINTF_decode_t = PRINTF_decode_unknown
PRINTF_decode_u = PRINTF_decode_unknown
PRINTF_decode_v = PRINTF_decode_unknown
PRINTF_decode_w = PRINTF_decode_unknown
PRINTF_decode_x = PRINTF_decode_unknown
PRINTF_decode_y = PRINTF_decode_unknown
PRINTF_decode_z = PRINTF_decode_unknown

proc PRINTF_decode_unknown elements,output
        ;the unknown character type is in eax
        ;output the type character
        ;this also allows to output a single % symbol by placing a pair (%%) in the format string
        stdcall PRINTF_elements_print_character,[elements],eax,[output]
        ret
endp

macro PRINTF_ceiling_log2_10 reg,bits,offset {
        ;computes: bits * log10(2) + roundup + offset
        ;result is valid for input values up to 2620 bits
        imul    reg,bits,631306                 ;ceiling(2^21 * log10(2))
        add     reg,(offset+1) shl 21 - 1       ;round up and add the offset
        shr     reg,21
}

proc PRINTF_generate_f uses ebx,elements,value,log10
        ;return eax=length of printed number
        locals
                expected_size                   dd ?
                scaled_value                    TRIPLE_PRECISION
                significant_decimal_digits_m1   dd ?
        endl
        mov     ebx,[elements]
        test    [ebx+PRINTF_ELEMENTS.flags],PRINTF_ELEMENTS_FLAG.separator
        jz      .separator_okay
        mov     [ebx+PRINTF_ELEMENTS.separator_character],','
        mov     [ebx+PRINTF_ELEMENTS.separator_modulus],3
    .separator_okay:
        PRINTF_ceiling_log2_10 eax,[ebx+PRINTF_ELEMENTS.significant_bits],1     ;+1 for last digit distinction
        dec     eax
        mov     [significant_decimal_digits_m1],eax
        mov     eax,[log10]
        mov     edx,[ebx+PRINTF_ELEMENTS.precision]
        ;eax=log10
        ;edx=specified precision
        test    [ebx+PRINTF_ELEMENTS.flags],PRINTF_ELEMENTS_FLAG.zero
        jz      .non_zero
        mov     eax,1 shl 31            ;the log of zero is extremely tiny
    .non_zero:
        test    eax,eax
        jns     .positive_log
        mov     ecx,eax
        neg     ecx
        cmp     ecx,edx
        jbe     .store_precision_zeros
        mov     ecx,edx
    .store_precision_zeros:
        mov     [ebx+PRINTF_ELEMENTS.precision_zeros],ecx
        mov     [ebx+PRINTF_ELEMENTS.decimal_point],1
        jmp     .decimal_point_done
    .positive_log:
        ;set the decimal point position
        lea     ecx,[eax+1]
        mov     [ebx+PRINTF_ELEMENTS.decimal_point],ecx
        ;check if a decimal is printed
        test    edx,edx                 ;non-zero precisions always have a decimal point
        jnz     .decimal_point_done
        test    [ebx+PRINTF_ELEMENTS.flags],PRINTF_ELEMENTS_FLAG.hash_option
        jnz     .decimal_point_done
        xor     ecx,ecx
        mov     [ebx+PRINTF_ELEMENTS.decimal_point],ecx
    .decimal_point_done:
        add     eax,edx                 ;compute most significant digit
        lea     ecx,[eax+1]
        jns     .store_expected_size
        xor     ecx,ecx
    .store_expected_size:
        mov     [expected_size],ecx
        test    ecx,ecx
        jnz     .compute_trailing_zeros
        mov     ecx,[ebx+PRINTF_ELEMENTS.precision_zeros]
        inc     ecx
        mov     [ebx+PRINTF_ELEMENTS.precision_zeros],ecx
        cmp     ecx,1
        jnz     .scale
        test    [ebx+PRINTF_ELEMENTS.flags],PRINTF_ELEMENTS_FLAG.hash_option
        jnz     .scale
        mov     [ebx+PRINTF_ELEMENTS.decimal_point],0
        jmp     .scale
    .compute_trailing_zeros:
        ;find the number of trailing zeros to print
        sub     eax,[significant_decimal_digits_m1]
        jle     .scale
        sub     edx,eax
        mov     [ebx+PRINTF_ELEMENTS.magnitude_zeros],eax
        sub     [expected_size],eax
    .scale:
        ;scale to fit
        stdcall PRINTF_triple_precision_scale,addr scaled_value,[value],edx
        stdcall PRINTF_decimalise_unsigned,addr ebx+PRINTF_ELEMENTS.number_string,addr scaled_value
        mov     [ebx+PRINTF_ELEMENTS.number_length],eax
        sub     eax,[expected_size]
        je      .conversion_okay
        add     [log10],eax             ;if the number was rounded up (e.g. 99.9 to 100) then adjust
        mov     ecx,[ebx+PRINTF_ELEMENTS.precision_zeros]
        sub     ecx,1
        jnc     .set_new_leading_zeros
        mov     ecx,[ebx+PRINTF_ELEMENTS.decimal_point]
        test    ecx,ecx
        jz      .conversion_okay
        inc     ecx
        mov     [ebx+PRINTF_ELEMENTS.decimal_point],ecx
        jmp     .conversion_okay
    .set_new_leading_zeros:
        mov     [ebx+PRINTF_ELEMENTS.precision_zeros],ecx
    .conversion_okay:
        mov     eax,[ebx+PRINTF_ELEMENTS.number_length]
        ret
endp

proc PRINTF_get_float_parameters uses ebx,elements,log10_flag
        ;return eax=log10(value), edx=conversion done flag (for NaN and infinity)
        ;upscales the argument
        ;sets the negative flag
        ;sets the precision
        ;converts infinity and NaN to standard formats
        mov     ebx,[elements]
        mov     edx,[ebx+PRINTF_ELEMENTS.flags]
        and     edx,PRINTF_ELEMENTS_FLAG.argument_size_mask
        lea     edx,[edx*sizeof.FLOAT_DESCRIPTION+float_description_table]
        lea     ecx,[ebx+PRINTF_ELEMENTS.arg_value]
        stdcall PRINTF_triple_precision_upscale,ecx,ecx,edx
        and     ecx,PRINTF_ELEMENTS_FLAG.negative
        or      [ebx+PRINTF_ELEMENTS.flags],ecx
        mov     [ebx+PRINTF_ELEMENTS.significant_bits],edx
        cmp     eax,PRINTF_CLASS_INFINITY
        je      .infinity
        cmp     eax,PRINTF_CLASS_SNAN
        je      .SNaN
        cmp     eax,PRINTF_CLASS_QNAN
        je      .QNaN
        cmp     eax,PRINTF_CLASS_ZERO
        je      .zero
        cmp     [log10_flag],0
        jz      .log10_okay
        stdcall PRINTF_triple_precision_floor_log10,addr ebx+PRINTF_ELEMENTS.arg_value
    .log10_okay:
        ;set the precision
        test    [ebx+PRINTF_ELEMENTS.flags],PRINTF_ELEMENTS_FLAG.precision_specified
        jnz     .precision_okay
        mov     [ebx+PRINTF_ELEMENTS.precision],DEFAULT_FLOAT_PRECISION
    .precision_okay:
        xor     edx,edx         ;indicate not yet converted
    .done:
        ret
    .zero:
        or      [ebx+PRINTF_ELEMENTS.flags],PRINTF_ELEMENTS_FLAG.zero
        jmp     .log10_okay
    .infinity:
        mov     eax,3
        mov     dword[ebx+PRINTF_ELEMENTS.number_string],'inf'
        jmp     .converted
    .SNaN:
        mov     ecx,'SNaN'
        jmp     .NaN
    .QNaN:
        mov     ecx,'QNaN'
    .NaN:
        mov     dword[ebx+PRINTF_ELEMENTS.number_string+0],ecx
        mov     dword[ebx+PRINTF_ELEMENTS.number_string+4],'0x0'
        ;stdcall PRINTF_base2_radix_unsigned,addr ebx+PRINTF_ELEMENTS.number_string+6,addr ebx+PRINTF_ELEMENTS.arg_value,16
        cmp     eax,1
        adc     eax,6
    .converted:
        mov     [ebx+PRINTF_ELEMENTS.number_length],eax
        or      edx,-1          ;indicate that conversion is complete
        jmp     .done
endp

proc_leaf PRINTF_triple_precision_upscale uses ebp edi esi ebx,dest,source,description
        ;returns eax=number class, ecx=negative flag, edx=number of significant bits
        ;convert incoming float to triple precision format
        locals
                negative_flag   dd ?
                number_class    dd ?
        endl
        mov     esi,[description]
        mov     ebp,[source]
        mov     edx,sizeof.TRIPLE_PRECISION.mantissa            ;edx = number of significant bits
        mov     [number_class],PRINTF_CLASS_NORMAL
        movzx   ecx,[esi+FLOAT_DESCRIPTION.sign_bit_position]
        mov     eax,[ebp+0]
        mov     ebx,[ebp+4]
        mov     ebp,[ebp+8]
        ;shift from 16 to 88 bits left
        sub     ecx,32
        jb      .shift_in_sign
    .find_sign:
        sub     edx,32
        mov     ebp,ebx
        mov     ebx,eax
        xor     eax,eax
        sub     ecx,32
        jae     .find_sign
    .shift_in_sign:
        and     ecx,0x1f
        sub     edx,ecx
        shld    ebp,ebx,cl
        shld    ebx,eax,cl
        shl     eax,cl
        ;extract the sign
        dec     edx
        add     eax,eax
        adc     ebx,ebx
        adc     ebp,ebp
        sbb     ecx,ecx
        mov     [negative_flag],ecx
        ;extract the exponent
        movzx   ecx,[esi+FLOAT_DESCRIPTION.exponent_width]
        sub     edx,ecx
        xor     edi,edi
        shld    edi,ebp,cl
        ;check for infinity and NaN
        inc     edi
        shr     edi,cl
        jnz     .infinity_NaN
        shld    edi,ebp,cl
        shld    ebp,ebx,cl
        shld    ebx,eax,cl
        shl     eax,cl
        test    edi,edi
        jnz     .normal
        ;check for zero
        mov     ecx,ebp
        or      ecx,ebx
        or      ecx,eax
        jz      .zero
        ;adjust denormal exponent
        cmp     [esi+FLOAT_DESCRIPTION.implicit_bit_flag],0
        setz    cl
        movzx   ecx,cl
        add     edi,ecx         ;restore exponent extra
        jmp     .implied_bit_okay
    .normal:
        ;add the implied bit
        cmp     [esi+FLOAT_DESCRIPTION.implicit_bit_flag],0
        jz      .implied_bit_okay
        inc     edx
        shrd    eax,ebx,1
        shrd    ebx,ebp,1
        stc
        rcr     ebp,1
    .implied_bit_okay:
        sub     edi,[esi+FLOAT_DESCRIPTION.exponent_bias]
        ;normalise
        test    ebp,ebp
        js      .normalised
        bsr     ecx,ebp
        jnz     .shift_in_denormal
        ;check for unnormal zero
        mov     ecx,ebp
        or      ecx,ebx
        or      ecx,eax
        jz      .unnormal_zero
    .macro_shift_denormal:
        sub     edi,32
        sub     edx,32
        mov     ebp,ebx
        mov     ebx,eax
        xor     eax,eax
        bsr     ecx,ebp
        jz      .macro_shift_denormal
    .shift_in_denormal:
        not     ecx
        and     ecx,0x1f
        sub     edi,ecx
        sub     edx,ecx
        shld    ebp,ebx,cl
        shld    ebx,eax,cl
        shl     eax,cl
    .normalised:
        mov     ecx,[dest]
        mov     [ecx+TRIPLE_PRECISION.exponent],edi
        mov     [ecx+TRIPLE_PRECISION.mantissa_high],ebp
        mov     [ecx+TRIPLE_PRECISION.mantissa_mid],ebx
        mov     [ecx+TRIPLE_PRECISION.mantissa_low],eax
        mov     ecx,[negative_flag]
        mov     eax,[number_class]
        ret
    .unnormal_zero:
        xor     edi,edi
    .zero:
        xor     edx,edx
        mov     [number_class],PRINTF_CLASS_ZERO
        jmp     .normalised
    .infinity_NaN:
        xor     edi,edi
        mov     [number_class],PRINTF_CLASS_INFINITY
        shld    ebp,ebx,cl
        shld    ebx,eax,cl
        shl     eax,cl
        ;if there is an explicit bit then we shift it out and ignore it
        ;this means all pseudo NaNs and pseudo infinity become normal NaNs and normal infinity
        cmp     [esi+FLOAT_DESCRIPTION.implicit_bit_flag],0
        jnz     .explicit_bit_okay
        shld    ebp,ebx,1
        shld    ebx,eax,1
        shl     eax,1
        dec     edx
    .explicit_bit_okay:
        ;check for infinity
        mov     ecx,ebp
        or      ecx,ebx
        or      ecx,eax
        xchg    edx,ecx
        jz      .normalised
        dec     ecx
        mov     edx,ecx
        ;get the Q/S bit
        btr     ebp,31
        mov     esi,PRINTF_CLASS_QNAN
        jc      .NaN_class_okay
        mov     esi,PRINTF_CLASS_SNAN
    .NaN_class_okay:
        mov     [number_class],esi
        ;shift back to the LSb
        sub     ecx,sizeof.TRIPLE_PRECISION.mantissa
        not     ecx
        sub     ecx,32
        jb      .final_shift_back_mantissa
    .shift_back_mantissa:
        mov     eax,ebx
        mov     ebx,ebp
        xor     ebp,ebp
        sub     ecx,32
        jae     .shift_back_mantissa
    .final_shift_back_mantissa:
        shrd    eax,ebx,cl
        shrd    ebx,ebp,cl
        shr     ebp,cl
        jmp     .normalised
endp

proc PRINTF_triple_precision_floor_log10 uses esi ebx,source
        ;compute eax=floor(log10(source))
        locals
                temp_power TRIPLE_PRECISION
        endl
        mov     esi,[source]
        mov     eax,0x4D104D43                          ;ceiling(2^32 * log2(10))
        imul    dword[esi+TRIPLE_PRECISION.exponent]    ;edx=approximate log. always equal to, or one higher, than the required value
        mov     ebx,edx
        stdcall PRINTF_get_triple_precision_10_power_y,addr temp_power,0,edx
        assert  sizeof.TRIPLE_PRECISION = 16
        mov     eax,[esi+4*0]
        mov     ecx,[esi+4*1]
        mov     edx,[esi+4*2]
        mov     esi,[esi+4*3]
        sub     eax,[temp_power+4*0]
        sbb     ecx,[temp_power+4*1]
        sbb     edx,[temp_power+4*2]
        sbb     esi,[temp_power+4*3]
        setl    cl                                      ;adjust log to the correct value
        movzx   ecx,cl
        neg     ecx
        lea     eax,[ebx+ecx]
        ret
endp

proc PRINTF_triple_precision_scale uses ebx esi,dest,source,scale
        ;compute dest=round(source*10^scale)
        locals
                scaled_value TRIPLE_PRECISION
        endl
        stdcall PRINTF_get_triple_precision_10_power_y,addr scaled_value,[source],[scale]
    ;integerise
        mov     ecx,[scaled_value.exponent]
        xor     edx,edx
        xor     ebx,ebx
        xor     eax,eax
        test    ecx,ecx
        js      .store
        mov     edx,[scaled_value.mantissa_high]
        mov     ebx,[scaled_value.mantissa_mid]
        mov     eax,[scaled_value.mantissa_low]
        sub     ecx,sizeof.TRIPLE_PRECISION.mantissa
        ;jg     .overflow
        jge     .store
        xor     esi,esi
        neg     ecx
        sub     ecx,32
        jb      .final_shift
    .shifter_loop:
        mov     esi,eax
        mov     eax,ebx
        mov     ebx,edx
        xor     edx,edx
        sub     ecx,32
        jae     .shifter_loop
    .final_shift:
        shrd    esi,eax,cl
        shrd    eax,ebx,cl
        shrd    ebx,edx,cl
        shr     edx,cl
        ;round up
        add     esi,esi
        adc     eax,0
        adc     ebx,0
        adc     edx,0
    .store:
        mov     esi,[dest]
        mov     [esi+TRIPLE_PRECISION.mantissa_low],eax
        mov     [esi+TRIPLE_PRECISION.mantissa_mid],ebx
        mov     [esi+TRIPLE_PRECISION.mantissa_high],edx
        ret
endp

TRIPLE_PRECISION_EXPONENT_TABLE_PASSES  = (TRIPLE_PRECISION_LOG2_MAXIMUM_EXPONENT+TRIPLE_PRECISION_EXPONENT_TABLE_SCALE)/\
                                          TRIPLE_PRECISION_EXPONENT_TABLE_SCALE
TRIPLE_PRECISION_EXPONENT_LAST_PASS_BITS= 1 + TRIPLE_PRECISION_LOG2_MAXIMUM_EXPONENT - \
                                          TRIPLE_PRECISION_EXPONENT_TABLE_SCALE * (TRIPLE_PRECISION_EXPONENT_TABLE_PASSES - 1)
TRIPLE_PRECISION_EXPONENT_TABLE_LENGTH  = ((1 shl TRIPLE_PRECISION_EXPONENT_TABLE_SCALE - 1) * \
                                          (TRIPLE_PRECISION_EXPONENT_TABLE_PASSES - 1) + \
                                          (1 shl TRIPLE_PRECISION_EXPONENT_LAST_PASS_BITS - 1))
TRIPLE_PRECISION_EXPONENT_TABLE_SIZE    = 1 + 2 * TRIPLE_PRECISION_EXPONENT_TABLE_LENGTH
TRIPLE_PRECISION_EXPONENT_10_TO_M4096   = 0xffffcada    ;the last value written to the power table, to ensure this is thread safe

proc PRINTF_get_triple_precision_10_power_y uses edi esi ebx,dest,source,y
        ;compute dest=source*10^y
        mov     esi,PRINTF_triple_precision_power_table
        mov     ebx,[source]
        mov     edi,[dest]
        stdcall PRINTF_check_triple_precision_power_table,esi
        ;either start with 10^0 in esi, ...
        test    ebx,ebx
        jz      .copy
        ;... or start with the source value
        mov     esi,ebx
    .copy:
        mov     eax,[esi+TRIPLE_PRECISION.mantissa_low]
        mov     ecx,[esi+TRIPLE_PRECISION.mantissa_mid]
        mov     edx,[esi+TRIPLE_PRECISION.mantissa_high]
        mov     ebx,[esi+TRIPLE_PRECISION.exponent]
        mov     [edi+TRIPLE_PRECISION.mantissa_low],eax
        mov     [edi+TRIPLE_PRECISION.mantissa_mid],ecx
        mov     [edi+TRIPLE_PRECISION.mantissa_high],edx
        mov     [edi+TRIPLE_PRECISION.exponent],ebx
        mov     ebx,[y]
        mov     esi,PRINTF_triple_precision_power_table                                 ;10^1
        test    ebx,ebx
        jz      .done
        jns     .raise_loop
        add     esi,TRIPLE_PRECISION_EXPONENT_TABLE_LENGTH * sizeof.TRIPLE_PRECISION    ;10^-1
        neg     ebx
    .raise_loop:
        mov     eax,ebx
        and     eax,1 shl TRIPLE_PRECISION_EXPONENT_TABLE_SCALE - 1
        jz      .raise_next
        assert  sizeof.TRIPLE_PRECISION=16
        shl     eax,4
        stdcall PRINTF_variable_precision_mul,edi,edi,addr eax+esi,sizeof.TRIPLE_PRECISION.mantissa / 32
    .raise_next:
        add     esi,(1 shl TRIPLE_PRECISION_EXPONENT_TABLE_SCALE - 1) * sizeof.TRIPLE_PRECISION
        shr     ebx,TRIPLE_PRECISION_EXPONENT_TABLE_SCALE
        jnz     .raise_loop
    .done:
        ret
endp

struct QUAD_PRECISION
        mantissa        rd sizeof.TRIPLE_PRECISION.mantissa / 32 + 1
        exponent        rd 1
ends
sizeof.QUAD_PRECISION.mantissa = sizeof.TRIPLE_PRECISION.mantissa + 32

proc_leaf PRINTF_check_triple_precision_power_table table
        mov     eax,[table]
        cmp     [eax+(TRIPLE_PRECISION_EXPONENT_TABLE_SIZE-1)*sizeof.TRIPLE_PRECISION+TRIPLE_PRECISION.exponent],TRIPLE_PRECISION_EXPONENT_10_TO_M4096
        jnz     PRINTF_make_triple_precision_power_table
        ret
endp

proc PRINTF_make_triple_precision_power_table uses edi esi ebx,table
        locals
                base_multiplier QUAD_PRECISION
                next_value      QUAD_PRECISION
        endl
        assert TRIPLE_PRECISION_EXPONENT_LAST_PASS_BITS = 1
        mov     ebx,[table]
        ;start with 10^0 at the beginning
        xor     ecx,ecx
        mov     [ebx+TRIPLE_PRECISION.mantissa_low],ecx
        mov     [ebx+TRIPLE_PRECISION.mantissa_mid],ecx
        mov     [ebx+TRIPLE_PRECISION.mantissa_high],1 shl 31
        mov     [ebx+TRIPLE_PRECISION.exponent],+1
        ;then the first block starts with 10^1
        repeat sizeof.QUAD_PRECISION.mantissa / 32 - 1
                mov     [next_value.mantissa + 4 * (%-1)],ecx
        end repeat
        mov     [next_value.mantissa + 4 * (sizeof.QUAD_PRECISION.mantissa / 32 - 1)],10 shl 28
        mov     [next_value.exponent],+4
        call    .build
        ;then the next block starts with 10^-1
        mov     ecx,0xcccccccd
        mov     [next_value.mantissa + 4 * 0],ecx
        dec     ecx
        repeat sizeof.QUAD_PRECISION.mantissa / 32 - 1
                mov     [next_value.mantissa + 4 * %],ecx
        end repeat
        mov     [next_value.exponent],-3
        call    .build
        ret

    .build:
        call    .round_and_copy_result
        mov     edi,TRIPLE_PRECISION_EXPONENT_TABLE_PASSES - 1
    .build_scale_loop:
        ;transfer next value to base
        mov     edx,edi
        lea     esi,[next_value]
        lea     edi,[base_multiplier]
        mov     ecx,sizeof.QUAD_PRECISION / 4
        rep     movsd
        mov     edi,edx
        mov     esi,1 shl TRIPLE_PRECISION_EXPONENT_TABLE_SCALE - 1
    .build_bit_loop:
        lea     ecx,[next_value]
        stdcall PRINTF_variable_precision_mul,ecx,ecx,addr base_multiplier,sizeof.QUAD_PRECISION.mantissa / 32
        call    .round_and_copy_result
        dec     esi
        jnz     .build_bit_loop
        dec     edi
        jnz     .build_scale_loop
        retn

    .round_and_copy_result:
        add     ebx,sizeof.TRIPLE_PRECISION
        bt      [next_value.exponent-4*4],31
        jnc     .skip_accuracy_test
        bt      [next_value.exponent-4*4],30
    .skip_accuracy_test:
        mov     eax,[next_value.exponent-4*3]
        mov     edx,[next_value.exponent-4*2]
        mov     ecx,[next_value.exponent-4*1]
        adc     eax,0
        adc     edx,0
        adc     ecx,0
        mov     [ebx+TRIPLE_PRECISION.mantissa_low],eax
        mov     eax,[next_value.exponent-4*0]
        mov     [ebx+TRIPLE_PRECISION.mantissa_mid],edx
        mov     [ebx+TRIPLE_PRECISION.mantissa_high],ecx
        mov     [ebx+TRIPLE_PRECISION.exponent],eax
        retn

endp

proc_leaf PRINTF_variable_precision_mul uses ebp esi edi ebx,dest,source1,source2,mantissa_length
        std
        mov     esi,[source1]
        mov     ebp,[source2]
        mov     ebx,[mantissa_length]
        lea     edi,[esp-4]
        lea     esi,[esi+ebx*4]
        lea     ebp,[ebp+ebx*4]
        lea     ecx,[ebx*2]
        xor     eax,eax
        rep     stosd
        neg     ebx
    .loop_multiplier:
        mov     ecx,[mantissa_length]
        neg     ecx
    .loop_multiplicand:
        lea     edi,[ebx+ecx]
        ;multiply esi+ebx * ebp+ecx ---> esp+edi
        mov     eax,[esi+ebx*4]
        mul     dword[ebp+ecx*4]
        add     [esp+edi*4+0],eax
        adc     [esp+edi*4+4],edx
        jnc     .next_multiplicand
        lea     eax,[edi+1]
    .add_in_carry:
        inc     eax
        adc     dword[esp+eax*4],0
        jc      .add_in_carry
    .next_multiplicand:
        inc     ecx
        jnz     .loop_multiplicand
        inc     ebx
        jnz     .loop_multiplier
        ;add the exponents into edx
        mov     edx,[esi]
        add     edx,[ebp]
        ;round to the destination length
        mov     ecx,[mantissa_length]
        not     ecx
        ;find the rounding bit
        bsr     ebp,[esp-4]                     ;will be either 31 or 30
        mov     ebx,ecx
        xor     esi,esi
        bts     esi,ebp                         ;carry is zeroed here
    .ripple_carry_through_result:
        adc     [esp+ebx*4],esi
        mov     esi,0
        inc     ebx
        jnz     .ripple_carry_through_result
        ;test the MSb and scale up if necessary
        test    byte[esp-1],-1                  ;carry is zeroed here
        js      .MSb_okay
        mov     ebx,ecx
    .normalise_result:
        mov     eax,[esp+ebx*4]
        adc     eax,eax
        mov     [esp+ebx*4],eax
        inc     ebx
        jnz     .normalise_result
        dec     edx                             ;adjust exponent
    .MSb_okay:
        ;store the result
        not     ecx
        mov     edi,[dest]
        lea     esi,[esp-4]
        mov     [edi+ecx*4],edx                 ;exponent
        lea     edi,[edi+ecx*4-4]
        rep     movsd
        cld
        ret
endp

proc_leaf PRINTF_decimalise_unsigned uses ebp esi edi ebx,dest,value
        ;return eax=length
        locals
                length          dd ?
                pos             dd ?
                mid             dd ?
                high            dd ?
        endl
        mov     ecx,[value]
        mov     ebx,[ecx+8]
        mov     edx,[ecx+4]
        mov     eax,[ecx]
        bsr     edi,ebx
        jnz     .reduce96
        bsr     edi,edx
        jnz     .reduce64
        bsr     edi,eax
        jz      .done                   ;a value of zero prints nothing
    .reduce32:
        PRINTF_ceiling_log2_10 edi,edi
        mov     ebp,[edi*4+PRINTF_integer_base_10_multiples_32]
        sub     ebp,eax
        adc     edi,0
        mov     [length],edi
        add     edi,[dest]
        jmp     .start32
    .reduce64:
        add     edi,32
        PRINTF_ceiling_log2_10 edi,edi
        mov     ebp,[(edi-10)*8+PRINTF_integer_base_10_multiples_64+0]
        mov     esi,[(edi-10)*8+PRINTF_integer_base_10_multiples_64+4]
        sub     ebp,eax
        sbb     esi,edx
        adc     edi,0
        mov     [length],edi
        add     edi,[dest]
        jmp     .start64
    .reduce96:
        add     edi,64
        PRINTF_ceiling_log2_10 edi,edi
        lea     ecx,[edi*4]
        mov     ebp,[(ecx-20*4)*3+PRINTF_integer_base_10_multiples_96+0]
        mov     esi,[(ecx-20*4)*3+PRINTF_integer_base_10_multiples_96+4]
        mov     ecx,[(ecx-20*4)*3+PRINTF_integer_base_10_multiples_96+8]
        sub     ebp,eax
        sbb     esi,edx
        sbb     ecx,ebx
        adc     edi,0
        mov     [length],edi
        add     edi,[dest]
        mov     esi,eax
        mov     eax,edx
        mov     edx,ebx
    .loop96:
        mov     [pos],edi
        mov     [high],edx
        mov     [mid],eax
        mov     ecx,esi         ;       [high]  [mid]   ecx
        mov     ebx,0xcccccccc  ;floor(2^35/10)
        mov     eax,esi
        mul     ebx
        mov     edi,eax
        mov     esi,edx         ;               esi     edi
        mov     eax,ebx
        mul     [mid]
        add     esi,eax
        adc     edx,0
        xchg    ebx,edx         ;       ebx     esi     edi
        mov     eax,[high]
        mul     edx
        add     eax,ebx
        adc     edx,0           ;edx    eax     esi     edi
        xor     ebx,ebx         ;                               [high]  [mid]   ecx     =num*0x000000000000000000000001
        mov     ebp,edi         ;                       edx     eax     esi     edi     =num*0x0000000000000000cccccccc
        add     ebp,ecx         ;               edx     eax     esi     edi             =num*0x00000000cccccccc00000000
        mov     ebp,edi         ;       edx     eax     esi     edi                     =num*0xcccccccc0000000000000000
        adc     ebp,esi
        adc     ebx,0
        add     ebp,[mid]
        adc     ebx,0
        xor     ebp,ebp
        add     edi,esi
        adc     ebp,0
        add     edi,eax
        adc     ebp,0
        add     edi,[high]
        adc     ebp,0
        add     edi,ebx
        mov     edi,[pos]
        adc     ebp,0
        xor     ebx,ebx
        add     esi,eax
        adc     eax,edx
        adc     edx,0
        add     esi,edx
        adc     ebx,0
        add     esi,ebp
        adc     eax,ebx
        adc     edx,0
        and     esi,-8
        sub     ecx,esi
        shrd    esi,eax,2
        shrd    eax,edx,2
        shr     edx,2
        sub     ecx,esi
        shrd    esi,eax,1
        shrd    eax,edx,1
        dec     edi
        add     ecx,'0'
        mov     [edi],cl
        shr     edx,1
        jnz     .loop96
        mov     edx,eax
        mov     eax,esi
    .start64:
        ;edx:eax/esi
    .loop64:
        mov     esi,0xcccccccc  ;floor(2^35/10)
        dec     edi
        mov     ecx,eax
        mov     ebp,edx         ;ebp:ecx=num
        mul     esi
        mov     ebx,eax
        xchg    esi,edx
        mov     eax,ebp
        mul     edx
        add     eax,esi
        adc     edx,0           ;edx:eax:ebx:000=num*0xcccccccc00000000
        mov     esi,ebx
        add     esi,eax
        adc     eax,edx
        adc     edx,0           ;edx:eax:esi:ebx=num*0xcccccccccccccccc
        add     ebx,ecx
        adc     esi,ebp
        adc     eax,0
        adc     edx,0           ;edx:eax:esi:ebx=num*0xcccccccccccccccd
        and     eax,-8
        sub     ecx,eax
        shrd    eax,edx,2
        shr     edx,2
        sub     ecx,eax         ;ecx=remainder
        shrd    eax,edx,1
        add     ecx,'0'
        mov     [edi],cl
        shr     edx,1           ;edx:eax=quotient=num/10
        jnz     .loop64
    .start32:
        ;eax = eax / 10
        mov     ebx,0xcccccccd  ;ceiling(2^35/10)
    .loop32:
        mov     ecx,eax
        dec     edi
        mul     ebx
        and     edx,-8
        mov     eax,edx
        sub     ecx,edx
        shr     edx,2
        sub     ecx,edx
        add     ecx,'0'
        mov     [edi],cl
        shr     eax,3
        jnz     .loop32
        mov     eax,[length]
    .done:
        ret
endp

proc PRINTF_elements_print uses ebx esi edi,elements,output
        mov     ebx,[elements]
        mov     esi,[output]
        ;format sign
        xor     ecx,ecx
        mov     al,'-'
        test    [ebx+PRINTF_ELEMENTS.flags],PRINTF_ELEMENTS_FLAG.negative
        jnz     .initialise_sign
        mov     al,'+'
        test    [ebx+PRINTF_ELEMENTS.flags],PRINTF_ELEMENTS_FLAG.show_sign
        jnz     .initialise_sign
        mov     al,' '
        test    [ebx+PRINTF_ELEMENTS.flags],PRINTF_ELEMENTS_FLAG.blank_sign
        jnz     .initialise_sign
        not     ecx
    .initialise_sign:
        inc     ecx
        mov     [ebx+PRINTF_ELEMENTS.sign_character],al
        mov     [ebx+PRINTF_ELEMENTS.sign_length],ecx
        xor     edi,edi                         ;edi = number of padding characters
        ;stdcall PRINTF_get_minimum_length,ebx
        sub     eax,[ebx+PRINTF_ELEMENTS.width]
        jae     .no_padding
        neg     eax
        mov     edi,eax
    .no_padding:
        ;leading spaces
        test    edi,edi
        jz      .leading_spaces_done
        ;if zero padding then do nothing
        test    [ebx+PRINTF_ELEMENTS.flags],PRINTF_ELEMENTS_FLAG.zero_pad
        jnz     .leading_spaces_done
        ;if left justified then do nothing
        test    [ebx+PRINTF_ELEMENTS.flags],PRINTF_ELEMENTS_FLAG.left_justify
        jnz     .leading_spaces_done
        ;stdcall PRINTF_elements_print_repeated_character,ebx,+' ',edi,esi
    .leading_spaces_done:
        ;sign
        mov     ecx,[ebx+PRINTF_ELEMENTS.sign_length]
        test    ecx,ecx
        jz      .sign_done
        movzx   eax,[ebx+PRINTF_ELEMENTS.sign_character]
        stdcall PRINTF_elements_print_character,ebx,eax,esi
    .sign_done:
        ;prefix
        mov     ecx,[ebx+PRINTF_ELEMENTS.prefix_length]
        test    ecx,ecx
        jz      .prefix_done
        stdcall PRINTF_elements_print_string,ebx,addr ebx+PRINTF_ELEMENTS.prefix_string,ecx,esi
    .prefix_done:
        test    edi,edi
        jz      .padding_zeros_done
        test    [ebx+PRINTF_ELEMENTS.flags],PRINTF_ELEMENTS_FLAG.zero_pad
        jz      .padding_zeros_done
        ;stdcall PRINTF_elements_print_padding_zeros,ebx,edi,esi
    .padding_zeros_done:
        ;precision zeros
        xor     eax,eax
        mov     ecx,[ebx+PRINTF_ELEMENTS.precision_zeros]
        test    ecx,ecx
        jz      .precision_zeros_done
        stdcall PRINTF_elements_print_decimal_repeated_character,ebx,+'0',ecx,eax,esi
    .precision_zeros_done:
        ;number
        mov     ecx,[ebx+PRINTF_ELEMENTS.number_length]
        test    ecx,ecx
        jz      .number_done
        stdcall PRINTF_elements_print_decimal_string,ebx,addr ebx+PRINTF_ELEMENTS.number_string,ecx,eax,esi
    .number_done:
        ;magnitude zeros
        mov     ecx,[ebx+PRINTF_ELEMENTS.magnitude_zeros]
        test    ecx,ecx
        jz      .magnitude_zeros_done
        stdcall PRINTF_elements_print_decimal_repeated_character,ebx,+'0',ecx,eax,esi
    .magnitude_zeros_done:
        ;exponent
        mov     ecx,[ebx+PRINTF_ELEMENTS.exponent_length]
        test    ecx,ecx
        jz      .exponent_done
        stdcall PRINTF_elements_print_string,ebx,addr ebx+PRINTF_ELEMENTS.exponent_string,ecx,esi
    .exponent_done:
        ;trailing spaces
        test    edi,edi
        jz      .trailing_spaces_done
        ;if zero padding then do nothing
        test    [ebx+PRINTF_ELEMENTS.flags],PRINTF_ELEMENTS_FLAG.zero_pad
        jnz     .trailing_spaces_done
        ;if right justified then do nothing
        test    [ebx+PRINTF_ELEMENTS.flags],PRINTF_ELEMENTS_FLAG.left_justify
        jz      .trailing_spaces_done
        ;stdcall PRINTF_elements_print_repeated_character,ebx,+' ',edi,esi
    .trailing_spaces_done:
        ret
endp

proc PRINTF_elements_print_decimal_repeated_character uses ebx edi esi,elements,char,repeats,current_position,output
        ;return eax=next position
        mov     ebx,[elements]
        mov     edi,[current_position]
        mov     esi,[repeats]
        test    esi,esi
        jz      .done
    .loop:
        stdcall PRINTF_elements_print_character,ebx,[char],[output]
        inc     edi
        stdcall PRINTF_elements_print_separator,ebx,edi,[output]
        dec     esi
        cmp     edi,[ebx+PRINTF_ELEMENTS.decimal_point]
        jnz     .decimal_done
        stdcall PRINTF_elements_print_character,ebx,+'.',[output]
    .decimal_done:
        test    esi,esi
        jnz     .loop
    .done:
        mov     eax,edi
        ret
endp

;V
proc PRINTF_elements_print_decimal_string uses esi ebx edi,elements,string,length,current_position,output
        ;return eax=next position
        mov     edi,[current_position]
        mov     esi,[string]
        mov     ebx,[length]
        test    ebx,ebx
        jz      .done
    .loop:
        movzx   eax,byte[esi]
        stdcall PRINTF_elements_print_character,[elements],eax,[output]
        inc     edi
        stdcall PRINTF_elements_print_separator,[elements],edi,[output]
        inc     esi
        dec     ebx
        mov     eax,[elements]
        cmp     edi,[eax+PRINTF_ELEMENTS.decimal_point]
        jnz     .decimal_done
        stdcall PRINTF_elements_print_character,eax,+'.',[output]
    .decimal_done:
        test    ebx,ebx
        jnz     .loop
    .done:
        mov     eax,edi
        ret
endp

proc PRINTF_elements_print_separator uses ebx,elements,current_position,output
        mov     ebx,[elements]
        cmp     [ebx+PRINTF_ELEMENTS.separator_modulus],0
        jz      .separator_skip
        stdcall PRINTF_get_digits_before_decimal_point,ebx
        sub     eax,[current_position]
        jbe     .separator_skip
        xor     edx,edx
        div     [ebx+PRINTF_ELEMENTS.separator_modulus]
        test    edx,edx
        jnz     .separator_skip
        movzx   eax,[ebx+PRINTF_ELEMENTS.separator_character]
        stdcall PRINTF_elements_print_character,ebx,eax,[output]
    .separator_skip:
        ret
endp

proc_leaf PRINTF_get_digits_before_decimal_point elements
        mov     edx,[elements]
        mov     eax,[edx+PRINTF_ELEMENTS.decimal_point]
        test    eax,eax
        jnz     .done
        mov     eax,[edx+PRINTF_ELEMENTS.precision_zeros]
        add     eax,[edx+PRINTF_ELEMENTS.number_length]
        add     eax,[edx+PRINTF_ELEMENTS.magnitude_zeros]
    .done:
        ret
endp

proc PRINTF_elements_print_string uses esi ebx,elements,string,length,output
        mov     esi,[string]
        mov     ebx,[length]
        test    ebx,ebx
        jz      .done
    .loop:
        movzx   eax,byte[esi]
        test    eax,eax
        jz      .done
        stdcall PRINTF_elements_print_character,[elements],eax,[output]
        inc     esi
        dec     ebx
        jnz     .loop
    .done:
        ret
endp

proc PRINTF_elements_print_character uses ebx,elements,char,output
        ;applies uppercase flag
        mov     ebx,[output]
        mov     eax,[elements]
        mov     ecx,[ebx+PRINTF_OUTPUT.length]
        mov     edx,[ebx+PRINTF_OUTPUT.buffer]
        test    [eax+PRINTF_ELEMENTS.flags],PRINTF_ELEMENTS_FLAG.uppercase
        mov     eax,[char]
        jz      .case_change_done
        cmp     al,'a'
        jb      .case_change_done
        cmp     al,'z'
        ja      .case_change_done
        sub     al,'a'-'A'
    .case_change_done:
        test    ecx,ecx
        jne     .add
        cmp     [ebx+PRINTF_OUTPUT.handle],INVALID_HANDLE_VALUE
        je      .next
        push    eax
        ;stdcall PRINTF_flush_buffer,ebx
        pop     eax
        mov     edx,[ebx+PRINTF_OUTPUT.buffer]
        mov     ecx,[ebx+PRINTF_OUTPUT.length]
    .add:
        mov     [edx],al
        dec     ecx
        mov     [ebx+PRINTF_OUTPUT.length],ecx
    .next:
        inc     edx
        mov     [ebx+PRINTF_OUTPUT.buffer],edx
        ret
endp

restore prologue@proc,epilogue@proc    
both pieces are parts of one file or one could be included in another

_________________
I don`t like to refer by "you" to one person.
My soul requires acronim "thou" instead.
Post 23 Nov 2023, 11:32
View user's profile Send private message Send e-mail Reply with quote
ProMiNick



Joined: 24 Mar 2012
Posts: 798
Location: Russian Federation, Sochi
ProMiNick 24 Nov 2023, 21:03
I tried to test muliplication algorithm(not current, but same like I understand it) for extracting decimals
test data - any dword. multiplicator $1000000000000000000000000/10000000000 that is equal to $6DF37F675EF6EADF.
What I got: decimals extraction is differ than actual value by +\-0..2
Code:
Code:
format PE GUI 4.0
entry start

include 'win32a.inc'

section '.text' code readable executable

  start: RTL_C ; cut off RTL_C for official fasm
        push    ebx
        push    ebp
        push    esi
        push    edi
        mov     esi, 10
        mov     edi, buffer
        mov     ecx, [a]
        mov     eax, $6DF37F675EF6EADF shr 32  ;$6DF37F675EF6EADE=1/$100000000
        mul     ecx
        mov     ebp, edx
        mov     ebx, 0;eax
        mov     eax, $6DF37F675EF6EADF and $FFFFFFFF
        mul     ecx
        add     ebx, edx
        adc     ebp, 0
        mov     ecx, 10
     .loop:
        mul     ecx
        xchg    eax, edx
        mov     ebx, edx
        xchg    eax, ebp
        mul     ecx
        add     ebp, eax
        mov     al,  dl
        adc     al,  '0'
        stosb
        mov     eax, ebx
        dec     esi
        jnz     .loop

        pop     edi
        pop     esi
        pop     ebp
        pop     ebx
        invoke  MessageBox,0,buffer,esp,0
        invoke  ExitProcess,0
flush_locals ; cut off flush_locals for official fasm

section '.data' data readable writeable

 a   dd 9

  buffer db 11 dup 0

section '.idata' import data readable writeable

  library kernel32,'KERNEL32.DLL',\
          user32,'USER32.DLL'

  include 'api\kernel32.inc'
  include 'api\user32.inc'     
could be something differ for case with multiplier that 1 dword longer? (with this value $5AB9A207 at least significant dword)

_________________
I don`t like to refer by "you" to one person.
My soul requires acronim "thou" instead.
Post 24 Nov 2023, 21:03
View user's profile Send private message Send e-mail Reply with quote
sylware



Joined: 23 Oct 2020
Posts: 437
Location: Marseille/France
sylware 08 Dec 2023, 19:46
I am working on my own version (64bits though).

I wonder how much different it will be in the end (mine won't be complete and quite limited though).

I am finishing the conversion directive decoding stage (numbered argument mode and auto-incremented argument mode).

Since I am on and off on that piece of code. I could test something: I had a significant break in the middle of its developement, and I wanted to see how much time it would take me to recall everything and move forward again, that in comparison to C.

Well, it felt easier and simpler with assembly than with C. Weird.
Post 08 Dec 2023, 19:46
View user's profile Send private message Reply with quote
sylware



Joined: 23 Oct 2020
Posts: 437
Location: Marseille/France
sylware 05 Jan 2024, 16:12
Allright, here is my take, partial implementation though (string and hexadecimal, but the current conversions can be used as code templates for the others).

Conclusion: printf family functions should be avoided like hell, and a brutal put_string with string conversion functions with dynamic stack allocation should be used instead, even if it means a bit more work on our side. I understand now why some busybox people are/were so hostile to printf functions.

It does assemble with fasmg(duh!), binutils gas and nasm (then probably yasm).

vim color syntax file provided.

It is there:

https://www.rocketgit.com/user/sylware/nyanvsnprintf

(I did a recent modification for a better fatal error code path)
Post 05 Jan 2024, 16:12
View user's profile Send private message Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.