libc/docs/dev/printf_behavior.rst

   1 .. _printf_behavior:
   2
   3 ====================================
   4 Printf Behavior Under All Conditions
   5 ====================================
   6
   7 Introduction:
   8 =============
   9 On the "defining undefined behavior" page, I said you should write down your
  10 decisions regarding undefined behavior in your functions. This is that document
  11 for my printf implementation.
  12
  13 Unless otherwise specified, the functionality described is aligned with the ISO
  14 C standard and POSIX standard. If any behavior is not mentioned here, it should
  15 be assumed to follow the behavior described in those standards.
  16
  17 The LLVM-libc codebase is under active development, and may change. This
  18 document was last updated [January 8, 2024] by [michaelrj] and may
  19 not be accurate after this point.
  20
  21 The behavior of LLVM-libc's printf is heavily influenced by compile-time flags.
  22 Make sure to check what flags are defined before filing a bug report. It is also
  23 not relevant to any other libc implementation of printf, which may or may not
  24 share the same behavior.
  25
  26 This document assumes familiarity with the definition of the printf function and
  27 is intended as a reference, not a replacement for the original standards.
  28
  29 --------------
  30 General Flags:
  31 --------------
  32 These compile-time flags will change the behavior of LLVM-libc's printf when it
  33 is compiled. Combinations of flags that are incompatible will be marked.
  34
  35 LIBC_COPT_STDIO_USE_SYSTEM_FILE
  36 -------------------------------
  37 When set, this flag changes fprintf and printf to use the FILE API from the
  38 system's libc, instead of LLVM-libc's internal FILE API. This is set by default
  39 when LLVM-libc is built in overlay mode.
  40
  41 LIBC_COPT_PRINTF_DISABLE_INDEX_MODE
  42 -----------------------------------
  43 When set, this flag disables support for the POSIX "%n$" format, hereafter
  44 referred to as "index mode"; conversions using the index mode format will be
  45 treated as invalid. This reduces code size.
  46
  47 LIBC_COPT_PRINTF_INDEX_ARR_LEN
  48 ------------------------------
  49 This flag takes a positive integer value, defaulting to 128. This flag
  50 determines the number of entries the parser's type descriptor array has. This is
  51 used in index mode to avoid re-parsing the format string to determine types when
  52 an index lower than the previously specified one is requested. This has no
  53 effect when index mode is disabled.
  54
  55 LIBC_COPT_PRINTF_DISABLE_WRITE_INT
  56 ----------------------------------
  57 When set, this flag disables support for the C Standard "%n" conversion; any
  58 "%n" conversion will be treated as invalid. This is set by default to improve
  59 security.
  60
  61 LIBC_COPT_PRINTF_DISABLE_FLOAT
  62 ------------------------------
  63 When set, this flag disables support for floating point numbers and all their
  64 conversions (%a, %f, %e, %g); any floating point number conversion will be
  65 treated as invalid. This reduces code size.
  66
  67 LIBC_COPT_PRINTF_DISABLE_FIXED_POINT
  68 ------------------------------------
  69 When set, this flag disables support for fixed point numbers and all their
  70 conversions (%r, %k); any fixed point number conversion will be treated as
  71 invalid. This reduces code size. This has no effect if the current compiler does
  72 not support fixed point numbers.
  73
  74 LIBC_COPT_PRINTF_NO_NULLPTR_CHECKS
  75 ----------------------------------
  76 When set, this flag disables the nullptr checks in %n and %s.
  77
  78 LIBC_COPT_PRINTF_CONV_ATLAS
  79 ---------------------------
  80 When set, this flag changes the include path for the "converter atlas" which is
  81 a header that includes all the files containing the conversion functions. This
  82 is not recommended to be set without careful consideration.
  83
  84 LIBC_COPT_PRINTF_HEX_LONG_DOUBLE
  85 --------------------------------
  86 When set, this flag replaces all decimal long double conversions (%Lf, %Le, %Lg)
  87 with hexadecimal long double conversions (%La). This will improve performance
  88 significantly, but may cause some tests to fail. This has no effect when float
  89 conversions are disabled.
  90
  91 --------------------------------
  92 Float Conversion Internal Flags:
  93 --------------------------------
  94 The following floating point conversion flags are provided for reference, but
  95 are not recommended to be adjusted except by persons familiar with the Printf
  96 Ryu Algorithm. Additionally they have no effect when float conversions are
  97 disabled.
  98
  99 LIBC_COPT_FLOAT_TO_STR_NO_SPECIALIZE_LD
 100 ---------------------------------------
 101 This flag disables the separate long double conversion implementation. It is
 102 not based on the Ryu algorithm, instead generating the digits by
 103 multiplying/dividing the written-out number by 10^9 to get blocks. It's
 104 significantly faster than INT_CALC, only about 10x slower than MEGA_TABLE,
 105 and is small in binary size. Its downside is that it always calculates all
 106 of the digits above the decimal point, making it slightly inefficient for %e
 107 calls with large exponents. This is the default. This specialization overrides
 108 other flags, so this flag must be set for other flags to effect the long double
 109 behavior.
 110
 111 LIBC_COPT_FLOAT_TO_STR_USE_MEGA_LONG_DOUBLE_TABLE
 112 -------------------------------------------------
 113 When set, the float to string decimal conversion algorithm will use a larger
 114 table to accelerate long double conversions. This larger table is around 5MB of
 115 size when compiled.
 116
 117 LIBC_COPT_FLOAT_TO_STR_USE_DYADIC_FLOAT
 118 ---------------------------------------
 119 When set, the float to string decimal conversion algorithm will use dyadic
 120 floats instead of a table when performing floating point conversions. This
 121 results in ~50 digits of accuracy in the result, then zeroes for the remaining
 122 values. This may improve performance but may also cause some tests to fail. The
 123 flag ending in _LD is the same, but only applies to long double decimal
 124 conversions.
 125
 126 LIBC_COPT_FLOAT_TO_STR_USE_INT_CALC
 127 -----------------------------------
 128 When set, the float to string decimal conversion algorithm will use wide
 129 integers instead of a table when performing floating point conversions. This
 130 gives the same results as the table, but is very slow at the extreme ends of
 131 the long double range.
 132
 133 LIBC_COPT_FLOAT_TO_STR_NO_TABLE
 134 -------------------------------
 135 When set, the float to string decimal conversion algorithm will not use either
 136 the mega table or the normal table for any conversions. Instead it will set
 137 algorithmic constants to improve performance when using calculation algorithms.
 138 If this flag is set without any calculation algorithm flag set, an error will
 139 occur.
 140
 141 --------
 142 Parsing:
 143 --------
 144
 145 When printf encounters an invalid conversion specification, the entire
 146 conversion specification will be passed literally to the output string.
 147 As an example, printf("%Z") would display "%Z".
 148
 149 If an index mode conversion is requested for index "n" and there exists a number
 150 in [1,n) that does not have a conversion specified in the format string, then
 151 the conversion for index "n" is considered invalid.
 152
 153 If a non-index mode (also referred to as sequential mode) conversion is
 154 specified after an index mode conversion, the next argument will be read but the
 155 current index will not be incremented. From this point on, the arguments
 156 selected by each conversion may or may not be correct. This is considered
 157 dangerously undefined and may change without warning.
 158
 159 If a conversion specification is provided an invalid type modifier, that type
 160 modifier will be ignored, and the default type for that conversion will be used.
 161 In the case of the length modifier "L" and integer conversions, it will be
 162 treated as if it was "ll" (lowercase LL). For this purpose the list of integer
 163 conversions is d, i, u, o, x, X, b, B, n.
 164
 165 If a conversion specification ending in % has any options that consume arguments
 166 (e.g. "%*.*%") those arguments will be consumed as normal, but their values will
 167 be ignored.
 168
 169 If a conversion specification ends in a null byte ('\0') then it shall be
 170 treated as an invalid conversion followed by a null byte.
 171
 172 If a number passed as a field width or precision value is out of range for an
 173 int, then it will be treated as the largest value in the int range
 174 (e.g. "%-999999999999.999999999999s" is the same as "%-2147483647.2147483647s").
 175
 176 If the field width is set to INT_MIN by using the '*' form,
 177 e.g. printf("%*d", INT_MIN, 1), it will be treated as INT_MAX, since -INT_MIN is
 178 not representable as an int.
 179
 180 If a number passed as a bit width is less than or equal to zero, the conversion
 181 is considered invalid. If the provided bit width is larger than the width of
 182 uintmax_t, it will be clamped to the width of uintmax_t.
 183
 184 ----------
 185 Conversion
 186 ----------
 187 Any conversion specification that contains a flag or option that it does not
 188 have defined behavior for will ignore that flag or option (e.g. %.5c is the same
 189 as %c).
 190
 191 If a conversion specification ends in %, then it will be treated as if it is
 192 "%%", ignoring all options.
 193
 194 If a null pointer is passed to a %s conversion specification and null pointer
 195 checks are enabled, it will be treated as if the provided string is "null".
 196
 197 If a null pointer is passed to a %n conversion specification and null pointer
 198 checks are enabled, the conversion will fail and printf will return a negative
 199 value.
 200
 201 If a null pointer is passed to a %p conversion specification, the string
 202 "(nullptr)" will be returned instead of an integer value.
 203
 204 The %p conversion will display any non-null pointer as if it was a uintptr value
 205 passed to a "%#tx" conversion, with all other options remaining the same as the
 206 original conversion.
 207
 208 The %p conversion will display a null pointer as if it was the string
 209 "(nullptr)" passed to a "%s" conversion, with all other options remaining the
 210 same as the original conversion.
 211
 212 The %r, %R, %k, and %K fixed point number format specifiers are accepted as
 213 defined in ISO/IEC TR 18037 (the fixed point number extension). These are
 214 available when the compiler is detected as having support for fixed point
 215 numbers and the LIBC_COPT_PRINTF_DISABLE_FIXED_POINT flag is not set.
 216
 217 The %m conversion will behave as specified by POSIX for syslog: It takes no
 218 arguments, and outputs the result of strerror(errno). Additionally, to match
 219 existing printf behaviors, it will behave as if it is a %s string conversion for
 220 the purpose of all options, except for the alt form flag. If the alt form flag
 221 is specified, %m will instead output a string matching the macro name of the
 222 value of errno (e.g. "ERANGE" for errno = ERANGE), again treating it as a string
 223 conversion. If there is no corresponding macro, then alt form %m will print the
 224 value of errno as an integer with the %d format, including all options. If
 225 errno = 0 and alt form is specified, the conversion will be a string conversion
 226 on "0" for simplicity of implementation. This matches what other libcs
 227 implementing this feature have done.