clang/docs/HLSL/ExpectedDifferences.rst

   1 ===================================
   2 Expected Differences vs DXC and FXC
   3 ===================================
   4
   5 .. contents::
   6    :local:
   7
   8 Introduction
   9 ============
  10
  11 HLSL currently has two reference compilers, the `DirectX Shader Compiler (DXC)
  12 <https://github.com/microsoft/DirectXShaderCompiler/>`_ and the
  13 `Effect-Compiler (FXC) <https://learn.microsoft.com/en-us/windows/win32/direct3dtools/fxc>`_.
  14 The two reference compilers do not fully agree. Some known disagreements in the
  15 references are tracked on
  16 `DXC's GitHub
  17 <https://github.com/microsoft/DirectXShaderCompiler/issues?q=is%3Aopen+is%3Aissue+label%3Afxc-disagrees>`_,
  18 but many more are known to exist.
  19
  20 HLSL as implemented by Clang will also not fully match either of the reference
  21 implementations, it is instead being written to match the `draft language
  22 specification <https://microsoft.github.io/hlsl-specs/specs/hlsl.pdf>`_.
  23
  24 This document is a non-exhaustive collection the known differences between
  25 Clang's implementation of HLSL and the existing reference compilers.
  26
  27 General Principles
  28 ------------------
  29
  30 Most of the intended differences between Clang and the earlier reference
  31 compilers are focused on increased consistency and correctness. Both reference
  32 compilers do not always apply language rules the same in all contexts.
  33
  34 Clang also deviates from the reference compilers by providing different
  35 diagnostics, both in terms of the textual messages and the contexts in which
  36 diagnostics are produced. While striving for a high level of source
  37 compatibility with conforming HLSL code, Clang may produce earlier and more
  38 robust diagnostics for incorrect code or reject code that a reference compiler
  39 incorrectly accepted.
  40
  41 Language Version
  42 ================
  43
  44 Clang targets language compatibility for HLSL 2021 as implemented by DXC.
  45 Language features that were removed in earlier versions of HLSL may be added on
  46 a case-by-case basis, but are not planned for the initial implementation.
  47
  48 Overload Resolution
  49 ===================
  50
  51 Clang's HLSL implementation adopts C++ overload resolution rules as proposed for
  52 HLSL 202x based on proposal
  53 `0007 <https://github.com/microsoft/hlsl-specs/blob/main/proposals/0007-const-instance-methods.md>`_
  54 and
  55 `0008 <https://github.com/microsoft/hlsl-specs/blob/main/proposals/0008-non-member-operator-overloading.md>`_.
  56
  57 The largest difference between Clang and DXC's overload resolution is the
  58 algorithm used for identifying best-match overloads. There are more details
  59 about the algorithmic differences in the :ref:`multi_argument_overloads` section
  60 below. There are three high level differences that should be highlighted:
  61
  62 * **There should be no cases** where DXC and Clang both successfully
  63   resolve an overload where the resolved overload is different between the two.
  64 * There are cases where Clang will successfully resolve an overload that DXC
  65   wouldn't because we've trimmed the overload set in Clang to remove ambiguity.
  66 * There are cases where DXC will successfully resolve an overload that Clang
  67   will not for two reasons: (1) DXC only generates partial overload sets for
  68   builtin functions and (2) DXC resolves cases that probably should be ambiguous.
  69
  70 Clang's implementation extends standard overload resolution rules to HLSL
  71 library functionality. This causes subtle changes in overload resolution
  72 behavior between Clang and DXC. Some examples include:
  73
  74 .. code-block:: c++
  75
  76   void halfOrInt16(half H);
  77   void halfOrInt16(uint16_t U);
  78   void halfOrInt16(int16_t I);
  79
  80   void takesDoubles(double, double, double);
  81
  82   cbuffer CB {
  83     bool B;
  84     uint U;
  85     int I;
  86     float X, Y, Z;
  87     double3 R, G;
  88   }
  89
  90   void takesSingleDouble(double);
  91   void takesSingleDouble(vector<double, 1>);
  92
  93   void scalarOrVector(double);
  94   void scalarOrVector(vector<double, 2>);
  95
  96   export void call() {
  97     half H;
  98     halfOrInt16(I); // All: Resolves to halfOrInt16(int16_t).
  99
 100   #ifndef IGNORE_ERRORS
 101     halfOrInt16(U); // All: Fails with call ambiguous between int16_t and uint16_t
 102                     // overloads
 103
 104     // asfloat16 is a builtin with overloads for half, int16_t, and uint16_t.
 105     H = asfloat16(I); // DXC: Fails to resolve overload for int.
 106                       // Clang: Resolves to asfloat16(int16_t).
 107     H = asfloat16(U); // DXC: Fails to resolve overload for int.
 108                       // Clang: Resolves to asfloat16(uint16_t).
 109   #endif
 110     H = asfloat16(0x01); // DXC: Resolves to asfloat16(half).
 111                          // Clang: Resolves to asfloat16(uint16_t).
 112
 113     takesDoubles(X, Y, Z); // Works on all compilers
 114   #ifndef IGNORE_ERRORS
 115     fma(X, Y, Z); // DXC: Fails to resolve no known conversion from float to
 116                   //   double.
 117                   // Clang: Resolves to fma(double,double,double).
 118
 119     double D = dot(R, G); // DXC: Resolves to dot(double3, double3), fails DXIL Validation.
 120                           // FXC: Expands to compute double dot product with fmul/fadd
 121                           // Clang: Fails to resolve as ambiguous against
 122                           //   dot(half, half) or dot(float, float)
 123   #endif
 124
 125   #ifndef IGNORE_ERRORS
 126     tan(B); // DXC: resolves to tan(float).
 127             // Clang: Fails to resolve, ambiguous between integer types.
 128
 129   #endif
 130
 131     double D;
 132     takesSingleDouble(D); // All: Fails to resolve ambiguous conversions.
 133     takesSingleDouble(R); // All: Fails to resolve ambiguous conversions.
 134
 135     scalarOrVector(D); // All: Resolves to scalarOrVector(double).
 136     scalarOrVector(R); // All: Fails to resolve ambiguous conversions.
 137   }
 138
 139 .. note::
 140
 141   In Clang, a conscious decision was made to exclude the ``dot(vector<double,N>, vector<double,N>)``
 142   overload and allow overload resolution to resolve the
 143   ``vector<float,N>`` overload. This approach provides ``-Wconversion``
 144   diagnostic notifying the user of the conversion rather than silently altering
 145   precision relative to the other overloads (as FXC does) or generating code
 146   that will fail validation (as DXC does).
 147
 148 .. _multi_argument_overloads:
 149
 150 Multi-Argument Overloads
 151 ------------------------
 152
 153 In addition to the differences in single-element conversions, Clang and DXC
 154 differ dramatically in multi-argument overload resolution. C++ multi-argument
 155 overload resolution behavior (or something very similar) is required to
 156 implement
 157 `non-member operator overloading <https://github.com/microsoft/hlsl-specs/blob/main/proposals/0008-non-member-operator-overloading.md>`_.
 158
 159 Clang adopts the C++ inspired language from the
 160 `draft HLSL specification <https://microsoft.github.io/hlsl-specs/specs/hlsl.pdf>`_,
 161 where an overload ``f1`` is a better candidate than ``f2`` if for all arguments the
 162 conversion sequences is not worse than the corresponding conversion sequence and
 163 for at least one argument it is better.
 164
 165 .. code-block:: c++
 166
 167   cbuffer CB {
 168     int I;
 169     float X;
 170     float4 V;
 171   }
 172
 173   void twoParams(int, int);
 174   void twoParams(float, float);
 175   void threeParams(float, float, float);
 176   void threeParams(float4, float4, float4);
 177
 178   export void call() {
 179     twoParams(I, X); // DXC: resolves twoParams(int, int).
 180                      // Clang: Fails to resolve ambiguous conversions.
 181
 182     threeParams(X, V, V); // DXC: resolves threeParams(float4, float4, float4).
 183                           // Clang: Fails to resolve ambiguous conversions.
 184   }
 185
 186 For the examples above since ``twoParams`` called with mixed parameters produces
 187 implicit conversion sequences that are { ExactMatch, FloatingIntegral }  and {
 188 FloatingIntegral, ExactMatch }. In both cases an argument has a worse conversion
 189 in the other sequence, so the overload is ambiguous.
 190
 191 In the ``threeParams`` example the sequences are { ExactMatch, VectorTruncation,
 192 VectorTruncation } or { VectorSplat, ExactMatch, ExactMatch }, again in both
 193 cases at least one parameter has a worse conversion in the other sequence, so
 194 the overload is ambiguous.
 195
 196 .. note::
 197
 198   The behavior of DXC documented below is undocumented so this is gleaned from
 199   observation and a bit of reading the source.
 200
 201 DXC's approach for determining the best overload produces an integer score value
 202 for each implicit conversion sequence for each argument expression. Scores for
 203 casts are based on a bitmask construction that is complicated to reverse
 204 engineer. It seems that:
 205
 206 * Exact match is 0
 207 * Dimension increase is 1
 208 * Promotion is 2
 209 * Integral -> Float conversion is 4
 210 * Float -> Integral conversion is 8
 211 * Cast is 16
 212
 213 The masks are or'd against each other to produce a score for the cast.
 214
 215 The scores of each conversion sequence are then summed to generate a score for
 216 the overload candidate. The overload candidate with the lowest score is the best
 217 candidate. If more than one overload are matched for the lowest score the call
 218 is ambiguous.