llvm/docs/AMDGPUInstructionSyntax.rst

   1 =========================
   2 AMDGPU Instruction Syntax
   3 =========================
   4
   5 .. contents::
   6    :local:
   7
   8 .. _amdgpu_syn_instructions:
   9
  10 Instructions
  11 ============
  12
  13 Syntax
  14 ~~~~~~
  15
  16 Syntax of Regular Instructions
  17 ------------------------------
  18
  19 An instruction has the following syntax:
  20
  21   | ``<``\ *opcode mnemonic*\ ``>    <``\ *operand0*\ ``>,
  22       <``\ *operand1*\ ``>,...    <``\ *modifier0*\ ``> <``\ *modifier1*\ ``>...``
  23
  24 :doc:`Operands<AMDGPUOperandSyntax>` are normally comma-separated, while
  25 :doc:`modifiers<AMDGPUModifierSyntax>` are space-separated.
  26
  27 The order of *operands* and *modifiers* is fixed.
  28 Most *modifiers* are optional and may be omitted.
  29
  30 Syntax of VOPD Instructions
  31 ---------------------------
  32
  33 *VOPDX* and *VOPDY* instructions must be concatenated with the :: operator to form a single *VOPD* instruction:
  34
  35     ``<``\ *VOPDX instruction*\ ``>  ::  <``\ *VOPDY instruction*\ ``>``
  36
  37 An example:
  38
  39 .. parsed-literal::
  40
  41     v_dual_add_f32 v255, v255, v2 :: v_dual_fmaak_f32 v6, v2, v3, 1.0
  42
  43 Note that *VOPDX* and *VOPDY* instructions cannot be used as separate opcodes.
  44
  45 .. _amdgpu_syn_instruction_mnemo:
  46
  47 Opcode Mnemonic
  48 ~~~~~~~~~~~~~~~
  49
  50 Opcode mnemonic describes opcode semantics
  51 and may include one or more suffices in this order:
  52
  53 * :ref:`Packing suffix<amdgpu_syn_instruction_pk>`.
  54 * :ref:`Destination operand type suffix<amdgpu_syn_instruction_type>`.
  55 * :ref:`Source operand type suffix<amdgpu_syn_instruction_type>`.
  56 * :ref:`Encoding suffix<amdgpu_syn_instruction_enc>`.
  57
  58 .. _amdgpu_syn_instruction_pk:
  59
  60 Packing Suffix
  61 ~~~~~~~~~~~~~~
  62
  63 Most instructions which operate on packed data have a *_pk* suffix.
  64 Unless otherwise :ref:`noted<amdgpu_syn_instruction_operand_tags>`,
  65 these instructions operate on and produce packed data composed of
  66 two values. The type of values is indicated by
  67 :ref:`type suffices<amdgpu_syn_instruction_type>`.
  68
  69 For example, the following instruction sums up two pairs of f16 values
  70 and produces a pair of f16 values:
  71
  72 .. parsed-literal::
  73
  74     v_pk_add_f16 v1, v2, v3     // Each operand has f16x2 type
  75
  76 .. _amdgpu_syn_instruction_type:
  77
  78 Type and Size Suffices
  79 ~~~~~~~~~~~~~~~~~~~~~~
  80
  81 Instructions which operate with data have an implied type of *data* operands.
  82 This data type is specified as a suffix of instruction mnemonic.
  83
  84 There are instructions which have 2 type suffices:
  85 the first is the data type of the destination operand,
  86 the second is the data type of source *data* operand(s).
  87
  88 Note that data type specified by an instruction does not apply
  89 to other kinds of operands such as *addresses*, *offsets* and so on.
  90
  91 The following table enumerates the most frequently used type suffices.
  92
  93     ============================================ ======================= ============================
  94     Type Suffices                                Packed instruction?     Data Type
  95     ============================================ ======================= ============================
  96     _b512, _b256, _b128, _b64, _b32, _b16, _b8   No                      Bits.
  97     _u64, _u32, _u16, _u8                        No                      Unsigned integer.
  98     _i64, _i32, _i16, _i8                        No                      Signed integer.
  99     _f64, _f32, _f16                             No                      Floating-point.
 100     _b16, _u16, _i16, _f16                       Yes                     Packed (b16x2, u16x2, etc).
 101     ============================================ ======================= ============================
 102
 103 Instructions which have no type suffices are assumed to operate with typeless data.
 104 The size of typeless data is specified by size suffices:
 105
 106     ================= =================== =====================================
 107     Size Suffix       Implied data type   Required register size in dwords
 108     ================= =================== =====================================
 109     \-                b32                 1
 110     x2                b64                 2
 111     x3                b96                 3
 112     x4                b128                4
 113     x8                b256                8
 114     x16               b512                16
 115     x                 b32                 1
 116     xy                b64                 2
 117     xyz               b96                 3
 118     xyzw              b128                4
 119     d16_x             b16                 1
 120     d16_xy            b16x2               2 for GFX8.0, 1 for GFX8.1 and GFX9+
 121     d16_xyz           b16x3               3 for GFX8.0, 2 for GFX8.1 and GFX9+
 122     d16_xyzw          b16x4               4 for GFX8.0, 2 for GFX8.1 and GFX9+
 123     d16_format_x      b16                 1
 124     d16_format_xy     b16x2               1
 125     d16_format_xyz    b16x3               2
 126     d16_format_xyzw   b16x4               2
 127     ================= =================== =====================================
 128
 129 .. WARNING::
 130     There are exceptions to the rules described above.
 131     Operands which have a type different from the type specified by the opcode are
 132     :ref:`tagged<amdgpu_syn_instruction_operand_tags>` in the description.
 133
 134 Examples of instructions with different types of source and destination operands:
 135
 136 .. parsed-literal::
 137
 138     s_bcnt0_i32_b64
 139     v_cvt_f32_u32
 140
 141 Examples of instructions with one data type:
 142
 143 .. parsed-literal::
 144
 145     v_max3_f32
 146     v_max3_i16
 147
 148 Examples of instructions which operate with packed data:
 149
 150 .. parsed-literal::
 151
 152     v_pk_add_u16
 153     v_pk_add_i16
 154     v_pk_add_f16
 155
 156 Examples of typeless instructions which operate on b128 data:
 157
 158 .. parsed-literal::
 159
 160     buffer_store_dwordx4
 161     flat_load_dwordx4
 162
 163 .. _amdgpu_syn_instruction_enc:
 164
 165 Encoding Suffices
 166 ~~~~~~~~~~~~~~~~~
 167
 168 Most *VOP1*, *VOP2* and *VOPC* instructions have several variants:
 169 they may also be encoded in *VOP3*, *DPP* and *SDWA* formats.
 170
 171 The assembler selects an optimal encoding automatically
 172 based on instruction operands and modifiers,
 173 unless a specific encoding is explicitly requested.
 174 To force specific encoding, one can add a suffix to the opcode of the instruction:
 175
 176     =================================================== =================
 177     Encoding                                            Encoding Suffix
 178     =================================================== =================
 179     *VOP1*, *VOP2* and *VOPC* (32-bit) encoding         _e32
 180     *VOP3* (64-bit) encoding                            _e64
 181     *DPP* encoding                                      _dpp
 182     *SDWA* encoding                                     _sdwa
 183     *VOP3 DPP* encoding                                 _e64_dpp
 184     =================================================== =================
 185
 186 This reference uses encoding suffices to specify which encoding is implied.
 187 When no suffix is specified, native instruction encoding is assumed.
 188
 189 Operands
 190 ========
 191
 192 Syntax
 193 ~~~~~~
 194
 195 The syntax of generic operands is described :doc:`in this document<AMDGPUOperandSyntax>`.
 196
 197 For detailed information about operands, follow *operand links* in GPU-specific documents.
 198
 199 Modifiers
 200 =========
 201
 202 Syntax
 203 ~~~~~~
 204
 205 The syntax of modifiers is described :doc:`in this document<AMDGPUModifierSyntax>`.
 206
 207 Information about modifiers supported for individual instructions
 208 may be found in GPU-specific documents.