docs/AMDGPUOperandSyntax.rst

   1 =================================================
   2 Syntax of AMDGPU Assembler Operands and Modifiers
   3 =================================================
   4
   5 .. contents::
   6    :local:
   7
   8 Conventions
   9 ===========
  10
  11 The following conventions are used in syntax description:
  12
  13     =================== =============================================================
  14     Notation            Description
  15     =================== =============================================================
  16     {0..N}              Any integer value in the range from 0 to N (inclusive).
  17                         Unless stated otherwise, this value may be specified as
  18                         either a literal or an llvm expression.
  19     <x>                 Syntax and meaning of *<x>* is explained elsewhere.
  20     =================== =============================================================
  21
  22 .. _amdgpu_syn_operands:
  23
  24 Operands
  25 ========
  26
  27 TBD
  28
  29 .. _amdgpu_syn_modifiers:
  30
  31 Modifiers
  32 =========
  33
  34 DS Modifiers
  35 ------------
  36
  37 .. _amdgpu_synid_ds_offset8:
  38
  39 ds_offset8
  40 ~~~~~~~~~~
  41
  42 Specifies an immediate unsigned 8-bit offset, in bytes. The default value is 0.
  43
  44 Used with DS instructions which have 2 addresses.
  45
  46     ======================================== ================================================
  47     Syntax                                   Description
  48     ======================================== ================================================
  49     offset:{0..0xFF}                         Specifies a 8-bit offset.
  50     ======================================== ================================================
  51
  52 .. _amdgpu_synid_ds_offset16:
  53
  54 ds_offset16
  55 ~~~~~~~~~~~
  56
  57 Specifies an immediate unsigned 16-bit offset, in bytes. The default value is 0.
  58
  59 Used with DS instructions which have 1 address.
  60
  61     ======================================== ================================================
  62     Syntax                                   Description
  63     ======================================== ================================================
  64     offset:{0..0xFFFF}                       Specifies a 16-bit offset.
  65     ======================================== ================================================
  66
  67 .. _amdgpu_synid_sw_offset16:
  68
  69 sw_offset16
  70 ~~~~~~~~~~~
  71
  72 This is a special modifier which may be used with *ds_swizzle_b32* instruction only.
  73 Specifies a sizzle pattern in numeric or symbolic form. The default value is 0.
  74
  75 See AMD documentation for more information.
  76
  77     ======================================================= ===================================================
  78     Syntax                                                  Description
  79     ======================================================= ===================================================
  80     offset:{0..0xFFFF}                                      Specifies a 16-bit swizzle pattern
  81                                                             in a numeric form.
  82     offset:swizzle(QUAD_PERM,{0..3},{0..3},{0..3},{0..3})   Specifies a quad permute mode pattern; each
  83                                                             number is a lane id.
  84     offset:swizzle(BITMASK_PERM, "<mask>")                  Specifies a bitmask permute mode pattern
  85                                                             which converts a 5-bit lane id to another
  86                                                             lane id with which the lane interacts.
  87
  88                                                             <mask> is a 5 character sequence which
  89                                                             specifies how to transform the bits of the
  90                                                             lane id. The following characters are allowed:
  91
  92                                                               * "0" - set bit to 0.
  93
  94                                                               * "1" - set bit to 1.
  95
  96                                                               * "p" - preserve bit.
  97
  98                                                               * "i" - inverse bit.
  99
 100     offset:swizzle(BROADCAST,{2..32},{0..N})                Specifies a broadcast mode.
 101                                                             Broadcasts the value of any particular lane to
 102                                                             all lanes in its group.
 103
 104                                                             The first numeric parameter is a group
 105                                                             size and must be equal to 2, 4, 8, 16 or 32.
 106
 107                                                             The second numeric parameter is an index of the
 108                                                             lane being broadcasted. The index must not exceed
 109                                                             group size.
 110     offset:swizzle(SWAP,{1..16})                            Specifies a swap mode.
 111                                                             Swaps the neighboring groups of
 112                                                             1, 2, 4, 8 or 16 lanes.
 113     offset:swizzle(REVERSE,{2..32})                         Specifies a reverse mode. Reverses
 114                                                             the lanes for groups of 2, 4, 8, 16 or 32 lanes.
 115     ======================================================= ===================================================
 116
 117 .. _amdgpu_synid_gds:
 118
 119 gds
 120 ~~~
 121
 122 Specifies whether to use GDS or LDS memory (LDS is the default).
 123
 124     ======================================== ================================================
 125     Syntax                                   Description
 126     ======================================== ================================================
 127     gds                                      Use GDS memory.
 128     ======================================== ================================================
 129
 130
 131 EXP Modifiers
 132 -------------
 133
 134 .. _amdgpu_synid_done:
 135
 136 done
 137 ~~~~
 138
 139 Specifies if this is the last export from the shader to the target. By default, current
 140 instruction does not finish an export sequence.
 141
 142     ======================================== ================================================
 143     Syntax                                   Description
 144     ======================================== ================================================
 145     done                                     Indicates the last export operation.
 146     ======================================== ================================================
 147
 148 .. _amdgpu_synid_compr:
 149
 150 compr
 151 ~~~~~
 152
 153 Indicates if the data are compressed (not compressed by default).
 154
 155     ======================================== ================================================
 156     Syntax                                   Description
 157     ======================================== ================================================
 158     compr                                    Data are compressed.
 159     ======================================== ================================================
 160
 161 .. _amdgpu_synid_vm:
 162
 163 vm
 164 ~~
 165
 166 Specifies valid mask flag state (off by default).
 167
 168     ======================================== ================================================
 169     Syntax                                   Description
 170     ======================================== ================================================
 171     vm                                       Set valid mask flag.
 172     ======================================== ================================================
 173
 174 FLAT Modifiers
 175 --------------
 176
 177 .. _amdgpu_synid_flat_offset12:
 178
 179 flat_offset12
 180 ~~~~~~~~~~~~~
 181
 182 Specifies an immediate unsigned 12-bit offset, in bytes. The default value is 0.
 183
 184 Cannot be used with *global/scratch* opcodes. GFX9 only.
 185
 186     ======================================== ================================================
 187     Syntax                                   Description
 188     ======================================== ================================================
 189     offset:{0..4095}                         Specifies a 12-bit unsigned offset.
 190     ======================================== ================================================
 191
 192 .. _amdgpu_synid_flat_offset13:
 193
 194 flat_offset13
 195 ~~~~~~~~~~~~~
 196
 197 Specifies an immediate signed 13-bit offset, in bytes. The default value is 0.
 198
 199 Can be used with *global/scratch* opcodes only. GFX9 only.
 200
 201     ======================================== ================================================
 202     Syntax                                   Description
 203     ======================================== ================================================
 204     offset:{-4096..+4095}                    Specifies a 13-bit signed offset.
 205     ======================================== ================================================
 206
 207 glc
 208 ~~~
 209
 210 See a description :ref:`here<amdgpu_synid_glc>`.
 211
 212 slc
 213 ~~~
 214
 215 See a description :ref:`here<amdgpu_synid_slc>`.
 216
 217 tfe
 218 ~~~
 219
 220 See a description :ref:`here<amdgpu_synid_tfe>`.
 221
 222 nv
 223 ~~
 224
 225 See a description :ref:`here<amdgpu_synid_nv>`.
 226
 227 MIMG Modifiers
 228 --------------
 229
 230 .. _amdgpu_synid_dmask:
 231
 232 dmask
 233 ~~~~~
 234
 235 Specifies which channels (image components) are used by the operation. By default, no channels
 236 are used.
 237
 238     ======================================== ================================================
 239     Syntax                                   Description
 240     ======================================== ================================================
 241     dmask:{0..15}                            Each bit corresponds to one of 4 image
 242                                              components (RGBA). If the specified bit value
 243                                              is 0, the component is not used, value 1 means
 244                                              that the component is used.
 245     ======================================== ================================================
 246
 247 This modifier has some limitations depending on instruction kind:
 248
 249     ======================================== ================================================
 250     Instruction Kind                         Valid dmask Values
 251     ======================================== ================================================
 252     32-bit atomic cmpswap                    0x3
 253     other 32-bit atomic instructions         0x1
 254     64-bit atomic cmpswap                    0xF
 255     other 64-bit atomic instructions         0x3
 256     GATHER4                                  0x1, 0x2, 0x4, 0x8
 257     Other instructions                       any value
 258     ======================================== ================================================
 259
 260 .. _amdgpu_synid_unorm:
 261
 262 unorm
 263 ~~~~~
 264
 265 Specifies whether address is normalized or not (normalized by default).
 266
 267     ======================================== ================================================
 268     Syntax                                   Description
 269     ======================================== ================================================
 270     unorm                                    Force address to be un-normalized.
 271     ======================================== ================================================
 272
 273 glc
 274 ~~~
 275
 276 See a description :ref:`here<amdgpu_synid_glc>`.
 277
 278 slc
 279 ~~~
 280
 281 See a description :ref:`here<amdgpu_synid_slc>`.
 282
 283 .. _amdgpu_synid_r128:
 284
 285 r128
 286 ~~~~
 287
 288 Specifies texture resource size. The default size is 256 bits.
 289
 290 GFX7 and GFX8 only.
 291
 292     ======================================== ================================================
 293     Syntax                                   Description
 294     ======================================== ================================================
 295     r128                                     Specifies 128 bits texture resource size.
 296     ======================================== ================================================
 297
 298 tfe
 299 ~~~
 300
 301 See a description :ref:`here<amdgpu_synid_tfe>`.
 302
 303 .. _amdgpu_synid_lwe:
 304
 305 lwe
 306 ~~~
 307
 308 Specifies LOD warning status (LOD warning is disabled by default).
 309
 310     ======================================== ================================================
 311     Syntax                                   Description
 312     ======================================== ================================================
 313     lwe                                      Enables LOD warning.
 314     ======================================== ================================================
 315
 316 .. _amdgpu_synid_da:
 317
 318 da
 319 ~~
 320
 321 Specifies if an array index must be sent to TA. By default, array index is not sent.
 322
 323     ======================================== ================================================
 324     Syntax                                   Description
 325     ======================================== ================================================
 326     da                                       Send an array-index to TA.
 327     ======================================== ================================================
 328
 329 .. _amdgpu_synid_d16:
 330
 331 d16
 332 ~~~
 333
 334 Specifies data size: 16 or 32 bits (32 bits by default). Not supported by GFX7.
 335
 336     ======================================== ================================================
 337     Syntax                                   Description
 338     ======================================== ================================================
 339     d16                                      Enables 16-bits data mode.
 340
 341                                              On loads, convert data in memory to 16-bit
 342                                              format before storing it in VGPRs.
 343
 344                                              For stores, convert 16-bit data in VGPRs to
 345                                              32 bits before going to memory.
 346
 347                                              Note that 16-bit data are stored in VGPRs
 348                                              unpacked in GFX8.0. In GFX8.1 and GFX9 16-bit
 349                                              data are packed.
 350     ======================================== ================================================
 351
 352 .. _amdgpu_synid_a16:
 353
 354 a16
 355 ~~~
 356
 357 Specifies size of image address components: 16 or 32 bits (32 bits by default). GFX9 only.
 358
 359     ======================================== ================================================
 360     Syntax                                   Description
 361     ======================================== ================================================
 362     a16                                      Enables 16-bits image address components.
 363     ======================================== ================================================
 364
 365 Miscellaneous Modifiers
 366 -----------------------
 367
 368 .. _amdgpu_synid_glc:
 369
 370 glc
 371 ~~~
 372
 373 This modifier has different meaning for loads, stores, and atomic operations.
 374 The default value is off (0).
 375
 376 See AMD documentation for details.
 377
 378     ======================================== ================================================
 379     Syntax                                   Description
 380     ======================================== ================================================
 381     glc                                      Set glc bit to 1.
 382     ======================================== ================================================
 383
 384 .. _amdgpu_synid_slc:
 385
 386 slc
 387 ~~~
 388
 389 Specifies cache policy. The default value is off (0).
 390
 391 See AMD documentation for details.
 392
 393     ======================================== ================================================
 394     Syntax                                   Description
 395     ======================================== ================================================
 396     slc                                      Set slc bit to 1.
 397     ======================================== ================================================
 398
 399 .. _amdgpu_synid_tfe:
 400
 401 tfe
 402 ~~~
 403
 404 Controls access to partially resident textures. The default value is off (0).
 405
 406 See AMD documentation for details.
 407
 408     ======================================== ================================================
 409     Syntax                                   Description
 410     ======================================== ================================================
 411     tfe                                      Set tfe bit to 1.
 412     ======================================== ================================================
 413
 414 .. _amdgpu_synid_nv:
 415
 416 nv
 417 ~~
 418
 419 Specifies if instruction is operating on non-volatile memory. By default, memory is volatile.
 420
 421 GFX9 only.
 422
 423     ======================================== ================================================
 424     Syntax                                   Description
 425     ======================================== ================================================
 426     nv                                       Indicates that instruction operates on
 427                                              non-volatile memory.
 428     ======================================== ================================================
 429
 430 MUBUF/MTBUF Modifiers
 431 ---------------------
 432
 433 .. _amdgpu_synid_idxen:
 434
 435 idxen
 436 ~~~~~
 437
 438 Specifies whether address components include an index. By default, no components are used.
 439
 440 Can be used together with :ref:`offen<amdgpu_synid_offen>`.
 441
 442 Cannot be used with :ref:`addr64<amdgpu_synid_addr64>`.
 443
 444     ======================================== ================================================
 445     Syntax                                   Description
 446     ======================================== ================================================
 447     idxen                                    Address components include an index.
 448     ======================================== ================================================
 449
 450 .. _amdgpu_synid_offen:
 451
 452 offen
 453 ~~~~~
 454
 455 Specifies whether address components include an offset. By default, no components are used.
 456
 457 Can be used together with :ref:`idxen<amdgpu_synid_idxen>`.
 458
 459 Cannot be used with :ref:`addr64<amdgpu_synid_addr64>`.
 460
 461     ======================================== ================================================
 462     Syntax                                   Description
 463     ======================================== ================================================
 464     offen                                    Address components include an offset.
 465     ======================================== ================================================
 466
 467 .. _amdgpu_synid_addr64:
 468
 469 addr64
 470 ~~~~~~
 471
 472 Specifies whether a 64-bit address is used. By default, no address is used.
 473
 474 GFX7 only. Cannot be used with :ref:`offen<amdgpu_synid_offen>` and
 475 :ref:`idxen<amdgpu_synid_idxen>` modifiers.
 476
 477     ======================================== ================================================
 478     Syntax                                   Description
 479     ======================================== ================================================
 480     addr64                                   A 64-bit address is used.
 481     ======================================== ================================================
 482
 483 .. _amdgpu_synid_buf_offset12:
 484
 485 buf_offset12
 486 ~~~~~~~~~~~~
 487
 488 Specifies an immediate unsigned 12-bit offset, in bytes. The default value is 0.
 489
 490     ======================================== ================================================
 491     Syntax                                   Description
 492     ======================================== ================================================
 493     offset:{0..0xFFF}                        Specifies a 12-bit unsigned offset.
 494     ======================================== ================================================
 495
 496 glc
 497 ~~~
 498
 499 See a description :ref:`here<amdgpu_synid_glc>`.
 500
 501 slc
 502 ~~~
 503
 504 See a description :ref:`here<amdgpu_synid_slc>`.
 505
 506 .. _amdgpu_synid_lds:
 507
 508 lds
 509 ~~~
 510
 511 Specifies where to store the result: VGPRs or LDS (VGPRs by default).
 512
 513     ======================================== ================================================
 514     Syntax                                   Description
 515     ======================================== ================================================
 516     lds                                      Store result in LDS.
 517     ======================================== ================================================
 518
 519 tfe
 520 ~~~
 521
 522 See a description :ref:`here<amdgpu_synid_tfe>`.
 523
 524 .. _amdgpu_synid_dfmt:
 525
 526 dfmt
 527 ~~~~
 528
 529 TBD
 530
 531 .. _amdgpu_synid_nfmt:
 532
 533 nfmt
 534 ~~~~
 535
 536 TBD
 537
 538 SMRD/SMEM Modifiers
 539 -------------------
 540
 541 glc
 542 ~~~
 543
 544 See a description :ref:`here<amdgpu_synid_glc>`.
 545
 546 nv
 547 ~~
 548
 549 See a description :ref:`here<amdgpu_synid_nv>`.
 550
 551 VINTRP Modifiers
 552 ----------------
 553
 554 .. _amdgpu_synid_high:
 555
 556 high
 557 ~~~~
 558
 559 Specifies which half of the LDS word to use. Low half of LDS word is used by default.
 560 GFX9 only.
 561
 562     ======================================== ================================================
 563     Syntax                                   Description
 564     ======================================== ================================================
 565     high                                     Use high half of LDS word.
 566     ======================================== ================================================
 567
 568 VOP1/VOP2 DPP Modifiers
 569 -----------------------
 570
 571 GFX8 and GFX9 only.
 572
 573 .. _amdgpu_synid_dpp_ctrl:
 574
 575 dpp_ctrl
 576 ~~~~~~~~
 577
 578 Specifies how data are shared between threads. This is a mandatory modifier.
 579 There is no default value.
 580
 581 Note. The lanes of a wavefront are organized in four banks and four rows.
 582
 583     ======================================== ================================================
 584     Syntax                                   Description
 585     ======================================== ================================================
 586     quad_perm:[{0..3},{0..3},{0..3},{0..3}]  Full permute of 4 threads.
 587     row_mirror                               Mirror threads within row.
 588     row_half_mirror                          Mirror threads within 1/2 row (8 threads).
 589     row_bcast:15                             Broadcast 15th thread of each row to next row.
 590     row_bcast:31                             Broadcast thread 31 to rows 2 and 3.
 591     wave_shl:1                               Wavefront left shift by 1 thread.
 592     wave_rol:1                               Wavefront left rotate by 1 thread.
 593     wave_shr:1                               Wavefront right shift by 1 thread.
 594     wave_ror:1                               Wavefront right rotate by 1 thread.
 595     row_shl:{1..15}                          Row shift left by 1-15 threads.
 596     row_shr:{1..15}                          Row shift right by 1-15 threads.
 597     row_ror:{1..15}                          Row rotate right by 1-15 threads.
 598     ======================================== ================================================
 599
 600 .. _amdgpu_synid_row_mask:
 601
 602 row_mask
 603 ~~~~~~~~
 604
 605 Controls which rows are enabled for data sharing. By default, all rows are enabled.
 606
 607 Note. The lanes of a wavefront are organized in four banks and four rows.
 608
 609     ======================================== ================================================
 610     Syntax                                   Description
 611     ======================================== ================================================
 612     row_mask:{0..15}                         Each of 4 bits in the mask controls one
 613                                              row (0 - disabled, 1 - enabled).
 614     ======================================== ================================================
 615
 616 .. _amdgpu_synid_bank_mask:
 617
 618 bank_mask
 619 ~~~~~~~~~
 620
 621 Controls which banks are enabled for data sharing. By default, all banks are enabled.
 622
 623 Note. The lanes of a wavefront are organized in four banks and four rows.
 624
 625     ======================================== ================================================
 626     Syntax                                   Description
 627     ======================================== ================================================
 628     bank_mask:{0..15}                        Each of 4 bits in the mask controls one
 629                                              bank (0 - disabled, 1 - enabled).
 630     ======================================== ================================================
 631
 632 .. _amdgpu_synid_bound_ctrl:
 633
 634 bound_ctrl
 635 ~~~~~~~~~~
 636
 637 Controls data sharing when accessing an invalid lane. By default, data sharing with
 638 invalid lanes is disabled.
 639
 640     ======================================== ================================================
 641     Syntax                                   Description
 642     ======================================== ================================================
 643     bound_ctrl:0                             Enables data sharing with invalid lanes.
 644                                              Accessing data from an invalid lane will
 645                                              return zero.
 646     ======================================== ================================================
 647
 648 VOP1/VOP2/VOPC SDWA Modifiers
 649 -----------------------------
 650
 651 GFX8 and GFX9 only.
 652
 653 clamp
 654 ~~~~~
 655
 656 See a description :ref:`here<amdgpu_synid_clamp>`.
 657
 658 omod
 659 ~~~~
 660
 661 See a description :ref:`here<amdgpu_synid_omod>`.
 662
 663 GFX9 only.
 664
 665 .. _amdgpu_synid_dst_sel:
 666
 667 dst_sel
 668 ~~~~~~~
 669
 670 Selects which bits in the destination are affected. By default, all bits are affected.
 671
 672     ======================================== ================================================
 673     Syntax                                   Description
 674     ======================================== ================================================
 675     dst_sel:DWORD                            Use bits 31:0.
 676     dst_sel:BYTE_0                           Use bits 7:0.
 677     dst_sel:BYTE_1                           Use bits 15:8.
 678     dst_sel:BYTE_2                           Use bits 23:16.
 679     dst_sel:BYTE_3                           Use bits 31:24.
 680     dst_sel:WORD_0                           Use bits 15:0.
 681     dst_sel:WORD_1                           Use bits 31:16.
 682     ======================================== ================================================
 683
 684
 685 .. _amdgpu_synid_dst_unused:
 686
 687 dst_unused
 688 ~~~~~~~~~~
 689
 690 Controls what to do with the bits in the destination which are not selected
 691 by :ref:`dst_sel<amdgpu_synid_dst_sel>`.
 692 By default, unused bits are preserved.
 693
 694     ======================================== ================================================
 695     Syntax                                   Description
 696     ======================================== ================================================
 697     dst_unused:UNUSED_PAD                    Pad with zeros.
 698     dst_unused:UNUSED_SEXT                   Sign-extend upper bits, zero lower bits.
 699     dst_unused:UNUSED_PRESERVE               Preserve bits.
 700     ======================================== ================================================
 701
 702 .. _amdgpu_synid_src0_sel:
 703
 704 src0_sel
 705 ~~~~~~~~
 706
 707 Controls which bits in the src0 are used. By default, all bits are used.
 708
 709     ======================================== ================================================
 710     Syntax                                   Description
 711     ======================================== ================================================
 712     src0_sel:DWORD                           Use bits 31:0.
 713     src0_sel:BYTE_0                          Use bits 7:0.
 714     src0_sel:BYTE_1                          Use bits 15:8.
 715     src0_sel:BYTE_2                          Use bits 23:16.
 716     src0_sel:BYTE_3                          Use bits 31:24.
 717     src0_sel:WORD_0                          Use bits 15:0.
 718     src0_sel:WORD_1                          Use bits 31:16.
 719     ======================================== ================================================
 720
 721 .. _amdgpu_synid_src1_sel:
 722
 723 src1_sel
 724 ~~~~~~~~
 725
 726 Controls which bits in the src1 are used. By default, all bits are used.
 727
 728     ======================================== ================================================
 729     Syntax                                   Description
 730     ======================================== ================================================
 731     src1_sel:DWORD                           Use bits 31:0.
 732     src1_sel:BYTE_0                          Use bits 7:0.
 733     src1_sel:BYTE_1                          Use bits 15:8.
 734     src1_sel:BYTE_2                          Use bits 23:16.
 735     src1_sel:BYTE_3                          Use bits 31:24.
 736     src1_sel:WORD_0                          Use bits 15:0.
 737     src1_sel:WORD_1                          Use bits 31:16.
 738     ======================================== ================================================
 739
 740 VOP1/VOP2/VOPC SDWA Operand Modifiers
 741 -------------------------------------
 742
 743 Operand modifiers are not used separately. They are applied to source operands.
 744
 745 GFX8 and GFX9 only.
 746
 747 abs
 748 ~~~
 749
 750 See a description :ref:`here<amdgpu_synid_abs>`.
 751
 752 neg
 753 ~~~
 754
 755 See a description :ref:`here<amdgpu_synid_neg>`.
 756
 757 .. _amdgpu_synid_sext:
 758
 759 sext
 760 ~~~~
 761
 762 Sign-extends value of a (sub-dword) operand to fill all 32 bits.
 763 Has no effect for 32-bit operands.
 764
 765 Valid for integer operands only.
 766
 767     ======================================== ================================================
 768     Syntax                                   Description
 769     ======================================== ================================================
 770     sext(<operand>)                          Sign-extend operand value.
 771     ======================================== ================================================
 772
 773 VOP3 Modifiers
 774 --------------
 775
 776 .. _amdgpu_synid_vop3_op_sel:
 777
 778 vop3_op_sel
 779 ~~~~~~~~~~~
 780
 781 Selects the low [15:0] or high [31:16] operand bits for source and destination operands.
 782 By default, low bits are used for all operands.
 783
 784 The number of values specified with the op_sel modifier must match the number of instruction
 785 operands (both source and destination). First value controls src0, second value controls src1
 786 and so on, except that the last value controls destination.
 787 The value 0 selects the low bits, while 1 selects the high bits.
 788
 789 Note. op_sel modifier affects 16-bit operands only. For 32-bit operands the value specified
 790 by op_sel must be 0.
 791
 792 GFX9 only.
 793
 794     ======================================== ============================================================
 795     Syntax                                   Description
 796     ======================================== ============================================================
 797     op_sel:[{0..1},{0..1}]                   Select operand bits for instructions with 1 source operand.
 798     op_sel:[{0..1},{0..1},{0..1}]            Select operand bits for instructions with 2 source operands.
 799     op_sel:[{0..1},{0..1},{0..1},{0..1}]     Select operand bits for instructions with 3 source operands.
 800     ======================================== ============================================================
 801
 802 .. _amdgpu_synid_clamp:
 803
 804 clamp
 805 ~~~~~
 806
 807 Clamp meaning depends on instruction.
 808
 809 For *v_cmp* instructions, clamp modifier indicates that the compare signals
 810 if a floating point exception occurs. By default, signaling is disabled.
 811 Not supported by GFX7.
 812
 813 For integer operations, clamp modifier indicates that the result must be clamped
 814 to the largest and smallest representable value. By default, there is no clamping.
 815 Integer clamping is not supported by GFX7.
 816
 817 For floating point operations, clamp modifier indicates that the result must be clamped
 818 to the range [0.0, 1.0]. By default, there is no clamping.
 819
 820 Note. Clamp modifier is applied after :ref:`output modifiers<amdgpu_synid_omod>` (if any).
 821
 822     ======================================== ================================================
 823     Syntax                                   Description
 824     ======================================== ================================================
 825     clamp                                    Enables clamping (or signaling).
 826     ======================================== ================================================
 827
 828 .. _amdgpu_synid_omod:
 829
 830 omod
 831 ~~~~
 832
 833 Specifies if an output modifier must be applied to the result.
 834 By default, no output modifiers are applied.
 835
 836 Note. Output modifiers are applied before :ref:`clamping<amdgpu_synid_clamp>` (if any).
 837
 838 Output modifiers are valid for f32 and f64 floating point results only.
 839 They must not be used with f16.
 840
 841 Note. *v_cvt_f16_f32* is an exception. This instruction produces f16 result
 842 but accepts output modifiers.
 843
 844     ======================================== ================================================
 845     Syntax                                   Description
 846     ======================================== ================================================
 847     mul:2                                    Multiply the result by 2.
 848     mul:4                                    Multiply the result by 4.
 849     div:2                                    Multiply the result by 0.5.
 850     ======================================== ================================================
 851
 852 VOP3 Operand Modifiers
 853 ----------------------
 854
 855 Operand modifiers are not used separately. They are applied to source operands.
 856
 857 .. _amdgpu_synid_abs:
 858
 859 abs
 860 ~~~
 861
 862 Computes absolute value of its operand. Applied before :ref:`neg<amdgpu_synid_neg>` (if any).
 863 Valid for floating point operands only.
 864
 865     ======================================== ================================================
 866     Syntax                                   Description
 867     ======================================== ================================================
 868     abs(<operand>)                           Get absolute value of operand.
 869     \|<operand>|                             The same as above.
 870     ======================================== ================================================
 871
 872 .. _amdgpu_synid_neg:
 873
 874 neg
 875 ~~~
 876
 877 Computes negative value of its operand. Applied after :ref:`abs<amdgpu_synid_abs>` (if any).
 878 Valid for floating point operands only.
 879
 880     ======================================== ================================================
 881     Syntax                                   Description
 882     ======================================== ================================================
 883     neg(<operand>)                           Get negative value of operand.
 884     -<operand>                               The same as above.
 885     ======================================== ================================================
 886
 887 VOP3P Modifiers
 888 ---------------
 889
 890 This section describes modifiers of regular VOP3P instructions.
 891 *v_mad_mix* modifiers are described :ref:`in a separate section<amdgpu_synid_mad_mix>`.
 892
 893 GFX9 only.
 894
 895 .. _amdgpu_synid_op_sel:
 896
 897 op_sel
 898 ~~~~~~
 899
 900 Selects the low [15:0] or high [31:16] operand bits as input to the operation
 901 which results in the lower-half of the destination.
 902 By default, low bits are used for all operands.
 903
 904 The number of values specified with the op_sel modifier must match the number of source
 905 operands. First value controls src0, second value controls src1 and so on.
 906 The value 0 selects the low bits, while 1 selects the high bits.
 907
 908     ======================================== =============================================================
 909     Syntax                                   Description
 910     ======================================== =============================================================
 911     op_sel:[{0..1}]                          Select operand bits for instructions with 1 source operand.
 912     op_sel:[{0..1},{0..1}]                   Select operand bits for instructions with 2 source operands.
 913     op_sel:[{0..1},{0..1},{0..1}]            Select operand bits for instructions with 3 source operands.
 914     ======================================== =============================================================
 915
 916 .. _amdgpu_synid_op_sel_hi:
 917
 918 op_sel_hi
 919 ~~~~~~~~~
 920
 921 Selects the low [15:0] or high [31:16] operand bits as input to the operation
 922 which results in the upper-half of the destination.
 923 By default, high bits are used for all operands.
 924
 925 The number of values specified with the op_sel_hi modifier must match the number of source
 926 operands. First value controls src0, second value controls src1 and so on.
 927 The value 0 selects the low bits, while 1 selects the high bits.
 928
 929     ======================================== =============================================================
 930     Syntax                                   Description
 931     ======================================== =============================================================
 932     op_sel_hi:[{0..1}]                       Select operand bits for instructions with 1 source operand.
 933     op_sel_hi:[{0..1},{0..1}]                Select operand bits for instructions with 2 source operands.
 934     op_sel_hi:[{0..1},{0..1},{0..1}]         Select operand bits for instructions with 3 source operands.
 935     ======================================== =============================================================
 936
 937 .. _amdgpu_synid_neg_lo:
 938
 939 neg_lo
 940 ~~~~~~
 941
 942 Specifies whether to change sign of operand values selected by
 943 :ref:`op_sel<amdgpu_synid_op_sel>`. These values are then used
 944 as input to the operation which results in the upper-half of the destination.
 945
 946 The number of values specified with this modifier must match the number of source
 947 operands. First value controls src0, second value controls src1 and so on.
 948
 949 The value 0 indicates that the corresponding operand value is used unmodified,
 950 the value 1 indicates that negative value of the operand must be used.
 951
 952 By default, operand values are used unmodified.
 953
 954 This modifier is valid for floating point operands only.
 955
 956     ======================================== ==================================================================
 957     Syntax                                   Description
 958     ======================================== ==================================================================
 959     neg_lo:[{0..1}]                          Select affected operands for instructions with 1 source operand.
 960     neg_lo:[{0..1},{0..1}]                   Select affected operands for instructions with 2 source operands.
 961     neg_lo:[{0..1},{0..1},{0..1}]            Select affected operands for instructions with 3 source operands.
 962     ======================================== ==================================================================
 963
 964 .. _amdgpu_synid_neg_hi:
 965
 966 neg_hi
 967 ~~~~~~
 968
 969 Specifies whether to change sign of operand values selected by
 970 :ref:`op_sel_hi<amdgpu_synid_op_sel_hi>`. These values are then used
 971 as input to the operation which results in the upper-half of the destination.
 972
 973 The number of values specified with this modifier must match the number of source
 974 operands. First value controls src0, second value controls src1 and so on.
 975
 976 The value 0 indicates that the corresponding operand value is used unmodified,
 977 the value 1 indicates that negative value of the operand must be used.
 978
 979 By default, operand values are used unmodified.
 980
 981 This modifier is valid for floating point operands only.
 982
 983     ======================================== ==================================================================
 984     Syntax                                   Description
 985     ======================================== ==================================================================
 986     neg_hi:[{0..1}]                          Select affected operands for instructions with 1 source operand.
 987     neg_hi:[{0..1},{0..1}]                   Select affected operands for instructions with 2 source operands.
 988     neg_hi:[{0..1},{0..1},{0..1}]            Select affected operands for instructions with 3 source operands.
 989     ======================================== ==================================================================
 990
 991 clamp
 992 ~~~~~
 993
 994 See a description :ref:`here<amdgpu_synid_clamp>`.
 995
 996 .. _amdgpu_synid_mad_mix:
 997
 998 VOP3P V_MAD_MIX Modifiers
 999 -------------------------
1000
1001 These instructions use VOP3P format but have different modifiers.
1002
1003 GFX9 only.
1004
1005 .. _amdgpu_synid_mad_mix_op_sel:
1006
1007 mad_mix_op_sel
1008 ~~~~~~~~~~~~~~
1009
1010 This operand has meaning only for 16-bit source operands as indicated by
1011 :ref:`mad_mix_op_sel_hi<amdgpu_synid_mad_mix_op_sel_hi>`.
1012 It specifies to select either the low [15:0] or high [31:16] operand bits
1013 as input to the operation.
1014
1015 The value 0 indicates the low bits, the value 1 indicates the high 16 bits.
1016 By default, low bits are used for all operands.
1017
1018     ======================================== ================================================
1019     Syntax                                   Description
1020     ======================================== ================================================
1021     op_sel:[{0..1},{0..1},{0..1}]            Select location of each 16-bit source operand.
1022     ======================================== ================================================
1023
1024 .. _amdgpu_synid_mad_mix_op_sel_hi:
1025
1026 mad_mix_op_sel_hi
1027 ~~~~~~~~~~~~~~~~~
1028
1029 Selects the size of source operands: either 32 bits or 16 bits.
1030 By default, 32 bits are used for all source operands.
1031
1032 The value 0 indicates 32 bits, the value 1 indicates 16 bits.
1033 The location of 16 bits in the operand may be specified by
1034 :ref:`mad_mix_op_sel<amdgpu_synid_mad_mix_op_sel>`.
1035
1036     ======================================== ================================================
1037     Syntax                                   Description
1038     ======================================== ================================================
1039     op_sel_hi:[{0..1},{0..1},{0..1}]         Select size of each source operand.
1040     ======================================== ================================================
1041
1042 abs
1043 ~~~
1044
1045 See a description :ref:`here<amdgpu_synid_abs>`.
1046
1047 neg
1048 ~~~
1049
1050 See a description :ref:`here<amdgpu_synid_neg>`.
1051
1052 clamp
1053 ~~~~~
1054
1055 See a description :ref:`here<amdgpu_synid_clamp>`.