llvm/docs/AMDGPUModifierSyntax.rst

   1 ======================================
   2 Syntax of AMDGPU Instruction Modifiers
   3 ======================================
   4
   5 .. contents::
   6    :local:
   7
   8 Conventions
   9 ===========
  10
  11 The following notation is used throughout this document:
  12
  13     =================== =============================================================
  14     Notation            Description
  15     =================== =============================================================
  16     {0..N}              Any integer value in the range from 0 to N (inclusive).
  17     <x>                 Syntax and meaning of *x* are explained elsewhere.
  18     =================== =============================================================
  19
  20 .. _amdgpu_syn_modifiers:
  21
  22 Modifiers
  23 =========
  24
  25 DS Modifiers
  26 ------------
  27
  28 .. _amdgpu_synid_ds_offset80:
  29
  30 offset0
  31 ~~~~~~~
  32
  33 Specifies the first 8-bit offset, in bytes. The default value is 0.
  34
  35 Used with DS instructions that expect two addresses.
  36
  37     =================== ====================================================================
  38     Syntax              Description
  39     =================== ====================================================================
  40     offset0:{0..0xFF}   Specifies an unsigned 8-bit offset as a positive
  41                         :ref:`integer number <amdgpu_synid_integer_number>`
  42                         or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
  43     =================== ====================================================================
  44
  45 Examples:
  46
  47 .. parsed-literal::
  48
  49   offset0:0xff
  50   offset0:2-x
  51   offset0:-x-y
  52
  53 .. _amdgpu_synid_ds_offset81:
  54
  55 offset1
  56 ~~~~~~~
  57
  58 Specifies the second 8-bit offset, in bytes. The default value is 0.
  59
  60 Used with DS instructions that expect two addresses.
  61
  62     =================== ====================================================================
  63     Syntax              Description
  64     =================== ====================================================================
  65     offset1:{0..0xFF}   Specifies an unsigned 8-bit offset as a positive
  66                         :ref:`integer number <amdgpu_synid_integer_number>`
  67                         or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
  68     =================== ====================================================================
  69
  70 Examples:
  71
  72 .. parsed-literal::
  73
  74   offset1:0xff
  75   offset1:2-x
  76   offset1:-x-y
  77
  78 .. _amdgpu_synid_ds_offset16:
  79
  80 offset
  81 ~~~~~~
  82
  83 Specifies a 16-bit offset, in bytes. The default value is 0.
  84
  85 Used with DS instructions that expect a single address.
  86
  87     ==================== ====================================================================
  88     Syntax               Description
  89     ==================== ====================================================================
  90     offset:{0..0xFFFF}   Specifies an unsigned 16-bit offset as a positive
  91                          :ref:`integer number <amdgpu_synid_integer_number>`
  92                          or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
  93     ==================== ====================================================================
  94
  95 Examples:
  96
  97 .. parsed-literal::
  98
  99   offset:65535
 100   offset:0xffff
 101   offset:-x-y
 102
 103 .. _amdgpu_synid_sw_offset16:
 104
 105 swizzle pattern
 106 ~~~~~~~~~~~~~~~
 107
 108 This is a special modifier that may be used with *ds_swizzle_b32* instruction only.
 109 It specifies a swizzle pattern in numeric or symbolic form. The default value is 0.
 110
 111     ======================================================= ===========================================================
 112     Syntax                                                  Description
 113     ======================================================= ===========================================================
 114     offset:{0..0xFFFF}                                      Specifies a 16-bit swizzle pattern.
 115     offset:swizzle(QUAD_PERM,{0..3},{0..3},{0..3},{0..3})   Specifies a quad permute mode pattern
 116
 117                                                             Each number is a lane *id*.
 118     offset:swizzle(BITMASK_PERM, "<mask>")                  Specifies a bitmask permute mode pattern.
 119
 120                                                             The pattern converts a 5-bit lane *id* to another
 121                                                             lane *id* with which the lane interacts.
 122
 123                                                             The *mask* is a 5-character sequence which
 124                                                             specifies how to transform the bits of the
 125                                                             lane *id*.
 126
 127                                                             The following characters are allowed:
 128
 129                                                             * "0" - set bit to 0.
 130
 131                                                             * "1" - set bit to 1.
 132
 133                                                             * "p" - preserve bit.
 134
 135                                                             * "i" - inverse bit.
 136
 137     offset:swizzle(BROADCAST,{2..32},{0..N})                Specifies a broadcast mode.
 138
 139                                                             Broadcasts the value of any particular lane to
 140                                                             all lanes in its group.
 141
 142                                                             The first numeric parameter is a group
 143                                                             size and must be equal to 2, 4, 8, 16 or 32.
 144
 145                                                             The second numeric parameter is an index of the
 146                                                             lane being broadcast.
 147
 148                                                             The index must not exceed group size.
 149     offset:swizzle(SWAP,{1..16})                            Specifies a swap mode.
 150
 151                                                             Swaps the neighboring groups of
 152                                                             1, 2, 4, 8 or 16 lanes.
 153     offset:swizzle(REVERSE,{2..32})                         Specifies a reverse mode.
 154
 155                                                             Reverses the lanes for groups of 2, 4, 8, 16 or 32 lanes.
 156     ======================================================= ===========================================================
 157
 158 Note: numeric values may be specified as either
 159 :ref:`integer numbers<amdgpu_synid_integer_number>` or
 160 :ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
 161
 162 Examples:
 163
 164 .. parsed-literal::
 165
 166   offset:255
 167   offset:0xffff
 168   offset:swizzle(QUAD_PERM, 0, 1, 2, 3)
 169   offset:swizzle(BITMASK_PERM, "01pi0")
 170   offset:swizzle(BROADCAST, 2, 0)
 171   offset:swizzle(SWAP, 8)
 172   offset:swizzle(REVERSE, 30 + 2)
 173
 174 .. _amdgpu_synid_gds:
 175
 176 gds
 177 ~~~
 178
 179 Specifies whether to use GDS or LDS memory (LDS is the default).
 180
 181     ======================================== ================================================
 182     Syntax                                   Description
 183     ======================================== ================================================
 184     gds                                      Use GDS memory.
 185     ======================================== ================================================
 186
 187
 188 EXP Modifiers
 189 -------------
 190
 191 .. _amdgpu_synid_done:
 192
 193 done
 194 ~~~~
 195
 196 Specifies if this is the last export from the shader to the target. By default,
 197 an *export* instruction does not finish an export sequence.
 198
 199     ======================================== ================================================
 200     Syntax                                   Description
 201     ======================================== ================================================
 202     done                                     Indicates the last export operation.
 203     ======================================== ================================================
 204
 205 .. _amdgpu_synid_compr:
 206
 207 compr
 208 ~~~~~
 209
 210 Indicates if the data is compressed (data is not compressed by default).
 211
 212     ======================================== ================================================
 213     Syntax                                   Description
 214     ======================================== ================================================
 215     compr                                    Data is compressed.
 216     ======================================== ================================================
 217
 218 .. _amdgpu_synid_vm:
 219
 220 vm
 221 ~~
 222
 223 Specifies if the :ref:`exec<amdgpu_synid_exec>` mask is valid for this *export* instruction
 224 (the mask is not valid by default).
 225
 226     ======================================== ================================================
 227     Syntax                                   Description
 228     ======================================== ================================================
 229     vm                                       Set the flag indicating a valid
 230                                              :ref:`exec<amdgpu_synid_exec>` mask.
 231     ======================================== ================================================
 232
 233 .. _amdgpu_synid_row_en:
 234
 235 row_en
 236 ~~~~~~
 237
 238 Specifies whether to export one row or multiple rows of data.
 239
 240     ======================================== ================================================
 241     Syntax                                   Description
 242     ======================================== ================================================
 243     row_en                                   Export multiple rows using row index from M0.
 244     ======================================== ================================================
 245
 246 FLAT Modifiers
 247 --------------
 248
 249 .. _amdgpu_synid_flat_offset12:
 250
 251 offset12
 252 ~~~~~~~~
 253
 254 Specifies an immediate unsigned 12-bit offset, in bytes. The default value is 0.
 255
 256     ================= ====================================================================
 257     Syntax            Description
 258     ================= ====================================================================
 259     offset:{0..4095}  Specifies a 12-bit unsigned offset as a positive
 260                       :ref:`integer number <amdgpu_synid_integer_number>`
 261                       or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
 262     ================= ====================================================================
 263
 264 Examples:
 265
 266 .. parsed-literal::
 267
 268   offset:4095
 269   offset:x-0xff
 270
 271 .. _amdgpu_synid_flat_offset13s:
 272
 273 offset13s
 274 ~~~~~~~~~
 275
 276 Specifies an immediate signed 13-bit offset, in bytes. The default value is 0.
 277
 278     ===================== ====================================================================
 279     Syntax                Description
 280     ===================== ====================================================================
 281     offset:{-4096..4095}  Specifies a 13-bit signed offset as an
 282                           :ref:`integer number <amdgpu_synid_integer_number>`
 283                           or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
 284     ===================== ====================================================================
 285
 286 Examples:
 287
 288 .. parsed-literal::
 289
 290   offset:-4000
 291   offset:0x10
 292   offset:-x
 293
 294 .. _amdgpu_synid_flat_offset12s:
 295
 296 offset12s
 297 ~~~~~~~~~
 298
 299 Specifies an immediate signed 12-bit offset, in bytes. The default value is 0.
 300
 301     ===================== ====================================================================
 302     Syntax                Description
 303     ===================== ====================================================================
 304     offset:{-2048..2047}  Specifies a 12-bit signed offset as an
 305                           :ref:`integer number <amdgpu_synid_integer_number>`
 306                           or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
 307     ===================== ====================================================================
 308
 309 Examples:
 310
 311 .. parsed-literal::
 312
 313   offset:-2000
 314   offset:0x10
 315   offset:-x+y
 316
 317 .. _amdgpu_synid_flat_offset11:
 318
 319 offset11
 320 ~~~~~~~~
 321
 322 Specifies an immediate unsigned 11-bit offset, in bytes. The default value is 0.
 323
 324     ================= ====================================================================
 325     Syntax            Description
 326     ================= ====================================================================
 327     offset:{0..2047}  Specifies an 11-bit unsigned offset as a positive
 328                       :ref:`integer number <amdgpu_synid_integer_number>`
 329                       or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
 330     ================= ====================================================================
 331
 332 Examples:
 333
 334 .. parsed-literal::
 335
 336   offset:2047
 337   offset:x+0xff
 338
 339 dlc
 340 ~~~
 341
 342 See a description :ref:`here<amdgpu_synid_dlc>`.
 343
 344 glc
 345 ~~~
 346
 347 See a description :ref:`here<amdgpu_synid_glc>`.
 348
 349 lds
 350 ~~~
 351
 352 See a description :ref:`here<amdgpu_synid_lds>`.
 353
 354 slc
 355 ~~~
 356
 357 See a description :ref:`here<amdgpu_synid_slc>`.
 358
 359 tfe
 360 ~~~
 361
 362 See a description :ref:`here<amdgpu_synid_tfe>`.
 363
 364 nv
 365 ~~
 366
 367 See a description :ref:`here<amdgpu_synid_nv>`.
 368
 369 sc0
 370 ~~~
 371
 372 See a description :ref:`here<amdgpu_synid_sc0>`.
 373
 374 sc1
 375 ~~~
 376
 377 See a description :ref:`here<amdgpu_synid_sc1>`.
 378
 379 nt
 380 ~~
 381
 382 See a description :ref:`here<amdgpu_synid_nt>`.
 383
 384 MIMG Modifiers
 385 --------------
 386
 387 .. _amdgpu_synid_dmask:
 388
 389 dmask
 390 ~~~~~
 391
 392 Specifies which channels (image components) are used by the operation.
 393 By default, no channels are used.
 394
 395     =============== ====================================================================
 396     Syntax          Description
 397     =============== ====================================================================
 398     dmask:{0..15}   Specifies image channels as a positive
 399                     :ref:`integer number <amdgpu_synid_integer_number>`
 400                     or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
 401
 402                     Each bit corresponds to one of 4 image components (RGBA).
 403
 404                     If the specified bit value is 0, the image component is not used,
 405                     while value 1 means that the component is used.
 406     =============== ====================================================================
 407
 408 This modifier has some limitations depending on the instruction kind:
 409
 410     =================================================== ========================
 411     Instruction Kind                                    Valid dmask Values
 412     =================================================== ========================
 413     32-bit atomic *cmpswap*                             0x3
 414     32-bit atomic instructions except for *cmpswap*     0x1
 415     64-bit atomic *cmpswap*                             0xF
 416     64-bit atomic instructions except for *cmpswap*     0x3
 417     *gather4*                                           0x1, 0x2, 0x4, 0x8
 418     GFX11+ *msaa_load*                                  0x1, 0x2, 0x4, 0x8
 419     Other instructions                                  any value
 420     =================================================== ========================
 421
 422 Examples:
 423
 424 .. parsed-literal::
 425
 426   dmask:0xf
 427   dmask:0b1111
 428   dmask:x|y|z
 429
 430 .. _amdgpu_synid_unorm:
 431
 432 unorm
 433 ~~~~~
 434
 435 Specifies whether the address is normalized or not (the address is normalized by default).
 436
 437     ======================== ========================================
 438     Syntax                   Description
 439     ======================== ========================================
 440     unorm                    Force the address to be not normalized.
 441     ======================== ========================================
 442
 443 glc
 444 ~~~
 445
 446 See a description :ref:`here<amdgpu_synid_glc>`.
 447
 448 slc
 449 ~~~
 450
 451 See a description :ref:`here<amdgpu_synid_slc>`.
 452
 453 .. _amdgpu_synid_r128:
 454
 455 r128
 456 ~~~~
 457
 458 Specifies texture resource size. The default size is 256 bits.
 459
 460     =================== ================================================
 461     Syntax              Description
 462     =================== ================================================
 463     r128                Specifies 128 bits texture resource size.
 464     =================== ================================================
 465
 466 .. WARNING:: Using this modifier shall decrease *rsrc* operand size from 8 to 4 dwords, \
 467              but assembler does not currently support this feature.
 468
 469 tfe
 470 ~~~
 471
 472 See a description :ref:`here<amdgpu_synid_tfe>`.
 473
 474 .. _amdgpu_synid_lwe:
 475
 476 lwe
 477 ~~~
 478
 479 Specifies LOD warning status (LOD warning is disabled by default).
 480
 481     ======================================== ================================================
 482     Syntax                                   Description
 483     ======================================== ================================================
 484     lwe                                      Enables LOD warning.
 485     ======================================== ================================================
 486
 487 .. _amdgpu_synid_da:
 488
 489 da
 490 ~~
 491
 492 Specifies if an array index must be sent to TA. By default, the array index is not sent.
 493
 494     ======================================== ================================================
 495     Syntax                                   Description
 496     ======================================== ================================================
 497     da                                       Send an array index to TA.
 498     ======================================== ================================================
 499
 500 .. _amdgpu_synid_d16:
 501
 502 d16
 503 ~~~
 504
 505 Specifies data size: 16 or 32 bits (32 bits by default).
 506
 507     ======================================== ================================================
 508     Syntax                                   Description
 509     ======================================== ================================================
 510     d16                                      Enables 16-bits data mode.
 511
 512                                              On loads, convert data in memory to 16-bit
 513                                              format before storing it in VGPRs.
 514
 515                                              For stores, convert 16-bit data in VGPRs to
 516                                              32 bits before writing the values to memory.
 517
 518                                              Note that GFX8.0 does not support data packing.
 519                                              Each 16-bit data element occupies 1 VGPR.
 520
 521                                              GFX8.1 and GFX9+ support data packing.
 522                                              Each pair of 16-bit data elements
 523                                              occupies 1 VGPR.
 524     ======================================== ================================================
 525
 526 .. _amdgpu_synid_a16:
 527
 528 a16
 529 ~~~
 530
 531 Specifies the size of image address components: 16 or 32 bits (32 bits by default).
 532
 533     ======================================== ================================================
 534     Syntax                                   Description
 535     ======================================== ================================================
 536     a16                                      Enables 16-bits image address components.
 537     ======================================== ================================================
 538
 539 .. _amdgpu_synid_dim:
 540
 541 dim
 542 ~~~
 543
 544 Specifies surface dimension. This is a mandatory modifier. There is no default value.
 545
 546     =============================== =========================================================
 547     Syntax                          Description
 548     =============================== =========================================================
 549     dim:1D                          One-dimensional image.
 550     dim:2D                          Two-dimensional image.
 551     dim:3D                          Three-dimensional image.
 552     dim:CUBE                        Cubemap array.
 553     dim:1D_ARRAY                    One-dimensional image array.
 554     dim:2D_ARRAY                    Two-dimensional image array.
 555     dim:2D_MSAA                     Two-dimensional multi-sample auto-aliasing image.
 556     dim:2D_MSAA_ARRAY               Two-dimensional multi-sample auto-aliasing image array.
 557     =============================== =========================================================
 558
 559 The following table defines an alternative syntax which is supported
 560 for compatibility with SP3 assembler:
 561
 562     =============================== =========================================================
 563     Syntax                          Description
 564     =============================== =========================================================
 565     dim:SQ_RSRC_IMG_1D              One-dimensional image.
 566     dim:SQ_RSRC_IMG_2D              Two-dimensional image.
 567     dim:SQ_RSRC_IMG_3D              Three-dimensional image.
 568     dim:SQ_RSRC_IMG_CUBE            Cubemap array.
 569     dim:SQ_RSRC_IMG_1D_ARRAY        One-dimensional image array.
 570     dim:SQ_RSRC_IMG_2D_ARRAY        Two-dimensional image array.
 571     dim:SQ_RSRC_IMG_2D_MSAA         Two-dimensional multi-sample auto-aliasing image.
 572     dim:SQ_RSRC_IMG_2D_MSAA_ARRAY   Two-dimensional multi-sample auto-aliasing image array.
 573     =============================== =========================================================
 574
 575 dlc
 576 ~~~
 577
 578 See a description :ref:`here<amdgpu_synid_dlc>`.
 579
 580 Miscellaneous Modifiers
 581 -----------------------
 582
 583 .. _amdgpu_synid_dlc:
 584
 585 dlc
 586 ~~~
 587
 588 Controls device level cache policy for memory operations. Used for synchronization.
 589 When specified, forces operation to bypass device level cache, making the operation device
 590 level coherent. By default, instructions use device level cache.
 591
 592     ======================================== ================================================
 593     Syntax                                   Description
 594     ======================================== ================================================
 595     dlc                                      Bypass device level cache.
 596     ======================================== ================================================
 597
 598 .. _amdgpu_synid_glc:
 599
 600 glc
 601 ~~~
 602
 603 For atomic opcodes, this modifier indicates that the instruction returns the value from memory
 604 before the operation. For other opcodes, it is used together with :ref:`slc<amdgpu_synid_slc>`
 605 to specify cache policy.
 606
 607 The default value is off (0).
 608
 609     ======================================== ================================================
 610     Syntax                                   Description
 611     ======================================== ================================================
 612     glc                                      Set glc bit to 1.
 613     ======================================== ================================================
 614
 615 .. _amdgpu_synid_lds:
 616
 617 lds
 618 ~~~
 619
 620 Specifies where to store the result: VGPRs or LDS (VGPRs by default).
 621
 622     ======================================== ===========================
 623     Syntax                                   Description
 624     ======================================== ===========================
 625     lds                                      Store the result in LDS.
 626     ======================================== ===========================
 627
 628 .. _amdgpu_synid_nv:
 629
 630 nv
 631 ~~
 632
 633 Specifies if the instruction is operating on non-volatile memory.
 634 By default, memory is volatile.
 635
 636     ======================================== ================================================
 637     Syntax                                   Description
 638     ======================================== ================================================
 639     nv                                       Indicates that the instruction operates on
 640                                              non-volatile memory.
 641     ======================================== ================================================
 642
 643 .. _amdgpu_synid_slc:
 644
 645 slc
 646 ~~~
 647
 648 Controls behavior of L2 cache. The default value is off (0).
 649
 650     ======================================== ================================================
 651     Syntax                                   Description
 652     ======================================== ================================================
 653     slc                                      Set slc bit to 1.
 654     ======================================== ================================================
 655
 656 .. _amdgpu_synid_tfe:
 657
 658 tfe
 659 ~~~
 660
 661 Controls access to partially resident textures. The default value is off (0).
 662
 663     ======================================== ================================================
 664     Syntax                                   Description
 665     ======================================== ================================================
 666     tfe                                      Set tfe bit to 1.
 667     ======================================== ================================================
 668
 669 .. _amdgpu_synid_sc0:
 670
 671 sc0
 672 ~~~
 673
 674 For atomic opcodes, this modifier indicates that the instruction returns the value from memory
 675 before the operation. For other opcodes, it is used together with :ref:`sc1<amdgpu_synid_sc1>`
 676 to specify cache policy.
 677
 678     ======================================== ================================================
 679     Syntax                                   Description
 680     ======================================== ================================================
 681     sc0                                      Set sc0 bit to 1.
 682     ======================================== ================================================
 683
 684 .. _amdgpu_synid_sc1:
 685
 686 sc1
 687 ~~~
 688
 689 This modifier is used together with :ref:`sc0<amdgpu_synid_sc0>` to specify cache
 690 policy.
 691
 692     ======================================== ================================================
 693     Syntax                                   Description
 694     ======================================== ================================================
 695     sc1                                      Set sc1 bit to 1.
 696     ======================================== ================================================
 697
 698 .. _amdgpu_synid_nt:
 699
 700 nt
 701 ~~
 702
 703 Indicates an operation with non-temporal data.
 704
 705     ======================================== ================================================
 706     Syntax                                   Description
 707     ======================================== ================================================
 708     nt                                       Set nt bit to 1.
 709     ======================================== ================================================
 710
 711 MUBUF/MTBUF Modifiers
 712 ---------------------
 713
 714 .. _amdgpu_synid_idxen:
 715
 716 idxen
 717 ~~~~~
 718
 719 Specifies whether address components include an index. By default, the index is not used.
 720
 721 May be used together with :ref:`offen<amdgpu_synid_offen>`.
 722
 723 Cannot be used with :ref:`addr64<amdgpu_synid_addr64>`.
 724
 725     ======================================== ================================================
 726     Syntax                                   Description
 727     ======================================== ================================================
 728     idxen                                    Address components include an index.
 729     ======================================== ================================================
 730
 731 .. _amdgpu_synid_offen:
 732
 733 offen
 734 ~~~~~
 735
 736 Specifies whether address components include an offset. By default, the offset is not used.
 737
 738 May be used together with :ref:`idxen<amdgpu_synid_idxen>`.
 739
 740 Cannot be used with :ref:`addr64<amdgpu_synid_addr64>`.
 741
 742     ======================================== ================================================
 743     Syntax                                   Description
 744     ======================================== ================================================
 745     offen                                    Address components include an offset.
 746     ======================================== ================================================
 747
 748 .. _amdgpu_synid_addr64:
 749
 750 addr64
 751 ~~~~~~
 752
 753 Specifies whether a 64-bit address is used. By default, no address is used.
 754
 755 Cannot be used with :ref:`offen<amdgpu_synid_offen>` and
 756 :ref:`idxen<amdgpu_synid_idxen>` modifiers.
 757
 758     ======================================== ================================================
 759     Syntax                                   Description
 760     ======================================== ================================================
 761     addr64                                   A 64-bit address is used.
 762     ======================================== ================================================
 763
 764 .. _amdgpu_synid_buf_offset12:
 765
 766 offset12
 767 ~~~~~~~~
 768
 769 Specifies an immediate unsigned 12-bit offset, in bytes. The default value is 0.
 770
 771     ================== ====================================================================
 772     Syntax             Description
 773     ================== ====================================================================
 774     offset:{0..0xFFF}  Specifies a 12-bit unsigned offset as a positive
 775                        :ref:`integer number <amdgpu_synid_integer_number>`
 776                        or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
 777     ================== ====================================================================
 778
 779 Examples:
 780
 781 .. parsed-literal::
 782
 783   offset:x+y
 784   offset:0x10
 785
 786 glc
 787 ~~~
 788
 789 See a description :ref:`here<amdgpu_synid_glc>`.
 790
 791 slc
 792 ~~~
 793
 794 See a description :ref:`here<amdgpu_synid_slc>`.
 795
 796 lds
 797 ~~~
 798
 799 See a description :ref:`here<amdgpu_synid_lds>`.
 800
 801 dlc
 802 ~~~
 803
 804 See a description :ref:`here<amdgpu_synid_dlc>`.
 805
 806 tfe
 807 ~~~
 808
 809 See a description :ref:`here<amdgpu_synid_tfe>`.
 810
 811 .. _amdgpu_synid_fmt:
 812
 813 fmt
 814 ~~~
 815
 816 Specifies data and numeric formats used by the operation.
 817 The default numeric format is BUF_NUM_FORMAT_UNORM.
 818 The default data format is BUF_DATA_FORMAT_8.
 819
 820     ========================================= ===============================================================
 821     Syntax                                    Description
 822     ========================================= ===============================================================
 823     format:{0..127}                           Use a format specified as either an
 824                                               :ref:`integer number<amdgpu_synid_integer_number>` or an
 825                                               :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
 826     format:[<data format>]                    Use the specified data format and
 827                                               default numeric format.
 828     format:[<numeric format>]                 Use the specified numeric format and
 829                                               default data format.
 830     format:[<data format>,<numeric format>]   Use the specified data and numeric formats.
 831     format:[<numeric format>,<data format>]   Use the specified data and numeric formats.
 832     ========================================= ===============================================================
 833
 834 .. _amdgpu_synid_format_data:
 835
 836 Supported data formats are defined in the following table:
 837
 838     ========================================= ===============================
 839     Syntax                                    Note
 840     ========================================= ===============================
 841     BUF_DATA_FORMAT_INVALID
 842     BUF_DATA_FORMAT_8                         The default value.
 843     BUF_DATA_FORMAT_16
 844     BUF_DATA_FORMAT_8_8
 845     BUF_DATA_FORMAT_32
 846     BUF_DATA_FORMAT_16_16
 847     BUF_DATA_FORMAT_10_11_11
 848     BUF_DATA_FORMAT_11_11_10
 849     BUF_DATA_FORMAT_10_10_10_2
 850     BUF_DATA_FORMAT_2_10_10_10
 851     BUF_DATA_FORMAT_8_8_8_8
 852     BUF_DATA_FORMAT_32_32
 853     BUF_DATA_FORMAT_16_16_16_16
 854     BUF_DATA_FORMAT_32_32_32
 855     BUF_DATA_FORMAT_32_32_32_32
 856     BUF_DATA_FORMAT_RESERVED_15
 857     ========================================= ===============================
 858
 859 .. _amdgpu_synid_format_num:
 860
 861 Supported numeric formats are defined below:
 862
 863     ========================================= ===============================
 864     Syntax                                    Note
 865     ========================================= ===============================
 866     BUF_NUM_FORMAT_UNORM                      The default value.
 867     BUF_NUM_FORMAT_SNORM
 868     BUF_NUM_FORMAT_USCALED
 869     BUF_NUM_FORMAT_SSCALED
 870     BUF_NUM_FORMAT_UINT
 871     BUF_NUM_FORMAT_SINT
 872     BUF_NUM_FORMAT_SNORM_OGL                  GFX7 only.
 873     BUF_NUM_FORMAT_RESERVED_6                 GFX8 and GFX9 only.
 874     BUF_NUM_FORMAT_FLOAT
 875     ========================================= ===============================
 876
 877 Examples:
 878
 879 .. parsed-literal::
 880
 881   format:0
 882   format:127
 883   format:[BUF_DATA_FORMAT_16]
 884   format:[BUF_DATA_FORMAT_16,BUF_NUM_FORMAT_SSCALED]
 885   format:[BUF_NUM_FORMAT_FLOAT]
 886
 887 .. _amdgpu_synid_ufmt:
 888
 889 ufmt
 890 ~~~~
 891
 892 Specifies a unified format used by the operation.
 893 The default format is BUF_FMT_8_UNORM.
 894
 895     ========================================= ===============================================================
 896     Syntax                                    Description
 897     ========================================= ===============================================================
 898     format:{0..127}                           Use a unified format specified as either an
 899                                               :ref:`integer number<amdgpu_synid_integer_number>` or an
 900                                               :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
 901                                               Note that unified format numbers are incompatible with
 902                                               format numbers used for pre-GFX10 ISA.
 903     format:[<unified format>]                 Use the specified unified format.
 904     ========================================= ===============================================================
 905
 906 Unified format is a replacement for :ref:`data<amdgpu_synid_format_data>`
 907 and :ref:`numeric<amdgpu_synid_format_num>` formats. For compatibility with older ISA,
 908 :ref:`the syntax with data and numeric formats<amdgpu_synid_fmt>` is still accepted
 909 provided that the combination of formats can be mapped to a unified format.
 910
 911 Supported unified formats and equivalent combinations of data and numeric formats
 912 are defined below:
 913
 914     ============================== ============================== ============================= ============
 915     Unified Format Syntax          Equivalent Data Format         Equivalent Numeric Format     Note
 916     ============================== ============================== ============================= ============
 917     BUF_FMT_INVALID                BUF_DATA_FORMAT_INVALID        BUF_NUM_FORMAT_UNORM
 918
 919     BUF_FMT_8_UNORM                BUF_DATA_FORMAT_8              BUF_NUM_FORMAT_UNORM
 920     BUF_FMT_8_SNORM                BUF_DATA_FORMAT_8              BUF_NUM_FORMAT_SNORM
 921     BUF_FMT_8_USCALED              BUF_DATA_FORMAT_8              BUF_NUM_FORMAT_USCALED
 922     BUF_FMT_8_SSCALED              BUF_DATA_FORMAT_8              BUF_NUM_FORMAT_SSCALED
 923     BUF_FMT_8_UINT                 BUF_DATA_FORMAT_8              BUF_NUM_FORMAT_UINT
 924     BUF_FMT_8_SINT                 BUF_DATA_FORMAT_8              BUF_NUM_FORMAT_SINT
 925
 926     BUF_FMT_16_UNORM               BUF_DATA_FORMAT_16             BUF_NUM_FORMAT_UNORM
 927     BUF_FMT_16_SNORM               BUF_DATA_FORMAT_16             BUF_NUM_FORMAT_SNORM
 928     BUF_FMT_16_USCALED             BUF_DATA_FORMAT_16             BUF_NUM_FORMAT_USCALED
 929     BUF_FMT_16_SSCALED             BUF_DATA_FORMAT_16             BUF_NUM_FORMAT_SSCALED
 930     BUF_FMT_16_UINT                BUF_DATA_FORMAT_16             BUF_NUM_FORMAT_UINT
 931     BUF_FMT_16_SINT                BUF_DATA_FORMAT_16             BUF_NUM_FORMAT_SINT
 932     BUF_FMT_16_FLOAT               BUF_DATA_FORMAT_16             BUF_NUM_FORMAT_FLOAT
 933
 934     BUF_FMT_8_8_UNORM              BUF_DATA_FORMAT_8_8            BUF_NUM_FORMAT_UNORM
 935     BUF_FMT_8_8_SNORM              BUF_DATA_FORMAT_8_8            BUF_NUM_FORMAT_SNORM
 936     BUF_FMT_8_8_USCALED            BUF_DATA_FORMAT_8_8            BUF_NUM_FORMAT_USCALED
 937     BUF_FMT_8_8_SSCALED            BUF_DATA_FORMAT_8_8            BUF_NUM_FORMAT_SSCALED
 938     BUF_FMT_8_8_UINT               BUF_DATA_FORMAT_8_8            BUF_NUM_FORMAT_UINT
 939     BUF_FMT_8_8_SINT               BUF_DATA_FORMAT_8_8            BUF_NUM_FORMAT_SINT
 940
 941     BUF_FMT_32_UINT                BUF_DATA_FORMAT_32             BUF_NUM_FORMAT_UINT
 942     BUF_FMT_32_SINT                BUF_DATA_FORMAT_32             BUF_NUM_FORMAT_SINT
 943     BUF_FMT_32_FLOAT               BUF_DATA_FORMAT_32             BUF_NUM_FORMAT_FLOAT
 944
 945     BUF_FMT_16_16_UNORM            BUF_DATA_FORMAT_16_16          BUF_NUM_FORMAT_UNORM
 946     BUF_FMT_16_16_SNORM            BUF_DATA_FORMAT_16_16          BUF_NUM_FORMAT_SNORM
 947     BUF_FMT_16_16_USCALED          BUF_DATA_FORMAT_16_16          BUF_NUM_FORMAT_USCALED
 948     BUF_FMT_16_16_SSCALED          BUF_DATA_FORMAT_16_16          BUF_NUM_FORMAT_SSCALED
 949     BUF_FMT_16_16_UINT             BUF_DATA_FORMAT_16_16          BUF_NUM_FORMAT_UINT
 950     BUF_FMT_16_16_SINT             BUF_DATA_FORMAT_16_16          BUF_NUM_FORMAT_SINT
 951     BUF_FMT_16_16_FLOAT            BUF_DATA_FORMAT_16_16          BUF_NUM_FORMAT_FLOAT
 952
 953     BUF_FMT_10_11_11_UNORM         BUF_DATA_FORMAT_10_11_11       BUF_NUM_FORMAT_UNORM          GFX10 only
 954     BUF_FMT_10_11_11_SNORM         BUF_DATA_FORMAT_10_11_11       BUF_NUM_FORMAT_SNORM          GFX10 only
 955     BUF_FMT_10_11_11_USCALED       BUF_DATA_FORMAT_10_11_11       BUF_NUM_FORMAT_USCALED        GFX10 only
 956     BUF_FMT_10_11_11_SSCALED       BUF_DATA_FORMAT_10_11_11       BUF_NUM_FORMAT_SSCALED        GFX10 only
 957     BUF_FMT_10_11_11_UINT          BUF_DATA_FORMAT_10_11_11       BUF_NUM_FORMAT_UINT           GFX10 only
 958     BUF_FMT_10_11_11_SINT          BUF_DATA_FORMAT_10_11_11       BUF_NUM_FORMAT_SINT           GFX10 only
 959     BUF_FMT_10_11_11_FLOAT         BUF_DATA_FORMAT_10_11_11       BUF_NUM_FORMAT_FLOAT
 960
 961     BUF_FMT_11_11_10_UNORM         BUF_DATA_FORMAT_11_11_10       BUF_NUM_FORMAT_UNORM          GFX10 only
 962     BUF_FMT_11_11_10_SNORM         BUF_DATA_FORMAT_11_11_10       BUF_NUM_FORMAT_SNORM          GFX10 only
 963     BUF_FMT_11_11_10_USCALED       BUF_DATA_FORMAT_11_11_10       BUF_NUM_FORMAT_USCALED        GFX10 only
 964     BUF_FMT_11_11_10_SSCALED       BUF_DATA_FORMAT_11_11_10       BUF_NUM_FORMAT_SSCALED        GFX10 only
 965     BUF_FMT_11_11_10_UINT          BUF_DATA_FORMAT_11_11_10       BUF_NUM_FORMAT_UINT           GFX10 only
 966     BUF_FMT_11_11_10_SINT          BUF_DATA_FORMAT_11_11_10       BUF_NUM_FORMAT_SINT           GFX10 only
 967     BUF_FMT_11_11_10_FLOAT         BUF_DATA_FORMAT_11_11_10       BUF_NUM_FORMAT_FLOAT
 968
 969     BUF_FMT_10_10_10_2_UNORM       BUF_DATA_FORMAT_10_10_10_2     BUF_NUM_FORMAT_UNORM
 970     BUF_FMT_10_10_10_2_SNORM       BUF_DATA_FORMAT_10_10_10_2     BUF_NUM_FORMAT_SNORM
 971     BUF_FMT_10_10_10_2_USCALED     BUF_DATA_FORMAT_10_10_10_2     BUF_NUM_FORMAT_USCALED        GFX10 only
 972     BUF_FMT_10_10_10_2_SSCALED     BUF_DATA_FORMAT_10_10_10_2     BUF_NUM_FORMAT_SSCALED        GFX10 only
 973     BUF_FMT_10_10_10_2_UINT        BUF_DATA_FORMAT_10_10_10_2     BUF_NUM_FORMAT_UINT
 974     BUF_FMT_10_10_10_2_SINT        BUF_DATA_FORMAT_10_10_10_2     BUF_NUM_FORMAT_SINT
 975
 976     BUF_FMT_2_10_10_10_UNORM       BUF_DATA_FORMAT_2_10_10_10     BUF_NUM_FORMAT_UNORM
 977     BUF_FMT_2_10_10_10_SNORM       BUF_DATA_FORMAT_2_10_10_10     BUF_NUM_FORMAT_SNORM
 978     BUF_FMT_2_10_10_10_USCALED     BUF_DATA_FORMAT_2_10_10_10     BUF_NUM_FORMAT_USCALED
 979     BUF_FMT_2_10_10_10_SSCALED     BUF_DATA_FORMAT_2_10_10_10     BUF_NUM_FORMAT_SSCALED
 980     BUF_FMT_2_10_10_10_UINT        BUF_DATA_FORMAT_2_10_10_10     BUF_NUM_FORMAT_UINT
 981     BUF_FMT_2_10_10_10_SINT        BUF_DATA_FORMAT_2_10_10_10     BUF_NUM_FORMAT_SINT
 982
 983     BUF_FMT_8_8_8_8_UNORM          BUF_DATA_FORMAT_8_8_8_8        BUF_NUM_FORMAT_UNORM
 984     BUF_FMT_8_8_8_8_SNORM          BUF_DATA_FORMAT_8_8_8_8        BUF_NUM_FORMAT_SNORM
 985     BUF_FMT_8_8_8_8_USCALED        BUF_DATA_FORMAT_8_8_8_8        BUF_NUM_FORMAT_USCALED
 986     BUF_FMT_8_8_8_8_SSCALED        BUF_DATA_FORMAT_8_8_8_8        BUF_NUM_FORMAT_SSCALED
 987     BUF_FMT_8_8_8_8_UINT           BUF_DATA_FORMAT_8_8_8_8        BUF_NUM_FORMAT_UINT
 988     BUF_FMT_8_8_8_8_SINT           BUF_DATA_FORMAT_8_8_8_8        BUF_NUM_FORMAT_SINT
 989
 990     BUF_FMT_32_32_UINT             BUF_DATA_FORMAT_32_32          BUF_NUM_FORMAT_UINT
 991     BUF_FMT_32_32_SINT             BUF_DATA_FORMAT_32_32          BUF_NUM_FORMAT_SINT
 992     BUF_FMT_32_32_FLOAT            BUF_DATA_FORMAT_32_32          BUF_NUM_FORMAT_FLOAT
 993
 994     BUF_FMT_16_16_16_16_UNORM      BUF_DATA_FORMAT_16_16_16_16    BUF_NUM_FORMAT_UNORM
 995     BUF_FMT_16_16_16_16_SNORM      BUF_DATA_FORMAT_16_16_16_16    BUF_NUM_FORMAT_SNORM
 996     BUF_FMT_16_16_16_16_USCALED    BUF_DATA_FORMAT_16_16_16_16    BUF_NUM_FORMAT_USCALED
 997     BUF_FMT_16_16_16_16_SSCALED    BUF_DATA_FORMAT_16_16_16_16    BUF_NUM_FORMAT_SSCALED
 998     BUF_FMT_16_16_16_16_UINT       BUF_DATA_FORMAT_16_16_16_16    BUF_NUM_FORMAT_UINT
 999     BUF_FMT_16_16_16_16_SINT       BUF_DATA_FORMAT_16_16_16_16    BUF_NUM_FORMAT_SINT
1000     BUF_FMT_16_16_16_16_FLOAT      BUF_DATA_FORMAT_16_16_16_16    BUF_NUM_FORMAT_FLOAT
1001
1002     BUF_FMT_32_32_32_UINT          BUF_DATA_FORMAT_32_32_32       BUF_NUM_FORMAT_UINT
1003     BUF_FMT_32_32_32_SINT          BUF_DATA_FORMAT_32_32_32       BUF_NUM_FORMAT_SINT
1004     BUF_FMT_32_32_32_FLOAT         BUF_DATA_FORMAT_32_32_32       BUF_NUM_FORMAT_FLOAT
1005     BUF_FMT_32_32_32_32_UINT       BUF_DATA_FORMAT_32_32_32_32    BUF_NUM_FORMAT_UINT
1006     BUF_FMT_32_32_32_32_SINT       BUF_DATA_FORMAT_32_32_32_32    BUF_NUM_FORMAT_SINT
1007     BUF_FMT_32_32_32_32_FLOAT      BUF_DATA_FORMAT_32_32_32_32    BUF_NUM_FORMAT_FLOAT
1008     ============================== ============================== ============================= ============
1009
1010 Examples:
1011
1012 .. parsed-literal::
1013
1014   format:0
1015   format:[BUF_FMT_32_UINT]
1016
1017 SMRD/SMEM Modifiers
1018 -------------------
1019
1020 glc
1021 ~~~
1022
1023 See a description :ref:`here<amdgpu_synid_glc>`.
1024
1025 nv
1026 ~~
1027
1028 See a description :ref:`here<amdgpu_synid_nv>`.
1029
1030 dlc
1031 ~~~
1032
1033 See a description :ref:`here<amdgpu_synid_dlc>`.
1034
1035 .. _amdgpu_synid_smem_offset20u:
1036
1037 offset20u
1038 ~~~~~~~~~
1039
1040 Specifies an unsigned 20-bit offset, in bytes. The default value is 0.
1041
1042     ==================== ====================================================================
1043     Syntax               Description
1044     ==================== ====================================================================
1045     offset:{0..0xFFFFF}  Specifies an offset as a positive
1046                          :ref:`integer number <amdgpu_synid_integer_number>`
1047                          or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
1048     ==================== ====================================================================
1049
1050 Examples:
1051
1052 .. parsed-literal::
1053
1054   offset:1
1055   offset:0xfffff
1056   offset:x-y
1057
1058 .. _amdgpu_synid_smem_offset21s:
1059
1060 offset21s
1061 ~~~~~~~~~
1062
1063 Specifies a signed 21-bit offset, in bytes. The default value is 0.
1064
1065     ============================= ====================================================================
1066     Syntax                        Description
1067     ============================= ====================================================================
1068     offset:{-0x100000..0xFFFFF}   Specifies an offset as an
1069                                   :ref:`integer number <amdgpu_synid_integer_number>`
1070                                   or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
1071     ============================= ====================================================================
1072
1073 Examples:
1074
1075 .. parsed-literal::
1076
1077   offset:-1
1078   offset:0xfffff
1079   offset:-x
1080
1081 VINTRP/VINTERP/LDSDIR Modifiers
1082 -------------------------------
1083
1084 .. _amdgpu_synid_high:
1085
1086 high
1087 ~~~~
1088
1089 Specifies which half of the LDS word to use. Low half of LDS word is used by default.
1090
1091     ======================================== ================================
1092     Syntax                                   Description
1093     ======================================== ================================
1094     high                                     Use the high half of LDS word.
1095     ======================================== ================================
1096
1097 neg
1098 ~~~
1099
1100 See a description :ref:`here<amdgpu_synid_neg>`.
1101
1102 .. _amdgpu_synid_wait_exp:
1103
1104 wait_exp
1105 ~~~~~~~~
1106
1107 Specifies a wait on the EXP counter before issuing the current instruction.
1108 The counter must be less than or equal to this value before the instruction is issued.
1109 If set to 7, no wait is performed.
1110
1111 The default value is zero. This is a safe value, but it may be suboptimal.
1112
1113     ================ ======================================================
1114     Syntax           Description
1115     ================ ======================================================
1116     wait_exp:{0..7}  An additional wait on the EXP counter before
1117                      issuing this instruction.
1118     ================ ======================================================
1119
1120 .. _amdgpu_synid_wait_vdst:
1121
1122 wait_vdst
1123 ~~~~~~~~~
1124
1125 Specifies a wait on the VA_VDST counter before issuing the current instruction.
1126 The counter must be less than or equal to this value before the instruction is issued.
1127 If set to 15, no wait is performed.
1128
1129 The default value is zero. This is a safe value, but it may be suboptimal.
1130
1131     ================== ======================================================
1132     Syntax             Description
1133     ================== ======================================================
1134     wait_vdst:{0..15}  An additional wait on the VA_VDST counter before
1135                        issuing this instruction.
1136     ================== ======================================================
1137
1138 DPP8 Modifiers
1139 --------------
1140
1141 .. _amdgpu_synid_dpp8_sel:
1142
1143 dpp8_sel
1144 ~~~~~~~~
1145
1146 Selects which lanes to pull data from, within a group of 8 lanes. This is a mandatory modifier.
1147 There is no default value.
1148
1149 The *dpp8_sel* modifier must specify exactly 8 values.
1150 The first value selects which lane to read from to supply data into lane 0.
1151 The second value controls lane 1 and so on.
1152
1153 Each value may be specified as either
1154 an :ref:`integer number<amdgpu_synid_integer_number>` or
1155 an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
1156
1157     =============================================================== ===========================
1158     Syntax                                                          Description
1159     =============================================================== ===========================
1160     dpp8:[{0..7},{0..7},{0..7},{0..7},{0..7},{0..7},{0..7},{0..7}]  Select lanes to read from.
1161     =============================================================== ===========================
1162
1163 Examples:
1164
1165 .. parsed-literal::
1166
1167   dpp8:[7,6,5,4,3,2,1,0]
1168   dpp8:[0,1,0,1,0,1,0,1]
1169
1170 .. _amdgpu_synid_fi8:
1171
1172 fi
1173 ~~
1174
1175 Controls interaction with inactive lanes for *dpp8* instructions. The default value is zero.
1176
1177 Note: *inactive* lanes are those whose :ref:`exec<amdgpu_synid_exec>` mask bit is zero.
1178
1179     ==================================== =====================================================
1180     Syntax                               Description
1181     ==================================== =====================================================
1182     fi:0                                 Fetch zero when accessing data from inactive lanes.
1183     fi:1                                 Fetch pre-existing values from inactive lanes.
1184     ==================================== =====================================================
1185
1186 Note: numeric values may be specified as either
1187 :ref:`integer numbers<amdgpu_synid_integer_number>` or
1188 :ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
1189
1190 DPP Modifiers
1191 -------------
1192
1193 .. _amdgpu_synid_dpp_ctrl:
1194
1195 dpp_ctrl
1196 ~~~~~~~~
1197
1198 Specifies how data is shared between threads. This is a mandatory modifier.
1199 There is no default value.
1200
1201 Note: the lanes of a wavefront are organized in four *rows* and four *banks*.
1202
1203     ======================================== ========================================================
1204     Syntax                                   Description
1205     ======================================== ========================================================
1206     quad_perm:[{0..3},{0..3},{0..3},{0..3}]  Full permute of 4 threads.
1207     row_mirror                               Mirror threads within row.
1208     row_half_mirror                          Mirror threads within 1/2 row (8 threads).
1209     row_bcast:15                             Broadcast the 15th thread of each row to the next row.
1210     row_bcast:31                             Broadcast thread 31 to rows 2 and 3.
1211     wave_shl:1                               Wavefront left shift by 1 thread.
1212     wave_rol:1                               Wavefront left rotate by 1 thread.
1213     wave_shr:1                               Wavefront right shift by 1 thread.
1214     wave_ror:1                               Wavefront right rotate by 1 thread.
1215     row_shl:{1..15}                          Row shift left by 1-15 threads.
1216     row_shr:{1..15}                          Row shift right by 1-15 threads.
1217     row_ror:{1..15}                          Row rotate right by 1-15 threads.
1218     ======================================== ========================================================
1219
1220 Note: numeric values may be specified as either
1221 :ref:`integer numbers<amdgpu_synid_integer_number>` or
1222 :ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
1223
1224 Examples:
1225
1226 .. parsed-literal::
1227
1228   quad_perm:[0, 1, 2, 3]
1229   row_shl:3
1230
1231 .. _amdgpu_synid_dpp16_ctrl:
1232
1233 dpp16_ctrl
1234 ~~~~~~~~~~
1235
1236 Specifies how data is shared between threads. This is a mandatory modifier.
1237 There is no default value.
1238
1239 Note: the lanes of a wavefront are organized in four *rows* and four *banks*.
1240 (There are only two rows in *wave32* mode.)
1241
1242     ======================================== =======================================================
1243     Syntax                                   Description
1244     ======================================== =======================================================
1245     quad_perm:[{0..3},{0..3},{0..3},{0..3}]  Full permute of 4 threads.
1246     row_mirror                               Mirror threads within row.
1247     row_half_mirror                          Mirror threads within 1/2 row (8 threads).
1248     row_share:{0..15}                        Share the value from the specified lane with other
1249                                              lanes in the row.
1250     row_xmask:{0..15}                        Fetch from XOR(<current lane id>,<specified lane id>).
1251     row_shl:{1..15}                          Row shift left by 1-15 threads.
1252     row_shr:{1..15}                          Row shift right by 1-15 threads.
1253     row_ror:{1..15}                          Row rotate right by 1-15 threads.
1254     ======================================== =======================================================
1255
1256 Note: numeric values may be specified as either
1257 :ref:`integer numbers<amdgpu_synid_integer_number>` or
1258 :ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
1259
1260 Examples:
1261
1262 .. parsed-literal::
1263
1264   quad_perm:[0, 1, 2, 3]
1265   row_shl:3
1266
1267 .. _amdgpu_synid_dpp32_ctrl:
1268
1269 dpp32_ctrl
1270 ~~~~~~~~~~
1271
1272 Specifies how data is shared between threads. This is a mandatory modifier.
1273 There is no default value.
1274
1275 Note: the lanes of a wavefront are organized in four *rows* and four *banks*.
1276
1277     ======================================== =========================================================
1278     Syntax                                   Description
1279     ======================================== =========================================================
1280     quad_perm:[{0..3},{0..3},{0..3},{0..3}]  Full permute of 4 threads.
1281     row_mirror                               Mirror threads within row.
1282     row_half_mirror                          Mirror threads within 1/2 row (8 threads).
1283     row_bcast:15                             Broadcast the 15th thread of each row to the next row.
1284     row_bcast:31                             Broadcast thread 31 to rows 2 and 3.
1285     wave_shl:1                               Wavefront left shift by 1 thread.
1286     wave_rol:1                               Wavefront left rotate by 1 thread.
1287     wave_shr:1                               Wavefront right shift by 1 thread.
1288     wave_ror:1                               Wavefront right rotate by 1 thread.
1289     row_shl:{1..15}                          Row shift left by 1-15 threads.
1290     row_shr:{1..15}                          Row shift right by 1-15 threads.
1291     row_ror:{1..15}                          Row rotate right by 1-15 threads.
1292     row_newbcast:{1..15}                     Broadcast a thread within a row to the whole row.
1293     ======================================== =========================================================
1294
1295 Note: numeric values may be specified as either
1296 :ref:`integer numbers<amdgpu_synid_integer_number>` or
1297 :ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
1298
1299 Examples:
1300
1301 .. parsed-literal::
1302
1303   quad_perm:[0, 1, 2, 3]
1304   row_shl:3
1305
1306
1307 .. _amdgpu_synid_dpp64_ctrl:
1308
1309 dpp64_ctrl
1310 ~~~~~~~~~~
1311
1312 Specifies how data is shared between threads. This is a mandatory modifier.
1313 There is no default value.
1314
1315 Note: the lanes of a wavefront are organized in four *rows* and four *banks*.
1316
1317     ======================================== ==================================================
1318     Syntax                                   Description
1319     ======================================== ==================================================
1320     row_newbcast:{1..15}                     Broadcast a thread within a row to the whole row.
1321     ======================================== ==================================================
1322
1323 Note: numeric values may be specified as either
1324 :ref:`integer numbers<amdgpu_synid_integer_number>` or
1325 :ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
1326
1327 Examples:
1328
1329 .. parsed-literal::
1330
1331   row_newbcast:3
1332
1333
1334 .. _amdgpu_synid_row_mask:
1335
1336 row_mask
1337 ~~~~~~~~
1338
1339 Controls which rows are enabled for data sharing. By default, all rows are enabled.
1340
1341 Note: the lanes of a wavefront are organized in four *rows* and four *banks*.
1342 (There are only two rows in *wave32* mode.)
1343
1344     ================= ====================================================================
1345     Syntax            Description
1346     ================= ====================================================================
1347     row_mask:{0..15}  Specifies a *row mask* as a positive
1348                       :ref:`integer number <amdgpu_synid_integer_number>`
1349                       or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
1350
1351                       Each of the 4 bits in the mask controls one row
1352                       (0 - disabled, 1 - enabled).
1353
1354                       In *wave32* mode, the values shall be limited to {0..7}.
1355     ================= ====================================================================
1356
1357 Examples:
1358
1359 .. parsed-literal::
1360
1361   row_mask:0xf
1362   row_mask:0b1010
1363   row_mask:x|y
1364
1365 .. _amdgpu_synid_bank_mask:
1366
1367 bank_mask
1368 ~~~~~~~~~
1369
1370 Controls which banks are enabled for data sharing. By default, all banks are enabled.
1371
1372 Note: the lanes of a wavefront are organized in four *rows* and four *banks*.
1373 (There are only two rows in *wave32* mode.)
1374
1375     ================== ====================================================================
1376     Syntax             Description
1377     ================== ====================================================================
1378     bank_mask:{0..15}  Specifies a *bank mask* as a positive
1379                        :ref:`integer number <amdgpu_synid_integer_number>`
1380                        or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
1381
1382                        Each of the 4 bits in the mask controls one bank
1383                        (0 - disabled, 1 - enabled).
1384     ================== ====================================================================
1385
1386 Examples:
1387
1388 .. parsed-literal::
1389
1390   bank_mask:0x3
1391   bank_mask:0b0011
1392   bank_mask:x&y
1393
1394 .. _amdgpu_synid_bound_ctrl:
1395
1396 bound_ctrl
1397 ~~~~~~~~~~
1398
1399 Controls data sharing when accessing an invalid lane. By default, data sharing with
1400 invalid lanes is disabled.
1401
1402     ======================================== ================================================
1403     Syntax                                   Description
1404     ======================================== ================================================
1405     bound_ctrl:1                             Enables data sharing with invalid lanes.
1406
1407                                              Accessing data from an invalid lane will
1408                                              return zero.
1409
1410     bound_ctrl:0 (GFX11+)                    Disables data sharing with invalid lanes.
1411     ======================================== ================================================
1412
1413 .. WARNING:: For historical reasons, *bound_ctrl:0* has the same meaning as *bound_ctrl:1* for older architectures.
1414
1415 .. _amdgpu_synid_fi16:
1416
1417 fi
1418 ~~
1419
1420 Controls interaction with *inactive* lanes for *dpp16* instructions. The default value is zero.
1421
1422 Note: *inactive* lanes are those whose :ref:`exec<amdgpu_synid_exec>` mask bit is zero.
1423
1424     ======================================== ==================================================
1425     Syntax                                   Description
1426     ======================================== ==================================================
1427     fi:0                                     Interaction with inactive lanes is controlled by
1428                                              :ref:`bound_ctrl<amdgpu_synid_bound_ctrl>`.
1429
1430     fi:1                                     Fetch pre-existing values from inactive lanes.
1431     ======================================== ==================================================
1432
1433 Note: numeric values may be specified as either
1434 :ref:`integer numbers<amdgpu_synid_integer_number>` or
1435 :ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
1436
1437 SDWA Modifiers
1438 --------------
1439
1440 clamp
1441 ~~~~~
1442
1443 See a description :ref:`here<amdgpu_synid_clamp>`.
1444
1445 omod
1446 ~~~~
1447
1448 See a description :ref:`here<amdgpu_synid_omod>`.
1449
1450 .. _amdgpu_synid_dst_sel:
1451
1452 dst_sel
1453 ~~~~~~~
1454
1455 Selects which bits in the destination are affected. By default, all bits are affected.
1456
1457     ======================================== ================================================
1458     Syntax                                   Description
1459     ======================================== ================================================
1460     dst_sel:DWORD                            Use bits 31:0.
1461     dst_sel:BYTE_0                           Use bits 7:0.
1462     dst_sel:BYTE_1                           Use bits 15:8.
1463     dst_sel:BYTE_2                           Use bits 23:16.
1464     dst_sel:BYTE_3                           Use bits 31:24.
1465     dst_sel:WORD_0                           Use bits 15:0.
1466     dst_sel:WORD_1                           Use bits 31:16.
1467     ======================================== ================================================
1468
1469 .. _amdgpu_synid_dst_unused:
1470
1471 dst_unused
1472 ~~~~~~~~~~
1473
1474 Controls what to do with the bits in the destination which are not selected
1475 by :ref:`dst_sel<amdgpu_synid_dst_sel>`.
1476 By default, unused bits are preserved.
1477
1478     ======================================== ================================================
1479     Syntax                                   Description
1480     ======================================== ================================================
1481     dst_unused:UNUSED_PAD                    Pad with zeros.
1482     dst_unused:UNUSED_SEXT                   Sign-extend upper bits, zero lower bits.
1483     dst_unused:UNUSED_PRESERVE               Preserve bits.
1484     ======================================== ================================================
1485
1486 .. _amdgpu_synid_src0_sel:
1487
1488 src0_sel
1489 ~~~~~~~~
1490
1491 Controls which bits in the src0 are used. By default, all bits are used.
1492
1493     ======================================== ================================================
1494     Syntax                                   Description
1495     ======================================== ================================================
1496     src0_sel:DWORD                           Use bits 31:0.
1497     src0_sel:BYTE_0                          Use bits 7:0.
1498     src0_sel:BYTE_1                          Use bits 15:8.
1499     src0_sel:BYTE_2                          Use bits 23:16.
1500     src0_sel:BYTE_3                          Use bits 31:24.
1501     src0_sel:WORD_0                          Use bits 15:0.
1502     src0_sel:WORD_1                          Use bits 31:16.
1503     ======================================== ================================================
1504
1505 .. _amdgpu_synid_src1_sel:
1506
1507 src1_sel
1508 ~~~~~~~~
1509
1510 Controls which bits in the src1 are used. By default, all bits are used.
1511
1512     ======================================== ================================================
1513     Syntax                                   Description
1514     ======================================== ================================================
1515     src1_sel:DWORD                           Use bits 31:0.
1516     src1_sel:BYTE_0                          Use bits 7:0.
1517     src1_sel:BYTE_1                          Use bits 15:8.
1518     src1_sel:BYTE_2                          Use bits 23:16.
1519     src1_sel:BYTE_3                          Use bits 31:24.
1520     src1_sel:WORD_0                          Use bits 15:0.
1521     src1_sel:WORD_1                          Use bits 31:16.
1522     ======================================== ================================================
1523
1524 .. _amdgpu_synid_sdwa_operand_modifiers:
1525
1526 SDWA Operand Modifiers
1527 ----------------------
1528
1529 Operand modifiers are not used separately. They are applied to source operands.
1530
1531 abs
1532 ~~~
1533
1534 See a description :ref:`here<amdgpu_synid_abs>`.
1535
1536 neg
1537 ~~~
1538
1539 See a description :ref:`here<amdgpu_synid_neg>`.
1540
1541 .. _amdgpu_synid_sext:
1542
1543 sext
1544 ~~~~
1545
1546 Sign-extends the value of a (sub-dword) integer operand to fill all 32 bits.
1547
1548 Valid for integer operands only.
1549
1550     ======================================== ================================================
1551     Syntax                                   Description
1552     ======================================== ================================================
1553     sext(<operand>)                          Sign-extend operand value.
1554     ======================================== ================================================
1555
1556 Examples:
1557
1558 .. parsed-literal::
1559
1560   sext(v4)
1561   sext(v255)
1562
1563 VOP3 Modifiers
1564 --------------
1565
1566 .. _amdgpu_synid_vop3_op_sel:
1567
1568 op_sel
1569 ~~~~~~
1570
1571 Selects the low [15:0] or high [31:16] operand bits for source and destination operands.
1572 By default, low bits are used for all operands.
1573
1574 The number of values specified with the op_sel modifier must match the number of instruction
1575 operands (both source and destination). The first value controls src0, the second value controls src1
1576 and so on, except that the last value controls destination.
1577 The value 0 selects the low bits, while 1 selects the high bits.
1578
1579 Note: op_sel modifier affects 16-bit operands only. For 32-bit operands, the value specified
1580 by op_sel must be 0.
1581
1582     ======================================== ============================================================
1583     Syntax                                   Description
1584     ======================================== ============================================================
1585     op_sel:[{0..1},{0..1}]                   Select operand bits for instructions with 1 source operand.
1586     op_sel:[{0..1},{0..1},{0..1}]            Select operand bits for instructions with 2 source operands.
1587     op_sel:[{0..1},{0..1},{0..1},{0..1}]     Select operand bits for instructions with 3 source operands.
1588     ======================================== ============================================================
1589
1590 Note: numeric values may be specified as either
1591 :ref:`integer numbers<amdgpu_synid_integer_number>` or
1592 :ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
1593
1594 Examples:
1595
1596 .. parsed-literal::
1597
1598   op_sel:[0,0]
1599   op_sel:[0,1]
1600
1601 .. _amdgpu_synid_dpp_op_sel:
1602
1603 dpp_op_sel
1604 ~~~~~~~~~~
1605
1606 This is a special version of *op_sel* used for *permlane* opcodes to specify
1607 dpp-like mode bits - :ref:`fi<amdgpu_synid_fi16>` and
1608 :ref:`bound_ctrl<amdgpu_synid_bound_ctrl>`.
1609
1610     ======================================== =================================================================
1611     Syntax                                   Description
1612     ======================================== =================================================================
1613     op_sel:[{0..1},{0..1}]                   The first bit specifies :ref:`fi<amdgpu_synid_fi16>`, the second
1614                                              bit specifies :ref:`bound_ctrl<amdgpu_synid_bound_ctrl>`.
1615     ======================================== =================================================================
1616
1617 Note: numeric values may be specified as either
1618 :ref:`integer numbers<amdgpu_synid_integer_number>` or
1619 :ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
1620
1621 Examples:
1622
1623 .. parsed-literal::
1624
1625   op_sel:[0,0]
1626
1627 .. _amdgpu_synid_clamp:
1628
1629 clamp
1630 ~~~~~
1631
1632 Clamp meaning depends on instruction.
1633
1634 For *v_cmp* instructions, clamp modifier indicates that the compare signals
1635 if a floating-point exception occurs. By default, signaling is disabled.
1636
1637 For integer operations, clamp modifier indicates that the result must be clamped
1638 to the largest and smallest representable value. By default, there is no clamping.
1639
1640 For floating-point operations, clamp modifier indicates that the result must be clamped
1641 to the range [0.0, 1.0]. By default, there is no clamping.
1642
1643 Note: clamp modifier is applied after :ref:`output modifiers<amdgpu_synid_omod>` (if any).
1644
1645     ======================================== ================================================
1646     Syntax                                   Description
1647     ======================================== ================================================
1648     clamp                                    Enables clamping (or signaling).
1649     ======================================== ================================================
1650
1651 .. _amdgpu_synid_omod:
1652
1653 omod
1654 ~~~~
1655
1656 Specifies if an output modifier must be applied to the result.
1657 It is assumed that the result is a floating-point number.
1658
1659 By default, no output modifiers are applied.
1660
1661 Note: output modifiers are applied before :ref:`clamping<amdgpu_synid_clamp>` (if any).
1662
1663     ======================================== ================================================
1664     Syntax                                   Description
1665     ======================================== ================================================
1666     mul:2                                    Multiply the result by 2.
1667     mul:4                                    Multiply the result by 4.
1668     div:2                                    Multiply the result by 0.5.
1669     ======================================== ================================================
1670
1671 Note: numeric values may be specified as either
1672 :ref:`integer numbers<amdgpu_synid_integer_number>` or
1673 :ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
1674
1675 Examples:
1676
1677 .. parsed-literal::
1678
1679   mul:2
1680   mul:x      // x must be equal to 2 or 4
1681
1682 .. _amdgpu_synid_vop3_operand_modifiers:
1683
1684 VOP3 Operand Modifiers
1685 ----------------------
1686
1687 Operand modifiers are not used separately. They are applied to source operands.
1688
1689 .. _amdgpu_synid_abs:
1690
1691 abs
1692 ~~~
1693
1694 Computes the absolute value of its operand. Must be applied before :ref:`neg<amdgpu_synid_neg>`
1695 (if any). Valid for floating-point operands only.
1696
1697     ======================================== ====================================================
1698     Syntax                                   Description
1699     ======================================== ====================================================
1700     abs(<operand>)                           Get the absolute value of a floating-point operand.
1701     \|<operand>|                             The same as above (an SP3 syntax).
1702     ======================================== ====================================================
1703
1704 Note: avoid using SP3 syntax with operands specified as expressions because the trailing '|'
1705 may be misinterpreted. Such operands should be enclosed into additional parentheses, as shown
1706 in examples below.
1707
1708 Examples:
1709
1710 .. parsed-literal::
1711
1712   abs(v36)
1713   \|v36|
1714   abs(x|y)     // ok
1715   \|(x|y)|      // additional parentheses are required
1716
1717 .. _amdgpu_synid_neg:
1718
1719 neg
1720 ~~~
1721
1722 Computes the negative value of its operand. Must be applied after :ref:`abs<amdgpu_synid_abs>`
1723 (if any). Valid for floating-point operands only.
1724
1725     ================== ====================================================
1726     Syntax             Description
1727     ================== ====================================================
1728     neg(<operand>)     Get the negative value of a floating-point operand.
1729                        An optional :ref:`abs<amdgpu_synid_abs>` modifier
1730                        may be applied to the operand before negation.
1731     -<operand>         The same as above (an SP3 syntax).
1732     ================== ====================================================
1733
1734 Note: SP3 syntax is supported with limitations because of a potential ambiguity.
1735 Currently, it is allowed in the following cases:
1736
1737 * Before a register.
1738 * Before an :ref:`abs<amdgpu_synid_abs>` modifier.
1739 * Before an SP3 :ref:`abs<amdgpu_synid_abs>` modifier.
1740
1741 In all other cases, "-" is handled as a part of an expression that follows the sign.
1742
1743 Examples:
1744
1745 .. parsed-literal::
1746
1747   // Operands with negate modifiers
1748   neg(v[0])
1749   neg(1.0)
1750   neg(abs(v0))
1751   -v5
1752   -abs(v5)
1753   -\|v5|
1754
1755   // Expressions where "-" has a different meaning
1756   -1
1757   -x+y
1758
1759 VOP3P Modifiers
1760 ---------------
1761
1762 This section describes modifiers of *regular* VOP3P instructions.
1763
1764 *v_mad_mix\** and *v_fma_mix\**
1765 instructions use these modifiers :ref:`in a special manner<amdgpu_synid_mad_mix>`.
1766
1767 .. _amdgpu_synid_op_sel:
1768
1769 op_sel
1770 ~~~~~~
1771
1772 Selects the low [15:0] or high [31:16] operand bits as input to the operation,
1773 which results in the lower-half of the destination.
1774 By default, low 16 bits are used for all operands.
1775
1776 The number of values specified by the *op_sel* modifier must match the number of source
1777 operands. The first value controls src0, the second value controls src1 and so on.
1778
1779 The value 0 selects the low bits, while 1 selects the high bits.
1780
1781     ================================= =============================================================
1782     Syntax                            Description
1783     ================================= =============================================================
1784     op_sel:[{0..1}]                   Select operand bits for instructions with 1 source operand.
1785     op_sel:[{0..1},{0..1}]            Select operand bits for instructions with 2 source operands.
1786     op_sel:[{0..1},{0..1},{0..1}]     Select operand bits for instructions with 3 source operands.
1787     ================================= =============================================================
1788
1789 Note: numeric values may be specified as either
1790 :ref:`integer numbers<amdgpu_synid_integer_number>` or
1791 :ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
1792
1793 Examples:
1794
1795 .. parsed-literal::
1796
1797   op_sel:[0,0]
1798   op_sel:[0,1,0]
1799
1800 .. _amdgpu_synid_op_sel_hi:
1801
1802 op_sel_hi
1803 ~~~~~~~~~
1804
1805 Selects the low [15:0] or high [31:16] operand bits as input to the operation,
1806 which results in the upper-half of the destination.
1807 By default, high 16 bits are used for all operands.
1808
1809 The number of values specified by the *op_sel_hi* modifier must match the number of source
1810 operands. The first value controls src0, the second value controls src1 and so on.
1811
1812 The value 0 selects the low bits, while 1 selects the high bits.
1813
1814     =================================== =============================================================
1815     Syntax                              Description
1816     =================================== =============================================================
1817     op_sel_hi:[{0..1}]                  Select operand bits for instructions with 1 source operand.
1818     op_sel_hi:[{0..1},{0..1}]           Select operand bits for instructions with 2 source operands.
1819     op_sel_hi:[{0..1},{0..1},{0..1}]    Select operand bits for instructions with 3 source operands.
1820     =================================== =============================================================
1821
1822 Note: numeric values may be specified as either
1823 :ref:`integer numbers<amdgpu_synid_integer_number>` or
1824 :ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
1825
1826 Examples:
1827
1828 .. parsed-literal::
1829
1830   op_sel_hi:[0,0]
1831   op_sel_hi:[0,0,1]
1832
1833 .. _amdgpu_synid_neg_lo:
1834
1835 neg_lo
1836 ~~~~~~
1837
1838 Specifies whether to change the sign of operand values selected by
1839 :ref:`op_sel<amdgpu_synid_op_sel>`. These values are then used
1840 as input to the operation, which results in the upper-half of the destination.
1841
1842 The number of values specified by this modifier must match the number of source
1843 operands. The first value controls src0, the second value controls src1 and so on.
1844
1845 The value 0 indicates that the corresponding operand value is used unmodified,
1846 the value 1 indicates that the negative value of the operand must be used.
1847
1848 By default, operand values are used unmodified.
1849
1850 This modifier is valid for floating-point operands only.
1851
1852     ================================ ==================================================================
1853     Syntax                           Description
1854     ================================ ==================================================================
1855     neg_lo:[{0..1}]                  Select affected operands for instructions with 1 source operand.
1856     neg_lo:[{0..1},{0..1}]           Select affected operands for instructions with 2 source operands.
1857     neg_lo:[{0..1},{0..1},{0..1}]    Select affected operands for instructions with 3 source operands.
1858     ================================ ==================================================================
1859
1860 Note: numeric values may be specified as either
1861 :ref:`integer numbers<amdgpu_synid_integer_number>` or
1862 :ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
1863
1864 Examples:
1865
1866 .. parsed-literal::
1867
1868   neg_lo:[0]
1869   neg_lo:[0,1]
1870
1871 .. _amdgpu_synid_neg_hi:
1872
1873 neg_hi
1874 ~~~~~~
1875
1876 Specifies whether to change sign of operand values selected by
1877 :ref:`op_sel_hi<amdgpu_synid_op_sel_hi>`. These values are then used
1878 as input to the operation, which results in the upper-half of the destination.
1879
1880 The number of values specified by this modifier must match the number of source
1881 operands. The first value controls src0, the second value controls src1 and so on.
1882
1883 The value 0 indicates that the corresponding operand value is used unmodified,
1884 the value 1 indicates that the negative value of the operand must be used.
1885
1886 By default, operand values are used unmodified.
1887
1888 This modifier is valid for floating-point operands only.
1889
1890     =============================== ==================================================================
1891     Syntax                          Description
1892     =============================== ==================================================================
1893     neg_hi:[{0..1}]                 Select affected operands for instructions with 1 source operand.
1894     neg_hi:[{0..1},{0..1}]          Select affected operands for instructions with 2 source operands.
1895     neg_hi:[{0..1},{0..1},{0..1}]   Select affected operands for instructions with 3 source operands.
1896     =============================== ==================================================================
1897
1898 Note: numeric values may be specified as either
1899 :ref:`integer numbers<amdgpu_synid_integer_number>` or
1900 :ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
1901
1902 Examples:
1903
1904 .. parsed-literal::
1905
1906   neg_hi:[1,0]
1907   neg_hi:[0,1,1]
1908
1909 clamp
1910 ~~~~~
1911
1912 See a description :ref:`here<amdgpu_synid_clamp>`.
1913
1914 .. _amdgpu_synid_mad_mix:
1915
1916 VOP3P MAD_MIX/FMA_MIX Modifiers
1917 -------------------------------
1918
1919 *v_mad_mix\** and *v_fma_mix\**
1920 instructions use *op_sel* and *op_sel_hi* modifiers
1921 in a manner different from *regular* VOP3P instructions.
1922
1923 See a description below.
1924
1925 .. _amdgpu_synid_mad_mix_op_sel:
1926
1927 m_op_sel
1928 ~~~~~~~~
1929
1930 This operand has meaning only for 16-bit source operands, as indicated by
1931 :ref:`m_op_sel_hi<amdgpu_synid_mad_mix_op_sel_hi>`.
1932 It specifies to select either the low [15:0] or high [31:16] operand bits
1933 as input to the operation.
1934
1935 The number of values specified by the *op_sel* modifier must match the number of source
1936 operands. The first value controls src0, the second value controls src1 and so on.
1937
1938 The value 0 indicates the low bits, the value 1 indicates the high 16 bits.
1939
1940 By default, low bits are used for all operands.
1941
1942     =============================== ===================================================
1943     Syntax                          Description
1944     =============================== ===================================================
1945     op_sel:[{0..1},{0..1},{0..1}]   Select the location of each 16-bit source operand.
1946     =============================== ===================================================
1947
1948 Note: numeric values may be specified as either
1949 :ref:`integer numbers<amdgpu_synid_integer_number>` or
1950 :ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
1951
1952 Examples:
1953
1954 .. parsed-literal::
1955
1956   op_sel:[0,1]
1957
1958 .. _amdgpu_synid_mad_mix_op_sel_hi:
1959
1960 m_op_sel_hi
1961 ~~~~~~~~~~~
1962
1963 Selects the size of source operands: either 32 bits or 16 bits.
1964 By default, 32 bits are used for all source operands.
1965
1966 The number of values specified by the *op_sel_hi* modifier must match the number of source
1967 operands. The first value controls src0, the second value controls src1 and so on.
1968
1969 The value 0 indicates 32 bits, the value 1 indicates 16 bits.
1970
1971 The location of 16 bits in the operand may be specified by
1972 :ref:`m_op_sel<amdgpu_synid_mad_mix_op_sel>`.
1973
1974     ======================================== ========================================
1975     Syntax                                   Description
1976     ======================================== ========================================
1977     op_sel_hi:[{0..1},{0..1},{0..1}]         Select the size of each source operand.
1978     ======================================== ========================================
1979
1980 Note: numeric values may be specified as either
1981 :ref:`integer numbers<amdgpu_synid_integer_number>` or
1982 :ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
1983
1984 Examples:
1985
1986 .. parsed-literal::
1987
1988   op_sel_hi:[1,1,1]
1989
1990 abs
1991 ~~~
1992
1993 See a description :ref:`here<amdgpu_synid_abs>`.
1994
1995 neg
1996 ~~~
1997
1998 See a description :ref:`here<amdgpu_synid_neg>`.
1999
2000 clamp
2001 ~~~~~
2002
2003 See a description :ref:`here<amdgpu_synid_clamp>`.
2004
2005 VOP3P MFMA Modifiers
2006 --------------------
2007
2008 .. _amdgpu_synid_cbsz:
2009
2010 cbsz
2011 ~~~~
2012
2013 Specifies a broadcast mode.
2014
2015     =============================== ==================================================================
2016     Syntax                          Description
2017     =============================== ==================================================================
2018     cbsz:[{0..7}]                   A broadcast mode.
2019     =============================== ==================================================================
2020
2021 Note: numeric value may be specified as either
2022 an :ref:`integer number<amdgpu_synid_integer_number>` or
2023 an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
2024
2025 .. _amdgpu_synid_abid:
2026
2027 abid
2028 ~~~~
2029
2030 Specifies matrix A group select.
2031
2032     =============================== ==================================================================
2033     Syntax                          Description
2034     =============================== ==================================================================
2035     abid:[{0..15}]                  Matrix A group select id.
2036     =============================== ==================================================================
2037
2038 Note: numeric value may be specified as either
2039 an :ref:`integer number<amdgpu_synid_integer_number>` or
2040 an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
2041
2042 .. _amdgpu_synid_blgp:
2043
2044 blgp
2045 ~~~~
2046
2047 Specifies matrix B lane group pattern.
2048
2049     =============================== ==================================================================
2050     Syntax                          Description
2051     =============================== ==================================================================
2052     blgp:[{0..7}]                   Matrix B lane group pattern.
2053     =============================== ==================================================================
2054
2055 Note: numeric value may be specified as either
2056 an :ref:`integer number<amdgpu_synid_integer_number>` or
2057 an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
2058
2059 .. _amdgpu_synid_mfma_neg:
2060
2061 neg
2062 ~~~
2063
2064 Indicates operands that must be negated before the operation.
2065 The number of values specified by this modifier must match the number of source
2066 operands. The first value controls src0, the second value controls src1 and so on.
2067
2068 The value 0 indicates that the corresponding operand value is used unmodified,
2069 the value 1 indicates that the operand value must be negated before the operation.
2070
2071 By default, operand values are used unmodified.
2072
2073     =============================== ==================================================================
2074     Syntax                          Description
2075     =============================== ==================================================================
2076     neg:[{0..1},{0..1},{0..1}]      Select operands which must be negated before the operation.
2077     =============================== ==================================================================
2078
2079 Note: numeric values may be specified as either
2080 :ref:`integer numbers<amdgpu_synid_integer_number>` or
2081 :ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
2082
2083 Examples:
2084
2085 .. parsed-literal::
2086
2087   neg:[0,1,1]