llvm/docs/AMDGPUModifierSyntax.rst

   1 ======================================
   2 Syntax of AMDGPU Instruction Modifiers
   3 ======================================
   4
   5 .. contents::
   6    :local:
   7
   8 Conventions
   9 ===========
  10
  11 The following notation is used throughout this document:
  12
  13     =================== =============================================================
  14     Notation            Description
  15     =================== =============================================================
  16     {0..N}              Any integer value in the range from 0 to N (inclusive).
  17     <x>                 Syntax and meaning of *x* is explained elsewhere.
  18     =================== =============================================================
  19
  20 .. _amdgpu_syn_modifiers:
  21
  22 Modifiers
  23 =========
  24
  25 DS Modifiers
  26 ------------
  27
  28 .. _amdgpu_synid_ds_offset80:
  29
  30 offset0
  31 ~~~~~~~
  32
  33 Specifies first 8-bit offset, in bytes. The default value is 0.
  34
  35 Used with DS instructions that expect two addresses.
  36
  37     =================== ====================================================================
  38     Syntax              Description
  39     =================== ====================================================================
  40     offset0:{0..0xFF}   Specifies an unsigned 8-bit offset as a positive
  41                         :ref:`integer number <amdgpu_synid_integer_number>`
  42                         or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
  43     =================== ====================================================================
  44
  45 Examples:
  46
  47 .. parsed-literal::
  48
  49   offset0:0xff
  50   offset0:2-x
  51   offset0:-x-y
  52
  53 .. _amdgpu_synid_ds_offset81:
  54
  55 offset1
  56 ~~~~~~~
  57
  58 Specifies second 8-bit offset, in bytes. The default value is 0.
  59
  60 Used with DS instructions that expect two addresses.
  61
  62     =================== ====================================================================
  63     Syntax              Description
  64     =================== ====================================================================
  65     offset1:{0..0xFF}   Specifies an unsigned 8-bit offset as a positive
  66                         :ref:`integer number <amdgpu_synid_integer_number>`
  67                         or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
  68     =================== ====================================================================
  69
  70 Examples:
  71
  72 .. parsed-literal::
  73
  74   offset1:0xff
  75   offset1:2-x
  76   offset1:-x-y
  77
  78 .. _amdgpu_synid_ds_offset16:
  79
  80 offset
  81 ~~~~~~
  82
  83 Specifies a 16-bit offset, in bytes. The default value is 0.
  84
  85 Used with DS instructions that expect a single address.
  86
  87     ==================== ====================================================================
  88     Syntax               Description
  89     ==================== ====================================================================
  90     offset:{0..0xFFFF}   Specifies an unsigned 16-bit offset as a positive
  91                          :ref:`integer number <amdgpu_synid_integer_number>`
  92                          or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
  93     ==================== ====================================================================
  94
  95 Examples:
  96
  97 .. parsed-literal::
  98
  99   offset:65535
 100   offset:0xffff
 101   offset:-x-y
 102
 103 .. _amdgpu_synid_sw_offset16:
 104
 105 swizzle pattern
 106 ~~~~~~~~~~~~~~~
 107
 108 This is a special modifier which may be used with *ds_swizzle_b32* instruction only.
 109 It specifies a swizzle pattern in numeric or symbolic form. The default value is 0.
 110
 111 See AMD documentation for more information.
 112
 113     ======================================================= ===========================================================
 114     Syntax                                                  Description
 115     ======================================================= ===========================================================
 116     offset:{0..0xFFFF}                                      Specifies a 16-bit swizzle pattern.
 117     offset:swizzle(QUAD_PERM,{0..3},{0..3},{0..3},{0..3})   Specifies a quad permute mode pattern
 118
 119                                                             Each number is a lane *id*.
 120     offset:swizzle(BITMASK_PERM, "<mask>")                  Specifies a bitmask permute mode pattern.
 121
 122                                                             The pattern converts a 5-bit lane *id* to another
 123                                                             lane *id* with which the lane interacts.
 124
 125                                                             *mask* is a 5 character sequence which
 126                                                             specifies how to transform the bits of the
 127                                                             lane *id*.
 128
 129                                                             The following characters are allowed:
 130
 131                                                             * "0" - set bit to 0.
 132
 133                                                             * "1" - set bit to 1.
 134
 135                                                             * "p" - preserve bit.
 136
 137                                                             * "i" - inverse bit.
 138
 139     offset:swizzle(BROADCAST,{2..32},{0..N})                Specifies a broadcast mode.
 140
 141                                                             Broadcasts the value of any particular lane to
 142                                                             all lanes in its group.
 143
 144                                                             The first numeric parameter is a group
 145                                                             size and must be equal to 2, 4, 8, 16 or 32.
 146
 147                                                             The second numeric parameter is an index of the
 148                                                             lane being broadcasted.
 149
 150                                                             The index must not exceed group size.
 151     offset:swizzle(SWAP,{1..16})                            Specifies a swap mode.
 152
 153                                                             Swaps the neighboring groups of
 154                                                             1, 2, 4, 8 or 16 lanes.
 155     offset:swizzle(REVERSE,{2..32})                         Specifies a reverse mode.
 156
 157                                                             Reverses the lanes for groups of 2, 4, 8, 16 or 32 lanes.
 158     ======================================================= ===========================================================
 159
 160 Note: numeric values may be specified as either :ref:`integer numbers<amdgpu_synid_integer_number>` or
 161 :ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
 162
 163 Examples:
 164
 165 .. parsed-literal::
 166
 167   offset:255
 168   offset:0xffff
 169   offset:swizzle(QUAD_PERM, 0, 1, 2, 3)
 170   offset:swizzle(BITMASK_PERM, "01pi0")
 171   offset:swizzle(BROADCAST, 2, 0)
 172   offset:swizzle(SWAP, 8)
 173   offset:swizzle(REVERSE, 30 + 2)
 174
 175 .. _amdgpu_synid_gds:
 176
 177 gds
 178 ~~~
 179
 180 Specifies whether to use GDS or LDS memory (LDS is the default).
 181
 182     ======================================== ================================================
 183     Syntax                                   Description
 184     ======================================== ================================================
 185     gds                                      Use GDS memory.
 186     ======================================== ================================================
 187
 188
 189 EXP Modifiers
 190 -------------
 191
 192 .. _amdgpu_synid_done:
 193
 194 done
 195 ~~~~
 196
 197 Specifies if this is the last export from the shader to the target. By default,
 198 *exp* instruction does not finish an export sequence.
 199
 200     ======================================== ================================================
 201     Syntax                                   Description
 202     ======================================== ================================================
 203     done                                     Indicates the last export operation.
 204     ======================================== ================================================
 205
 206 .. _amdgpu_synid_compr:
 207
 208 compr
 209 ~~~~~
 210
 211 Indicates if the data are compressed (data are not compressed by default).
 212
 213     ======================================== ================================================
 214     Syntax                                   Description
 215     ======================================== ================================================
 216     compr                                    Data are compressed.
 217     ======================================== ================================================
 218
 219 .. _amdgpu_synid_vm:
 220
 221 vm
 222 ~~
 223
 224 Specifies valid mask flag state (off by default).
 225
 226     ======================================== ================================================
 227     Syntax                                   Description
 228     ======================================== ================================================
 229     vm                                       Set valid mask flag.
 230     ======================================== ================================================
 231
 232 FLAT Modifiers
 233 --------------
 234
 235 .. _amdgpu_synid_flat_offset12:
 236
 237 offset12
 238 ~~~~~~~~
 239
 240 Specifies an immediate unsigned 12-bit offset, in bytes. The default value is 0.
 241
 242 Cannot be used with *global/scratch* opcodes. GFX9 only.
 243
 244     ================= ====================================================================
 245     Syntax            Description
 246     ================= ====================================================================
 247     offset:{0..4095}  Specifies a 12-bit unsigned offset as a positive
 248                       :ref:`integer number <amdgpu_synid_integer_number>`
 249                       or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
 250     ================= ====================================================================
 251
 252 Examples:
 253
 254 .. parsed-literal::
 255
 256   offset:4095
 257   offset:x-0xff
 258
 259 .. _amdgpu_synid_flat_offset13s:
 260
 261 offset13s
 262 ~~~~~~~~~
 263
 264 Specifies an immediate signed 13-bit offset, in bytes. The default value is 0.
 265
 266 Can be used with *global/scratch* opcodes only. GFX9 only.
 267
 268     ===================== ====================================================================
 269     Syntax                Description
 270     ===================== ====================================================================
 271     offset:{-4096..4095}  Specifies a 13-bit signed offset as an
 272                           :ref:`integer number <amdgpu_synid_integer_number>`
 273                           or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
 274     ===================== ====================================================================
 275
 276 Examples:
 277
 278 .. parsed-literal::
 279
 280   offset:-4000
 281   offset:0x10
 282   offset:-x
 283
 284 .. _amdgpu_synid_flat_offset12s:
 285
 286 offset12s
 287 ~~~~~~~~~
 288
 289 Specifies an immediate signed 12-bit offset, in bytes. The default value is 0.
 290
 291 Can be used with *global/scratch* opcodes only.
 292
 293 GFX10 only.
 294
 295     ===================== ====================================================================
 296     Syntax                Description
 297     ===================== ====================================================================
 298     offset:{-2048..2047}  Specifies a 12-bit signed offset as an
 299                           :ref:`integer number <amdgpu_synid_integer_number>`
 300                           or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
 301     ===================== ====================================================================
 302
 303 Examples:
 304
 305 .. parsed-literal::
 306
 307   offset:-2000
 308   offset:0x10
 309   offset:-x+y
 310
 311 .. _amdgpu_synid_flat_offset11:
 312
 313 offset11
 314 ~~~~~~~~
 315
 316 Specifies an immediate unsigned 11-bit offset, in bytes. The default value is 0.
 317
 318 Cannot be used with *global/scratch* opcodes.
 319
 320 GFX10 only.
 321
 322     ================= ====================================================================
 323     Syntax            Description
 324     ================= ====================================================================
 325     offset:{0..2047}  Specifies an 11-bit unsigned offset as a positive
 326                       :ref:`integer number <amdgpu_synid_integer_number>`
 327                       or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
 328     ================= ====================================================================
 329
 330 Examples:
 331
 332 .. parsed-literal::
 333
 334   offset:2047
 335   offset:x+0xff
 336
 337 dlc
 338 ~~~
 339
 340 See a description :ref:`here<amdgpu_synid_dlc>`. GFX10 only.
 341
 342 glc
 343 ~~~
 344
 345 See a description :ref:`here<amdgpu_synid_glc>`.
 346
 347 lds
 348 ~~~
 349
 350 See a description :ref:`here<amdgpu_synid_lds>`. GFX10 only.
 351
 352 slc
 353 ~~~
 354
 355 See a description :ref:`here<amdgpu_synid_slc>`.
 356
 357 tfe
 358 ~~~
 359
 360 See a description :ref:`here<amdgpu_synid_tfe>`.
 361
 362 nv
 363 ~~
 364
 365 See a description :ref:`here<amdgpu_synid_nv>`.
 366
 367 sc0
 368 ~~~
 369
 370 See a description :ref:`here<amdgpu_synid_sc0>`.
 371
 372 sc1
 373 ~~~
 374
 375 See a description :ref:`here<amdgpu_synid_sc1>`.
 376
 377 nt
 378 ~~
 379
 380 See a description :ref:`here<amdgpu_synid_nt>`.
 381
 382 MIMG Modifiers
 383 --------------
 384
 385 .. _amdgpu_synid_dmask:
 386
 387 dmask
 388 ~~~~~
 389
 390 Specifies which channels (image components) are used by the operation. By default, no channels
 391 are used.
 392
 393     =============== ====================================================================
 394     Syntax          Description
 395     =============== ====================================================================
 396     dmask:{0..15}   Specifies image channels as a positive
 397                     :ref:`integer number <amdgpu_synid_integer_number>`
 398                     or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
 399
 400                     Each bit corresponds to one of 4 image components (RGBA).
 401
 402                     If the specified bit value is 0, the component is not used,
 403                     value 1 means that the component is used.
 404     =============== ====================================================================
 405
 406 This modifier has some limitations depending on instruction kind:
 407
 408     =================================================== ========================
 409     Instruction Kind                                    Valid dmask Values
 410     =================================================== ========================
 411     32-bit atomic *cmpswap*                             0x3
 412     32-bit atomic instructions except for *cmpswap*     0x1
 413     64-bit atomic *cmpswap*                             0xF
 414     64-bit atomic instructions except for *cmpswap*     0x3
 415     *gather4*                                           0x1, 0x2, 0x4, 0x8
 416     Other instructions                                  any value
 417     =================================================== ========================
 418
 419 Examples:
 420
 421 .. parsed-literal::
 422
 423   dmask:0xf
 424   dmask:0b1111
 425   dmask:x|y|z
 426
 427 .. _amdgpu_synid_unorm:
 428
 429 unorm
 430 ~~~~~
 431
 432 Specifies whether the address is normalized or not (the address is normalized by default).
 433
 434     ======================== ========================================
 435     Syntax                   Description
 436     ======================== ========================================
 437     unorm                    Force the address to be unnormalized.
 438     ======================== ========================================
 439
 440 glc
 441 ~~~
 442
 443 See a description :ref:`here<amdgpu_synid_glc>`.
 444
 445 slc
 446 ~~~
 447
 448 See a description :ref:`here<amdgpu_synid_slc>`.
 449
 450 .. _amdgpu_synid_r128:
 451
 452 r128
 453 ~~~~
 454
 455 Specifies texture resource size. The default size is 256 bits.
 456
 457 GFX7, GFX8 and GFX10 only.
 458
 459     =================== ================================================
 460     Syntax              Description
 461     =================== ================================================
 462     r128                Specifies 128 bits texture resource size.
 463     =================== ================================================
 464
 465 .. WARNING:: Using this modifier should decrease *rsrc* operand size from 8 to 4 dwords, but assembler does not currently support this feature.
 466
 467 tfe
 468 ~~~
 469
 470 See a description :ref:`here<amdgpu_synid_tfe>`.
 471
 472 .. _amdgpu_synid_lwe:
 473
 474 lwe
 475 ~~~
 476
 477 Specifies LOD warning status (LOD warning is disabled by default).
 478
 479     ======================================== ================================================
 480     Syntax                                   Description
 481     ======================================== ================================================
 482     lwe                                      Enables LOD warning.
 483     ======================================== ================================================
 484
 485 .. _amdgpu_synid_da:
 486
 487 da
 488 ~~
 489
 490 Specifies if an array index must be sent to TA. By default, array index is not sent.
 491
 492     ======================================== ================================================
 493     Syntax                                   Description
 494     ======================================== ================================================
 495     da                                       Send an array-index to TA.
 496     ======================================== ================================================
 497
 498 .. _amdgpu_synid_d16:
 499
 500 d16
 501 ~~~
 502
 503 Specifies data size: 16 or 32 bits (32 bits by default). Not supported by GFX7.
 504
 505     ======================================== ================================================
 506     Syntax                                   Description
 507     ======================================== ================================================
 508     d16                                      Enables 16-bits data mode.
 509
 510                                              On loads, convert data in memory to 16-bit
 511                                              format before storing it in VGPRs.
 512
 513                                              For stores, convert 16-bit data in VGPRs to
 514                                              32 bits before going to memory.
 515
 516                                              Note that GFX8.0 does not support data packing.
 517                                              Each 16-bit data element occupies 1 VGPR.
 518
 519                                              GFX8.1, GFX9 and GFX10 support data packing.
 520                                              Each pair of 16-bit data elements
 521                                              occupies 1 VGPR.
 522     ======================================== ================================================
 523
 524 .. _amdgpu_synid_a16:
 525
 526 a16
 527 ~~~
 528
 529 Specifies size of image address components: 16 or 32 bits (32 bits by default).
 530 GFX9 and GFX10 only.
 531
 532     ======================================== ================================================
 533     Syntax                                   Description
 534     ======================================== ================================================
 535     a16                                      Enables 16-bits image address components.
 536     ======================================== ================================================
 537
 538 .. _amdgpu_synid_dim:
 539
 540 dim
 541 ~~~
 542
 543 Specifies surface dimension. This is a mandatory modifier. There is no default value.
 544
 545 GFX10 only.
 546
 547     =============================== =========================================================
 548     Syntax                          Description
 549     =============================== =========================================================
 550     dim:1D                          One-dimensional image.
 551     dim:2D                          Two-dimensional image.
 552     dim:3D                          Three-dimensional image.
 553     dim:CUBE                        Cubemap array.
 554     dim:1D_ARRAY                    One-dimensional image array.
 555     dim:2D_ARRAY                    Two-dimensional image array.
 556     dim:2D_MSAA                     Two-dimensional multi-sample auto-aliasing image.
 557     dim:2D_MSAA_ARRAY               Two-dimensional multi-sample auto-aliasing image array.
 558     =============================== =========================================================
 559
 560 The following table defines an alternative syntax which is supported
 561 for compatibility with SP3 assembler:
 562
 563     =============================== =========================================================
 564     Syntax                          Description
 565     =============================== =========================================================
 566     dim:SQ_RSRC_IMG_1D              One-dimensional image.
 567     dim:SQ_RSRC_IMG_2D              Two-dimensional image.
 568     dim:SQ_RSRC_IMG_3D              Three-dimensional image.
 569     dim:SQ_RSRC_IMG_CUBE            Cubemap array.
 570     dim:SQ_RSRC_IMG_1D_ARRAY        One-dimensional image array.
 571     dim:SQ_RSRC_IMG_2D_ARRAY        Two-dimensional image array.
 572     dim:SQ_RSRC_IMG_2D_MSAA         Two-dimensional multi-sample auto-aliasing image.
 573     dim:SQ_RSRC_IMG_2D_MSAA_ARRAY   Two-dimensional multi-sample auto-aliasing image array.
 574     =============================== =========================================================
 575
 576 dlc
 577 ~~~
 578
 579 See a description :ref:`here<amdgpu_synid_dlc>`. GFX10 only.
 580
 581 Miscellaneous Modifiers
 582 -----------------------
 583
 584 .. _amdgpu_synid_dlc:
 585
 586 dlc
 587 ~~~
 588
 589 Controls device level cache policy for memory operations. Used for synchronization.
 590 When specified, forces operation to bypass device level cache making the operation device
 591 level coherent. By default, instructions use device level cache.
 592
 593 GFX10 only.
 594
 595     ======================================== ================================================
 596     Syntax                                   Description
 597     ======================================== ================================================
 598     dlc                                      Bypass device level cache.
 599     ======================================== ================================================
 600
 601 .. _amdgpu_synid_glc:
 602
 603 glc
 604 ~~~
 605
 606 This modifier has different meaning for loads, stores, and atomic operations.
 607 The default value is off (0).
 608
 609 See AMD documentation for details.
 610
 611     ======================================== ================================================
 612     Syntax                                   Description
 613     ======================================== ================================================
 614     glc                                      Set glc bit to 1.
 615     ======================================== ================================================
 616
 617 .. _amdgpu_synid_lds:
 618
 619 lds
 620 ~~~
 621
 622 Specifies where to store the result: VGPRs or LDS (VGPRs by default).
 623
 624     ======================================== ===========================
 625     Syntax                                   Description
 626     ======================================== ===========================
 627     lds                                      Store result in LDS.
 628     ======================================== ===========================
 629
 630 .. _amdgpu_synid_nv:
 631
 632 nv
 633 ~~
 634
 635 Specifies if instruction is operating on non-volatile memory. By default, memory is volatile.
 636
 637 GFX9 only.
 638
 639     ======================================== ================================================
 640     Syntax                                   Description
 641     ======================================== ================================================
 642     nv                                       Indicates that instruction operates on
 643                                              non-volatile memory.
 644     ======================================== ================================================
 645
 646 .. _amdgpu_synid_slc:
 647
 648 slc
 649 ~~~
 650
 651 Specifies cache policy. The default value is off (0).
 652
 653 See AMD documentation for details.
 654
 655     ======================================== ================================================
 656     Syntax                                   Description
 657     ======================================== ================================================
 658     slc                                      Set slc bit to 1.
 659     ======================================== ================================================
 660
 661 .. _amdgpu_synid_tfe:
 662
 663 tfe
 664 ~~~
 665
 666 Controls access to partially resident textures. The default value is off (0).
 667
 668 See AMD documentation for details.
 669
 670     ======================================== ================================================
 671     Syntax                                   Description
 672     ======================================== ================================================
 673     tfe                                      Set tfe bit to 1.
 674     ======================================== ================================================
 675
 676 .. _amdgpu_synid_sc0:
 677
 678 sc0
 679 ~~~
 680
 681 For atomics, sc0 indicates that the atomic operation returns a value.
 682 For other opcodes is is used together with :ref:`sc1<amdgpu_synid_sc1>` to specify cache
 683 policy. See AMD documentation for details.
 684
 685     ======================================== ================================================
 686     Syntax                                   Description
 687     ======================================== ================================================
 688     sc0                                      Set sc0 bit to 1.
 689     ======================================== ================================================
 690
 691 .. _amdgpu_synid_sc1:
 692
 693 sc1
 694 ~~~
 695
 696 This modifier is used together with :ref:`sc0<amdgpu_synid_sc0>` to specify cache
 697 policy.
 698
 699     ======================================== ================================================
 700     Syntax                                   Description
 701     ======================================== ================================================
 702     sc1                                      Set sc1 bit to 1.
 703     ======================================== ================================================
 704
 705 .. _amdgpu_synid_nt:
 706
 707 nt
 708 ~~
 709
 710 Indicates an operation with non-temporal data.
 711
 712     ======================================== ================================================
 713     Syntax                                   Description
 714     ======================================== ================================================
 715     nt                                       Set nt bit to 1.
 716     ======================================== ================================================
 717
 718 MUBUF/MTBUF Modifiers
 719 ---------------------
 720
 721 .. _amdgpu_synid_idxen:
 722
 723 idxen
 724 ~~~~~
 725
 726 Specifies whether address components include an index. By default, no components are used.
 727
 728 Can be used together with :ref:`offen<amdgpu_synid_offen>`.
 729
 730 Cannot be used with :ref:`addr64<amdgpu_synid_addr64>`.
 731
 732     ======================================== ================================================
 733     Syntax                                   Description
 734     ======================================== ================================================
 735     idxen                                    Address components include an index.
 736     ======================================== ================================================
 737
 738 .. _amdgpu_synid_offen:
 739
 740 offen
 741 ~~~~~
 742
 743 Specifies whether address components include an offset. By default, no components are used.
 744
 745 Can be used together with :ref:`idxen<amdgpu_synid_idxen>`.
 746
 747 Cannot be used with :ref:`addr64<amdgpu_synid_addr64>`.
 748
 749     ======================================== ================================================
 750     Syntax                                   Description
 751     ======================================== ================================================
 752     offen                                    Address components include an offset.
 753     ======================================== ================================================
 754
 755 .. _amdgpu_synid_addr64:
 756
 757 addr64
 758 ~~~~~~
 759
 760 Specifies whether a 64-bit address is used. By default, no address is used.
 761
 762 GFX7 only. Cannot be used with :ref:`offen<amdgpu_synid_offen>` and
 763 :ref:`idxen<amdgpu_synid_idxen>` modifiers.
 764
 765     ======================================== ================================================
 766     Syntax                                   Description
 767     ======================================== ================================================
 768     addr64                                   A 64-bit address is used.
 769     ======================================== ================================================
 770
 771 .. _amdgpu_synid_buf_offset12:
 772
 773 offset12
 774 ~~~~~~~~
 775
 776 Specifies an immediate unsigned 12-bit offset, in bytes. The default value is 0.
 777
 778     ================== ====================================================================
 779     Syntax             Description
 780     ================== ====================================================================
 781     offset:{0..0xFFF}  Specifies a 12-bit unsigned offset as a positive
 782                        :ref:`integer number <amdgpu_synid_integer_number>`
 783                        or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
 784     ================== ====================================================================
 785
 786 Examples:
 787
 788 .. parsed-literal::
 789
 790   offset:x+y
 791   offset:0x10
 792
 793 glc
 794 ~~~
 795
 796 See a description :ref:`here<amdgpu_synid_glc>`.
 797
 798 slc
 799 ~~~
 800
 801 See a description :ref:`here<amdgpu_synid_slc>`.
 802
 803 lds
 804 ~~~
 805
 806 See a description :ref:`here<amdgpu_synid_lds>`.
 807
 808 dlc
 809 ~~~
 810
 811 See a description :ref:`here<amdgpu_synid_dlc>`. GFX10 only.
 812
 813 tfe
 814 ~~~
 815
 816 See a description :ref:`here<amdgpu_synid_tfe>`.
 817
 818 .. _amdgpu_synid_fmt:
 819
 820 fmt
 821 ~~~
 822
 823 Specifies data and numeric formats used by the operation.
 824 The default numeric format is BUF_NUM_FORMAT_UNORM.
 825 The default data format is BUF_DATA_FORMAT_8.
 826
 827     ========================================= ===============================================================
 828     Syntax                                    Description
 829     ========================================= ===============================================================
 830     format:{0..127}                           Use format specified as either an
 831                                               :ref:`integer number<amdgpu_synid_integer_number>` or an
 832                                               :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
 833     format:[<data format>]                    Use the specified data format and
 834                                               default numeric format.
 835     format:[<numeric format>]                 Use the specified numeric format and
 836                                               default data format.
 837     format:[<data format>, <numeric format>]  Use the specified data and numeric formats.
 838     format:[<numeric format>, <data format>]  Use the specified data and numeric formats.
 839     ========================================= ===============================================================
 840
 841 .. _amdgpu_synid_format_data:
 842
 843 Supported data formats are defined in the following table:
 844
 845     ========================================= ===============================
 846     Syntax                                    Note
 847     ========================================= ===============================
 848     BUF_DATA_FORMAT_INVALID
 849     BUF_DATA_FORMAT_8                         Default value.
 850     BUF_DATA_FORMAT_16
 851     BUF_DATA_FORMAT_8_8
 852     BUF_DATA_FORMAT_32
 853     BUF_DATA_FORMAT_16_16
 854     BUF_DATA_FORMAT_10_11_11
 855     BUF_DATA_FORMAT_11_11_10
 856     BUF_DATA_FORMAT_10_10_10_2
 857     BUF_DATA_FORMAT_2_10_10_10
 858     BUF_DATA_FORMAT_8_8_8_8
 859     BUF_DATA_FORMAT_32_32
 860     BUF_DATA_FORMAT_16_16_16_16
 861     BUF_DATA_FORMAT_32_32_32
 862     BUF_DATA_FORMAT_32_32_32_32
 863     BUF_DATA_FORMAT_RESERVED_15
 864     ========================================= ===============================
 865
 866 .. _amdgpu_synid_format_num:
 867
 868 Supported numeric formats are defined below:
 869
 870     ========================================= ===============================
 871     Syntax                                    Note
 872     ========================================= ===============================
 873     BUF_NUM_FORMAT_UNORM                      Default value.
 874     BUF_NUM_FORMAT_SNORM
 875     BUF_NUM_FORMAT_USCALED
 876     BUF_NUM_FORMAT_SSCALED
 877     BUF_NUM_FORMAT_UINT
 878     BUF_NUM_FORMAT_SINT
 879     BUF_NUM_FORMAT_SNORM_OGL                  GFX7 only.
 880     BUF_NUM_FORMAT_RESERVED_6                 GFX8 and GFX9 only.
 881     BUF_NUM_FORMAT_FLOAT
 882     ========================================= ===============================
 883
 884 Examples:
 885
 886 .. parsed-literal::
 887
 888   format:0
 889   format:127
 890   format:[BUF_DATA_FORMAT_16]
 891   format:[BUF_DATA_FORMAT_16,BUF_NUM_FORMAT_SSCALED]
 892   format:[BUF_NUM_FORMAT_FLOAT]
 893
 894 .. _amdgpu_synid_ufmt:
 895
 896 ufmt
 897 ~~~~
 898
 899 Specifies a unified format used by the operation.
 900 The default format is BUF_FMT_8_UNORM.
 901 GFX10 only.
 902
 903     ========================================= ===============================================================
 904     Syntax                                    Description
 905     ========================================= ===============================================================
 906     format:{0..127}                           Use unified format specified as either an
 907                                               :ref:`integer number<amdgpu_synid_integer_number>` or an
 908                                               :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
 909                                               Note that unified format numbers are not compatible with
 910                                               format numbers used for pre-GFX10 ISA.
 911     format:[<unified format>]                 Use the specified unified format.
 912     ========================================= ===============================================================
 913
 914 Unified format is a replacement for :ref:`data<amdgpu_synid_format_data>`
 915 and :ref:`numeric<amdgpu_synid_format_num>` formats. For compatibility with older ISA,
 916 :ref:`syntax with data and numeric formats<amdgpu_synid_fmt>` is still accepted
 917 provided that the combination of formats can be mapped to a unified format.
 918
 919 Supported unified formats and equivalent combinations of data and numeric formats
 920 are defined below:
 921
 922     ============================== ============================== =============================
 923     Syntax                         Equivalent Data Format         Equivalent Numeric Format
 924     ============================== ============================== =============================
 925     BUF_FMT_INVALID                BUF_DATA_FORMAT_INVALID        BUF_NUM_FORMAT_UNORM
 926
 927     BUF_FMT_8_UNORM                BUF_DATA_FORMAT_8              BUF_NUM_FORMAT_UNORM
 928     BUF_FMT_8_SNORM                BUF_DATA_FORMAT_8              BUF_NUM_FORMAT_SNORM
 929     BUF_FMT_8_USCALED              BUF_DATA_FORMAT_8              BUF_NUM_FORMAT_USCALED
 930     BUF_FMT_8_SSCALED              BUF_DATA_FORMAT_8              BUF_NUM_FORMAT_SSCALED
 931     BUF_FMT_8_UINT                 BUF_DATA_FORMAT_8              BUF_NUM_FORMAT_UINT
 932     BUF_FMT_8_SINT                 BUF_DATA_FORMAT_8              BUF_NUM_FORMAT_SINT
 933
 934     BUF_FMT_16_UNORM               BUF_DATA_FORMAT_16             BUF_NUM_FORMAT_UNORM
 935     BUF_FMT_16_SNORM               BUF_DATA_FORMAT_16             BUF_NUM_FORMAT_SNORM
 936     BUF_FMT_16_USCALED             BUF_DATA_FORMAT_16             BUF_NUM_FORMAT_USCALED
 937     BUF_FMT_16_SSCALED             BUF_DATA_FORMAT_16             BUF_NUM_FORMAT_SSCALED
 938     BUF_FMT_16_UINT                BUF_DATA_FORMAT_16             BUF_NUM_FORMAT_UINT
 939     BUF_FMT_16_SINT                BUF_DATA_FORMAT_16             BUF_NUM_FORMAT_SINT
 940     BUF_FMT_16_FLOAT               BUF_DATA_FORMAT_16             BUF_NUM_FORMAT_FLOAT
 941
 942     BUF_FMT_8_8_UNORM              BUF_DATA_FORMAT_8_8            BUF_NUM_FORMAT_UNORM
 943     BUF_FMT_8_8_SNORM              BUF_DATA_FORMAT_8_8            BUF_NUM_FORMAT_SNORM
 944     BUF_FMT_8_8_USCALED            BUF_DATA_FORMAT_8_8            BUF_NUM_FORMAT_USCALED
 945     BUF_FMT_8_8_SSCALED            BUF_DATA_FORMAT_8_8            BUF_NUM_FORMAT_SSCALED
 946     BUF_FMT_8_8_UINT               BUF_DATA_FORMAT_8_8            BUF_NUM_FORMAT_UINT
 947     BUF_FMT_8_8_SINT               BUF_DATA_FORMAT_8_8            BUF_NUM_FORMAT_SINT
 948
 949     BUF_FMT_32_UINT                BUF_DATA_FORMAT_32             BUF_NUM_FORMAT_UINT
 950     BUF_FMT_32_SINT                BUF_DATA_FORMAT_32             BUF_NUM_FORMAT_SINT
 951     BUF_FMT_32_FLOAT               BUF_DATA_FORMAT_32             BUF_NUM_FORMAT_FLOAT
 952
 953     BUF_FMT_16_16_UNORM            BUF_DATA_FORMAT_16_16          BUF_NUM_FORMAT_UNORM
 954     BUF_FMT_16_16_SNORM            BUF_DATA_FORMAT_16_16          BUF_NUM_FORMAT_SNORM
 955     BUF_FMT_16_16_USCALED          BUF_DATA_FORMAT_16_16          BUF_NUM_FORMAT_USCALED
 956     BUF_FMT_16_16_SSCALED          BUF_DATA_FORMAT_16_16          BUF_NUM_FORMAT_SSCALED
 957     BUF_FMT_16_16_UINT             BUF_DATA_FORMAT_16_16          BUF_NUM_FORMAT_UINT
 958     BUF_FMT_16_16_SINT             BUF_DATA_FORMAT_16_16          BUF_NUM_FORMAT_SINT
 959     BUF_FMT_16_16_FLOAT            BUF_DATA_FORMAT_16_16          BUF_NUM_FORMAT_FLOAT
 960
 961     BUF_FMT_10_11_11_UNORM         BUF_DATA_FORMAT_10_11_11       BUF_NUM_FORMAT_UNORM
 962     BUF_FMT_10_11_11_SNORM         BUF_DATA_FORMAT_10_11_11       BUF_NUM_FORMAT_SNORM
 963     BUF_FMT_10_11_11_USCALED       BUF_DATA_FORMAT_10_11_11       BUF_NUM_FORMAT_USCALED
 964     BUF_FMT_10_11_11_SSCALED       BUF_DATA_FORMAT_10_11_11       BUF_NUM_FORMAT_SSCALED
 965     BUF_FMT_10_11_11_UINT          BUF_DATA_FORMAT_10_11_11       BUF_NUM_FORMAT_UINT
 966     BUF_FMT_10_11_11_SINT          BUF_DATA_FORMAT_10_11_11       BUF_NUM_FORMAT_SINT
 967     BUF_FMT_10_11_11_FLOAT         BUF_DATA_FORMAT_10_11_11       BUF_NUM_FORMAT_FLOAT
 968
 969     BUF_FMT_11_11_10_UNORM         BUF_DATA_FORMAT_11_11_10       BUF_NUM_FORMAT_UNORM
 970     BUF_FMT_11_11_10_SNORM         BUF_DATA_FORMAT_11_11_10       BUF_NUM_FORMAT_SNORM
 971     BUF_FMT_11_11_10_USCALED       BUF_DATA_FORMAT_11_11_10       BUF_NUM_FORMAT_USCALED
 972     BUF_FMT_11_11_10_SSCALED       BUF_DATA_FORMAT_11_11_10       BUF_NUM_FORMAT_SSCALED
 973     BUF_FMT_11_11_10_UINT          BUF_DATA_FORMAT_11_11_10       BUF_NUM_FORMAT_UINT
 974     BUF_FMT_11_11_10_SINT          BUF_DATA_FORMAT_11_11_10       BUF_NUM_FORMAT_SINT
 975     BUF_FMT_11_11_10_FLOAT         BUF_DATA_FORMAT_11_11_10       BUF_NUM_FORMAT_FLOAT
 976
 977     BUF_FMT_10_10_10_2_UNORM       BUF_DATA_FORMAT_10_10_10_2     BUF_NUM_FORMAT_UNORM
 978     BUF_FMT_10_10_10_2_SNORM       BUF_DATA_FORMAT_10_10_10_2     BUF_NUM_FORMAT_SNORM
 979     BUF_FMT_10_10_10_2_USCALED     BUF_DATA_FORMAT_10_10_10_2     BUF_NUM_FORMAT_USCALED
 980     BUF_FMT_10_10_10_2_SSCALED     BUF_DATA_FORMAT_10_10_10_2     BUF_NUM_FORMAT_SSCALED
 981     BUF_FMT_10_10_10_2_UINT        BUF_DATA_FORMAT_10_10_10_2     BUF_NUM_FORMAT_UINT
 982     BUF_FMT_10_10_10_2_SINT        BUF_DATA_FORMAT_10_10_10_2     BUF_NUM_FORMAT_SINT
 983
 984     BUF_FMT_2_10_10_10_UNORM       BUF_DATA_FORMAT_2_10_10_10     BUF_NUM_FORMAT_UNORM
 985     BUF_FMT_2_10_10_10_SNORM       BUF_DATA_FORMAT_2_10_10_10     BUF_NUM_FORMAT_SNORM
 986     BUF_FMT_2_10_10_10_USCALED     BUF_DATA_FORMAT_2_10_10_10     BUF_NUM_FORMAT_USCALED
 987     BUF_FMT_2_10_10_10_SSCALED     BUF_DATA_FORMAT_2_10_10_10     BUF_NUM_FORMAT_SSCALED
 988     BUF_FMT_2_10_10_10_UINT        BUF_DATA_FORMAT_2_10_10_10     BUF_NUM_FORMAT_UINT
 989     BUF_FMT_2_10_10_10_SINT        BUF_DATA_FORMAT_2_10_10_10     BUF_NUM_FORMAT_SINT
 990
 991     BUF_FMT_8_8_8_8_UNORM          BUF_DATA_FORMAT_8_8_8_8        BUF_NUM_FORMAT_UNORM
 992     BUF_FMT_8_8_8_8_SNORM          BUF_DATA_FORMAT_8_8_8_8        BUF_NUM_FORMAT_SNORM
 993     BUF_FMT_8_8_8_8_USCALED        BUF_DATA_FORMAT_8_8_8_8        BUF_NUM_FORMAT_USCALED
 994     BUF_FMT_8_8_8_8_SSCALED        BUF_DATA_FORMAT_8_8_8_8        BUF_NUM_FORMAT_SSCALED
 995     BUF_FMT_8_8_8_8_UINT           BUF_DATA_FORMAT_8_8_8_8        BUF_NUM_FORMAT_UINT
 996     BUF_FMT_8_8_8_8_SINT           BUF_DATA_FORMAT_8_8_8_8        BUF_NUM_FORMAT_SINT
 997
 998     BUF_FMT_32_32_UINT             BUF_DATA_FORMAT_32_32          BUF_NUM_FORMAT_UINT
 999     BUF_FMT_32_32_SINT             BUF_DATA_FORMAT_32_32          BUF_NUM_FORMAT_SINT
1000     BUF_FMT_32_32_FLOAT            BUF_DATA_FORMAT_32_32          BUF_NUM_FORMAT_FLOAT
1001
1002     BUF_FMT_16_16_16_16_UNORM      BUF_DATA_FORMAT_16_16_16_16    BUF_NUM_FORMAT_UNORM
1003     BUF_FMT_16_16_16_16_SNORM      BUF_DATA_FORMAT_16_16_16_16    BUF_NUM_FORMAT_SNORM
1004     BUF_FMT_16_16_16_16_USCALED    BUF_DATA_FORMAT_16_16_16_16    BUF_NUM_FORMAT_USCALED
1005     BUF_FMT_16_16_16_16_SSCALED    BUF_DATA_FORMAT_16_16_16_16    BUF_NUM_FORMAT_SSCALED
1006     BUF_FMT_16_16_16_16_UINT       BUF_DATA_FORMAT_16_16_16_16    BUF_NUM_FORMAT_UINT
1007     BUF_FMT_16_16_16_16_SINT       BUF_DATA_FORMAT_16_16_16_16    BUF_NUM_FORMAT_SINT
1008     BUF_FMT_16_16_16_16_FLOAT      BUF_DATA_FORMAT_16_16_16_16    BUF_NUM_FORMAT_FLOAT
1009
1010     BUF_FMT_32_32_32_UINT          BUF_DATA_FORMAT_32_32_32       BUF_NUM_FORMAT_UINT
1011     BUF_FMT_32_32_32_SINT          BUF_DATA_FORMAT_32_32_32       BUF_NUM_FORMAT_SINT
1012     BUF_FMT_32_32_32_FLOAT         BUF_DATA_FORMAT_32_32_32       BUF_NUM_FORMAT_FLOAT
1013     BUF_FMT_32_32_32_32_UINT       BUF_DATA_FORMAT_32_32_32_32    BUF_NUM_FORMAT_UINT
1014     BUF_FMT_32_32_32_32_SINT       BUF_DATA_FORMAT_32_32_32_32    BUF_NUM_FORMAT_SINT
1015     BUF_FMT_32_32_32_32_FLOAT      BUF_DATA_FORMAT_32_32_32_32    BUF_NUM_FORMAT_FLOAT
1016     ============================== ============================== =============================
1017
1018 Examples:
1019
1020 .. parsed-literal::
1021
1022   format:0
1023   format:[BUF_FMT_32_UINT]
1024
1025 SMRD/SMEM Modifiers
1026 -------------------
1027
1028 glc
1029 ~~~
1030
1031 See a description :ref:`here<amdgpu_synid_glc>`.
1032
1033 nv
1034 ~~
1035
1036 See a description :ref:`here<amdgpu_synid_nv>`. GFX9 only.
1037
1038 dlc
1039 ~~~
1040
1041 See a description :ref:`here<amdgpu_synid_dlc>`. GFX10 only.
1042
1043 .. _amdgpu_synid_smem_offset20u:
1044
1045 offset20u
1046 ~~~~~~~~~
1047
1048 Specifies an unsigned 20-bit offset, in bytes. The default value is 0.
1049
1050     ==================== ====================================================================
1051     Syntax               Description
1052     ==================== ====================================================================
1053     offset:{0..0xFFFFF}  Specifies an offset as a positive
1054                          :ref:`integer number <amdgpu_synid_integer_number>`
1055                          or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
1056     ==================== ====================================================================
1057
1058 Examples:
1059
1060 .. parsed-literal::
1061
1062   offset:1
1063   offset:0xfffff
1064   offset:x-y
1065
1066 .. _amdgpu_synid_smem_offset21s:
1067
1068 offset21s
1069 ~~~~~~~~~
1070
1071 Specifies a signed 21-bit offset, in bytes. The default value is 0.
1072
1073     ============================= ====================================================================
1074     Syntax                        Description
1075     ============================= ====================================================================
1076     offset:{-0x100000..0xFFFFF}   Specifies an offset as an
1077                                   :ref:`integer number <amdgpu_synid_integer_number>`
1078                                   or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
1079     ============================= ====================================================================
1080
1081 Examples:
1082
1083 .. parsed-literal::
1084
1085   offset:-1
1086   offset:0xfffff
1087   offset:-x
1088
1089 VINTRP Modifiers
1090 ----------------
1091
1092 .. _amdgpu_synid_high:
1093
1094 high
1095 ~~~~
1096
1097 Specifies which half of the LDS word to use. Low half of LDS word is used by default.
1098 GFX9 and GFX10 only.
1099
1100     ======================================== ================================
1101     Syntax                                   Description
1102     ======================================== ================================
1103     high                                     Use high half of LDS word.
1104     ======================================== ================================
1105
1106 DPP8 Modifiers
1107 --------------
1108
1109 GFX10 only.
1110
1111 .. _amdgpu_synid_dpp8_sel:
1112
1113 dpp8_sel
1114 ~~~~~~~~
1115
1116 Selects which lanes to pull data from, within a group of 8 lanes. This is a mandatory modifier.
1117 There is no default value.
1118
1119 GFX10 only.
1120
1121 The *dpp8_sel* modifier must specify exactly 8 values.
1122 First value selects which lane to read from to supply data into lane 0.
1123 Second value controls lane 1 and so on.
1124
1125 Each value may be specified as either
1126 an :ref:`integer number<amdgpu_synid_integer_number>` or
1127 an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
1128
1129     =============================================================== ===========================
1130     Syntax                                                          Description
1131     =============================================================== ===========================
1132     dpp8:[{0..7},{0..7},{0..7},{0..7},{0..7},{0..7},{0..7},{0..7}]  Select lanes to read from.
1133     =============================================================== ===========================
1134
1135 Examples:
1136
1137 .. parsed-literal::
1138
1139   dpp8:[7,6,5,4,3,2,1,0]
1140   dpp8:[0,1,0,1,0,1,0,1]
1141
1142 .. _amdgpu_synid_fi8:
1143
1144 fi
1145 ~~
1146
1147 Controls interaction with inactive lanes for *dpp8* instructions. The default value is zero.
1148
1149 Note: *inactive* lanes are those whose :ref:`exec<amdgpu_synid_exec>` mask bit is zero.
1150
1151 GFX10 only.
1152
1153     ==================================== =====================================================
1154     Syntax                               Description
1155     ==================================== =====================================================
1156     fi:0                                 Fetch zero when accessing data from inactive lanes.
1157     fi:1                                 Fetch pre-exist values from inactive lanes.
1158     ==================================== =====================================================
1159
1160 Note: numeric values may be specified as either :ref:`integer numbers<amdgpu_synid_integer_number>` or
1161 :ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
1162
1163 DPP Modifiers
1164 -------------
1165
1166 GFX8, GFX9 and GFX10 only.
1167
1168 .. _amdgpu_synid_dpp_ctrl:
1169
1170 dpp_ctrl
1171 ~~~~~~~~
1172
1173 Specifies how data are shared between threads. This is a mandatory modifier.
1174 There is no default value.
1175
1176 GFX8 and GFX9 only. Use :ref:`dpp16_ctrl<amdgpu_synid_dpp16_ctrl>` for GFX10.
1177
1178 Note: the lanes of a wavefront are organized in four *rows* and four *banks*.
1179
1180     ======================================== ================================================
1181     Syntax                                   Description
1182     ======================================== ================================================
1183     quad_perm:[{0..3},{0..3},{0..3},{0..3}]  Full permute of 4 threads.
1184     row_mirror                               Mirror threads within row.
1185     row_half_mirror                          Mirror threads within 1/2 row (8 threads).
1186     row_bcast:15                             Broadcast 15th thread of each row to next row.
1187     row_bcast:31                             Broadcast thread 31 to rows 2 and 3.
1188     wave_shl:1                               Wavefront left shift by 1 thread.
1189     wave_rol:1                               Wavefront left rotate by 1 thread.
1190     wave_shr:1                               Wavefront right shift by 1 thread.
1191     wave_ror:1                               Wavefront right rotate by 1 thread.
1192     row_shl:{1..15}                          Row shift left by 1-15 threads.
1193     row_shr:{1..15}                          Row shift right by 1-15 threads.
1194     row_ror:{1..15}                          Row rotate right by 1-15 threads.
1195     ======================================== ================================================
1196
1197 Note: numeric values may be specified as either
1198 :ref:`integer numbers<amdgpu_synid_integer_number>` or
1199 :ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
1200
1201 Examples:
1202
1203 .. parsed-literal::
1204
1205   quad_perm:[0, 1, 2, 3]
1206   row_shl:3
1207
1208 .. _amdgpu_synid_dpp16_ctrl:
1209
1210 dpp16_ctrl
1211 ~~~~~~~~~~
1212
1213 Specifies how data are shared between threads. This is a mandatory modifier.
1214 There is no default value.
1215
1216 GFX10 only. Use :ref:`dpp_ctrl<amdgpu_synid_dpp_ctrl>` for GFX8 and GFX9.
1217
1218 Note: the lanes of a wavefront are organized in four *rows* and four *banks*.
1219 (There are only two rows in *wave32* mode.)
1220
1221     ======================================== ====================================================
1222     Syntax                                   Description
1223     ======================================== ====================================================
1224     quad_perm:[{0..3},{0..3},{0..3},{0..3}]  Full permute of 4 threads.
1225     row_mirror                               Mirror threads within row.
1226     row_half_mirror                          Mirror threads within 1/2 row (8 threads).
1227     row_share:{0..15}                        Share the value from the specified lane with other
1228                                              lanes in the row.
1229     row_xmask:{0..15}                        Fetch from XOR(current lane id, specified lane id).
1230     row_shl:{1..15}                          Row shift left by 1-15 threads.
1231     row_shr:{1..15}                          Row shift right by 1-15 threads.
1232     row_ror:{1..15}                          Row rotate right by 1-15 threads.
1233     ======================================== ====================================================
1234
1235 Note: numeric values may be specified as either
1236 :ref:`integer numbers<amdgpu_synid_integer_number>` or
1237 :ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
1238
1239 Examples:
1240
1241 .. parsed-literal::
1242
1243   quad_perm:[0, 1, 2, 3]
1244   row_shl:3
1245
1246 .. _amdgpu_synid_dpp32_ctrl:
1247
1248 dpp32_ctrl
1249 ~~~~~~~~~~
1250
1251 Specifies how data are shared between threads. This is a mandatory modifier.
1252 There is no default value.
1253
1254 May be used only with GFX90A 32-bit instructions.
1255
1256 Note: the lanes of a wavefront are organized in four *rows* and four *banks*.
1257
1258     ======================================== ==================================================
1259     Syntax                                   Description
1260     ======================================== ==================================================
1261     quad_perm:[{0..3},{0..3},{0..3},{0..3}]  Full permute of 4 threads.
1262     row_mirror                               Mirror threads within row.
1263     row_half_mirror                          Mirror threads within 1/2 row (8 threads).
1264     row_bcast:15                             Broadcast 15th thread of each row to next row.
1265     row_bcast:31                             Broadcast thread 31 to rows 2 and 3.
1266     wave_shl:1                               Wavefront left shift by 1 thread.
1267     wave_rol:1                               Wavefront left rotate by 1 thread.
1268     wave_shr:1                               Wavefront right shift by 1 thread.
1269     wave_ror:1                               Wavefront right rotate by 1 thread.
1270     row_shl:{1..15}                          Row shift left by 1-15 threads.
1271     row_shr:{1..15}                          Row shift right by 1-15 threads.
1272     row_ror:{1..15}                          Row rotate right by 1-15 threads.
1273     row_newbcast:{1..15}                     Broadcast a thread within a row to the whole row.
1274     ======================================== ==================================================
1275
1276 Note: numeric values may be specified as either
1277 :ref:`integer numbers<amdgpu_synid_integer_number>` or
1278 :ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
1279
1280 Examples:
1281
1282 .. parsed-literal::
1283
1284   quad_perm:[0, 1, 2, 3]
1285   row_shl:3
1286
1287
1288 .. _amdgpu_synid_dpp64_ctrl:
1289
1290 dpp64_ctrl
1291 ~~~~~~~~~~
1292
1293 Specifies how data are shared between threads. This is a mandatory modifier.
1294 There is no default value.
1295
1296 May be used only with GFX90A 64-bit instructions.
1297
1298 Note: the lanes of a wavefront are organized in four *rows* and four *banks*.
1299
1300     ======================================== ==================================================
1301     Syntax                                   Description
1302     ======================================== ==================================================
1303     row_newbcast:{1..15}                     Broadcast a thread within a row to the whole row.
1304     ======================================== ==================================================
1305
1306 Note: numeric values may be specified as either
1307 :ref:`integer numbers<amdgpu_synid_integer_number>` or
1308 :ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
1309
1310 Examples:
1311
1312 .. parsed-literal::
1313
1314   row_newbcast:3
1315
1316
1317 .. _amdgpu_synid_row_mask:
1318
1319 row_mask
1320 ~~~~~~~~
1321
1322 Controls which rows are enabled for data sharing. By default, all rows are enabled.
1323
1324 Note: the lanes of a wavefront are organized in four *rows* and four *banks*.
1325 (There are only two rows in *wave32* mode.)
1326
1327     ================= ====================================================================
1328     Syntax            Description
1329     ================= ====================================================================
1330     row_mask:{0..15}  Specifies a *row mask* as a positive
1331                       :ref:`integer number <amdgpu_synid_integer_number>`
1332                       or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
1333
1334                       Each of 4 bits in the mask controls one row
1335                       (0 - disabled, 1 - enabled).
1336
1337                       In *wave32* mode the values should be limited to 0..7.
1338     ================= ====================================================================
1339
1340 Examples:
1341
1342 .. parsed-literal::
1343
1344   row_mask:0xf
1345   row_mask:0b1010
1346   row_mask:x|y
1347
1348 .. _amdgpu_synid_bank_mask:
1349
1350 bank_mask
1351 ~~~~~~~~~
1352
1353 Controls which banks are enabled for data sharing. By default, all banks are enabled.
1354
1355 Note: the lanes of a wavefront are organized in four *rows* and four *banks*.
1356 (There are only two rows in *wave32* mode.)
1357
1358     ================== ====================================================================
1359     Syntax             Description
1360     ================== ====================================================================
1361     bank_mask:{0..15}  Specifies a *bank mask* as a positive
1362                        :ref:`integer number <amdgpu_synid_integer_number>`
1363                        or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
1364
1365                        Each of 4 bits in the mask controls one bank
1366                        (0 - disabled, 1 - enabled).
1367     ================== ====================================================================
1368
1369 Examples:
1370
1371 .. parsed-literal::
1372
1373   bank_mask:0x3
1374   bank_mask:0b0011
1375   bank_mask:x&y
1376
1377 .. _amdgpu_synid_bound_ctrl:
1378
1379 bound_ctrl
1380 ~~~~~~~~~~
1381
1382 Controls data sharing when accessing an invalid lane. By default, data sharing with
1383 invalid lanes is disabled.
1384
1385     ======================================== ================================================
1386     Syntax                                   Description
1387     ======================================== ================================================
1388     bound_ctrl:1                             Enables data sharing with invalid lanes.
1389
1390                                              Accessing data from an invalid lane will
1391                                              return zero.
1392     ======================================== ================================================
1393
1394 .. _amdgpu_synid_fi16:
1395
1396 fi
1397 ~~
1398
1399 Controls interaction with *inactive* lanes for *dpp16* instructions. The default value is zero.
1400
1401 Note: *inactive* lanes are those whose :ref:`exec<amdgpu_synid_exec>` mask bit is zero.
1402
1403 GFX10 only.
1404
1405     ======================================== ==================================================
1406     Syntax                                   Description
1407     ======================================== ==================================================
1408     fi:0                                     Interaction with inactive lanes is controlled by
1409                                              :ref:`bound_ctrl<amdgpu_synid_bound_ctrl>`.
1410
1411     fi:1                                     Fetch pre-exist values from inactive lanes.
1412     ======================================== ==================================================
1413
1414 Note: numeric values may be specified as either :ref:`integer numbers<amdgpu_synid_integer_number>` or
1415 :ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
1416
1417 SDWA Modifiers
1418 --------------
1419
1420 GFX8, GFX9 and GFX10 only.
1421
1422 clamp
1423 ~~~~~
1424
1425 See a description :ref:`here<amdgpu_synid_clamp>`.
1426
1427 omod
1428 ~~~~
1429
1430 See a description :ref:`here<amdgpu_synid_omod>`.
1431
1432 GFX9 and GFX10 only.
1433
1434 .. _amdgpu_synid_dst_sel:
1435
1436 dst_sel
1437 ~~~~~~~
1438
1439 Selects which bits in the destination are affected. By default, all bits are affected.
1440
1441     ======================================== ================================================
1442     Syntax                                   Description
1443     ======================================== ================================================
1444     dst_sel:DWORD                            Use bits 31:0.
1445     dst_sel:BYTE_0                           Use bits 7:0.
1446     dst_sel:BYTE_1                           Use bits 15:8.
1447     dst_sel:BYTE_2                           Use bits 23:16.
1448     dst_sel:BYTE_3                           Use bits 31:24.
1449     dst_sel:WORD_0                           Use bits 15:0.
1450     dst_sel:WORD_1                           Use bits 31:16.
1451     ======================================== ================================================
1452
1453 .. _amdgpu_synid_dst_unused:
1454
1455 dst_unused
1456 ~~~~~~~~~~
1457
1458 Controls what to do with the bits in the destination which are not selected
1459 by :ref:`dst_sel<amdgpu_synid_dst_sel>`.
1460 By default, unused bits are preserved.
1461
1462     ======================================== ================================================
1463     Syntax                                   Description
1464     ======================================== ================================================
1465     dst_unused:UNUSED_PAD                    Pad with zeros.
1466     dst_unused:UNUSED_SEXT                   Sign-extend upper bits, zero lower bits.
1467     dst_unused:UNUSED_PRESERVE               Preserve bits.
1468     ======================================== ================================================
1469
1470 .. _amdgpu_synid_src0_sel:
1471
1472 src0_sel
1473 ~~~~~~~~
1474
1475 Controls which bits in the src0 are used. By default, all bits are used.
1476
1477     ======================================== ================================================
1478     Syntax                                   Description
1479     ======================================== ================================================
1480     src0_sel:DWORD                           Use bits 31:0.
1481     src0_sel:BYTE_0                          Use bits 7:0.
1482     src0_sel:BYTE_1                          Use bits 15:8.
1483     src0_sel:BYTE_2                          Use bits 23:16.
1484     src0_sel:BYTE_3                          Use bits 31:24.
1485     src0_sel:WORD_0                          Use bits 15:0.
1486     src0_sel:WORD_1                          Use bits 31:16.
1487     ======================================== ================================================
1488
1489 .. _amdgpu_synid_src1_sel:
1490
1491 src1_sel
1492 ~~~~~~~~
1493
1494 Controls which bits in the src1 are used. By default, all bits are used.
1495
1496     ======================================== ================================================
1497     Syntax                                   Description
1498     ======================================== ================================================
1499     src1_sel:DWORD                           Use bits 31:0.
1500     src1_sel:BYTE_0                          Use bits 7:0.
1501     src1_sel:BYTE_1                          Use bits 15:8.
1502     src1_sel:BYTE_2                          Use bits 23:16.
1503     src1_sel:BYTE_3                          Use bits 31:24.
1504     src1_sel:WORD_0                          Use bits 15:0.
1505     src1_sel:WORD_1                          Use bits 31:16.
1506     ======================================== ================================================
1507
1508 .. _amdgpu_synid_sdwa_operand_modifiers:
1509
1510 SDWA Operand Modifiers
1511 ----------------------
1512
1513 Operand modifiers are not used separately. They are applied to source operands.
1514
1515 GFX8, GFX9 and GFX10 only.
1516
1517 abs
1518 ~~~
1519
1520 See a description :ref:`here<amdgpu_synid_abs>`.
1521
1522 neg
1523 ~~~
1524
1525 See a description :ref:`here<amdgpu_synid_neg>`.
1526
1527 .. _amdgpu_synid_sext:
1528
1529 sext
1530 ~~~~
1531
1532 Sign-extends value of a (sub-dword) operand to fill all 32 bits.
1533 Has no effect for 32-bit operands.
1534
1535 Valid for integer operands only.
1536
1537     ======================================== ================================================
1538     Syntax                                   Description
1539     ======================================== ================================================
1540     sext(<operand>)                          Sign-extend operand value.
1541     ======================================== ================================================
1542
1543 Examples:
1544
1545 .. parsed-literal::
1546
1547   sext(v4)
1548   sext(v255)
1549
1550 VOP3 Modifiers
1551 --------------
1552
1553 .. _amdgpu_synid_vop3_op_sel:
1554
1555 op_sel
1556 ~~~~~~
1557
1558 Selects the low [15:0] or high [31:16] operand bits for source and destination operands.
1559 By default, low bits are used for all operands.
1560
1561 The number of values specified with the op_sel modifier must match the number of instruction
1562 operands (both source and destination). First value controls src0, second value controls src1
1563 and so on, except that the last value controls destination.
1564 The value 0 selects the low bits, while 1 selects the high bits.
1565
1566 Note: op_sel modifier affects 16-bit operands only. For 32-bit operands the value specified
1567 by op_sel must be 0.
1568
1569 GFX9 and GFX10 only.
1570
1571     ======================================== ============================================================
1572     Syntax                                   Description
1573     ======================================== ============================================================
1574     op_sel:[{0..1},{0..1}]                   Select operand bits for instructions with 1 source operand.
1575     op_sel:[{0..1},{0..1},{0..1}]            Select operand bits for instructions with 2 source operands.
1576     op_sel:[{0..1},{0..1},{0..1},{0..1}]     Select operand bits for instructions with 3 source operands.
1577     ======================================== ============================================================
1578
1579 Note: numeric values may be specified as either
1580 :ref:`integer numbers<amdgpu_synid_integer_number>` or
1581 :ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
1582
1583 Examples:
1584
1585 .. parsed-literal::
1586
1587   op_sel:[0,0]
1588   op_sel:[0,1]
1589
1590 .. _amdgpu_synid_dpp_op_sel:
1591
1592 dpp_op_sel
1593 ~~~~~~~~~~
1594
1595 Special version of *op_sel* used for *permlane* opcodes to specify
1596 dpp-like mode bits - :ref:`fi<amdgpu_synid_fi16>` and
1597 :ref:`bound_ctrl<amdgpu_synid_bound_ctrl>`.
1598
1599 GFX10 only.
1600
1601     ======================================== ============================================================
1602     Syntax                                   Description
1603     ======================================== ============================================================
1604     op_sel:[{0..1},{0..1}]                   First bit specifies :ref:`fi<amdgpu_synid_fi16>`, second
1605                                              bit specifies :ref:`bound_ctrl<amdgpu_synid_bound_ctrl>`.
1606     ======================================== ============================================================
1607
1608 Note: numeric values may be specified as either
1609 :ref:`integer numbers<amdgpu_synid_integer_number>` or
1610 :ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
1611
1612 Examples:
1613
1614 .. parsed-literal::
1615
1616   op_sel:[0,0]
1617
1618 .. _amdgpu_synid_clamp:
1619
1620 clamp
1621 ~~~~~
1622
1623 Clamp meaning depends on instruction.
1624
1625 For *v_cmp* instructions, clamp modifier indicates that the compare signals
1626 if a floating point exception occurs. By default, signaling is disabled.
1627 Not supported by GFX7.
1628
1629 For integer operations, clamp modifier indicates that the result must be clamped
1630 to the largest and smallest representable value. By default, there is no clamping.
1631 Integer clamping is not supported by GFX7.
1632
1633 For floating point operations, clamp modifier indicates that the result must be clamped
1634 to the range [0.0, 1.0]. By default, there is no clamping.
1635
1636 Note: clamp modifier is applied after :ref:`output modifiers<amdgpu_synid_omod>` (if any).
1637
1638     ======================================== ================================================
1639     Syntax                                   Description
1640     ======================================== ================================================
1641     clamp                                    Enables clamping (or signaling).
1642     ======================================== ================================================
1643
1644 .. _amdgpu_synid_omod:
1645
1646 omod
1647 ~~~~
1648
1649 Specifies if an output modifier must be applied to the result.
1650 By default, no output modifiers are applied.
1651
1652 Note: output modifiers are applied before :ref:`clamping<amdgpu_synid_clamp>` (if any).
1653
1654 Output modifiers are valid for f32 and f64 floating point results only.
1655 They must not be used with f16.
1656
1657 Note: *v_cvt_f16_f32* is an exception. This instruction produces f16 result
1658 but accepts output modifiers.
1659
1660     ======================================== ================================================
1661     Syntax                                   Description
1662     ======================================== ================================================
1663     mul:2                                    Multiply the result by 2.
1664     mul:4                                    Multiply the result by 4.
1665     div:2                                    Multiply the result by 0.5.
1666     ======================================== ================================================
1667
1668 Note: numeric values may be specified as either :ref:`integer numbers<amdgpu_synid_integer_number>` or
1669 :ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
1670
1671 Examples:
1672
1673 .. parsed-literal::
1674
1675   mul:2
1676   mul:x      // x must be equal to 2 or 4
1677
1678 .. _amdgpu_synid_vop3_operand_modifiers:
1679
1680 VOP3 Operand Modifiers
1681 ----------------------
1682
1683 Operand modifiers are not used separately. They are applied to source operands.
1684
1685 .. _amdgpu_synid_abs:
1686
1687 abs
1688 ~~~
1689
1690 Computes the absolute value of its operand. Must be applied before :ref:`neg<amdgpu_synid_neg>`
1691 (if any). Valid for floating point operands only.
1692
1693     ======================================== ====================================================
1694     Syntax                                   Description
1695     ======================================== ====================================================
1696     abs(<operand>)                           Get the absolute value of a floating-point operand.
1697     \|<operand>|                             The same as above (an SP3 syntax).
1698     ======================================== ====================================================
1699
1700 Note: avoid using SP3 syntax with operands specified as expressions because the trailing '|'
1701 may be misinterpreted. Such operands should be enclosed into additional parentheses as shown
1702 in examples below.
1703
1704 Examples:
1705
1706 .. parsed-literal::
1707
1708   abs(v36)
1709   \|v36|
1710   abs(x|y)     // ok
1711   \|(x|y)|      // additional parentheses are required
1712
1713 .. _amdgpu_synid_neg:
1714
1715 neg
1716 ~~~
1717
1718 Computes the negative value of its operand. Must be applied after :ref:`abs<amdgpu_synid_abs>`
1719 (if any). Valid for floating point operands only.
1720
1721     ================== ====================================================
1722     Syntax             Description
1723     ================== ====================================================
1724     neg(<operand>)     Get the negative value of a floating-point operand.
1725                        The operand may include an optional
1726                        :ref:`abs<amdgpu_synid_abs>` modifier.
1727     -<operand>         The same as above (an SP3 syntax).
1728     ================== ====================================================
1729
1730 Note: SP3 syntax is supported with limitations because of a potential ambiguity.
1731 Currently it is allowed in the following cases:
1732
1733 * Before a register.
1734 * Before an :ref:`abs<amdgpu_synid_abs>` modifier.
1735 * Before an SP3 :ref:`abs<amdgpu_synid_abs>` modifier.
1736
1737 In all other cases "-" is handled as a part of an expression that follows the sign.
1738
1739 Examples:
1740
1741 .. parsed-literal::
1742
1743   // Operands with negate modifiers
1744   neg(v[0])
1745   neg(1.0)
1746   neg(abs(v0))
1747   -v5
1748   -abs(v5)
1749   -\|v5|
1750
1751   // Operands without negate modifiers
1752   -1
1753   -x+y
1754
1755 VOP3P Modifiers
1756 ---------------
1757
1758 This section describes modifiers of *regular* VOP3P instructions.
1759
1760 *v_mad_mix\** and *v_fma_mix\**
1761 instructions use these modifiers :ref:`in a special manner<amdgpu_synid_mad_mix>`.
1762
1763 GFX9 and GFX10 only.
1764
1765 .. _amdgpu_synid_op_sel:
1766
1767 op_sel
1768 ~~~~~~
1769
1770 Selects the low [15:0] or high [31:16] operand bits as input to the operation
1771 which results in the lower-half of the destination.
1772 By default, low bits are used for all operands.
1773
1774 The number of values specified by the *op_sel* modifier must match the number of source
1775 operands. First value controls src0, second value controls src1 and so on.
1776
1777 The value 0 selects the low bits, while 1 selects the high bits.
1778
1779     ================================= =============================================================
1780     Syntax                            Description
1781     ================================= =============================================================
1782     op_sel:[{0..1}]                   Select operand bits for instructions with 1 source operand.
1783     op_sel:[{0..1},{0..1}]            Select operand bits for instructions with 2 source operands.
1784     op_sel:[{0..1},{0..1},{0..1}]     Select operand bits for instructions with 3 source operands.
1785     ================================= =============================================================
1786
1787 Note: numeric values may be specified as either
1788 :ref:`integer numbers<amdgpu_synid_integer_number>` or
1789 :ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
1790
1791 Examples:
1792
1793 .. parsed-literal::
1794
1795   op_sel:[0,0]
1796   op_sel:[0,1,0]
1797
1798 .. _amdgpu_synid_op_sel_hi:
1799
1800 op_sel_hi
1801 ~~~~~~~~~
1802
1803 Selects the low [15:0] or high [31:16] operand bits as input to the operation
1804 which results in the upper-half of the destination.
1805 By default, high bits are used for all operands.
1806
1807 The number of values specified by the *op_sel_hi* modifier must match the number of source
1808 operands. First value controls src0, second value controls src1 and so on.
1809
1810 The value 0 selects the low bits, while 1 selects the high bits.
1811
1812     =================================== =============================================================
1813     Syntax                              Description
1814     =================================== =============================================================
1815     op_sel_hi:[{0..1}]                  Select operand bits for instructions with 1 source operand.
1816     op_sel_hi:[{0..1},{0..1}]           Select operand bits for instructions with 2 source operands.
1817     op_sel_hi:[{0..1},{0..1},{0..1}]    Select operand bits for instructions with 3 source operands.
1818     =================================== =============================================================
1819
1820 Note: numeric values may be specified as either
1821 :ref:`integer numbers<amdgpu_synid_integer_number>` or
1822 :ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
1823
1824 Examples:
1825
1826 .. parsed-literal::
1827
1828   op_sel_hi:[0,0]
1829   op_sel_hi:[0,0,1]
1830
1831 .. _amdgpu_synid_neg_lo:
1832
1833 neg_lo
1834 ~~~~~~
1835
1836 Specifies whether to change sign of operand values selected by
1837 :ref:`op_sel<amdgpu_synid_op_sel>`. These values are then used
1838 as input to the operation which results in the upper-half of the destination.
1839
1840 The number of values specified by this modifier must match the number of source
1841 operands. First value controls src0, second value controls src1 and so on.
1842
1843 The value 0 indicates that the corresponding operand value is used unmodified,
1844 the value 1 indicates that negative value of the operand must be used.
1845
1846 By default, operand values are used unmodified.
1847
1848 This modifier is valid for floating point operands only.
1849
1850     ================================ ==================================================================
1851     Syntax                           Description
1852     ================================ ==================================================================
1853     neg_lo:[{0..1}]                  Select affected operands for instructions with 1 source operand.
1854     neg_lo:[{0..1},{0..1}]           Select affected operands for instructions with 2 source operands.
1855     neg_lo:[{0..1},{0..1},{0..1}]    Select affected operands for instructions with 3 source operands.
1856     ================================ ==================================================================
1857
1858 Note: numeric values may be specified as either
1859 :ref:`integer numbers<amdgpu_synid_integer_number>` or
1860 :ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
1861
1862 Examples:
1863
1864 .. parsed-literal::
1865
1866   neg_lo:[0]
1867   neg_lo:[0,1]
1868
1869 .. _amdgpu_synid_neg_hi:
1870
1871 neg_hi
1872 ~~~~~~
1873
1874 Specifies whether to change sign of operand values selected by
1875 :ref:`op_sel_hi<amdgpu_synid_op_sel_hi>`. These values are then used
1876 as input to the operation which results in the upper-half of the destination.
1877
1878 The number of values specified by this modifier must match the number of source
1879 operands. First value controls src0, second value controls src1 and so on.
1880
1881 The value 0 indicates that the corresponding operand value is used unmodified,
1882 the value 1 indicates that negative value of the operand must be used.
1883
1884 By default, operand values are used unmodified.
1885
1886 This modifier is valid for floating point operands only.
1887
1888     =============================== ==================================================================
1889     Syntax                          Description
1890     =============================== ==================================================================
1891     neg_hi:[{0..1}]                 Select affected operands for instructions with 1 source operand.
1892     neg_hi:[{0..1},{0..1}]          Select affected operands for instructions with 2 source operands.
1893     neg_hi:[{0..1},{0..1},{0..1}]   Select affected operands for instructions with 3 source operands.
1894     =============================== ==================================================================
1895
1896 Note: numeric values may be specified as either
1897 :ref:`integer numbers<amdgpu_synid_integer_number>` or
1898 :ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
1899
1900 Examples:
1901
1902 .. parsed-literal::
1903
1904   neg_hi:[1,0]
1905   neg_hi:[0,1,1]
1906
1907 clamp
1908 ~~~~~
1909
1910 See a description :ref:`here<amdgpu_synid_clamp>`.
1911
1912 .. _amdgpu_synid_mad_mix:
1913
1914 VOP3P MAD_MIX/FMA_MIX Modifiers
1915 -------------------------------
1916
1917 *v_mad_mix\** and *v_fma_mix\**
1918 instructions use *op_sel* and *op_sel_hi* modifiers
1919 in a manner different from *regular* VOP3P instructions.
1920
1921 See a description below.
1922
1923 GFX9 and GFX10 only.
1924
1925 .. _amdgpu_synid_mad_mix_op_sel:
1926
1927 m_op_sel
1928 ~~~~~~~~
1929
1930 This operand has meaning only for 16-bit source operands as indicated by
1931 :ref:`m_op_sel_hi<amdgpu_synid_mad_mix_op_sel_hi>`.
1932 It specifies to select either the low [15:0] or high [31:16] operand bits
1933 as input to the operation.
1934
1935 The number of values specified by the *op_sel* modifier must match the number of source
1936 operands. First value controls src0, second value controls src1 and so on.
1937
1938 The value 0 indicates the low bits, the value 1 indicates the high 16 bits.
1939
1940 By default, low bits are used for all operands.
1941
1942     =============================== ================================================
1943     Syntax                          Description
1944     =============================== ================================================
1945     op_sel:[{0..1},{0..1},{0..1}]   Select location of each 16-bit source operand.
1946     =============================== ================================================
1947
1948 Note: numeric values may be specified as either
1949 :ref:`integer numbers<amdgpu_synid_integer_number>` or
1950 :ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
1951
1952 Examples:
1953
1954 .. parsed-literal::
1955
1956   op_sel:[0,1]
1957
1958 .. _amdgpu_synid_mad_mix_op_sel_hi:
1959
1960 m_op_sel_hi
1961 ~~~~~~~~~~~
1962
1963 Selects the size of source operands: either 32 bits or 16 bits.
1964 By default, 32 bits are used for all source operands.
1965
1966 The number of values specified by the *op_sel_hi* modifier must match the number of source
1967 operands. First value controls src0, second value controls src1 and so on.
1968
1969 The value 0 indicates 32 bits, the value 1 indicates 16 bits.
1970
1971 The location of 16 bits in the operand may be specified by
1972 :ref:`m_op_sel<amdgpu_synid_mad_mix_op_sel>`.
1973
1974     ======================================== ====================================
1975     Syntax                                   Description
1976     ======================================== ====================================
1977     op_sel_hi:[{0..1},{0..1},{0..1}]         Select size of each source operand.
1978     ======================================== ====================================
1979
1980 Note: numeric values may be specified as either
1981 :ref:`integer numbers<amdgpu_synid_integer_number>` or
1982 :ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
1983
1984 Examples:
1985
1986 .. parsed-literal::
1987
1988   op_sel_hi:[1,1,1]
1989
1990 abs
1991 ~~~
1992
1993 See a description :ref:`here<amdgpu_synid_abs>`.
1994
1995 neg
1996 ~~~
1997
1998 See a description :ref:`here<amdgpu_synid_neg>`.
1999
2000 clamp
2001 ~~~~~
2002
2003 See a description :ref:`here<amdgpu_synid_clamp>`.
2004
2005 VOP3P MFMA Modifiers
2006 --------------------
2007
2008 These modifiers may only be used with GFX908 and GFX90A.
2009
2010 .. _amdgpu_synid_cbsz:
2011
2012 cbsz
2013 ~~~~
2014
2015 Specifies a broadcast mode.
2016
2017     =============================== ==================================================================
2018     Syntax                          Description
2019     =============================== ==================================================================
2020     cbsz:[{0..7}]                   A broadcast mode.
2021     =============================== ==================================================================
2022
2023 Note: numeric value may be specified as either
2024 an :ref:`integer number<amdgpu_synid_integer_number>` or
2025 an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
2026
2027 .. _amdgpu_synid_abid:
2028
2029 abid
2030 ~~~~
2031
2032 Specifies matrix A group select.
2033
2034     =============================== ==================================================================
2035     Syntax                          Description
2036     =============================== ==================================================================
2037     abid:[{0..15}]                  Matrix A group select id.
2038     =============================== ==================================================================
2039
2040 Note: numeric value may be specified as either
2041 an :ref:`integer number<amdgpu_synid_integer_number>` or
2042 an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
2043
2044 .. _amdgpu_synid_blgp:
2045
2046 blgp
2047 ~~~~
2048
2049 Specifies matrix B lane group pattern.
2050
2051     =============================== ==================================================================
2052     Syntax                          Description
2053     =============================== ==================================================================
2054     blgp:[{0..7}]                   Matrix B lane group pattern.
2055     =============================== ==================================================================
2056
2057 Note: numeric value may be specified as either
2058 an :ref:`integer number<amdgpu_synid_integer_number>` or
2059 an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
2060
2061 .. _amdgpu_synid_mfma_neg:
2062
2063 neg
2064 ~~~
2065
2066 Indicates operands that must be negated before the operation.
2067 The number of values specified by this modifier must match the number of source
2068 operands. First value controls src0, second value controls src1 and so on.
2069
2070 The value 0 indicates that the corresponding operand value is used unmodified,
2071 the value 1 indicates that the operand value must be negated before the operation.
2072
2073 By default, operand values are used unmodified.
2074
2075 This modifier is valid for floating point operands only.
2076
2077     =============================== ==================================================================
2078     Syntax                          Description
2079     =============================== ==================================================================
2080     neg:[{0..1},{0..1},{0..1}]      Select operands which must be negated before the operation.
2081     =============================== ==================================================================
2082
2083 Note: numeric values may be specified as either
2084 :ref:`integer numbers<amdgpu_synid_integer_number>` or
2085 :ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
2086
2087 Examples:
2088
2089 .. parsed-literal::
2090
2091   neg:[0,1,1]