mlir/docs/DeclarativeRewrites.md

   1 # Table-driven Declarative Rewrite Rule (DRR)
   2
   3 In addition to subclassing the `mlir::RewritePattern` C++ class, MLIR also
   4 supports defining rewrite rules in a declarative manner. Similar to
   5 [Op Definition Specification](DefiningDialects/Operations.md) (ODS), this is achieved via
   6 [TableGen][TableGen], which is a language to maintain records of domain-specific
   7 information. The rewrite rules are specified concisely in a TableGen record,
   8 which will be expanded into an equivalent `mlir::RewritePattern` subclass at
   9 compiler build time.
  10
  11 This manual explains in detail all of the available mechanisms for defining
  12 rewrite rules in such a declarative manner. It aims to be a specification
  13 instead of a tutorial. Please refer to
  14 [Quickstart tutorial to adding MLIR graph rewrite](Tutorials/QuickstartRewrites.md)
  15 for the latter.
  16
  17 Given that declarative rewrite rules depend on op definition specification, this
  18 manual assumes knowledge of the [ODS](DefiningDialects/Operations.md) doc.
  19
  20 [TOC]
  21
  22 ## Benefits
  23
  24 Compared to the hand-written C++ classes, this declarative approach has several
  25 benefits, including but not limited to:
  26
  27 *   **Being declarative**: The pattern creator just needs to state the rewrite
  28     pattern declaratively, without worrying about the concrete C++ methods to
  29     call.
  30 *   **Removing boilerplate and showing the very essence of the rewrite**:
  31     `mlir::RewritePattern` is already good at hiding boilerplate for defining a
  32     rewrite rule. But we still need to write the class and function structures
  33     required by the C++ programming language, inspect ops for matching, and call
  34     op `build()` methods for constructing. These statements are typically quite
  35     simple and similar, so they can be further condensed with auto-generation.
  36     Because we reduce the boilerplate to the bare minimum, the declarative
  37     rewrite rule will just contain the very essence of the rewrite. This makes
  38     it very easy to understand the pattern.
  39
  40 ## Strengths and Limitations
  41
  42 The declarative rewrite rule is **operation-based**: it describes a rule to
  43 match against a directed acyclic graph (DAG) of operations and generate DAGs of
  44 operations. This gives DRR both its strengths and limitations: it is good at
  45 expressing op to op conversions, but not that well suited for, say, converting
  46 an op into a loop nest.
  47
  48 Per the current implementation, DRR does not have good support for the following
  49 features:
  50
  51 *   Matching and generating ops with regions.
  52 *   Matching and generating ops with block arguments.
  53 *   Matching multi-result ops in nested patterns.
  54 *   Matching and generating variadic operand/result ops in nested patterns.
  55 *   Packing and unpacking variadic operands/results during generation.
  56 *   [`NativeCodeCall`](#nativecodecall-transforming-the-generated-op) returning
  57     more than one results.
  58
  59 ## Rule Definition
  60
  61 The core construct for defining a rewrite rule is defined in
  62 [`PatternBase.td`][PatternBase] as
  63
  64 ```tablegen
  65 class Pattern<
  66     dag sourcePattern, list<dag> resultPatterns,
  67     list<dag> additionalConstraints = [],
  68     list<dag> supplementalPatterns = [],
  69     dag benefitsAdded = (addBenefit 0)>;
  70 ```
  71
  72 A declarative rewrite rule contains two main components:
  73
  74 *   A *source pattern*, which is used for matching a DAG of operations.
  75 *   One or more *result patterns*, which are used for generating DAGs of
  76     operations to replace the matched DAG of operations.
  77
  78 We allow multiple result patterns to support
  79 [multi-result ops](#supporting-multi-result-ops) and
  80 [auxiliary ops](#supporting-auxiliary-ops), but frequently we just want to
  81 convert one DAG of operations to another DAG of operations. There is a handy
  82 wrapper of `Pattern`, `Pat`, which takes a single result pattern:
  83
  84 ```tablegen
  85 class Pat<
  86     dag sourcePattern, dag resultPattern,
  87     list<dag> additionalConstraints = [],
  88     dag benefitsAdded = (addBenefit 0)> :
  89   Pattern<sourcePattern, [resultPattern], additionalConstraints, benefitAdded>;
  90 ```
  91
  92 Each pattern is specified as a TableGen `dag` object with the syntax of
  93 `(operator arg0, arg1, ...)`.
  94
  95 `operator` is typically an MLIR op, but it can also be other
  96 [directives](#rewrite-directives). `argN` is for matching (if used in source
  97 pattern) or generating (if used in result pattern) the `N`-th argument for
  98 `operator`. If the `operator` is some MLIR operation, it means the `N`-th
  99 argument as specified in the `arguments` list of the op's definition. Therefore,
 100 we say op argument specification in pattern is **position-based**: the position
 101 where they appear matters.
 102
 103 `argN` can be a `dag` object itself, thus we can have nested `dag` tree to model
 104 the def-use relationship between ops.
 105
 106 ### Source pattern
 107
 108 The source pattern is for matching a DAG of operations. Arguments in the `dag`
 109 object are intended to **capture** the op arguments. They can also be used to
 110 **further limit** the match criteria. The capturing is done by specifying a
 111 symbol starting with the `$` sign, while further constraints are introduced by
 112 specifying a `TypeConstraint` (for an operand) or a `AttrConstraint` (for an
 113 attribute).
 114
 115 #### Binding op arguments and limiting the match
 116
 117 For example,
 118
 119 ```tablegen
 120 def AOp : Op<"a_op"> {
 121     let arguments = (ins
 122       AnyType:$a_input,
 123       AnyAttr:$a_attr
 124     );
 125
 126     let results = (outs
 127       AnyType:$a_output
 128     );
 129 }
 130
 131 def : Pat<(AOp $input, F32Attr:$attr), ...>;
 132 ```
 133
 134 In the above, we are matching an `AOp` whose `$input` can be anything valid as
 135 defined by the op and whose `$attr` must be a float attribute. If the match
 136 succeeds, we bind the `$input` symbol to the op's only input (`$a_input`) and
 137 `$attr` to the only attribute (`$a_attr`); we can reference them using `$input`
 138 and `$attr` in result patterns and additional constraints.
 139
 140 The pattern is position-based: the symbol names used for capturing here do not
 141 need to match with the op definition as shown in the above example. As another
 142 example, the pattern can be written as `def : Pat<(AOp $a, F32Attr:$b), ...>;`
 143 and use `$a` and `$b` to refer to the captured input and attribute. But using
 144 the ODS name directly in the pattern is also allowed. Operands in the source
 145 pattern can have the same name. This bounds one operand to the name while
 146 verifying the rest are all equal.
 147
 148 Also note that we only need to add `TypeConstraint` or `AttributeConstraint`
 149 when we need to further limit the match criteria. If all valid cases to the op
 150 are acceptable, then we can leave the constraint unspecified.
 151
 152 `$_` is a special symbol to mean ignore capturing an argument. For example,
 153 `def : Pat<(AOp $_, $b), ...>` means only `$b` is interesting to capture and
 154 will be referenced later in result patterns. It's still possible to place
 155 additional constraints even if the symbol is not to be captured; for such case,
 156 you can simply use just the `TypeConstraint` or `AttributeConstraint` without a
 157 bound symbol, for example, `def : Pat<(AOp $a, F32Attr), ...>`.
 158
 159 #### Matching DAG of operations
 160
 161 To match a DAG of ops, use nested `dag` objects:
 162
 163 ```tablegen
 164
 165 def BOp : Op<"b_op"> {
 166     let arguments = (ins);
 167
 168     let results = (outs
 169       AnyType:$b_output
 170     );
 171 }
 172
 173
 174 def : Pat<(AOp (BOp), $attr), ...>;
 175 ```
 176
 177 The above pattern matches an `AOp` whose only operand is generated by a `BOp`,
 178 that is, the following MLIR code:
 179
 180 ```mlir
 181 %0 = "b_op"() : () -> (...)
 182 %1 = "a_op"(%0) {attr: ...} : () -> (...)
 183 ```
 184
 185 #### Binding op results
 186
 187 To bind a symbol to the results of a matched op for later reference, attach the
 188 symbol to the op itself:
 189
 190 ```tablegen
 191 def : Pat<(AOp (BOp:$b_result), $attr), ...>;
 192 ```
 193
 194 The above will bind `$b_result` to the matched `BOp`'s result. (There are more
 195 details regarding multi-result ops, which is covered
 196 [later](#supporting-multi-result-ops).)
 197
 198 ### Result pattern
 199
 200 The result pattern is for generating a DAG of operations. Arguments in the `dag`
 201 object are intended to **reference** values captured in the source pattern and
 202 potentially **apply transformations**.
 203
 204 #### Referencing bound symbols
 205
 206 For example,
 207
 208 ```tablegen
 209 def COp : Op<"c_op"> {
 210     let arguments = (ins
 211       AnyType:$c_input,
 212       AnyAttr:$c_attr
 213     );
 214
 215     let results = (outs
 216       AnyType:$c_output
 217     );
 218 }
 219
 220 def : Pat<(AOp $input, $attr), (COp $input, $attr)>;
 221 ```
 222
 223 In the above, `AOp`'s only operand and attribute are bound to `$input` and
 224 `$attr`, respectively. We then reference them in the result pattern for
 225 generating the `COp` by passing them in as arguments to `COp`'s `build()`
 226 method.
 227
 228 We can also reference symbols bound to matched op's results:
 229
 230 ```tablegen
 231 def : Pat<(AOp (BOp:$b_result) $attr), (COp $b_result $attr)>;
 232 ```
 233
 234 In the above, we are using `BOp`'s result for building `COp`.
 235
 236 #### Building operations
 237
 238 Given that `COp` was specified with table-driven op definition, there will be
 239 several `build()` methods generated for it. One of them has aggregated
 240 parameters for result types, operands, and attributes in the signature: `void
 241 COp::build(..., ArrayRef<Type> resultTypes, Array<Value> operands,
 242 ArrayRef<NamedAttribute> attr)`. The pattern in the above calls this `build()`
 243 method for constructing the `COp`.
 244
 245 In general, arguments in the result pattern will be passed directly to the
 246 `build()` method to leverage the auto-generated `build()` method, list them in
 247 the pattern by following the exact same order as the ODS `arguments` definition.
 248 Otherwise, a custom `build()` method that matches the argument list is required.
 249
 250 Right now all ODS-generated `build()` methods require specifying the result
 251 type(s), unless the op has known traits like `SameOperandsAndResultType` that we
 252 can use to auto-generate a `build()` method with result type deduction. When
 253 generating an op to replace the result of the matched root op, we can use the
 254 matched root op's result type when calling the ODS-generated builder. Otherwise
 255 (e.g., generating an [auxiliary op](#supporting-auxiliary-ops) or generating an
 256 op with a nested result pattern), DRR will not be able to deduce the result
 257 type(s). The pattern author will need to define a custom builder that has result
 258 type deduction ability via `OpBuilder` in ODS. For example, in the following
 259 pattern
 260
 261 ```tablegen
 262 def : Pat<(AOp $input, $attr), (COp (AOp $input, $attr) $attr)>;
 263 ```
 264
 265 `AOp` is generated via a nested result pattern; DRR won't be able to deduce the
 266 result type for it. A custom builder for `AOp` should be defined and it should
 267 deduce the result type by itself. The builder should have the separate parameter
 268 for each operand and attribute and deduce the result type internally by itself.
 269 For example, for the above `AOp`, a possible builder is:
 270
 271 ```c++
 272
 273 void AOp::build(OpBuilder &builder, OperationState &state,
 274                 Value input, Attribute attr) {
 275   state.addOperands({input});
 276   state.addAttribute("a_attr", attr);
 277   Type type = ...; // Deduce result type here
 278   state.addTypes({type});
 279 }
 280 ```
 281
 282 Failing to define such a builder will result in an error at C++ compilation time
 283 saying the call to `AOp::build()` cannot be resolved because of the number of
 284 parameters mismatch.
 285
 286 #### Generating DAG of operations
 287
 288 `dag` objects can be nested to generate a DAG of operations:
 289
 290 ```tablegen
 291 def : Pat<(AOp $input, $attr), (COp (BOp), $attr)>;
 292 ```
 293
 294 In the above, we generate a `BOp`, and then use its result to generate the `COp`
 295 to replace the matched `AOp`.
 296
 297 #### Binding op results
 298
 299 In the result pattern, we can bind to the result(s) of a newly built op by
 300 attaching symbols to the op. (But we **cannot** bind to op arguments given that
 301 they are referencing previously bound symbols.) This is useful for reusing newly
 302 created results where suitable. For example,
 303
 304 ```tablegen
 305 def DOp : Op<"d_op"> {
 306     let arguments = (ins
 307       AnyType:$d_input1,
 308       AnyType:$d_input2,
 309     );
 310
 311     let results = (outs
 312       AnyType:$d_output
 313     );
 314 }
 315
 316 def : Pat<(AOp $input, $ignored_attr), (DOp (BOp:$b_result) $b_result)>;
 317 ```
 318
 319 In this pattern, an `AOp` is matched and replaced with a `DOp` whose two
 320 operands are from the result of a single `BOp`. This is only possible by binding
 321 the result of the `BOp` to a name and reuse it for the second operand of the
 322 `DOp`
 323
 324 #### `NativeCodeCall`: transforming the generated op
 325
 326 Sometimes the captured arguments are not exactly what we want so they cannot be
 327 directly fed in as arguments to build the new op. For such cases, we can apply
 328 transformations on the arguments by calling into C++ helper functions. This is
 329 achieved by `NativeCodeCall`.
 330
 331 For example, if we want to capture some op's attributes and group them as an
 332 array attribute to construct a new op:
 333
 334 ```tablegen
 335
 336 def TwoAttrOp : Op<"two_attr_op"> {
 337     let arguments = (ins
 338       AnyAttr:$op_attr1,
 339       AnyAttr:$op_attr2
 340     );
 341
 342     let results = (outs
 343       AnyType:$op_output
 344     );
 345 }
 346
 347 def OneAttrOp : Op<"one_attr_op"> {
 348     let arguments = (ins
 349       ArrayAttr:$op_attr
 350     );
 351
 352     let results = (outs
 353       AnyType:$op_output
 354     );
 355 }
 356 ```
 357
 358 We can write a C++ helper function:
 359
 360 ```c++
 361 ArrayAttr createArrayAttr(Builder &builder, Attribute a, Attribute b) {
 362   return builder.getArrayAttr({a, b});
 363 }
 364 ```
 365
 366 And then write the pattern as:
 367
 368 ```tablegen
 369 def createArrayAttr : NativeCodeCall<"createArrayAttr($_builder, $0, $1)">;
 370
 371 def : Pat<(TwoAttrOp $attr1, $attr2),
 372           (OneAttrOp (createArrayAttr $attr1, $attr2))>;
 373 ```
 374
 375 And make sure the generated C++ code from the above pattern has access to the
 376 definition of the C++ helper function.
 377
 378 In the above example, we are using a string to specialize the `NativeCodeCall`
 379 template. The string can be an arbitrary C++ expression that evaluates into some
 380 C++ object expected at the `NativeCodeCall` site (here it would be expecting an
 381 array attribute). Typically the string should be a function call.
 382
 383 ##### `NativeCodeCall` placeholders
 384
 385 In `NativeCodeCall`, we can use placeholders like `$_builder`, `$N` and `$N...`.
 386 The former is called *special placeholder*, while the latter is called
 387 *positional placeholder* and *positional range placeholder*.
 388
 389 `NativeCodeCall` right now only supports three special placeholders:
 390 `$_builder`, `$_loc`, and `$_self`:
 391
 392 *   `$_builder` will be replaced by the current `mlir::PatternRewriter`.
 393 *   `$_loc` will be replaced by the fused location or custom location (as
 394     determined by location directive).
 395 *   `$_self` will be replaced by the defining operation in a source pattern.
 396
 397 We have seen how `$_builder` can be used in the above; it allows us to pass a
 398 `mlir::Builder` (`mlir::PatternRewriter` is a subclass of `mlir::OpBuilder`,
 399 which is a subclass of `mlir::Builder`) to the C++ helper function to use the
 400 handy methods on `mlir::Builder`.
 401
 402 Here's an example how we should use `$_self` in source pattern,
 403
 404 ```tablegen
 405
 406 def : Pat<(OneAttrOp (NativeCodeCall<"Foo($_self, &$0)"> I32Attr:$val)),
 407           (TwoAttrOp $val, $val)>;
 408 ```
 409
 410 In the above, `$_self` is substituted by the defining operation of the first
 411 operand of OneAttrOp. Note that we don't support binding name to
 412 `NativeCodeCall` in the source pattern. To carry some return values from a
 413 helper function, put the names (constraint is optional) in the parameter list
 414 and they will be bound to the variables with corresponding type. Then these names
 415 must be either passed by reference or pointer to the variable used as argument
 416 so that the matched value can be returned. In the same example, `$val` will be
 417 bound to a variable with `Attribute` type (as `I32Attr`) and the type of the
 418 second argument in `Foo()` could be `Attribute&` or `Attribute*`. Names with
 419 attribute constraints will be captured as `Attribute`s while everything else
 420 will be treated as `Value`s.
 421
 422 Positional placeholders will be substituted by the `dag` object parameters at
 423 the `NativeCodeCall` use site. For example, if we define `SomeCall :
 424 NativeCodeCall<"someFn($1, $2, $0)">` and use it like `(SomeCall $in0, $in1,
 425 $in2)`, then this will be translated into C++ call `someFn($in1, $in2, $in0)`.
 426
 427 Positional range placeholders will be substituted by multiple `dag` object
 428 parameters at the `NativeCodeCall` use site. For example, if we define
 429 `SomeCall : NativeCodeCall<"someFn($1...)">` and use it like `(SomeCall $in0,
 430 $in1, $in2)`, then this will be translated into C++ call `someFn($in1, $in2)`.
 431
 432 ##### `NativeCodeCall` binding multi-results
 433
 434 To bind multi-results and access the N-th result with `$<name>__N`, specify the
 435 number of return values in the template. Note that only `Value` type is
 436 supported for multiple results binding. For example,
 437
 438 ```tablegen
 439
 440 def PackAttrs : NativeCodeCall<"packAttrs($0, $1)", 2>;
 441 def : Pattern<(TwoResultOp $attr1, $attr2),
 442               [(OneResultOp (PackAttr:$res__0, $attr1, $attr2)),
 443                (OneResultOp $res__1)]>;
 444
 445 ```
 446
 447 Use `NativeCodeCallVoid` for cases with no return value.
 448
 449 The correct number of returned value specified in NativeCodeCall is important.
 450 It will be used to verify the consistency of the number of return values.
 451 Additionally, `mlir-tblgen` will try to capture the return values of
 452 `NativeCodeCall` in the generated code so that it will trigger a later
 453 compilation error if a `NativeCodeCall` that doesn't return any result isn't
 454 labeled with 0 returns.
 455
 456 ##### Customizing entire op building
 457
 458 `NativeCodeCall` is not only limited to transforming arguments for building an
 459 op; it can be also used to specify how to build an op entirely. An example:
 460
 461 If we have a C++ function for building an op:
 462
 463 ```c++
 464 Operation *createMyOp(OpBuilder builder, Value input, Attribute attr);
 465 ```
 466
 467 We can wrap it up and invoke it like:
 468
 469 ```tablegen
 470 def createMyOp : NativeCodeCall<"createMyOp($_builder, $0, $1)">;
 471
 472 def : Pat<(... $input, $attr), (createMyOp $input, $attr)>;
 473 ```
 474
 475 ### Supporting auxiliary ops
 476
 477 A declarative rewrite rule supports multiple result patterns. One of the
 478 purposes is to allow generating *auxiliary ops*. Auxiliary ops are operations
 479 used for building the replacement ops; but they are not directly used for
 480 replacement themselves.
 481
 482 For the case of uni-result ops, if there are multiple result patterns, only the
 483 value generated from the last result pattern will be used to replace the matched
 484 root op's result; all other result patterns will be considered as generating
 485 auxiliary ops.
 486
 487 Normally we want to specify ops as nested `dag` objects if their def-use
 488 relationship can be expressed in the way that an op's result can feed as the
 489 argument to consuming op. But that is not always possible. For example, if we
 490 want to allocate memory and store some computation (in pseudocode):
 491
 492 ```mlir
 493 %dst = arith.addi %lhs, %rhs
 494 ```
 495
 496 into
 497
 498 ```mlir
 499 %shape = shape %lhs
 500 %mem = memref.alloc %shape
 501 %sum = arith.addi %lhs, %rhs
 502 memref.store %mem, %sum
 503 %dst = memref.load %mem
 504 ```
 505
 506 We cannot fit in with just one result pattern given `store` does not return a
 507 value. Instead we can use multiple result patterns:
 508
 509 ```tablegen
 510 def : Pattern<(AddIOp $lhs, $rhs),
 511               [(StoreOp (AllocOp:$mem (ShapeOp $lhs)), (AddIOp $lhs, $rhs)),
 512                (LoadOp $mem)];
 513 ```
 514
 515 In the above we use the first result pattern to generate the first four ops, and
 516 use the last pattern to generate the last op, which is used to replace the
 517 matched op.
 518
 519 ### Supporting multi-result ops
 520
 521 Multi-result ops bring extra complexity to declarative rewrite rules. We use
 522 TableGen `dag` objects to represent ops in patterns; there is no native way to
 523 indicate that an op generates multiple results. The approach adopted is based on
 524 **naming convention**: a `__N` suffix is added to a symbol to indicate the
 525 `N`-th result.
 526
 527 #### `__N` suffix
 528
 529 The `__N` suffix is specifying the `N`-th result as a whole (which can be
 530 [variadic](#supporting-variadic-ops)). For example, we can bind a symbol to some
 531 multi-result op and reference a specific result later:
 532
 533 ```tablegen
 534 def ThreeResultOp : Op<"three_result_op"> {
 535     let arguments = (ins ...);
 536
 537     let results = (outs
 538       AnyTensor:$output1,
 539       AnyTensor:$output2,
 540       AnyTensor:$output3
 541     );
 542 }
 543
 544 def : Pattern<(ThreeResultOp:$results ...),
 545               [(... $results__0), ..., (... $results__2), ...]>;
 546 ```
 547
 548 In the above pattern we bind `$results` to all the results generated by
 549 `ThreeResultOp` and references its `$output1` and `$output3` later in the result
 550 patterns.
 551
 552 We can also bind a symbol and reference one of its specific result at the same
 553 time, which is typically useful when generating multi-result ops:
 554
 555 ```tablegen
 556 // TwoResultOp has similar definition as ThreeResultOp, but only has two
 557 // results.
 558
 559 def : Pattern<(TwoResultOp ...),
 560               [(ThreeResultOp:$results__2, ...),
 561                (replaceWithValue $results__0)]>;
 562 ```
 563
 564 In the above, we created a `ThreeResultOp` and bind `results` to its results,
 565 and uses its last result (`$output3`) and first result (`$output1`) to replace
 566 the `TwoResultOp`'s two results, respectively.
 567
 568 #### Replacing multi-result ops
 569
 570 The above example also shows how to replace a matched multi-result op.
 571
 572 To replace an `N`-result op, the result patterns must generate at least `N`
 573 declared values (see [Declared vs. actual value](#declared-vs-actual-value) for
 574 definition). If there are more than `N` declared values generated, only the last
 575 `N` declared values will be used to replace the matched op. Note that because of
 576 the existence of multi-result op, one result pattern **may** generate multiple
 577 declared values. So it means we do not necessarily need `N` result patterns to
 578 replace an `N`-result op. For example, to replace an op with three results, you
 579 can have
 580
 581 ```tablegen
 582 // ThreeResultOp/TwoResultOp/OneResultOp generates three/two/one result(s),
 583 // respectively.
 584
 585 // Replace each result with a result generated from an individual op.
 586 def : Pattern<(ThreeResultOp ...),
 587               [(OneResultOp ...), (OneResultOp ...), (OneResultOp ...)]>;
 588
 589 // Replace the first two results with two results generated from the same op.
 590 def : Pattern<(ThreeResultOp ...),
 591               [(TwoResultOp ...), (OneResultOp ...)]>;
 592
 593 // Replace all three results with three results generated from the same op.
 594 def : Pat<(ThreeResultOp ...), (ThreeResultOp ...)>;
 595
 596 def : Pattern<(ThreeResultOp ...),
 597               [(AuxiliaryOp ...), (ThreeResultOp ...)]>;
 598 ```
 599
 600 But using a single op to serve as both auxiliary op and replacement op is
 601 forbidden, i.e., the following is not allowed because that the first
 602 `TwoResultOp` generates two results but only the second result is used for
 603 replacing the matched op's result:
 604
 605 ```tablegen
 606 def : Pattern<(ThreeResultOp ...),
 607               [(TwoResultOp ...), (TwoResultOp ...)]>;
 608 ```
 609
 610 ### Supporting variadic ops
 611
 612 #### Declared vs. actual value
 613
 614 Before going into details on variadic op support, we need to define a few terms
 615 regarding an op's values.
 616
 617 *   *Value*: either an operand or a result
 618 *   *Declared operand/result/value*: an operand/result/value statically declared
 619     in ODS of the op
 620 *   *Actual operand/result/value*: an operand/result/value of an op instance at
 621     runtime
 622
 623 The above terms are needed because ops can have multiple results, and some of
 624 the results can also be variadic. For example,
 625
 626 ```tablegen
 627 def MultiVariadicOp : Op<"multi_variadic_op"> {
 628     let arguments = (ins
 629       AnyTensor:$input1,
 630       Variadic<AnyTensor>:$input2,
 631       AnyTensor:$input3
 632     );
 633
 634     let results = (outs
 635       AnyTensor:$output1,
 636       Variadic<AnyTensor>:$output2,
 637       AnyTensor:$output3
 638     );
 639 }
 640 ```
 641
 642 We say the above op has 3 declared operands and 3 declared results. But at
 643 runtime, an instance can have 3 values corresponding to `$input2` and 2 values
 644 correspond to `$output2`; we say it has 5 actual operands and 4 actual results.
 645 A variadic operand/result is a considered as a declared value that can
 646 correspond to multiple actual values.
 647
 648 [TODO]
 649
 650 #### Match variadic operand
 651
 652 Use the `variadic` DAG node to match a variadic operand with a fixed number of
 653 actual sub-operands.
 654
 655 For example, assume that `ConcatenateOp` is an operation with a variadic
 656 operand:
 657
 658 ```tablegen
 659 def ConcatenateOp : TEST_Op<"concatenate"> {
 660   let arguments = (ins
 661     Variadic<AnyTensor>:$inputs,
 662     I32Attr:$axis
 663   );
 664
 665   let results = (outs
 666     AnyTensor$output
 667   );
 668 }
 669 ```
 670
 671 We can match `ConcatenateOp` with exactly 2 actual operands with:
 672
 673 ```tablegen
 674 def : Pat<(ConcatenateOp (variadic $input0, $input1), $axis),
 675           ...>;
 676 ```
 677
 678 The variadic sub-operands can be sub-DAGs to be matched:
 679
 680 ```tablegen
 681 def : Pat<(ConcatenateOp (variadic (SomeOp $a), (AnotherOp $b, $c)), $axis),
 682           (OtherOp $a, $b, $c)>;
 683 ```
 684
 685 The variadic DAG can be bound to a symbol, which refers to the full
 686 `operand_range`:
 687
 688 ```tablegen
 689 def : Pat<(ConcatenateOp (variadic:$inputs $input0, $input1),
 690                          ConstantAttr<I32Attr, "0">),
 691           (VStackOp $inputs)>;
 692 ```
 693
 694 ### Supplying additional constraints
 695
 696 Constraints can be placed on op arguments when matching. But sometimes we need
 697 to also place constraints on the matched op's results or sometimes need to limit
 698 the matching with some constraints that cover both the arguments and the
 699 results. The third parameter to `Pattern` (and `Pat`) is for this purpose.
 700
 701 For example, we can write
 702
 703 ```tablegen
 704 def HasNoUseOf: Constraint<CPred<"$_self.use_empty()">, "has no use">;
 705
 706 def HasSameElementType : Constraint<
 707     CPred<"$0.cast<ShapedType>().getElementType() == "
 708           "$1.cast<ShapedType>().getElementType()">,
 709     "has same element type">;
 710
 711 def : Pattern<(TwoResultOp:$results $input),
 712               [(...), (...)],
 713               [(F32Tensor:$results__0), (HasNoUseOf:$results__1),
 714                (HasSameElementShape $results__0, $input)]>;
 715 ```
 716
 717 You can
 718
 719 *   Use normal `TypeConstraint`s on previous bound symbols (the first result of
 720     `TwoResultOp` must be a float tensor);
 721 *   Define new `Constraint` for previous bound symbols (the second result of
 722     `TwoResultOp` must has no use);
 723 *   Apply constraints on multiple bound symbols (`$input` and `TwoResultOp`'s
 724     first result must have the same element type).
 725
 726 ### Supplying additional result patterns
 727
 728 Sometimes we need to add additional code after the result patterns, e.g. coping
 729 the attributes of the source op to the result ops. These can be specified via
 730 `SupplementalPatterns` parameter. Similar to auxiliary patterns, they are not
 731 for replacing results in the source pattern.
 732
 733 For example, we can write
 734
 735 ```tablegen
 736 def GetOwner: NativeCodeCall<"$0.getOwner()">;
 737
 738 def CopyAttrFoo: NativeCodeCallVoid<
 739   "$1->setAttr($_builder.getStringAttr(\"foo\"), $0->getInherentAttr(\"foo\"))">;
 740
 741 def CopyAttrBar: NativeCodeCallVoid<
 742   "$1->setAttr($_builder.getStringAttr(\"bar\"), $0->getInherentAttr(\"bar\"))">;
 743
 744
 745 def : Pattern<
 746   (ThreeResultOp:$src ...),
 747   [(ZeroResultOp:$dest1 ...), (ThreeResultOp:$dest2 ...)],
 748   [(CopyAttrFoo (GetOwner $src), $dest1),
 749     (CopyAttrBar (GetOwner $src), (GetOwner $dest2))]>;
 750 ```
 751
 752 This will copy the attribute `foo` and `bar` of `ThreeResultOp` in the source
 753 pattern to `ZeroResultOp` and `ThreeResultOp` in the result patterns respectively.
 754 The patterns are executed in specified order.
 755
 756 ### Adjusting benefits
 757
 758 The benefit of a `Pattern` is an integer value indicating the benefit of
 759 matching the pattern. It determines the priorities of patterns inside the
 760 pattern rewrite driver. A pattern with a higher benefit is applied before one
 761 with a lower benefit.
 762
 763 In DRR, a rule is set to have a benefit of the number of ops in the source
 764 pattern. This is based on the heuristics and assumptions that:
 765
 766 *   Larger matches are more beneficial than smaller ones.
 767 *   If a smaller one is applied first the larger one may not apply anymore.
 768
 769 The fourth parameter to `Pattern` (and `Pat`) allows to manually tweak a
 770 pattern's benefit. Just supply `(addBenefit N)` to add `N` to the benefit value.
 771
 772 ## Rewrite directives
 773
 774 ### `location`
 775
 776 By default the C++ pattern expanded from a DRR pattern uses the fused location
 777 of all source ops as the location for all generated ops. This is not always the
 778 best location mapping relationship. For such cases, DRR provides the `location`
 779 directive to provide finer control.
 780
 781 `location` is of the following syntax:
 782
 783 ```tablegen
 784 (location $symbol0, $symbol1, ...)
 785 ```
 786
 787 where all `$symbol` should be bound previously in the pattern and one optional
 788 string may be specified as an attribute. The following locations are created:
 789
 790 *   If only 1 symbol is specified then that symbol's location is used,
 791 *   If multiple are specified then a fused location is created;
 792 *   If no symbol is specified then string must be specified and a NamedLoc is
 793     created instead;
 794
 795 `location` must be used as a trailing argument to an op creation. For example,
 796
 797 ```tablegen
 798 def : Pat<(LocSrc1Op:$src1 (LocSrc2Op:$src2 ...),
 799           (LocDst1Op (LocDst2Op ..., (location $src2)), (location "outer"))>;
 800 ```
 801
 802 In the above pattern, the generated `LocDst2Op` will use the matched location of
 803 `LocSrc2Op` while the root `LocDst1Op` node will used the named location
 804 `outer`.
 805
 806 ### `replaceWithValue`
 807
 808 The `replaceWithValue` directive is used to eliminate a matched op by replacing
 809 all of its uses with a captured value. It is of the following syntax:
 810
 811 ```tablegen
 812 (replaceWithValue $symbol)
 813 ```
 814
 815 where `$symbol` should be a symbol bound previously in the pattern.
 816
 817 For example,
 818
 819 ```tablegen
 820 def : Pat<(Foo $input), (replaceWithValue $input)>;
 821 ```
 822
 823 The above pattern removes the `Foo` and replaces all uses of `Foo` with
 824 `$input`.
 825
 826 ### `returnType`
 827
 828 The `returnType` directive allows patterns to directly specify return types for
 829 replacement ops that lack return type inference with op traits or user-defined
 830 builders with return type deduction.
 831
 832 The `returnType` directive must be used as a trailing argument to a node
 833 describing a replacement op. The directive comes in three forms:
 834
 835 *   `(returnType $value)`: copy the type of the operand or result bound to
 836     `value`.
 837 *   `(returnType "$_builder.getI32Type()")`: a string literal embedding C++. The
 838     embedded snippet is expected to return a `Type` or a `TypeRange`.
 839 *   `(returnType (NativeCodeCall<"myFunc($0)"> $value))`: a DAG node with a
 840     native code call that can be passed any bound variables arguments.
 841
 842 Specify multiple return types with a mix of any of the above. Example:
 843
 844 ```tablegen
 845 def : Pat<(SourceOp $arg0, $arg1),
 846           (OpA $arg0, (TwoResultOp:$res__1 $arg1,
 847                          (returnType $arg1, "$_builder.getI64Type()")))>;
 848 ```
 849
 850 Explicitly-specified return types will take precedence over return types
 851 inferred from op traits or user-defined builders. The return types of values
 852 replacing root op results cannot be overridden.
 853
 854 ### `either`
 855
 856 The `either` directive is used to specify the operands may be matched in either
 857 order.
 858
 859 ```tablegen
 860 def : Pat<(TwoArgOp (either $firstArg, (AnOp $secondArg))),
 861           (...)>;
 862 ```
 863
 864 The above pattern will accept either `"test.TwoArgOp"(%I32Arg, %AnOpArg)` and
 865 `"test.TwoArgOp"(%AnOpArg, %I32Arg)`.
 866
 867 Only operand is supported with `either` and note that an operation with
 868 `Commutative` trait doesn't imply that it'll have the same behavior than
 869 `either` while pattern matching.
 870
 871 ## Debugging Tips
 872
 873 ### Run `mlir-tblgen` to see the generated content
 874
 875 TableGen syntax sometimes can be obscure; reading the generated content can be a
 876 very helpful way to understand and debug issues. To build `mlir-tblgen`, run
 877 `cmake --build . --target mlir-tblgen` in your build directory and find the
 878 `mlir-tblgen` binary in the `bin/` subdirectory. All the supported generators
 879 can be found via `mlir-tblgen --help`.
 880
 881 To see the generated code, invoke `mlir-tblgen` with a specific generator by
 882 providing include paths via `-I`. For example,
 883
 884 ```sh
 885 # To see all the C++ pattern rewrite classes
 886 mlir-tblgen --gen-rewriters -I /path/to/mlir/include /path/to/input/td/file
 887 ```
 888
 889 ### Compilation error: no matching member function for call to 'build'
 890
 891 This is because DRR is failing to call a `build()` method with result type
 892 deduction ability. See [building operations](#building-operations) for more
 893 details.
 894
 895 [TableGen]: https://llvm.org/docs/TableGen/index.html
 896 [OpBase]: https://github.com/llvm/llvm-project/blob/main/mlir/include/mlir/IR/OpBase.td