mlir/docs/OpDefinitions.md

   1 # Operation Definition Specification (ODS)
   2
   3 In addition to specializing the `mlir::Op` C++ template, MLIR also supports
   4 defining operations and data types in a table-driven manner. This is achieved
   5 via [TableGen][TableGen], which is both a generic language and its tooling to
   6 maintain records of domain-specific information. Facts regarding an operation
   7 are specified concisely into a TableGen record, which will be expanded into an
   8 equivalent `mlir::Op` C++ template specialization at compiler build time.
   9
  10 This manual explains in detail all the available mechanisms for defining
  11 operations in such a table-driven manner. It aims to be a specification instead
  12 of a tutorial. Please refer to
  13 [Quickstart tutorial to adding MLIR graph rewrite](Tutorials/QuickstartRewrites.md)
  14 for the latter.
  15
  16 In addition to detailing each mechanism, this manual also tries to capture best
  17 practices. They are rendered as quoted bullet points.
  18
  19 [TOC]
  20
  21 ## Motivation
  22
  23 MLIR allows pluggable dialects, and dialects contain, among others, a list of
  24 operations. This open and extensible ecosystem leads to the "stringly" type IR
  25 problem, e.g., repetitive string comparisons during optimization and analysis
  26 passes, unintuitive accessor methods (e.g., generic/error prone `getOperand(3)`
  27 vs self-documenting `getStride()`) with more generic return types, verbose and
  28 generic constructors without default arguments, verbose textual IR dumps, and so
  29 on. Furthermore, operation verification is:
  30
  31 1.  best case: a central string-to-verification-function map,
  32 1.  middle case: duplication of verification across the code base, or
  33 1.  worst case: no verification functions.
  34
  35 The fix is to support defining ops in a table-driven manner. Then for each
  36 dialect, we can have a central place that contains everything you need to know
  37 about each op, including its constraints, custom assembly form, etc. This
  38 description is also used to generate helper functions and classes to allow
  39 building, verification, parsing, printing, analysis, and many more.
  40
  41 ## Benefits
  42
  43 Compared to the C++ template, this table-driven approach has several benefits
  44 including but not limited to:
  45
  46 *   **Single source of truth**: We strive to encode all facts regarding an
  47     operation into the record, so that readers don't need to jump among code
  48     snippets to fully understand an operation.
  49 *   **Removing boilerplate**: We can automatically generate
  50     operand/attribute/result getter methods, operation build methods, operation
  51     verify methods, and many more utilities from the record. This greatly
  52     reduces the boilerplate needed for defining a new op.
  53 *   **Facilitating auto-generation**: The usage of these operation information
  54     records are by no means limited to op definition itself. We can use them to
  55     drive the auto-generation of many other components, like computation graph
  56     serialization.
  57
  58 ## TableGen Syntax
  59
  60 We use TableGen as the language for specifying operation information. TableGen
  61 itself just provides syntax for writing records; the syntax and constructs
  62 allowed in a TableGen file (typically with the filename suffix `.td`) can be found
  63 [here][TableGenProgRef].
  64
  65 *   TableGen `class` is similar to C++ class; it can be templated and
  66     subclassed.
  67 *   TableGen `def` is similar to C++ object; it can be declared by specializing
  68     a TableGen `class` (e.g., `def MyDef : MyClass<...>;`) or completely
  69     independently (e.g., `def MyDef;`). It cannot be further templated or
  70     subclassed.
  71 *   TableGen `dag` is a dedicated type for directed acyclic graph of elements. A
  72     `dag` has one operator and zero or more arguments. Its syntax is `(operator
  73     arg0, arg1, argN)`. The operator can be any TableGen `def`; an argument can
  74     be anything, including `dag` itself. We can have names attached to both the
  75     operator and the arguments like `(MyOp:$op_name MyArg:$arg_name)`.
  76
  77 Please see the [language reference][TableGenProgRef] to learn about all the
  78 types and expressions supported by TableGen.
  79
  80 ## Operation Definition
  81
  82 MLIR defines several common constructs to help operation definition and provide
  83 their semantics via a special [TableGen backend][TableGenBackend]:
  84 [`OpDefinitionsGen`][OpDefinitionsGen]. These constructs are defined in
  85 [`OpBase.td`][OpBase]. The main ones are:
  86
  87 *   The `Op` class: It is the main construct for defining operations. All facts
  88     regarding the operation are specified when specializing this class, with the
  89     help of the following constructs.
  90 *   The `Dialect` class: Operations belonging to one logical group are placed in
  91     the same dialect. The `Dialect` class contains dialect-level information.
  92 *   The `OpTrait` class hierarchy: They are used to specify special properties
  93     and constraints of the operation, including whether the operation has side
  94     effect or whether its output has the same shape as the input.
  95 *   The `ins`/`outs` marker: These are two special markers builtin to the
  96     `OpDefinitionsGen` backend. They lead to the definitions of operands/attributes
  97     and results respectively.
  98 *   The `TypeConstraint` class hierarchy: They are used to specify the
  99     constraints over operands or results. A notable subclass hierarchy is
 100     `Type`, which stands for constraints for common C++ types.
 101 *   The `AttrConstraint` class hierarchy: They are used to specify the
 102     constraints over attributes. A notable subclass hierarchy is `Attr`, which
 103     stands for constraints for attributes whose values are of common types.
 104
 105 An operation is defined by specializing the `Op` class with concrete contents
 106 for all the fields it requires. For example, `tf.AvgPool` is defined as
 107
 108 ```tablegen
 109 def TF_AvgPoolOp : TF_Op<"AvgPool", [NoSideEffect]> {
 110   let summary = "Performs average pooling on the input.";
 111
 112   let description = [{
 113 Each entry in `output` is the mean of the corresponding size `ksize`
 114 window in `value`.
 115   }];
 116
 117   let arguments = (ins
 118     TF_FpTensor:$value,
 119
 120     Confined<I64ArrayAttr, [ArrayMinCount<4>]>:$ksize,
 121     Confined<I64ArrayAttr, [ArrayMinCount<4>]>:$strides,
 122     TF_AnyStrAttrOf<["SAME", "VALID"]>:$padding,
 123     DefaultValuedAttr<TF_ConvertDataFormatAttr, "NHWC">:$data_format
 124   );
 125
 126   let results = (outs
 127     TF_FpTensor:$output
 128   );
 129
 130   TF_DerivedOperandTypeAttr T = TF_DerivedOperandTypeAttr<0>;
 131 }
 132 ```
 133
 134 In the following we describe all the fields needed. Please see the definition of
 135 the `Op` class for the complete list of fields supported.
 136
 137 ### Operation name
 138
 139 The operation name is a unique identifier for the operation within MLIR, e.g.,
 140 `tf.Add` for addition operation in the TensorFlow dialect. This is the
 141 equivalent of the mnemonic in assembly language. It is used for parsing and
 142 printing in the textual format. It is also used for pattern matching in graph
 143 rewrites.
 144
 145 The full operation name is composed of the dialect name and the op name, with
 146 the former provided via the dialect and the latter provided as the second
 147 template parameter to the `Op` class.
 148
 149 ### Operation documentation
 150
 151 This includes both a one-line `summary` and a longer human-readable
 152 `description`. They will be used to drive automatic generation of dialect
 153 documentation. They need to be provided in the operation's definition body:
 154
 155 ```tablegen
 156 let summary = "...";
 157
 158 let description = [{
 159 ...
 160 }];
 161 ```
 162
 163 `description` should be written in Markdown syntax.
 164
 165 Placing the documentation at the beginning is recommended since it helps in
 166 understanding the operation.
 167
 168 > *   Place documentation at the beginning of the operation definition
 169 > *   The summary should be short and concise. It should be a one-liner without
 170 >     trailing punctuation. Put expanded explanation in description.
 171
 172 ### Operation arguments
 173
 174 There are two kinds of arguments: operands and attributes. Operands are runtime
 175 values produced by other ops; while attributes are compile-time known constant
 176 values, including two categories:
 177
 178 1.  Natural attributes: these attributes affect the behavior of the operations
 179     (e.g., padding for convolution);
 180 1.  Derived attributes: these attributes are not needed to define the operation
 181     but are instead derived from information of the operation. E.g., the output
 182     shape of type. This is mostly used for convenience interface generation or
 183     interaction with other frameworks/translation.
 184
 185     All derived attributes should be materializable as an Attribute. That is,
 186     even though they are not materialized, it should be possible to store as an
 187     attribute.
 188
 189 Both operands and attributes are specified inside the `dag`-typed `arguments`,
 190 led by `ins`:
 191
 192 ```tablegen
 193 let arguments = (ins
 194   <type-constraint>:$<operand-name>,
 195   ...
 196   <attr-constraint>:$<attr-name>,
 197   ...
 198 );
 199 ```
 200
 201 Here `<type-constraint>` is a TableGen `def` from the `TypeConstraint` class
 202 hierarchy. Similarly, `<attr-constraint>` is a TableGen `def` from the
 203 `AttrConstraint` class hierarchy. See [Constraints](#constraints) for more
 204 information.
 205
 206 There is no requirements on the relative order of operands and attributes; they
 207 can mix freely. The relative order of operands themselves matters. From each
 208 named argument a named getter will be generated that returns the argument with
 209 the return type (in the case of attributes the return type will be constructed
 210 from the storage type, while for operands it will be `Value`). Each attribute's
 211 raw value (e.g., as stored) can also be accessed via generated `<name>Attr`
 212 getters for use in transformation passes where the more user-friendly return
 213 type is less suitable.
 214
 215 All the arguments should be named to:
 216 - provide documentation,
 217 - drive auto-generation of getter methods, and
 218 - provide a handle to reference for other places like constraints.
 219
 220 #### Variadic operands
 221
 222 To declare a variadic operand, wrap the `TypeConstraint` for the operand with
 223 `Variadic<...>`.
 224
 225 Normally operations have no variadic operands or just one variadic operand. For
 226 the latter case, it is easy to deduce which dynamic operands are for the static
 227 variadic operand definition. However, if an operation has more than one variable
 228 length operands (either optional or variadic), it would be impossible to
 229 attribute dynamic operands to the corresponding static variadic operand
 230 definitions without further information from the operation. Therefore, either
 231 the `SameVariadicOperandSize` or `AttrSizedOperandSegments` trait is needed to
 232 indicate that all variable length operands have the same number of dynamic
 233 values.
 234
 235 #### VariadicOfVariadic operands
 236
 237 To declare a variadic operand that has a variadic number of sub-ranges, wrap the
 238 `TypeConstraint` for the operand with `VariadicOfVariadic<...,
 239 "<segment-attribute-name>">`.
 240
 241 The second field of the `VariadicOfVariadic` is the name of an `I32ElementsAttr`
 242 argument that contains the sizes of the variadic sub-ranges. This attribute will
 243 be used when determining the size of sub-ranges, or when updating the size of
 244 sub-ranges.
 245
 246 #### Optional operands
 247
 248 To declare an optional operand, wrap the `TypeConstraint` for the operand with
 249 `Optional<...>`.
 250
 251 Normally operations have no optional operands or just one optional operand. For
 252 the latter case, it is easy to deduce which dynamic operands are for the static
 253 operand definition. However, if an operation has more than one variable length
 254 operands (either optional or variadic), it would be impossible to attribute
 255 dynamic operands to the corresponding static variadic operand definitions
 256 without further information from the operation. Therefore, either the
 257 `SameVariadicOperandSize` or `AttrSizedOperandSegments` trait is needed to
 258 indicate that all variable length operands have the same number of dynamic
 259 values.
 260
 261 #### Optional attributes
 262
 263 To declare an optional attribute, wrap the `AttrConstraint` for the attribute
 264 with `OptionalAttr<...>`.
 265
 266 #### Attributes with default values
 267
 268 To declare an attribute with a default value, wrap the `AttrConstraint` for the
 269 attribute with `DefaultValuedAttr<..., "...">`.
 270
 271 The second parameter to `DefaultValuedAttr` should be a string containing the
 272 C++ default value. For example, a float default value should be specified as
 273 like `"0.5f"`, and an integer array default value should be specified as like
 274 `"{1, 2, 3}"`.
 275
 276 #### Confining attributes
 277
 278 `Confined` is provided as a general mechanism to help modelling further
 279 constraints on attributes beyond the ones brought by value types. You can use
 280 `Confined` to compose complex constraints out of more primitive ones. For
 281 example, a 32-bit integer attribute whose minimum value must be 10 can be
 282 expressed as `Confined<I32Attr, [IntMinValue<10>]>`.
 283
 284 Right now, the following primitive constraints are supported:
 285
 286 *   `IntMinValue<N>`: Specifying an integer attribute to be greater than or
 287     equal to `N`
 288 *   `IntMaxValue<N>`: Specifying an integer attribute to be less than or equal
 289     to `N`
 290 *   `ArrayMinCount<N>`: Specifying an array attribute to have at least `N`
 291     elements
 292 *   `IntArrayNthElemEq<I, N>`: Specifying an integer array attribute's `I`-th
 293     element to be equal to `N`
 294 *   `IntArrayNthElemMinValue<I, N>`: Specifying an integer array attribute's
 295     `I`-th element to be greater than or equal to `N`
 296
 297 TODO: Design and implement more primitive constraints
 298
 299 ### Operation regions
 300
 301 The regions of an operation are specified inside of the `dag`-typed `regions`,
 302 led by `region`:
 303
 304 ```tablegen
 305 let regions = (region
 306   <region-constraint>:$<region-name>,
 307   ...
 308 );
 309 ```
 310
 311 #### Variadic regions
 312
 313 Similar to the `Variadic` class used for variadic operands and results,
 314 `VariadicRegion<...>` can be used for regions. Variadic regions can currently
 315 only be specified as the last region in the regions list.
 316
 317 ### Operation results
 318
 319 Similar to operands, results are specified inside the `dag`-typed `results`, led
 320 by `outs`:
 321
 322 ```tablegen
 323 let results = (outs
 324   <type-constraint>:$<result-name>,
 325   ...
 326 );
 327 ```
 328
 329 #### Variadic results
 330
 331 Similar to variadic operands, `Variadic<...>` can also be used for results. And
 332 similarly, `SameVariadicResultSize` for multiple variadic results in the same
 333 operation.
 334
 335 ### Operation successors
 336
 337 For terminator operations, the successors are specified inside of the
 338 `dag`-typed `successors`, led by `successor`:
 339
 340 ```tablegen
 341 let successors = (successor
 342   <successor-constraint>:$<successor-name>,
 343   ...
 344 );
 345 ```
 346
 347 #### Variadic successors
 348
 349 Similar to the `Variadic` class used for variadic operands and results,
 350 `VariadicSuccessor<...>` can be used for successors. Variadic successors can
 351 currently only be specified as the last successor in the successor list.
 352
 353 ### Operation traits and constraints
 354
 355 Traits are operation properties that affect syntax or semantics. MLIR C++ models
 356 various traits in the `mlir::OpTrait` namespace.
 357
 358 Both operation traits, [interfaces](Interfaces.md/#utilizing-the-ods-framework),
 359 and constraints involving multiple operands/attributes/results are provided as
 360 the third template parameter to the `Op` class. They should be deriving from
 361 the `OpTrait` class. See [Constraints](#constraints) for more information.
 362
 363 ### Builder methods
 364
 365 For each operation, there are a few builders automatically generated based on
 366 the arguments and returns types. For example, given the following op definition:
 367
 368 ```tablegen
 369 def MyOp : ... {
 370   let arguments = (ins
 371     I32:$i32_operand,
 372     F32:$f32_operand,
 373     ...,
 374
 375     I32Attr:$i32_attr,
 376     F32Attr:$f32_attr,
 377     ...
 378   );
 379
 380   let results = (outs
 381     I32:$i32_result,
 382     F32:$f32_result,
 383     ...
 384   );
 385 }
 386 ```
 387
 388 The following builders are generated:
 389
 390 ```c++
 391 // All result-types/operands/attributes have one aggregate parameter.
 392 static void build(OpBuilder &odsBuilder, OperationState &odsState,
 393                   TypeRange resultTypes,
 394                   ValueRange operands,
 395                   ArrayRef<NamedAttribute> attributes);
 396
 397 // Each result-type/operand/attribute has a separate parameter. The parameters
 398 // for attributes are of mlir::Attribute types.
 399 static void build(OpBuilder &odsBuilder, OperationState &odsState,
 400                   Type i32_result, Type f32_result, ...,
 401                   Value i32_operand, Value f32_operand, ...,
 402                   IntegerAttr i32_attr, FloatAttr f32_attr, ...);
 403
 404 // Each result-type/operand/attribute has a separate parameter. The parameters
 405 // for attributes are raw values unwrapped with mlir::Attribute instances.
 406 // (Note that this builder will not always be generated. See the following
 407 // explanation for more details.)
 408 static void build(OpBuilder &odsBuilder, OperationState &odsState,
 409                   Type i32_result, Type f32_result, ...,
 410                   Value i32_operand, Value f32_operand, ...,
 411                   APInt i32_attr, StringRef f32_attr, ...);
 412
 413 // Each operand/attribute has a separate parameter but result type is aggregate.
 414 static void build(OpBuilder &odsBuilder, OperationState &odsState,
 415                   TypeRange resultTypes,
 416                   Value i32_operand, Value f32_operand, ...,
 417                   IntegerAttr i32_attr, FloatAttr f32_attr, ...);
 418
 419 // All operands/attributes have aggregate parameters.
 420 // Generated if return type can be inferred.
 421 static void build(OpBuilder &odsBuilder, OperationState &odsState,
 422                   ValueRange operands, ArrayRef<NamedAttribute> attributes);
 423
 424 // (And manually specified builders depending on the specific op.)
 425 ```
 426
 427 The first form provides basic uniformity so that we can create ops using the
 428 same form regardless of the exact op. This is particularly useful for
 429 implementing declarative pattern rewrites.
 430
 431 The second and third forms are good for use in manually written code, given that
 432 they provide better guarantee via signatures.
 433
 434 The third form will be generated if any of the op's attribute has different
 435 `Attr.returnType` from `Attr.storageType` and we know how to build an attribute
 436 from an unwrapped value (i.e., `Attr.constBuilderCall` is defined.)
 437 Additionally, for the third form, if an attribute appearing later in the
 438 `arguments` list has a default value, the default value will be supplied in the
 439 declaration. This works for `BoolAttr`, `StrAttr`, `EnumAttr` for now and the
 440 list can grow in the future. So if possible, the default-valued attribute should be
 441 placed at the end of the `arguments` list to leverage this feature. (This
 442 behavior is essentially due to C++ function parameter default value placement
 443 restrictions.) Otherwise, the builder of the third form will still be generated
 444 but default values for the attributes not at the end of the `arguments` list
 445 will not be supplied in the builder's signature.
 446
 447 ODS will generate a builder that doesn't require the return type specified if
 448
 449 *   Op implements InferTypeOpInterface interface;
 450 *   All return types are either buildable types or are the same as a given
 451     operand (e.g., `AllTypesMatch` constraint between operand and result);
 452
 453 And there may potentially exist other builders depending on the specific op;
 454 please refer to the
 455 [generated C++ file](#run-mlir-tblgen-to-see-the-generated-content) for the
 456 complete list.
 457
 458 #### Custom builder methods
 459
 460 However, if the above cases cannot satisfy all needs, you can define additional
 461 convenience build methods in the `builders` field as follows.
 462
 463 ```tablegen
 464 def MyOp : Op<"my_op", []> {
 465   let arguments = (ins F32Attr:$attr);
 466
 467   let builders = [
 468     OpBuilder<(ins "float":$val)>
 469   ];
 470 }
 471 ```
 472
 473 The `builders` field is a list of custom builders that are added to the Op
 474 class. In this example, we provide a convenience builder that takes a floating
 475 point value instead of an attribute. The `ins` prefix is common to many function
 476 declarations in ODS, which use a TableGen [`dag`](#tablegen-syntax). What
 477 follows is a comma-separated list of types (quoted string) and names prefixed
 478 with the `$` sign. This will generate the declaration of a builder method that
 479 looks like:
 480
 481 ```c++
 482 class MyOp : /*...*/ {
 483   /*...*/
 484   static void build(::mlir::OpBuilder &builder, ::mlir::OperationState &state,
 485                     float val);
 486 };
 487 ```
 488
 489 Note that the method has two additional leading arguments. These arguments are
 490 useful to construct the operation. In particular, the method must populate
 491 `state` with attributes, operands, regions and result types of the operation to
 492 be constructed. `builder` can be used to construct any IR objects that belong to
 493 the Op, such as types or nested operations. Since the type and name are
 494 generated as is in the C++ code, they should be valid C++ constructs for a type
 495 (in the namespace of the Op) and an identifier (e.g., `class` is not a valid
 496 identifier).
 497
 498 Implementations of the builder can be provided directly in ODS, using TableGen
 499 code block as follows.
 500
 501 ```tablegen
 502 def MyOp : Op<"my_op", []> {
 503   let arguments = (ins F32Attr:$attr);
 504
 505   let builders = [
 506     OpBuilder<(ins "float":$val), [{
 507       $_state.addAttribute("attr", $_builder.getF32FloatAttr(val));
 508     }]>
 509   ];
 510 }
 511 ```
 512
 513 The equivalents of `builder` and `state` arguments are available as `$_builder`
 514 and `$_state` special variables. The named arguments listed in the `ins` part
 515 are available directly, e.g. `val`. The body of the builder will be generated by
 516 substituting special variables and should otherwise be valid C++. While there is
 517 no limitation on the code size, we encourage one to define only short builders
 518 inline in ODS and put definitions of longer builders in C++ files.
 519
 520 Finally, if some arguments need a default value, they can be defined using
 521 `CArg` to wrap the type and this value as follows.
 522
 523 ```tablegen
 524 def MyOp : Op<"my_op", []> {
 525   let arguments = (ins F32Attr:$attr);
 526
 527   let builders = [
 528     OpBuilder<(ins CArg<"float", "0.5f">:$val), [{
 529       $_state.addAttribute("attr", $_builder.getF32FloatAttr(val));
 530     }]>
 531   ];
 532 }
 533 ```
 534
 535 The generated code will use default value in the declaration, but not in the
 536 definition, as required by C++.
 537
 538 ```c++
 539 /// Header file.
 540 class MyOp : /*...*/ {
 541   /*...*/
 542   static void build(::mlir::OpBuilder &builder, ::mlir::OperationState &state,
 543                     float val = 0.5f);
 544 };
 545
 546 /// Source file.
 547 MyOp::build(::mlir::OpBuilder &builder, ::mlir::OperationState &state,
 548             float val) {
 549   state.addAttribute("attr", builder.getF32FloatAttr(val));
 550 }
 551 ```
 552
 553 **Deprecated:** `OpBuilder` class allows one to specify the custom builder
 554 signature as a raw string, without separating parameters into different `dag`
 555 arguments. It also supports leading parameters of `OpBuilder &` and
 556 `OperationState &` types, which will be used instead of the autogenerated ones
 557 if present.
 558
 559 ### Custom parser and printer methods
 560
 561 Functions to parse and print the operation's custom assembly form.
 562
 563 ### Custom verifier code
 564
 565 Verification code will be automatically generated for
 566 [constraints](#constraints) specified on various entities of the op. To perform
 567 _additional_ verification, you can use
 568
 569 ```tablegen
 570 let hasVerifier = 1;
 571 let hasRegionVerifier = 1;
 572 ```
 573
 574 This will generate `LogicalResult verify()`/`LogicalResult verifyRegions()`
 575 method declarations on the op class that can be defined with any additional
 576 verification constraints. For verificaiton which needs to access the nested
 577 operations, you should use `hasRegionVerifier` to ensure that it won't access
 578 any ill-formed operation. Except that, The other verifications can be
 579 implemented with `hasVerifier`. Check the next section for the execution order
 580 of these verification methods.
 581
 582 #### Verification Ordering
 583
 584 The verification of an operation involves several steps,
 585
 586 1. StructuralOpTrait will be verified first, they can be run independently.
 587 2. `verifyInvariants` which is constructed by ODS, it verifies the type,
 588    attributes, .etc.
 589 3. Other Traits/Interfaces that have marked their verifier as `verifyTrait` or
 590    `verifyWithRegions=0`.
 591 4. Custom verifier which is defined in the op and has been marked `hasVerifier=1`
 592
 593 If an operation has regions, then it may have the second phase,
 594
 595 1. Traits/Interfaces that have marked their verifier as `verifyRegionTrait` or
 596    `verifyWithRegions=1`. This implies the verifier needs to access the
 597    operations in its regions.
 598 2. Custom verifier which is defined in the op and has been marked
 599    `hasRegionVerifier=1`
 600
 601 Note that the second phase will be run after the operations in the region are
 602 verified. Verifiers further down the order can rely on certain invariants being
 603 verified by a previous verifier and do not need to re-verify them.
 604
 605 #### Emitting diagnostics in custom verifiers
 606
 607 Custom verifiers should avoid printing operations using custom operation
 608 printers, because they require the printed operation (and sometimes its parent
 609 operation) to be verified first. In particular, when emitting diagnostics,
 610 custom verifiers should use the `Error` severity level, which prints operations
 611 in generic form by default, and avoid using lower severity levels (`Note`,
 612 `Remark`, `Warning`).
 613
 614 ### Declarative Assembly Format
 615
 616 The custom assembly form of the operation may be specified in a declarative
 617 string that matches the operations operands, attributes, etc. With the ability
 618 to express additional information that needs to be parsed to build the
 619 operation:
 620
 621 ```tablegen
 622 def CallOp : Std_Op<"call", ...> {
 623   let arguments = (ins FlatSymbolRefAttr:$callee, Variadic<AnyType>:$args);
 624   let results = (outs Variadic<AnyType>);
 625
 626   let assemblyFormat = [{
 627     $callee `(` $args `)` attr-dict `:` functional-type($args, results)
 628   }];
 629 }
 630 ```
 631
 632 The format is comprised of three components:
 633
 634 #### Directives
 635
 636 A directive is a type of builtin function, with an optional set of arguments.
 637 The available directives are as follows:
 638
 639 *   `attr-dict`
 640
 641     -   Represents the attribute dictionary of the operation.
 642
 643 *   `attr-dict-with-keyword`
 644
 645     -   Represents the attribute dictionary of the operation, but prefixes the
 646         dictionary with an `attributes` keyword.
 647
 648 *   `custom` < UserDirective > ( Params )
 649
 650     -   Represents a custom directive implemented by the user in C++.
 651     -   See the [Custom Directives](#custom-directives) section below for more
 652         details.
 653
 654 *   `functional-type` ( inputs , results )
 655
 656     -   Formats the `inputs` and `results` arguments as a
 657         [function type](Dialects/Builtin.md/#functiontype).
 658     -   The constraints on `inputs` and `results` are the same as the `input` of
 659         the `type` directive.
 660
 661 *   `oilist` ( \`keyword\` elements | \`otherKeyword\` elements ...)
 662
 663     -   Represents an optional order-independent list of clauses. Each clause
 664         has a keyword and corresponding assembly format.
 665     -   Each clause can appear 0 or 1 time (in any order).
 666     -   Only literals, types and variables can be used within an oilist element.
 667     -   All the variables must be optional or variadic.
 668
 669 *   `operands`
 670
 671     -   Represents all of the operands of an operation.
 672
 673 *   `ref` ( input )
 674
 675     -   Represents a reference to the a variable or directive, that must have
 676         already been resolved, that may be used as a parameter to a `custom`
 677         directive.
 678     -   Used to pass previously parsed entities to custom directives.
 679     -   The input may be any directive or variable, aside from `functional-type`
 680         and `custom`.
 681
 682 *   `regions`
 683
 684     -   Represents all of the regions of an operation.
 685
 686 *   `results`
 687
 688     -   Represents all of the results of an operation.
 689
 690 *   `successors`
 691
 692     -   Represents all of the successors of an operation.
 693
 694 *   `type` ( input )
 695
 696     -   Represents the type of the given input.
 697     -   `input` must be either an operand or result [variable](#variables), the
 698         `operands` directive, or the `results` directive.
 699
 700 *   `qualified` ( type_or_attribute )
 701
 702     -   Wraps a `type` directive or an attribute parameter.
 703     -   Used to force printing the type or attribute prefixed with its dialect
 704         and mnemonic. For example the `vector.multi_reduction` operation has a
 705         `kind` attribute ; by default the declarative assembly will print:
 706         `vector.multi_reduction <minf>, ...` but using `qualified($kind)` in the
 707         declarative assembly format will print it instead as:
 708         `vector.multi_reduction #vector.kind<minf>, ...`.
 709
 710 #### Literals
 711
 712 A literal is either a keyword or punctuation surrounded by \`\`.
 713
 714 The following are the set of valid punctuation:
 715
 716 `:`, `,`, `=`, `<`, `>`, `(`, `)`, `{`, `}`, `[`, `]`, `->`, `?`, `+`, `*`
 717
 718 The following are valid whitespace punctuation:
 719
 720 `\n`, ` `
 721
 722 The `\n` literal emits a newline an indents to the start of the operation. An
 723 example is shown below:
 724
 725 ```tablegen
 726 let assemblyFormat = [{
 727   `{` `\n` ` ` ` ` `this_is_on_a_newline` `\n` `}` attr-dict
 728 }];
 729 ```
 730
 731 ```mlir
 732 %results = my.operation {
 733   this_is_on_a_newline
 734 }
 735 ```
 736
 737 An empty literal \`\` may be used to remove a space that is inserted implicitly
 738 after certain literal elements, such as `)`/`]`/etc. For example, "`]`" may
 739 result in an output of `]` it is not the last element in the format. "`]` \`\`"
 740 would trim the trailing space in this situation.
 741
 742 #### Variables
 743
 744 A variable is an entity that has been registered on the operation itself, i.e.
 745 an argument(attribute or operand), region, result, successor, etc. In the
 746 `CallOp` example above, the variables would be `$callee` and `$args`.
 747
 748 Attribute variables are printed with their respective value type, unless that
 749 value type is buildable. In those cases, the type of the attribute is elided.
 750
 751 #### Custom Directives
 752
 753 The declarative assembly format specification allows for handling a large
 754 majority of the common cases when formatting an operation. For the operations
 755 that require or desire specifying parts of the operation in a form not supported
 756 by the declarative syntax, custom directives may be specified. A custom
 757 directive essentially allows for users to use C++ for printing and parsing
 758 subsections of an otherwise declaratively specified format. Looking at the
 759 specification of a custom directive above:
 760
 761 ```
 762 custom-directive ::= `custom` `<` UserDirective `>` `(` Params `)`
 763 ```
 764
 765 A custom directive has two main parts: The `UserDirective` and the `Params`. A
 766 custom directive is transformed into a call to a `print*` and a `parse*` method
 767 when generating the C++ code for the format. The `UserDirective` is an
 768 identifier used as a suffix to these two calls, i.e., `custom<MyDirective>(...)`
 769 would result in calls to `parseMyDirective` and `printMyDirective` within the
 770 parser and printer respectively. `Params` may be any combination of variables
 771 (i.e. Attribute, Operand, Successor, etc.), type directives, and `attr-dict`.
 772 The type directives must refer to a variable, but that variable need not also be
 773 a parameter to the custom directive.
 774
 775 The arguments to the `parse<UserDirective>` method are firstly a reference to
 776 the `OpAsmParser`(`OpAsmParser &`), and secondly a set of output parameters
 777 corresponding to the parameters specified in the format. The mapping of
 778 declarative parameter to `parse` method argument is detailed below:
 779
 780 *   Attribute Variables
 781     -   Single: `<Attribute-Storage-Type>(e.g. Attribute) &`
 782     -   Optional: `<Attribute-Storage-Type>(e.g. Attribute) &`
 783 *   Operand Variables
 784     -   Single: `OpAsmParser::UnresolvedOperand &`
 785     -   Optional: `Optional<OpAsmParser::UnresolvedOperand> &`
 786     -   Variadic: `SmallVectorImpl<OpAsmParser::UnresolvedOperand> &`
 787     -   VariadicOfVariadic:
 788         `SmallVectorImpl<SmallVector<OpAsmParser::UnresolvedOperand>> &`
 789 *   Ref Directives
 790     -   A reference directive is passed to the parser using the same mapping as
 791         the input operand. For example, a single region would be passed as a
 792         `Region &`.
 793 *   Region Variables
 794     -   Single: `Region &`
 795     -   Variadic: `SmallVectorImpl<std::unique_ptr<Region>> &`
 796 *   Successor Variables
 797     -   Single: `Block *&`
 798     -   Variadic: `SmallVectorImpl<Block *> &`
 799 *   Type Directives
 800     -   Single: `Type &`
 801     -   Optional: `Type &`
 802     -   Variadic: `SmallVectorImpl<Type> &`
 803     -   VariadicOfVariadic: `SmallVectorImpl<SmallVector<Type>> &`
 804 *   `attr-dict` Directive: `NamedAttrList &`
 805
 806 When a variable is optional, the value should only be specified if the variable
 807 is present. Otherwise, the value should remain `None` or null.
 808
 809 The arguments to the `print<UserDirective>` method is firstly a reference to the
 810 `OpAsmPrinter`(`OpAsmPrinter &`), second the op (e.g. `FooOp op` which can be
 811 `Operation *op` alternatively), and finally a set of output parameters
 812 corresponding to the parameters specified in the format. The mapping of
 813 declarative parameter to `print` method argument is detailed below:
 814
 815 *   Attribute Variables
 816     -   Single: `<Attribute-Storage-Type>(e.g. Attribute)`
 817     -   Optional: `<Attribute-Storage-Type>(e.g. Attribute)`
 818 *   Operand Variables
 819     -   Single: `Value`
 820     -   Optional: `Value`
 821     -   Variadic: `OperandRange`
 822     -   VariadicOfVariadic: `OperandRangeRange`
 823 *   Ref Directives
 824     -   A reference directive is passed to the printer using the same mapping as
 825         the input operand. For example, a single region would be passed as a
 826         `Region &`.
 827 *   Region Variables
 828     -   Single: `Region &`
 829     -   Variadic: `MutableArrayRef<Region>`
 830 *   Successor Variables
 831     -   Single: `Block *`
 832     -   Variadic: `SuccessorRange`
 833 *   Type Directives
 834     -   Single: `Type`
 835     -   Optional: `Type`
 836     -   Variadic: `TypeRange`
 837     -   VariadicOfVariadic: `TypeRangeRange`
 838 *   `attr-dict` Directive: `DictionaryAttr`
 839
 840 When a variable is optional, the provided value may be null.
 841
 842 #### Optional Groups
 843
 844 In certain situations operations may have "optional" information, e.g.
 845 attributes or an empty set of variadic operands. In these situations a section
 846 of the assembly format can be marked as `optional` based on the presence of this
 847 information. An optional group is defined as follows:
 848
 849 ```
 850 optional-group: `(` elements `)` (`:` `(` else-elements `)`)? `?`
 851 ```
 852
 853 The `elements` of an optional group have the following requirements:
 854
 855 *   The first element of the group must either be a attribute, literal, operand,
 856     or region.
 857     -   This is because the first element must be optionally parsable.
 858 *   Exactly one argument variable or type directive within the group must be
 859     marked as the anchor of the group.
 860     -   The anchor is the element whose presence controls whether the group
 861         should be printed/parsed.
 862     -   An element is marked as the anchor by adding a trailing `^`.
 863     -   The first element is *not* required to be the anchor of the group.
 864     -   When a non-variadic region anchors a group, the detector for printing
 865         the group is if the region is empty.
 866 *   Literals, variables, custom directives, and type directives are the only
 867     valid elements within the group.
 868     -   Any attribute variable may be used, but only optional attributes can be
 869         marked as the anchor.
 870     -   Only variadic or optional results and operand arguments and can be used.
 871     -   All region variables can be used. When a non-variable length region is
 872         used, if the group is not present the region is empty.
 873
 874 An example of an operation with an optional group is `func.return`, which has a
 875 variadic number of operands.
 876
 877 ```tablegen
 878 def ReturnOp : ... {
 879   let arguments = (ins Variadic<AnyType>:$operands);
 880
 881   // We only print the operands and types if there are a non-zero number
 882   // of operands.
 883   let assemblyFormat = "attr-dict ($operands^ `:` type($operands))?";
 884 }
 885 ```
 886
 887 ##### Unit Attributes
 888
 889 In MLIR, the [`unit` Attribute](Dialects/Builtin.md/#unitattr) is special in that it
 890 only has one possible value, i.e. it derives meaning from its existence. When a
 891 unit attribute is used to anchor an optional group and is not the first element
 892 of the group, the presence of the unit attribute can be directly correlated with
 893 the presence of the optional group itself. As such, in these situations the unit
 894 attribute will not be printed or present in the output and will be automatically
 895 inferred when parsing by the presence of the optional group itself.
 896
 897 For example, the following operation:
 898
 899 ```tablegen
 900 def FooOp : ... {
 901   let arguments = (ins UnitAttr:$is_read_only);
 902
 903   let assemblyFormat = "attr-dict (`is_read_only` $is_read_only^)?";
 904 }
 905 ```
 906
 907 would be formatted as such:
 908
 909 ```mlir
 910 // When the unit attribute is present:
 911 foo.op is_read_only
 912
 913 // When the unit attribute is not present:
 914 foo.op
 915 ```
 916
 917 ##### Optional "else" Group
 918
 919 Optional groups also have support for an "else" group of elements. These are
 920 elements that are parsed/printed if the `anchor` element of the optional group
 921 is *not* present. Unlike the main element group, the "else" group has no
 922 restriction on the first element and none of the elements may act as the
 923 `anchor` for the optional. An example is shown below:
 924
 925 ```tablegen
 926 def FooOp : ... {
 927   let arguments = (ins UnitAttr:$foo);
 928
 929   let assemblyFormat = "attr-dict (`foo_is_present` $foo^):(`foo_is_absent`)?";
 930 }
 931 ```
 932
 933 would be formatted as such:
 934
 935 ```mlir
 936 // When the `foo` attribute is present:
 937 foo.op foo_is_present
 938
 939 // When the `foo` attribute is not present:
 940 foo.op foo_is_absent
 941 ```
 942
 943 #### Requirements
 944
 945 The format specification has a certain set of requirements that must be adhered
 946 to:
 947
 948 1.  The output and operation name are never shown as they are fixed and cannot
 949     be altered.
 950 1.  All operands within the operation must appear within the format, either
 951     individually or with the `operands` directive.
 952 1.  All regions within the operation must appear within the format, either
 953     individually or with the `regions` directive.
 954 1.  All successors within the operation must appear within the format, either
 955     individually or with the `successors` directive.
 956 1.  All operand and result types must appear within the format using the various
 957     `type` directives, either individually or with the `operands` or `results`
 958     directives.
 959 1.  The `attr-dict` directive must always be present.
 960 1.  Must not contain overlapping information; e.g. multiple instances of
 961     'attr-dict', types, operands, etc.
 962     -   Note that `attr-dict` does not overlap with individual attributes. These
 963         attributes will simply be elided when printing the attribute dictionary.
 964
 965 ##### Type Inference
 966
 967 One requirement of the format is that the types of operands and results must
 968 always be present. In certain instances, the type of a variable may be deduced
 969 via type constraints or other information available. In these cases, the type of
 970 that variable may be elided from the format.
 971
 972 *   Buildable Types
 973
 974 Some type constraints may only have one representation, allowing for them to be
 975 directly buildable; for example the `I32` or `Index` types. Types in `ODS` may
 976 mark themselves as buildable by setting the `builderCall` field or inheriting
 977 from the `BuildableType` class.
 978
 979 *   Trait Equality Constraints
 980
 981 There are many operations that have known type equality constraints registered
 982 as traits on the operation; for example the true, false, and result values of a
 983 `select` operation often have the same type. The assembly format may inspect
 984 these equal constraints to discern the types of missing variables. The currently
 985 supported traits are: `AllTypesMatch`, `TypesMatchWith`, `SameTypeOperands`, and
 986 `SameOperandsAndResultType`.
 987
 988 *   InferTypeOpInterface
 989
 990 Operations that implement `InferTypeOpInterface` can omit their result types in
 991 their assembly format since the result types can be inferred from the operands.
 992
 993 ### `hasCanonicalizer`
 994
 995 This boolean field indicate whether canonicalization patterns have been defined
 996 for this operation. If it is `1`, then `::getCanonicalizationPatterns()` should
 997 be defined.
 998
 999 ### `hasCanonicalizeMethod`
1000
1001 When this boolean field is set to `true`, it indicates that the op implements a
1002 `canonicalize` method for simple "matchAndRewrite" style canonicalization
1003 patterns. If `hasCanonicalizer` is 0, then an implementation of
1004 `::getCanonicalizationPatterns()` is implemented to call this function.
1005
1006 ### `hasFolder`
1007
1008 This boolean field indicate whether general folding rules have been defined for
1009 this operation. If it is `1`, then `::fold()` should be defined.
1010
1011 ### Extra declarations
1012
1013 One of the goals of table-driven op definition is to auto-generate as much logic
1014 and methods needed for each op as possible. With that said, there will always be
1015 long-tail cases that won't be covered. For such cases, you can use
1016 `extraClassDeclaration`. Code in `extraClassDeclaration` will be copied
1017 literally to the generated C++ op class.
1018
1019 Note that `extraClassDeclaration` is a mechanism intended for long-tail cases by
1020 power users; for not-yet-implemented widely-applicable cases, improving the
1021 infrastructure is preferable.
1022
1023 ### Extra definitions
1024
1025 When defining base op classes in TableGen that are inherited many times by
1026 different ops, users may want to provide common definitions of utility and
1027 interface functions. However, many of these definitions may not be desirable or
1028 possible in `extraClassDeclaration`, which append them to the op's C++ class
1029 declaration. In these cases, users can add an `extraClassDefinition` to define
1030 code that is added to the generated source file inside the op's C++ namespace.
1031 The substitution `$cppClass` is replaced by the op's C++ class name.
1032
1033 ### Generated C++ code
1034
1035 [OpDefinitionsGen][OpDefinitionsGen] processes the op definition spec file and
1036 generates two files containing the corresponding C++ code: one for declarations,
1037 the other for definitions. The former is generated via the `-gen-op-decls`
1038 command-line option, while the latter is via the `-gen-op-defs` option.
1039
1040 The definition file contains all the op method definitions, which can be
1041 included and enabled by defining `GET_OP_CLASSES`. For each operation,
1042 OpDefinitionsGen generates an operation class and an
1043 [operand adaptor](#operand-adaptors) class. Besides, it also contains a
1044 comma-separated list of all defined ops, which can be included and enabled by
1045 defining `GET_OP_LIST`.
1046
1047 #### Class name and namespaces
1048
1049 For each operation, its generated C++ class name is the symbol `def`ed with
1050 TableGen with dialect prefix removed. The first `_` serves as the delimiter. For
1051 example, for `def TF_AddOp`, the C++ class name would be `AddOp`. We remove the
1052 `TF` prefix because it is for scoping ops; other dialects may as well define
1053 their own `AddOp`s.
1054
1055 The namespaces of the generated C++ class will come from the dialect's
1056 `cppNamespace` field. For example, if a dialect's `cppNamespace` is `A::B`, then
1057 an op of that dialect will be placed in `namespace A { namespace B { ... } }`.
1058 If a dialect does not specify a `cppNamespace`, we then use the dialect's name
1059 as the namespace.
1060
1061 This means the qualified name of the generated C++ class does not necessarily
1062 match exactly with the operation name as explained in
1063 [Operation name](#operation-name). This is to allow flexible naming to satisfy
1064 coding style requirements.
1065
1066 #### Operand adaptors
1067
1068 For each operation, we automatically generate an _operand adaptor_. This class
1069 solves the problem of accessing operands provided as a list of `Value`s without
1070 using "magic" constants. The operand adaptor takes a reference to an array of
1071 `Value` and provides methods with the same names as those in the operation class
1072 to access them. For example, for a binary arithmetic operation, it may provide
1073 `.lhs()` to access the first operand and `.rhs()` to access the second operand.
1074
1075 The operand adaptor class lives in the same namespace as the operation class,
1076 and has the name of the operation followed by `Adaptor` as well as an alias
1077 `Adaptor` inside the op class.
1078
1079 Operand adaptors can be used in function templates that also process operations:
1080
1081 ```c++
1082 template <typename BinaryOpTy>
1083 std::pair<Value, Value> zip(BinaryOpTy &&op) {
1084   return std::make_pair(op.lhs(), op.rhs());;
1085 }
1086
1087 void process(AddOp op, ArrayRef<Value> newOperands) {
1088   zip(op);
1089   zip(Adaptor<AddOp>(newOperands));
1090   /*...*/
1091 }
1092 ```
1093
1094 ## Constraints
1095
1096 Constraint is a core concept in table-driven operation definition: operation
1097 verification and graph operation matching are all based on satisfying
1098 constraints. So both the operation definition and rewrite rules specification
1099 significantly involve writing constraints. We have the `Constraint` class in
1100 [`OpBase.td`][OpBase] as the common base class for all constraints.
1101
1102 An operation's constraint can cover different range; it may
1103
1104 *   Only concern a single attribute (e.g. being a 32-bit integer greater than
1105     5),
1106 *   Multiple operands and results (e.g., the 1st result's shape must be the same
1107     as the 1st operand), or
1108 *   Intrinsic to the operation itself (e.g., having no side effect).
1109
1110 We call them as single-entity constraint, multi-entity constraint, and traits,
1111 respectively.
1112
1113 ### Single-entity constraint
1114
1115 Constraints scoped to a single operand, attribute, or result are specified at
1116 the entity's declaration place as described in
1117 [Operation arguments](#operation-arguments) and
1118 [Operation results](#operation-results).
1119
1120 To help modelling constraints of common types, a set of `TypeConstraint`s are
1121 created; they are the `Type` subclass hierarchy. It includes `F32` for the
1122 constraints of being a float, `TensorOf<[F32]>` for the constraints of being a
1123 float tensor, and so on.
1124
1125 Similarly, a set of `AttrConstraint`s are created for helping modelling
1126 constraints of common attribute kinds. They are the `Attr` subclass hierarchy.
1127 It includes `F32Attr` for the constraints of being a float attribute,
1128 `F32ArrayAttr` for the constraints of being a float array attribute, and so on.
1129
1130 ### Multi-entity constraint
1131
1132 Constraints involving more than one operand/attribute/result are quite common on
1133 operations, like the element type and shape relation between operands and
1134 results. These constraints should be specified as the `Op` class template
1135 parameter as described in
1136 [Operation traits and constraints](#operation-traits-and-constraints).
1137
1138 Multi-entity constraints are modeled as `PredOpTrait` (a subclass of `OpTrait`)
1139 in [`OpBase.td`][OpBase].A bunch of constraint primitives are provided to help
1140 specification. See [`OpBase.td`][OpBase] for the complete list.
1141
1142 ### Trait
1143
1144 Traits are intrinsic properties of the operation like having side effect or not,
1145 commutative or not, whether is a terminator, etc. These constraints should be
1146 specified as the `Op` class template parameter as described in
1147 [Operation traits and constraints](#operation-traits-and-constraints).
1148
1149 Traits are modeled as `NativeOpTrait` (a subclass of `OpTrait`) in
1150 [`OpBase.td`][OpBase]. They are backed and will be translated into the
1151 corresponding C++ `mlir::OpTrait` classes.
1152
1153 ### How to specify new constraint
1154
1155 To write a constraint, you need to provide its predicates and give it a
1156 descriptive name. Predicates, modeled with the `Pred` class, are the workhorse
1157 for composing constraints. The predicate for a constraint is typically built up
1158 in a nested manner, using the two categories of predicates:
1159
1160 1.  `CPred`: the primitive leaf predicate.
1161 2.  Compound predicate: a predicate composed from child predicates using
1162     predicate combiners (conjunction: `And`, disjunction: `Or`, negation: `Neg`,
1163     substitution: `SubstLeaves`, concatenation: `Concat`).
1164
1165 `CPred` is the basis for composing more complex predicates. It is the "atom"
1166 predicate from the perspective of TableGen and the "interface" between TableGen
1167 and C++. What is inside is already C++ code, which will be treated as opaque
1168 strings with special placeholders to be substituted.
1169
1170 You can put any C++ code that returns a boolean value inside a `CPred`,
1171 including evaluating expressions, calling functions, calling class methods, and
1172 so on.
1173
1174 To help interaction with the C++ environment, there are a few special
1175 placeholders provided to refer to entities in the context where this predicate
1176 is used. They serve as "hooks" to the enclosing environment. This includes
1177 `$_builder`, `$_op`, and `$_self`:
1178
1179 *   `$_builder` will be replaced by a `mlir::Builder` instance so that you can
1180     access common build methods.
1181 *   `$_op` will be replaced by the current operation so that you can access
1182     information of the current operation.
1183 *   `$_self` will be replaced with the entity this predicate is attached to.
1184     E.g., `BoolAttr` is an attribute constraint that wraps a
1185     `CPred<"$_self.isa<BoolAttr>()">`. Then for `BoolAttr:$attr`,`$_self` will be
1186     replaced by `$attr`. For type constraints, it's a little bit special since
1187     we want the constraints on each type definition reads naturally and we want
1188     to attach type constraints directly to an operand/result, `$_self` will be
1189     replaced by the operand/result's type. E.g., for `F32` in `F32:$operand`,
1190     its `$_self` will be expanded as `operand(...).getType()`.
1191
1192 TODO: Reconsider the leading symbol for special placeholders. Eventually we want
1193 to allow referencing operand/result `$-name`s; such `$-name`s can start with
1194 underscore.
1195
1196 For example, to write an attribute `attr` is an `IntegerAttr`, in C++ you can
1197 just call `attr.isa<IntegerAttr>()`. The code can be wrapped in a `CPred` as
1198 `$_self.isa<IntegerAttr>()`, with `$_self` as the special placeholder to be
1199 replaced by the current attribute `attr` at expansion time.
1200
1201 For more complicated predicates, you can wrap it in a single `CPred`, or you can
1202 use predicate combiners to combine them. For example, to write the constraint
1203 that an attribute `attr` is a 32-bit or 64-bit integer, you can write it as
1204
1205 ```tablegen
1206 And<[
1207   CPred<"$_self.isa<IntegerAttr>()">,
1208   Or<[
1209     CPred<"$_self.cast<IntegerAttr>().getType().isInteger(32)">,
1210     CPred<"$_self.cast<IntegerAttr>().getType().isInteger(64)">
1211   ]>
1212 ]>
1213 ```
1214
1215 (Note that the above is just to show with a familiar example how you can use
1216 `CPred` and predicate combiners to write complicated predicates. For integer
1217 attributes specifically, [`OpBase.td`][OpBase] already defines `I32Attr` and
1218 `I64Attr`. So you can actually reuse them to write it as `Or<[I32Attr.predicate,
1219 I64Attr.predicate]>`.)
1220
1221 TODO: Build up a library of reusable primitive constraints
1222
1223 If the predicate is very complex to write with `CPred` together with predicate
1224 combiners, you can also write it as a normal C++ function and use the `CPred` as
1225 a way to "invoke" the function. For example, to verify an attribute `attr` has
1226 some property, you can write a C++ function like
1227
1228 ```cpp
1229 bool HasSomeProperty(Attribute attr) { ... }
1230 ```
1231
1232 and then define the op as:
1233
1234 ```tablegen
1235 def HasSomeProperty : AttrConstraint<CPred<"HasSomeProperty($_self)">,
1236                                      "has some property">;
1237
1238 def MyOp : Op<...> {
1239   let arguments = (ins
1240     ...
1241     HasSomeProperty:$attr
1242   );
1243 }
1244 ```
1245
1246 As to whether we should define the predicate using a single `CPred` wrapping the
1247 whole expression, multiple `CPred`s with predicate combiners, or a single
1248 `CPred` "invoking" a function, there are no clear-cut criteria. Defining using
1249 `CPred` and predicate combiners is preferable since it exposes more information
1250 (instead hiding all the logic behind a C++ function) into the op definition spec
1251 so that it can potentially drive more auto-generation cases. But it will require
1252 a nice library of common predicates as the building blocks to avoid the
1253 duplication, which is being worked on right now.
1254
1255 ## Attribute Definition
1256
1257 An attribute is a compile-time known constant of an operation.
1258
1259 ODS provides attribute wrappers over C++ attribute classes. There are a few
1260 common C++ [attribute classes][AttrClasses] defined in MLIR's core IR library
1261 and one is free to define dialect-specific attribute classes. ODS allows one to
1262 use these attributes in TableGen to define operations, potentially with more
1263 fine-grained constraints. For example, `StrAttr` directly maps to `StringAttr`;
1264 `F32Attr`/`F64Attr` requires the `FloatAttr` to additionally be of a certain
1265 bitwidth.
1266
1267 ODS attributes are defined as having a storage type (corresponding to a backing
1268 `mlir::Attribute` that _stores_ the attribute), a return type (corresponding to
1269 the C++ _return_ type of the generated helper getters) as well as a method
1270 to convert between the internal storage and the helper method.
1271
1272 ### Attribute decorators
1273
1274 There are a few important attribute adapters/decorators/modifiers that can be
1275 applied to ODS attributes to specify common additional properties like
1276 optionality, default values, etc.:
1277
1278 *   `DefaultValuedAttr`: specifies the
1279     [default value](#attributes-with-default-values) for an attribute.
1280 *   `OptionalAttr`: specifies an attribute as [optional](#optional-attributes).
1281 *   `Confined`: adapts an attribute with
1282     [further constraints](#confining-attributes).
1283
1284 ### Enum attributes
1285
1286 Some attributes can only take values from a predefined enum, e.g., the
1287 comparison kind of a comparison op. To define such attributes, ODS provides
1288 several mechanisms: `IntEnumAttr`, and `BitEnumAttr`.
1289
1290 *   `IntEnumAttr`: each enum case is an integer, the attribute is stored as a
1291     [`IntegerAttr`][IntegerAttr] in the op.
1292 *   `BitEnumAttr`: each enum case is a either the empty case, a single bit,
1293     or a group of single bits, and the attribute is stored as a
1294     [`IntegerAttr`][IntegerAttr] in the op.
1295
1296 All these `*EnumAttr` attributes require fully specifying all of the allowed
1297 cases via their corresponding `*EnumAttrCase`. With this, ODS is able to
1298 generate additional verification to only accept allowed cases. To facilitate the
1299 interaction between `*EnumAttr`s and their C++ consumers, the
1300 [`EnumsGen`][EnumsGen] TableGen backend can generate a few common utilities: a
1301 C++ enum class, `llvm::DenseMapInfo` for the enum class, conversion functions
1302 from/to strings. This is controlled via the `-gen-enum-decls` and
1303 `-gen-enum-defs` command-line options of `mlir-tblgen`.
1304
1305 For example, given the following `EnumAttr`:
1306
1307 ```tablegen
1308 def Case15: I32EnumAttrCase<"Case15", 15>;
1309 def Case20: I32EnumAttrCase<"Case20", 20>;
1310
1311 def MyIntEnum: I32EnumAttr<"MyIntEnum", "An example int enum",
1312                            [Case15, Case20]> {
1313   let cppNamespace = "Outer::Inner";
1314   let stringToSymbolFnName = "ConvertToEnum";
1315   let symbolToStringFnName = "ConvertToString";
1316 }
1317 ```
1318
1319 The following will be generated via `mlir-tblgen -gen-enum-decls`:
1320
1321 ```c++
1322 namespace Outer {
1323 namespace Inner {
1324 // An example int enum
1325 enum class MyIntEnum : uint32_t {
1326   Case15 = 15,
1327   Case20 = 20,
1328 };
1329
1330 llvm::Optional<MyIntEnum> symbolizeMyIntEnum(uint32_t);
1331 llvm::StringRef ConvertToString(MyIntEnum);
1332 llvm::Optional<MyIntEnum> ConvertToEnum(llvm::StringRef);
1333 inline constexpr unsigned getMaxEnumValForMyIntEnum() {
1334   return 20;
1335 }
1336
1337 } // namespace Inner
1338 } // namespace Outer
1339
1340 namespace llvm {
1341 template<> struct DenseMapInfo<Outer::Inner::MyIntEnum> {
1342   using StorageInfo = llvm::DenseMapInfo<uint32_t>;
1343
1344   static inline Outer::Inner::MyIntEnum getEmptyKey() {
1345     return static_cast<Outer::Inner::MyIntEnum>(StorageInfo::getEmptyKey());
1346   }
1347
1348   static inline Outer::Inner::MyIntEnum getTombstoneKey() {
1349     return static_cast<Outer::Inner::MyIntEnum>(StorageInfo::getTombstoneKey());
1350   }
1351
1352   static unsigned getHashValue(const Outer::Inner::MyIntEnum &val) {
1353     return StorageInfo::getHashValue(static_cast<uint32_t>(val));
1354   }
1355
1356   static bool isEqual(const Outer::Inner::MyIntEnum &lhs, const Outer::Inner::MyIntEnum &rhs) {
1357     return lhs == rhs;
1358   }
1359 };
1360 }
1361 ```
1362
1363 The following will be generated via `mlir-tblgen -gen-enum-defs`:
1364
1365 ```c++
1366 namespace Outer {
1367 namespace Inner {
1368 llvm::StringRef ConvertToString(MyIntEnum val) {
1369   switch (val) {
1370     case MyIntEnum::Case15: return "Case15";
1371     case MyIntEnum::Case20: return "Case20";
1372   }
1373   return "";
1374 }
1375
1376 llvm::Optional<MyIntEnum> ConvertToEnum(llvm::StringRef str) {
1377   return llvm::StringSwitch<llvm::Optional<MyIntEnum>>(str)
1378       .Case("Case15", MyIntEnum::Case15)
1379       .Case("Case20", MyIntEnum::Case20)
1380       .Default(llvm::None);
1381 }
1382 llvm::Optional<MyIntEnum> symbolizeMyIntEnum(uint32_t value) {
1383   switch (value) {
1384   case 15: return MyIntEnum::Case15;
1385   case 20: return MyIntEnum::Case20;
1386   default: return llvm::None;
1387   }
1388 }
1389
1390 } // namespace Inner
1391 } // namespace Outer
1392 ```
1393
1394 Similarly for the following `BitEnumAttr` definition:
1395
1396 ```tablegen
1397 def None: BitEnumAttrCaseNone<"None">;
1398 def Bit0: BitEnumAttrCaseBit<"Bit0", 0>;
1399 def Bit1: BitEnumAttrCaseBit<"Bit1", 1>;
1400 def Bit2: BitEnumAttrCaseBit<"Bit2", 2>;
1401 def Bit3: BitEnumAttrCaseBit<"Bit3", 3>;
1402
1403 def MyBitEnum: BitEnumAttr<"MyBitEnum", "An example bit enum",
1404                            [None, Bit0, Bit1, Bit2, Bit3]>;
1405 ```
1406
1407 We can have:
1408
1409 ```c++
1410 // An example bit enum
1411 enum class MyBitEnum : uint32_t {
1412   None = 0,
1413   Bit0 = 1,
1414   Bit1 = 2,
1415   Bit2 = 4,
1416   Bit3 = 8,
1417 };
1418
1419 llvm::Optional<MyBitEnum> symbolizeMyBitEnum(uint32_t);
1420 std::string stringifyMyBitEnum(MyBitEnum);
1421 llvm::Optional<MyBitEnum> symbolizeMyBitEnum(llvm::StringRef);
1422 inline MyBitEnum operator|(MyBitEnum lhs, MyBitEnum rhs) {
1423   return static_cast<MyBitEnum>(static_cast<uint32_t>(lhs) | static_cast<uint32_t>(rhs));
1424 }
1425 inline MyBitEnum operator&(MyBitEnum lhs, MyBitEnum rhs) {
1426   return static_cast<MyBitEnum>(static_cast<uint32_t>(lhs) & static_cast<uint32_t>(rhs));
1427 }
1428 inline bool bitEnumContains(MyBitEnum bits, MyBitEnum bit) {
1429   return (static_cast<uint32_t>(bits) & static_cast<uint32_t>(bit)) != 0;
1430 }
1431
1432 namespace llvm {
1433 template<> struct DenseMapInfo<::MyBitEnum> {
1434   using StorageInfo = llvm::DenseMapInfo<uint32_t>;
1435
1436   static inline ::MyBitEnum getEmptyKey() {
1437     return static_cast<::MyBitEnum>(StorageInfo::getEmptyKey());
1438   }
1439
1440   static inline ::MyBitEnum getTombstoneKey() {
1441     return static_cast<::MyBitEnum>(StorageInfo::getTombstoneKey());
1442   }
1443
1444   static unsigned getHashValue(const ::MyBitEnum &val) {
1445     return StorageInfo::getHashValue(static_cast<uint32_t>(val));
1446   }
1447
1448   static bool isEqual(const ::MyBitEnum &lhs, const ::MyBitEnum &rhs) {
1449     return lhs == rhs;
1450   }
1451 };
1452 ```
1453
1454 ```c++
1455 std::string stringifyMyBitEnum(MyBitEnum symbol) {
1456   auto val = static_cast<uint32_t>(symbol);
1457   assert(15u == (15u | val) && "invalid bits set in bit enum");
1458   // Special case for all bits unset.
1459   if (val == 0) return "None";
1460   llvm::SmallVector<llvm::StringRef, 2> strs;
1461   if (1u == (1u & val)) { strs.push_back("Bit0"); }
1462   if (2u == (2u & val)) { strs.push_back("Bit1"); }
1463   if (4u == (4u & val)) { strs.push_back("Bit2"); }
1464   if (8u == (8u & val)) { strs.push_back("Bit3"); }
1465
1466   return llvm::join(strs, "|");
1467 }
1468
1469 llvm::Optional<MyBitEnum> symbolizeMyBitEnum(llvm::StringRef str) {
1470   // Special case for all bits unset.
1471   if (str == "None") return MyBitEnum::None;
1472
1473   llvm::SmallVector<llvm::StringRef, 2> symbols;
1474   str.split(symbols, "|");
1475
1476   uint32_t val = 0;
1477   for (auto symbol : symbols) {
1478     auto bit = llvm::StringSwitch<llvm::Optional<uint32_t>>(symbol)
1479       .Case("Bit0", 1)
1480       .Case("Bit1", 2)
1481       .Case("Bit2", 4)
1482       .Case("Bit3", 8)
1483       .Default(llvm::None);
1484     if (bit) { val |= *bit; } else { return llvm::None; }
1485   }
1486   return static_cast<MyBitEnum>(val);
1487 }
1488
1489 llvm::Optional<MyBitEnum> symbolizeMyBitEnum(uint32_t value) {
1490   // Special case for all bits unset.
1491   if (value == 0) return MyBitEnum::None;
1492
1493   if (value & ~(1u | 2u | 4u | 8u)) return llvm::None;
1494   return static_cast<MyBitEnum>(value);
1495 }
1496 ```
1497
1498 ## Debugging Tips
1499
1500 ### Run `mlir-tblgen` to see the generated content
1501
1502 TableGen syntax sometimes can be obscure; reading the generated content can be a
1503 very helpful way to understand and debug issues. To build `mlir-tblgen`, run
1504 `cmake --build . --target mlir-tblgen` in your build directory and find the
1505 `mlir-tblgen` binary in the `bin/` subdirectory. All the supported generators
1506 can be found via `mlir-tblgen --help`. For example, `--gen-op-decls` and
1507 `--gen-op-defs` as explained in [Generated C++ code](#generated-c-code).
1508
1509 To see the generated code, invoke `mlir-tblgen` with a specific generator by
1510 providing include paths via `-I`. For example,
1511
1512 ```sh
1513 # To see op C++ class declaration
1514 mlir-tblgen --gen-op-decls -I /path/to/mlir/include /path/to/input/td/file
1515 # To see op C++ class definition
1516 mlir-tblgen --gen-op-defs -I /path/to/mlir/include /path/to/input/td/file
1517 # To see op documentation
1518 mlir-tblgen --gen-dialect-doc -I /path/to/mlir/include /path/to/input/td/file
1519
1520 # To see op interface C++ class declaration
1521 mlir-tblgen --gen-op-interface-decls -I /path/to/mlir/include /path/to/input/td/file
1522 # To see op interface C++ class definition
1523 mlir-tblgen --gen-op-interface-defs -I /path/to/mlir/include /path/to/input/td/file
1524 # To see op interface documentation
1525 mlir-tblgen --gen-op-interface-doc -I /path/to/mlir/include /path/to/input/td/file
1526 ```
1527
1528 ## Appendix
1529
1530 ### Reporting deprecation
1531
1532 Classes/defs can be marked as deprecated by using the `Deprecate` helper class,
1533 e.g.,
1534
1535 ```tablegen
1536 def OpTraitA : NativeOpTrait<"OpTraitA">, Deprecated<"use `bar` instead">;
1537 ```
1538
1539 would result in marking `OpTraitA` as deprecated and mlir-tblgen can emit a
1540 warning (default) or error (depending on `-on-deprecated` flag) to make
1541 deprecated state known.
1542
1543 ### Requirements and existing mechanisms analysis
1544
1545 The op description should be as declarative as possible to allow a wide range of
1546 tools to work with them and query methods generated from them. In particular
1547 this means specifying traits, constraints and shape inference information in a
1548 way that is easily analyzable (e.g., avoid opaque calls to C++ functions where
1549 possible).
1550
1551 We considered the approaches of several contemporary systems and focused on
1552 requirements that were desirable:
1553
1554 *   Ops registered using a registry separate from C++ code.
1555     *   Unknown ops are allowed in MLIR, so ops need not be registered. The
1556         ability of the compiler to optimize those ops or graphs containing those
1557         ops is constrained but correct.
1558     *   The current proposal does not include a runtime op description, but it
1559         does not preclude such description, it can be added later.
1560     *   The op registry is essential for generating C++ classes that make
1561         manipulating ops, verifying correct construction etc. in C++ easier by
1562         providing a typed representation and accessors.
1563 *   The op registry will be defined in
1564     [TableGen](https://llvm.org/docs/TableGen/index.html) and be used to
1565     generate C++ classes and utility functions
1566     (builder/verifier/parser/printer).
1567     *   TableGen is a modelling specification language used by LLVM's backends
1568         and fits in well with trait-based modelling. This is an implementation
1569         decision and there are alternative ways of doing this. But the
1570         specification language is good for the requirements of modelling the
1571         traits (as seen from usage in LLVM processor backend modelling) and easy
1572         to extend, so a practical choice. If another good option comes up, we
1573         will consider it.
1574 *   MLIR allows both defined and undefined ops.
1575     *   Defined ops should have fixed semantics and could have a corresponding
1576         reference implementation defined.
1577     *   Dialects are under full control of the dialect owner and normally live
1578         with the framework of the dialect.
1579 *   The op's traits (e.g., commutative) are modelled along with the op in the
1580     registry.
1581 *   The op's operand/return type constraints are modelled along with the op in
1582     the registry (see [Shape inference](ShapeInference.md) discussion below),
1583     this allows (e.g.) optimized concise syntax in textual dumps.
1584 *   Behavior of the op is documented along with the op with a summary and a
1585     description. The description is written in markdown and extracted for
1586     inclusion in the generated LangRef section of the dialect.
1587 *   The generic assembly form of printing and parsing is available as normal,
1588     but a custom parser and printer can either be specified or automatically
1589     generated from an optional string representation showing the mapping of the
1590     "assembly" string to operands/type.
1591     *   Parser-level remappings (e.g., `eq` to enum) will be supported as part
1592         of the parser generation.
1593 *   Matching patterns are specified separately from the op description.
1594     *   Contrasted with LLVM there is no "base" set of ops that every backend
1595         needs to be aware of. Instead there are many different dialects and the
1596         transformations/legalizations between these dialects form a graph of
1597         transformations.
1598 *   Reference implementation may be provided along with the op definition.
1599
1600     *   The reference implementation may be in terms of either standard ops or
1601         other reference implementations.
1602
1603     TODO: document expectation if the dependent op's definition changes.
1604
1605 [TableGen]: https://llvm.org/docs/TableGen/index.html
1606 [TableGenProgRef]: https://llvm.org/docs/TableGen/ProgRef.html
1607 [TableGenBackend]: https://llvm.org/docs/TableGen/BackEnds.html#introduction
1608 [OpBase]: https://github.com/llvm/llvm-project/blob/main/mlir/include/mlir/IR/OpBase.td
1609 [OpDefinitionsGen]: https://github.com/llvm/llvm-project/blob/main/mlir/tools/mlir-tblgen/OpDefinitionsGen.cpp
1610 [EnumsGen]: https://github.com/llvm/llvm-project/blob/main/mlir/tools/mlir-tblgen/EnumsGen.cpp
1611 [StringAttr]: Dialects/Builtin.md/#stringattr
1612 [IntegerAttr]: Dialects/Builtin.md/#integertype
1613 [AttrClasses]: https://github.com/llvm/llvm-project/blob/main/mlir/include/mlir/IR/Attributes.h