mlir/docs/LangRef.md

   1 # MLIR Language Reference
   2
   3 MLIR (Multi-Level IR) is a compiler intermediate representation with
   4 similarities to traditional three-address SSA representations (like
   5 [LLVM IR](http://llvm.org/docs/LangRef.html) or
   6 [SIL](https://github.com/apple/swift/blob/main/docs/SIL.rst)), but which
   7 introduces notions from polyhedral loop optimization as first-class concepts.
   8 This hybrid design is optimized to represent, analyze, and transform high level
   9 dataflow graphs as well as target-specific code generated for high performance
  10 data parallel systems. Beyond its representational capabilities, its single
  11 continuous design provides a framework to lower from dataflow graphs to
  12 high-performance target-specific code.
  13
  14 This document defines and describes the key concepts in MLIR, and is intended to
  15 be a dry reference document - the
  16 [rationale documentation](Rationale/Rationale.md),
  17 [glossary](../getting_started/Glossary.md), and other content are hosted
  18 elsewhere.
  19
  20 MLIR is designed to be used in three different forms: a human-readable textual
  21 form suitable for debugging, an in-memory form suitable for programmatic
  22 transformations and analysis, and a compact serialized form suitable for storage
  23 and transport. The different forms all describe the same semantic content. This
  24 document describes the human-readable textual form.
  25
  26 [TOC]
  27
  28 ## High-Level Structure
  29
  30 MLIR is fundamentally based on a graph-like data structure of nodes, called
  31 *Operations*, and edges, called *Values*. Each Value is the result of exactly
  32 one Operation or Block Argument, and has a *Value Type* defined by the
  33 [type system](#type-system). [Operations](#operations) are contained in
  34 [Blocks](#blocks) and Blocks are contained in [Regions](#regions). Operations
  35 are also ordered within their containing block and Blocks are ordered in their
  36 containing region, although this order may or may not be semantically meaningful
  37 in a given [kind of region](Interfaces.md/#regionkindinterfaces)). Operations
  38 may also contain regions, enabling hierarchical structures to be represented.
  39
  40 Operations can represent many different concepts, from higher-level concepts
  41 like function definitions, function calls, buffer allocations, view or slices of
  42 buffers, and process creation, to lower-level concepts like target-independent
  43 arithmetic, target-specific instructions, configuration registers, and logic
  44 gates. These different concepts are represented by different operations in MLIR
  45 and the set of operations usable in MLIR can be arbitrarily extended.
  46
  47 MLIR also provides an extensible framework for transformations on operations,
  48 using familiar concepts of compiler [Passes](Passes.md). Enabling an arbitrary
  49 set of passes on an arbitrary set of operations results in a significant scaling
  50 challenge, since each transformation must potentially take into account the
  51 semantics of any operation. MLIR addresses this complexity by allowing operation
  52 semantics to be described abstractly using [Traits](Traits) and
  53 [Interfaces](Interfaces.md), enabling transformations to operate on operations
  54 more generically. Traits often describe verification constraints on valid IR,
  55 enabling complex invariants to be captured and checked. (see
  56 [Op vs Operation](Tutorials/Toy/Ch-2.md/#op-vs-operation-using-mlir-operations))
  57
  58 One obvious application of MLIR is to represent an
  59 [SSA-based](https://en.wikipedia.org/wiki/Static_single_assignment_form) IR,
  60 like the LLVM core IR, with appropriate choice of operation types to define
  61 Modules, Functions, Branches, Memory Allocation, and verification constraints to
  62 ensure the SSA Dominance property. MLIR includes a collection of dialects which
  63 defines just such structures. However, MLIR is intended to be general enough to
  64 represent other compiler-like data structures, such as Abstract Syntax Trees in
  65 a language frontend, generated instructions in a target-specific backend, or
  66 circuits in a High-Level Synthesis tool.
  67
  68 Here's an example of an MLIR module:
  69
  70 ```mlir
  71 // Compute A*B using an implementation of multiply kernel and print the
  72 // result using a TensorFlow op. The dimensions of A and B are partially
  73 // known. The shapes are assumed to match.
  74 func.func @mul(%A: tensor<100x?xf32>, %B: tensor<?x50xf32>) -> (tensor<100x50xf32>) {
  75   // Compute the inner dimension of %A using the dim operation.
  76   %n = memref.dim %A, 1 : tensor<100x?xf32>
  77
  78   // Allocate addressable "buffers" and copy tensors %A and %B into them.
  79   %A_m = memref.alloc(%n) : memref<100x?xf32>
  80   bufferization.materialize_in_destination %A in writable %A_m
  81       : (tensor<100x?xf32>, memref<100x?xf32>) -> ()
  82
  83   %B_m = memref.alloc(%n) : memref<?x50xf32>
  84   bufferization.materialize_in_destination %B in writable %B_m
  85       : (tensor<?x50xf32>, memref<?x50xf32>) -> ()
  86
  87   // Call function @multiply passing memrefs as arguments,
  88   // and getting returned the result of the multiplication.
  89   %C_m = call @multiply(%A_m, %B_m)
  90           : (memref<100x?xf32>, memref<?x50xf32>) -> (memref<100x50xf32>)
  91
  92   memref.dealloc %A_m : memref<100x?xf32>
  93   memref.dealloc %B_m : memref<?x50xf32>
  94
  95   // Load the buffer data into a higher level "tensor" value.
  96   %C = memref.tensor_load %C_m : memref<100x50xf32>
  97   memref.dealloc %C_m : memref<100x50xf32>
  98
  99   // Call TensorFlow built-in function to print the result tensor.
 100   "tf.Print"(%C){message: "mul result"} : (tensor<100x50xf32>) -> (tensor<100x50xf32>)
 101
 102   return %C : tensor<100x50xf32>
 103 }
 104
 105 // A function that multiplies two memrefs and returns the result.
 106 func.func @multiply(%A: memref<100x?xf32>, %B: memref<?x50xf32>)
 107           -> (memref<100x50xf32>)  {
 108   // Compute the inner dimension of %A.
 109   %n = memref.dim %A, 1 : memref<100x?xf32>
 110
 111   // Allocate memory for the multiplication result.
 112   %C = memref.alloc() : memref<100x50xf32>
 113
 114   // Multiplication loop nest.
 115   affine.for %i = 0 to 100 {
 116      affine.for %j = 0 to 50 {
 117         memref.store 0 to %C[%i, %j] : memref<100x50xf32>
 118         affine.for %k = 0 to %n {
 119            %a_v  = memref.load %A[%i, %k] : memref<100x?xf32>
 120            %b_v  = memref.load %B[%k, %j] : memref<?x50xf32>
 121            %prod = arith.mulf %a_v, %b_v : f32
 122            %c_v  = memref.load %C[%i, %j] : memref<100x50xf32>
 123            %sum  = arith.addf %c_v, %prod : f32
 124            memref.store %sum, %C[%i, %j] : memref<100x50xf32>
 125         }
 126      }
 127   }
 128   return %C : memref<100x50xf32>
 129 }
 130 ```
 131
 132 ## Notation
 133
 134 MLIR has a simple and unambiguous grammar, allowing it to reliably round-trip
 135 through a textual form. This is important for development of the compiler - e.g.
 136 for understanding the state of code as it is being transformed and writing test
 137 cases.
 138
 139 This document describes the grammar using
 140 [Extended Backus-Naur Form (EBNF)](https://en.wikipedia.org/wiki/Extended_Backus%E2%80%93Naur_form).
 141
 142 This is the EBNF grammar used in this document, presented in yellow boxes.
 143
 144 ```
 145 alternation ::= expr0 | expr1 | expr2  // Either expr0 or expr1 or expr2.
 146 sequence    ::= expr0 expr1 expr2      // Sequence of expr0 expr1 expr2.
 147 repetition0 ::= expr*  // 0 or more occurrences.
 148 repetition1 ::= expr+  // 1 or more occurrences.
 149 optionality ::= expr?  // 0 or 1 occurrence.
 150 grouping    ::= (expr) // Everything inside parens is grouped together.
 151 literal     ::= `abcd` // Matches the literal `abcd`.
 152 ```
 153
 154 Code examples are presented in blue boxes.
 155
 156 ```
 157 // This is an example use of the grammar above:
 158 // This matches things like: ba, bana, boma, banana, banoma, bomana...
 159 example ::= `b` (`an` | `om`)* `a`
 160 ```
 161
 162 ### Common syntax
 163
 164 The following core grammar productions are used in this document:
 165
 166 ```
 167 // TODO: Clarify the split between lexing (tokens) and parsing (grammar).
 168 digit     ::= [0-9]
 169 hex_digit ::= [0-9a-fA-F]
 170 letter    ::= [a-zA-Z]
 171 id-punct  ::= [$._-]
 172
 173 integer-literal ::= decimal-literal | hexadecimal-literal
 174 decimal-literal ::= digit+
 175 hexadecimal-literal ::= `0x` hex_digit+
 176 float-literal ::= [-+]?[0-9]+[.][0-9]*([eE][-+]?[0-9]+)?
 177 string-literal  ::= `"` [^"\n\f\v\r]* `"`   TODO: define escaping rules
 178 ```
 179
 180 Not listed here, but MLIR does support comments. They use standard BCPL syntax,
 181 starting with a `//` and going until the end of the line.
 182
 183
 184 ### Top level Productions
 185
 186 ```
 187 // Top level production
 188 toplevel := (operation | attribute-alias-def | type-alias-def)*
 189 ```
 190
 191 The production `toplevel` is the top level production that is parsed by any parsing
 192 consuming the MLIR syntax. [Operations](#operations),
 193 [Attribute aliases](#attribute-value-aliases), and [Type aliases](#type-aliases)
 194 can be declared on the toplevel.
 195
 196 ### Identifiers and keywords
 197
 198 Syntax:
 199
 200 ```
 201 // Identifiers
 202 bare-id ::= (letter|[_]) (letter|digit|[_$.])*
 203 bare-id-list ::= bare-id (`,` bare-id)*
 204 value-id ::= `%` suffix-id
 205 alias-name :: = bare-id
 206 suffix-id ::= (digit+ | ((letter|id-punct) (letter|id-punct|digit)*))
 207
 208 symbol-ref-id ::= `@` (suffix-id | string-literal) (`::` symbol-ref-id)?
 209 value-id-list ::= value-id (`,` value-id)*
 210
 211 // Uses of value, e.g. in an operand list to an operation.
 212 value-use ::= value-id (`#` decimal-literal)?
 213 value-use-list ::= value-use (`,` value-use)*
 214 ```
 215
 216 Identifiers name entities such as values, types and functions, and are chosen by
 217 the writer of MLIR code. Identifiers may be descriptive (e.g. `%batch_size`,
 218 `@matmul`), or may be non-descriptive when they are auto-generated (e.g. `%23`,
 219 `@func42`). Identifier names for values may be used in an MLIR text file but are
 220 not persisted as part of the IR - the printer will give them anonymous names
 221 like `%42`.
 222
 223 MLIR guarantees identifiers never collide with keywords by prefixing identifiers
 224 with a sigil (e.g. `%`, `#`, `@`, `^`, `!`). In certain unambiguous contexts
 225 (e.g. affine expressions), identifiers are not prefixed, for brevity. New
 226 keywords may be added to future versions of MLIR without danger of collision
 227 with existing identifiers.
 228
 229 Value identifiers are only [in scope](#value-scoping) for the (nested) region in
 230 which they are defined and cannot be accessed or referenced outside of that
 231 region. Argument identifiers in mapping functions are in scope for the mapping
 232 body. Particular operations may further limit which identifiers are in scope in
 233 their regions. For instance, the scope of values in a region with
 234 [SSA control flow semantics](#control-flow-and-ssacfg-regions) is constrained
 235 according to the standard definition of
 236 [SSA dominance](https://en.wikipedia.org/wiki/Dominator_\(graph_theory\)).
 237 Another example is the [IsolatedFromAbove trait](Traits/#isolatedfromabove),
 238 which restricts directly accessing values defined in containing regions.
 239
 240 Function identifiers and mapping identifiers are associated with
 241 [Symbols](SymbolsAndSymbolTables.md) and have scoping rules dependent on symbol
 242 attributes.
 243
 244 ## Dialects
 245
 246 Dialects are the mechanism by which to engage with and extend the MLIR
 247 ecosystem. They allow for defining new [operations](#operations), as well as
 248 [attributes](#attributes) and [types](#type-system). Each dialect is given a
 249 unique `namespace` that is prefixed to each defined attribute/operation/type.
 250 For example, the [Affine dialect](Dialects/Affine.md) defines the namespace:
 251 `affine`.
 252
 253 MLIR allows for multiple dialects, even those outside of the main tree, to
 254 co-exist together within one module. Dialects are produced and consumed by
 255 certain passes. MLIR provides a [framework](DialectConversion.md) to convert
 256 between, and within, different dialects.
 257
 258 A few of the dialects supported by MLIR:
 259
 260 *   [Affine dialect](Dialects/Affine.md)
 261 *   [Func dialect](Dialects/Func.md)
 262 *   [GPU dialect](Dialects/GPU.md)
 263 *   [LLVM dialect](Dialects/LLVM.md)
 264 *   [SPIR-V dialect](Dialects/SPIR-V.md)
 265 *   [Vector dialect](Dialects/Vector.md)
 266
 267 ### Target specific operations
 268
 269 Dialects provide a modular way in which targets can expose target-specific
 270 operations directly through to MLIR. As an example, some targets go through
 271 LLVM. LLVM has a rich set of intrinsics for certain target-independent
 272 operations (e.g. addition with overflow check) as well as providing access to
 273 target-specific operations for the targets it supports (e.g. vector permutation
 274 operations). LLVM intrinsics in MLIR are represented via operations that start
 275 with an "llvm." name.
 276
 277 Example:
 278
 279 ```mlir
 280 // LLVM: %x = call {i16, i1} @llvm.sadd.with.overflow.i16(i16 %a, i16 %b)
 281 %x:2 = "llvm.sadd.with.overflow.i16"(%a, %b) : (i16, i16) -> (i16, i1)
 282 ```
 283
 284 These operations only work when targeting LLVM as a backend (e.g. for CPUs and
 285 GPUs), and are required to align with the LLVM definition of these intrinsics.
 286
 287 ## Operations
 288
 289 Syntax:
 290
 291 ```
 292 operation             ::= op-result-list? (generic-operation | custom-operation)
 293                           trailing-location?
 294 generic-operation     ::= string-literal `(` value-use-list? `)`  successor-list?
 295                           dictionary-properties? region-list? dictionary-attribute?
 296                           `:` function-type
 297 custom-operation      ::= bare-id custom-operation-format
 298 op-result-list        ::= op-result (`,` op-result)* `=`
 299 op-result             ::= value-id (`:` integer-literal)?
 300 successor-list        ::= `[` successor (`,` successor)* `]`
 301 successor             ::= caret-id (`:` block-arg-list)?
 302 dictionary-properties ::= `<` dictionary-attribute `>`
 303 region-list           ::= `(` region (`,` region)* `)`
 304 dictionary-attribute  ::= `{` (attribute-entry (`,` attribute-entry)*)? `}`
 305 trailing-location     ::= `loc` `(` location `)`
 306 ```
 307
 308 MLIR introduces a uniform concept called *operations* to enable describing many
 309 different levels of abstractions and computations. Operations in MLIR are fully
 310 extensible (there is no fixed list of operations) and have application-specific
 311 semantics. For example, MLIR supports
 312 [target-independent operations](Dialects/MemRef.md),
 313 [affine operations](Dialects/Affine.md), and
 314 [target-specific machine operations](#target-specific-operations).
 315
 316 The internal representation of an operation is simple: an operation is
 317 identified by a unique string (e.g. `dim`, `tf.Conv2d`, `x86.repmovsb`,
 318 `ppc.eieio`, etc), can return zero or more results, take zero or more operands,
 319 has storage for [properties](#properties), has a dictionary of
 320 [attributes](#attributes), has zero or more successors, and zero or more
 321 enclosed [regions](#regions). The generic printing form includes all these
 322 elements literally, with a function type to indicate the types of the
 323 results and operands.
 324
 325 Example:
 326
 327 ```mlir
 328 // An operation that produces two results.
 329 // The results of %result can be accessed via the <name> `#` <opNo> syntax.
 330 %result:2 = "foo_div"() : () -> (f32, i32)
 331
 332 // Pretty form that defines a unique name for each result.
 333 %foo, %bar = "foo_div"() : () -> (f32, i32)
 334
 335 // Invoke a TensorFlow function called tf.scramble with two inputs
 336 // and an attribute "fruit" stored in properties.
 337 %2 = "tf.scramble"(%result#0, %bar) <{fruit = "banana"}> : (f32, i32) -> f32
 338
 339 // Invoke an operation with some discardable attributes
 340 %foo, %bar = "foo_div"() {some_attr = "value", other_attr = 42 : i64} : () -> (f32, i32)
 341 ```
 342
 343 In addition to the basic syntax above, dialects may register known operations.
 344 This allows those dialects to support *custom assembly form* for parsing and
 345 printing operations. In the operation sets listed below, we show both forms.
 346
 347 ### Builtin Operations
 348
 349 The [builtin dialect](Dialects/Builtin.md) defines a select few operations that
 350 are widely applicable by MLIR dialects, such as a universal conversion cast
 351 operation that simplifies inter/intra dialect conversion. This dialect also
 352 defines a top-level `module` operation, that represents a useful IR container.
 353
 354 ## Blocks
 355
 356 Syntax:
 357
 358 ```
 359 block           ::= block-label operation+
 360 block-label     ::= block-id block-arg-list? `:`
 361 block-id        ::= caret-id
 362 caret-id        ::= `^` suffix-id
 363 value-id-and-type ::= value-id `:` type
 364
 365 // Non-empty list of names and types.
 366 value-id-and-type-list ::= value-id-and-type (`,` value-id-and-type)*
 367
 368 block-arg-list ::= `(` value-id-and-type-list? `)`
 369 ```
 370
 371 A *Block* is a list of operations. In
 372 [SSACFG regions](#control-flow-and-ssacfg-regions), each block represents a
 373 compiler [basic block](https://en.wikipedia.org/wiki/Basic_block) where
 374 instructions inside the block are executed in order and terminator operations
 375 implement control flow branches between basic blocks.
 376
 377 The last operation in a block must be a
 378 [terminator operation](#control-flow-and-ssacfg-regions). A region with a single
 379 block may opt out of this requirement by attaching the `NoTerminator` on the
 380 enclosing op. The top-level `ModuleOp` is an example of such an operation which
 381 defines this trait and whose block body does not have a terminator.
 382
 383 Blocks in MLIR take a list of block arguments, notated in a function-like way.
 384 Block arguments are bound to values specified by the semantics of individual
 385 operations. Block arguments of the entry block of a region are also arguments to
 386 the region and the values bound to these arguments are determined by the
 387 semantics of the containing operation. Block arguments of other blocks are
 388 determined by the semantics of terminator operations, e.g. Branches, which have
 389 the block as a successor. In regions with
 390 [control flow](#control-flow-and-ssacfg-regions), MLIR leverages this structure
 391 to implicitly represent the passage of control-flow dependent values without the
 392 complex nuances of PHI nodes in traditional SSA representations. Note that
 393 values which are not control-flow dependent can be referenced directly and do
 394 not need to be passed through block arguments.
 395
 396 Here is a simple example function showing branches, returns, and block
 397 arguments:
 398
 399 ```mlir
 400 func.func @simple(i64, i1) -> i64 {
 401 ^bb0(%a: i64, %cond: i1): // Code dominated by ^bb0 may refer to %a
 402   cf.cond_br %cond, ^bb1, ^bb2
 403
 404 ^bb1:
 405   cf.br ^bb3(%a: i64)    // Branch passes %a as the argument
 406
 407 ^bb2:
 408   %b = arith.addi %a, %a : i64
 409   cf.br ^bb3(%b: i64)    // Branch passes %b as the argument
 410
 411 // ^bb3 receives an argument, named %c, from predecessors
 412 // and passes it on to bb4 along with %a. %a is referenced
 413 // directly from its defining operation and is not passed through
 414 // an argument of ^bb3.
 415 ^bb3(%c: i64):
 416   cf.br ^bb4(%c, %a : i64, i64)
 417
 418 ^bb4(%d : i64, %e : i64):
 419   %0 = arith.addi %d, %e : i64
 420   return %0 : i64   // Return is also a terminator.
 421 }
 422 ```
 423
 424 **Context:** The "block argument" representation eliminates a number of special
 425 cases from the IR compared to traditional "PHI nodes are operations" SSA IRs
 426 (like LLVM). For example, the
 427 [parallel copy semantics](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.524.5461&rep=rep1&type=pdf)
 428 of SSA is immediately apparent, and function arguments are no longer a special
 429 case: they become arguments to the entry block
 430 [[more rationale](Rationale/Rationale.md/#block-arguments-vs-phi-nodes)]. Blocks
 431 are also a fundamental concept that cannot be represented by operations because
 432 values defined in an operation cannot be accessed outside the operation.
 433
 434 ## Regions
 435
 436 ### Definition
 437
 438 A region is an ordered list of MLIR [Blocks](#blocks). The semantics within a
 439 region is not imposed by the IR. Instead, the containing operation defines the
 440 semantics of the regions it contains. MLIR currently defines two kinds of
 441 regions: [SSACFG regions](#control-flow-and-ssacfg-regions), which describe
 442 control flow between blocks, and [Graph regions](#graph-regions), which do not
 443 require control flow between block. The kinds of regions within an operation are
 444 described using the [RegionKindInterface](Interfaces.md/#regionkindinterfaces).
 445
 446 Regions do not have a name or an address, only the blocks contained in a region
 447 do. Regions must be contained within operations and have no type or attributes.
 448 The first block in the region is a special block called the 'entry block'. The
 449 arguments to the entry block are also the arguments of the region itself. The
 450 entry block cannot be listed as a successor of any other block. The syntax for a
 451 region is as follows:
 452
 453 ```
 454 region      ::= `{` entry-block? block* `}`
 455 entry-block ::= operation+
 456 ```
 457
 458 A function body is an example of a region: it consists of a CFG of blocks and
 459 has additional semantic restrictions that other types of regions may not have.
 460 For example, in a function body, block terminators must either branch to a
 461 different block, or return from a function where the types of the `return`
 462 arguments must match the result types of the function signature. Similarly, the
 463 function arguments must match the types and count of the region arguments. In
 464 general, operations with regions can define these correspondences arbitrarily.
 465
 466 An *entry block* is a block with no label and no arguments that may occur at
 467 the beginning of a region. It enables a common pattern of using a region to
 468 open a new scope.
 469
 470
 471 ### Value Scoping
 472
 473 Regions provide hierarchical encapsulation of programs: it is impossible to
 474 reference, i.e. branch to, a block which is not in the same region as the source
 475 of the reference, i.e. a terminator operation. Similarly, regions provides a
 476 natural scoping for value visibility: values defined in a region don't escape to
 477 the enclosing region, if any. By default, operations inside a region can
 478 reference values defined outside of the region whenever it would have been legal
 479 for operands of the enclosing operation to reference those values, but this can
 480 be restricted using traits, such as
 481 [OpTrait::IsolatedFromAbove](Traits/#isolatedfromabove), or a custom
 482 verifier.
 483
 484 Example:
 485
 486 ```mlir
 487   "any_op"(%a) ({ // if %a is in-scope in the containing region...
 488      // then %a is in-scope here too.
 489     %new_value = "another_op"(%a) : (i64) -> (i64)
 490   }) : (i64) -> (i64)
 491 ```
 492
 493 MLIR defines a generalized 'hierarchical dominance' concept that operates across
 494 hierarchy and defines whether a value is 'in scope' and can be used by a
 495 particular operation. Whether a value can be used by another operation in the
 496 same region is defined by the kind of region. A value defined in a region can be
 497 used by an operation which has a parent in the same region, if and only if the
 498 parent could use the value. A value defined by an argument to a region can
 499 always be used by any operation deeply contained in the region. A value defined
 500 in a region can never be used outside of the region.
 501
 502 ### Control Flow and SSACFG Regions
 503
 504 In MLIR, control flow semantics of a region is indicated by
 505 [RegionKind::SSACFG](Interfaces.md/#regionkindinterfaces). Informally, these
 506 regions support semantics where operations in a region 'execute sequentially'.
 507 Before an operation executes, its operands have well-defined values. After an
 508 operation executes, the operands have the same values and results also have
 509 well-defined values. After an operation executes, the next operation in the
 510 block executes until the operation is the terminator operation at the end of a
 511 block, in which case some other operation will execute. The determination of the
 512 next instruction to execute is the 'passing of control flow'.
 513
 514 In general, when control flow is passed to an operation, MLIR does not restrict
 515 when control flow enters or exits the regions contained in that operation.
 516 However, when control flow enters a region, it always begins in the first block
 517 of the region, called the *entry* block. Terminator operations ending each block
 518 represent control flow by explicitly specifying the successor blocks of the
 519 block. Control flow can only pass to one of the specified successor blocks as in
 520 a `branch` operation, or back to the containing operation as in a `return`
 521 operation. Terminator operations without successors can only pass control back
 522 to the containing operation. Within these restrictions, the particular semantics
 523 of terminator operations is determined by the specific dialect operations
 524 involved. Blocks (other than the entry block) that are not listed as a successor
 525 of a terminator operation are defined to be unreachable and can be removed
 526 without affecting the semantics of the containing operation.
 527
 528 Although control flow always enters a region through the entry block, control
 529 flow may exit a region through any block with an appropriate terminator. The
 530 standard dialect leverages this capability to define operations with
 531 Single-Entry-Multiple-Exit (SEME) regions, possibly flowing through different
 532 blocks in the region and exiting through any block with a `return` operation.
 533 This behavior is similar to that of a function body in most programming
 534 languages. In addition, control flow may also not reach the end of a block or
 535 region, for example if a function call does not return.
 536
 537 Example:
 538
 539 ```mlir
 540 func.func @accelerator_compute(i64, i1) -> i64 { // An SSACFG region
 541 ^bb0(%a: i64, %cond: i1): // Code dominated by ^bb0 may refer to %a
 542   cf.cond_br %cond, ^bb1, ^bb2
 543
 544 ^bb1:
 545   // This def for %value does not dominate ^bb2
 546   %value = "op.convert"(%a) : (i64) -> i64
 547   cf.br ^bb3(%a: i64)    // Branch passes %a as the argument
 548
 549 ^bb2:
 550   accelerator.launch() { // An SSACFG region
 551     ^bb0:
 552       // Region of code nested under "accelerator.launch", it can reference %a but
 553       // not %value.
 554       %new_value = "accelerator.do_something"(%a) : (i64) -> ()
 555   }
 556   // %new_value cannot be referenced outside of the region
 557
 558 ^bb3:
 559   ...
 560 }
 561 ```
 562
 563 #### Operations with Multiple Regions
 564
 565 An operation containing multiple regions also completely determines the
 566 semantics of those regions. In particular, when control flow is passed to an
 567 operation, it may transfer control flow to any contained region. When control
 568 flow exits a region and is returned to the containing operation, the containing
 569 operation may pass control flow to any region in the same operation. An
 570 operation may also pass control flow to multiple contained regions concurrently.
 571 An operation may also pass control flow into regions that were specified in
 572 other operations, in particular those that defined the values or symbols the
 573 given operation uses as in a call operation. This passage of control is
 574 generally independent of passage of control flow through the basic blocks of the
 575 containing region.
 576
 577 #### Closure
 578
 579 Regions allow defining an operation that creates a closure, for example by
 580 “boxing” the body of the region into a value they produce. It remains up to the
 581 operation to define its semantics. Note that if an operation triggers
 582 asynchronous execution of the region, it is under the responsibility of the
 583 operation caller to wait for the region to be executed guaranteeing that any
 584 directly used values remain live.
 585
 586 ### Graph Regions
 587
 588 In MLIR, graph-like semantics in a region is indicated by
 589 [RegionKind::Graph](Interfaces.md/#regionkindinterfaces). Graph regions are
 590 appropriate for concurrent semantics without control flow, or for modeling
 591 generic directed graph data structures. Graph regions are appropriate for
 592 representing cyclic relationships between coupled values where there is no
 593 fundamental order to the relationships. For instance, operations in a graph
 594 region may represent independent threads of control with values representing
 595 streams of data. As usual in MLIR, the particular semantics of a region is
 596 completely determined by its containing operation. Graph regions may only
 597 contain a single basic block (the entry block).
 598
 599 **Rationale:** Currently graph regions are arbitrarily limited to a single basic
 600 block, although there is no particular semantic reason for this limitation. This
 601 limitation has been added to make it easier to stabilize the pass infrastructure
 602 and commonly used passes for processing graph regions to properly handle
 603 feedback loops. Multi-block regions may be allowed in the future if use cases
 604 that require it arise.
 605
 606 In graph regions, MLIR operations naturally represent nodes, while each MLIR
 607 value represents a multi-edge connecting a single source node and multiple
 608 destination nodes. All values defined in the region as results of operations are
 609 in scope within the region and can be accessed by any other operation in the
 610 region. In graph regions, the order of operations within a block and the order
 611 of blocks in a region is not semantically meaningful and non-terminator
 612 operations may be freely reordered, for instance, by canonicalization. Other
 613 kinds of graphs, such as graphs with multiple source nodes and multiple
 614 destination nodes, can also be represented by representing graph edges as MLIR
 615 operations.
 616
 617 Note that cycles can occur within a single block in a graph region, or between
 618 basic blocks.
 619
 620 ```mlir
 621 "test.graph_region"() ({ // A Graph region
 622   %1 = "op1"(%1, %3) : (i32, i32) -> (i32)  // OK: %1, %3 allowed here
 623   %2 = "test.ssacfg_region"() ({
 624      %5 = "op2"(%1, %2, %3, %4) : (i32, i32, i32, i32) -> (i32) // OK: %1, %2, %3, %4 all defined in the containing region
 625   }) : () -> (i32)
 626   %3 = "op2"(%1, %4) : (i32, i32) -> (i32)  // OK: %4 allowed here
 627   %4 = "op3"(%1) : (i32) -> (i32)
 628 }) : () -> ()
 629 ```
 630
 631 ### Arguments and Results
 632
 633 The arguments of the first block of a region are treated as arguments of the
 634 region. The source of these arguments is defined by the semantics of the parent
 635 operation. They may correspond to some of the values the operation itself uses.
 636
 637 Regions produce a (possibly empty) list of values. The operation semantics
 638 defines the relation between the region results and the operation results.
 639
 640 ## Type System
 641
 642 Each value in MLIR has a type defined by the type system. MLIR has an open type
 643 system (i.e. there is no fixed list of types), and types may have
 644 application-specific semantics. MLIR dialects may define any number of types
 645 with no restrictions on the abstractions they represent.
 646
 647 ```
 648 type ::= type-alias | dialect-type | builtin-type
 649
 650 type-list-no-parens ::=  type (`,` type)*
 651 type-list-parens ::= `(` `)`
 652                    | `(` type-list-no-parens `)`
 653
 654 // This is a common way to refer to a value with a specified type.
 655 ssa-use-and-type ::= ssa-use `:` type
 656 ssa-use ::= value-use
 657
 658 // Non-empty list of names and types.
 659 ssa-use-and-type-list ::= ssa-use-and-type (`,` ssa-use-and-type)*
 660
 661 function-type ::= (type | type-list-parens) `->` (type | type-list-parens)
 662 ```
 663
 664 ### Type Aliases
 665
 666 ```
 667 type-alias-def ::= `!` alias-name `=` type
 668 type-alias ::= `!` alias-name
 669 ```
 670
 671 MLIR supports defining named aliases for types. A type alias is an identifier
 672 that can be used in the place of the type that it defines. These aliases *must*
 673 be defined before their uses. Alias names may not contain a '.', since those
 674 names are reserved for [dialect types](#dialect-types).
 675
 676 Example:
 677
 678 ```mlir
 679 !avx_m128 = vector<4 x f32>
 680
 681 // Using the original type.
 682 "foo"(%x) : vector<4 x f32> -> ()
 683
 684 // Using the type alias.
 685 "foo"(%x) : !avx_m128 -> ()
 686 ```
 687
 688 ### Dialect Types
 689
 690 Similarly to operations, dialects may define custom extensions to the type
 691 system.
 692
 693 ```
 694 dialect-namespace ::= bare-id
 695
 696 dialect-type ::= `!` (opaque-dialect-type | pretty-dialect-type)
 697 opaque-dialect-type ::= dialect-namespace dialect-type-body
 698 pretty-dialect-type ::= dialect-namespace `.` pretty-dialect-type-lead-ident
 699                                               dialect-type-body?
 700 pretty-dialect-type-lead-ident ::= `[A-Za-z][A-Za-z0-9._]*`
 701
 702 dialect-type-body ::= `<` dialect-type-contents+ `>`
 703 dialect-type-contents ::= dialect-type-body
 704                             | `(` dialect-type-contents+ `)`
 705                             | `[` dialect-type-contents+ `]`
 706                             | `{` dialect-type-contents+ `}`
 707                             | [^\[<({\]>)}\0]+
 708 ```
 709
 710 Dialect types are generally specified in an opaque form, where the contents
 711 of the type are defined within a body wrapped with the dialect namespace
 712 and `<>`. Consider the following examples:
 713
 714 ```mlir
 715 // A tensorflow string type.
 716 !tf<string>
 717
 718 // A type with complex components.
 719 !foo<something<abcd>>
 720
 721 // An even more complex type.
 722 !foo<"a123^^^" + bar>
 723 ```
 724
 725 Dialect types that are simple enough may use a prettier format, which unwraps
 726 part of the syntax into an equivalent, but lighter weight form:
 727
 728 ```mlir
 729 // A tensorflow string type.
 730 !tf.string
 731
 732 // A type with complex components.
 733 !foo.something<abcd>
 734 ```
 735
 736 See [here](DefiningDialects/AttributesAndTypes.md) to learn how to define dialect types.
 737
 738 ### Builtin Types
 739
 740 The [builtin dialect](Dialects/Builtin.md) defines a set of types that are
 741 directly usable by any other dialect in MLIR. These types cover a range from
 742 primitive integer and floating-point types, function types, and more.
 743
 744 ## Properties
 745
 746 Properties are extra data members stored directly on an Operation class. They
 747 provide a way to store [inherent attributes](#attributes) and other arbitrary
 748 data. The semantics of the data is specific to a given operation, and may be
 749 exposed through [Interfaces](Interfaces.md) accessors and other methods.
 750 Properties can always be serialized to Attribute in order to be printed
 751 generically.
 752
 753 ## Attributes
 754
 755 Syntax:
 756
 757 ```
 758 attribute-entry ::= (bare-id | string-literal) `=` attribute-value
 759 attribute-value ::= attribute-alias | dialect-attribute | builtin-attribute
 760 ```
 761
 762 Attributes are the mechanism for specifying constant data on operations in
 763 places where a variable is never allowed - e.g. the comparison predicate of a
 764 [`cmpi` operation](Dialects/ArithOps.md/#arithcmpi-arithcmpiop). Each operation has an
 765 attribute dictionary, which associates a set of attribute names to attribute
 766 values. MLIR's builtin dialect provides a rich set of
 767 [builtin attribute values](#builtin-attribute-values) out of the box (such as
 768 arrays, dictionaries, strings, etc.). Additionally, dialects can define their
 769 own [dialect attribute values](#dialect-attribute-values).
 770
 771 For dialects which haven't adopted properties yet, the top-level attribute
 772 dictionary attached to an operation has special semantics. The attribute
 773 entries are considered to be of two different kinds based on whether their
 774 dictionary key has a dialect prefix:
 775
 776 -   *inherent attributes* are inherent to the definition of an operation's
 777     semantics. The operation itself is expected to verify the consistency of
 778     these attributes. An example is the `predicate` attribute of the
 779     `arith.cmpi` op. These attributes must have names that do not start with a
 780     dialect prefix.
 781
 782 -   *discardable attributes* have semantics defined externally to the operation
 783     itself, but must be compatible with the operations's semantics. These
 784     attributes must have names that start with a dialect prefix. The dialect
 785     indicated by the dialect prefix is expected to verify these attributes. An
 786     example is the `gpu.container_module` attribute.
 787
 788 Note that attribute values are allowed to themselves be dictionary attributes,
 789 but only the top-level dictionary attribute attached to the operation is subject
 790 to the classification above.
 791
 792 When properties are adopted, only discardable attributes are stored in the
 793 top-level dictionary, while inherent attributes are stored in the properties
 794 storage.
 795
 796 ### Attribute Value Aliases
 797
 798 ```
 799 attribute-alias-def ::= `#` alias-name `=` attribute-value
 800 attribute-alias ::= `#` alias-name
 801 ```
 802
 803 MLIR supports defining named aliases for attribute values. An attribute alias is
 804 an identifier that can be used in the place of the attribute that it defines.
 805 These aliases *must* be defined before their uses. Alias names may not contain a
 806 '.', since those names are reserved for
 807 [dialect attributes](#dialect-attribute-values).
 808
 809 Example:
 810
 811 ```mlir
 812 #map = affine_map<(d0) -> (d0 + 10)>
 813
 814 // Using the original attribute.
 815 %b = affine.apply affine_map<(d0) -> (d0 + 10)> (%a)
 816
 817 // Using the attribute alias.
 818 %b = affine.apply #map(%a)
 819 ```
 820
 821 ### Dialect Attribute Values
 822
 823 Similarly to operations, dialects may define custom attribute values.
 824
 825 ```
 826 dialect-namespace ::= bare-id
 827
 828 dialect-attribute ::= `#` (opaque-dialect-attribute | pretty-dialect-attribute)
 829 opaque-dialect-attribute ::= dialect-namespace dialect-attribute-body
 830 pretty-dialect-attribute ::= dialect-namespace `.` pretty-dialect-attribute-lead-ident
 831                                               dialect-attribute-body?
 832 pretty-dialect-attribute-lead-ident ::= `[A-Za-z][A-Za-z0-9._]*`
 833
 834 dialect-attribute-body ::= `<` dialect-attribute-contents+ `>`
 835 dialect-attribute-contents ::= dialect-attribute-body
 836                             | `(` dialect-attribute-contents+ `)`
 837                             | `[` dialect-attribute-contents+ `]`
 838                             | `{` dialect-attribute-contents+ `}`
 839                             | [^\[<({\]>)}\0]+
 840 ```
 841
 842 Dialect attributes are generally specified in an opaque form, where the contents
 843 of the attribute are defined within a body wrapped with the dialect namespace
 844 and `<>`. Consider the following examples:
 845
 846 ```mlir
 847 // A string attribute.
 848 #foo<string<"">>
 849
 850 // A complex attribute.
 851 #foo<"a123^^^" + bar>
 852 ```
 853
 854 Dialect attributes that are simple enough may use a prettier format, which unwraps
 855 part of the syntax into an equivalent, but lighter weight form:
 856
 857 ```mlir
 858 // A string attribute.
 859 #foo.string<"">
 860 ```
 861
 862 See [here](DefiningDialects/AttributesAndTypes.md) on how to define dialect attribute values.
 863
 864 ### Builtin Attribute Values
 865
 866 The [builtin dialect](Dialects/Builtin.md) defines a set of attribute values
 867 that are directly usable by any other dialect in MLIR. These types cover a range
 868 from primitive integer and floating-point values, attribute dictionaries, dense
 869 multi-dimensional arrays, and more.
 870
 871 ### IR Versioning
 872
 873 A dialect can opt-in to handle versioning through the
 874 `BytecodeDialectInterface`. Few hooks are exposed to the dialect to allow
 875 managing a version encoded into the bytecode file. The version is loaded lazily
 876 and allows to retrieve the version information while parsing the input IR, and
 877 gives an opportunity to each dialect for which a version is present to perform
 878 IR upgrades post-parsing through the `upgradeFromVersion` method. Custom
 879 Attribute and Type encodings can also be upgraded according to the dialect
 880 version using readAttribute and readType methods.
 881
 882 There is no restriction on what kind of information a dialect is allowed to
 883 encode to model its versioning. Currently, versioning is supported only for
 884 bytecode formats.