mlir/docs/LangRef.md

   1 # MLIR Language Reference
   2
   3 MLIR (Multi-Level IR) is a compiler intermediate representation with
   4 similarities to traditional three-address SSA representations (like
   5 [LLVM IR](http://llvm.org/docs/LangRef.html) or
   6 [SIL](https://github.com/apple/swift/blob/main/docs/SIL.rst)), but which
   7 introduces notions from polyhedral loop optimization as first-class concepts.
   8 This hybrid design is optimized to represent, analyze, and transform high level
   9 dataflow graphs as well as target-specific code generated for high performance
  10 data parallel systems. Beyond its representational capabilities, its single
  11 continuous design provides a framework to lower from dataflow graphs to
  12 high-performance target-specific code.
  13
  14 This document defines and describes the key concepts in MLIR, and is intended to
  15 be a dry reference document - the
  16 [rationale documentation](Rationale/Rationale.md),
  17 [glossary](../getting_started/Glossary.md), and other content are hosted
  18 elsewhere.
  19
  20 MLIR is designed to be used in three different forms: a human-readable textual
  21 form suitable for debugging, an in-memory form suitable for programmatic
  22 transformations and analysis, and a compact serialized form suitable for storage
  23 and transport. The different forms all describe the same semantic content. This
  24 document describes the human-readable textual form.
  25
  26 [TOC]
  27
  28 ## High-Level Structure
  29
  30 MLIR is fundamentally based on a graph-like data structure of nodes, called
  31 *Operations*, and edges, called *Values*. Each Value is the result of exactly
  32 one Operation or Block Argument, and has a *Value Type* defined by the
  33 [type system](#type-system). [Operations](#operations) are contained in
  34 [Blocks](#blocks) and Blocks are contained in [Regions](#regions). Operations
  35 are also ordered within their containing block and Blocks are ordered in their
  36 containing region, although this order may or may not be semantically meaningful
  37 in a given [kind of region](Interfaces.md/#regionkindinterfaces)). Operations
  38 may also contain regions, enabling hierarchical structures to be represented.
  39
  40 Operations can represent many different concepts, from higher-level concepts
  41 like function definitions, function calls, buffer allocations, view or slices of
  42 buffers, and process creation, to lower-level concepts like target-independent
  43 arithmetic, target-specific instructions, configuration registers, and logic
  44 gates. These different concepts are represented by different operations in MLIR
  45 and the set of operations usable in MLIR can be arbitrarily extended.
  46
  47 MLIR also provides an extensible framework for transformations on operations,
  48 using familiar concepts of compiler [Passes](Passes.md). Enabling an arbitrary
  49 set of passes on an arbitrary set of operations results in a significant scaling
  50 challenge, since each transformation must potentially take into account the
  51 semantics of any operation. MLIR addresses this complexity by allowing operation
  52 semantics to be described abstractly using [Traits](Traits.md) and
  53 [Interfaces](Interfaces.md), enabling transformations to operate on operations
  54 more generically. Traits often describe verification constraints on valid IR,
  55 enabling complex invariants to be captured and checked. (see
  56 [Op vs Operation](Tutorials/Toy/Ch-2.md/#op-vs-operation-using-mlir-operations))
  57
  58 One obvious application of MLIR is to represent an
  59 [SSA-based](https://en.wikipedia.org/wiki/Static_single_assignment_form) IR,
  60 like the LLVM core IR, with appropriate choice of operation types to define
  61 Modules, Functions, Branches, Memory Allocation, and verification constraints to
  62 ensure the SSA Dominance property. MLIR includes a collection of dialects which
  63 defines just such structures. However, MLIR is intended to be general enough to
  64 represent other compiler-like data structures, such as Abstract Syntax Trees in
  65 a language frontend, generated instructions in a target-specific backend, or
  66 circuits in a High-Level Synthesis tool.
  67
  68 Here's an example of an MLIR module:
  69
  70 ```mlir
  71 // Compute A*B using an implementation of multiply kernel and print the
  72 // result using a TensorFlow op. The dimensions of A and B are partially
  73 // known. The shapes are assumed to match.
  74 func.func @mul(%A: tensor<100x?xf32>, %B: tensor<?x50xf32>) -> (tensor<100x50xf32>) {
  75   // Compute the inner dimension of %A using the dim operation.
  76   %n = memref.dim %A, 1 : tensor<100x?xf32>
  77
  78   // Allocate addressable "buffers" and copy tensors %A and %B into them.
  79   %A_m = memref.alloc(%n) : memref<100x?xf32>
  80   memref.tensor_store %A to %A_m : memref<100x?xf32>
  81
  82   %B_m = memref.alloc(%n) : memref<?x50xf32>
  83   memref.tensor_store %B to %B_m : memref<?x50xf32>
  84
  85   // Call function @multiply passing memrefs as arguments,
  86   // and getting returned the result of the multiplication.
  87   %C_m = call @multiply(%A_m, %B_m)
  88           : (memref<100x?xf32>, memref<?x50xf32>) -> (memref<100x50xf32>)
  89
  90   memref.dealloc %A_m : memref<100x?xf32>
  91   memref.dealloc %B_m : memref<?x50xf32>
  92
  93   // Load the buffer data into a higher level "tensor" value.
  94   %C = memref.tensor_load %C_m : memref<100x50xf32>
  95   memref.dealloc %C_m : memref<100x50xf32>
  96
  97   // Call TensorFlow built-in function to print the result tensor.
  98   "tf.Print"(%C){message: "mul result"} : (tensor<100x50xf32>) -> (tensor<100x50xf32>)
  99
 100   return %C : tensor<100x50xf32>
 101 }
 102
 103 // A function that multiplies two memrefs and returns the result.
 104 func.func @multiply(%A: memref<100x?xf32>, %B: memref<?x50xf32>)
 105           -> (memref<100x50xf32>)  {
 106   // Compute the inner dimension of %A.
 107   %n = memref.dim %A, 1 : memref<100x?xf32>
 108
 109   // Allocate memory for the multiplication result.
 110   %C = memref.alloc() : memref<100x50xf32>
 111
 112   // Multiplication loop nest.
 113   affine.for %i = 0 to 100 {
 114      affine.for %j = 0 to 50 {
 115         memref.store 0 to %C[%i, %j] : memref<100x50xf32>
 116         affine.for %k = 0 to %n {
 117            %a_v  = memref.load %A[%i, %k] : memref<100x?xf32>
 118            %b_v  = memref.load %B[%k, %j] : memref<?x50xf32>
 119            %prod = arith.mulf %a_v, %b_v : f32
 120            %c_v  = memref.load %C[%i, %j] : memref<100x50xf32>
 121            %sum  = arith.addf %c_v, %prod : f32
 122            memref.store %sum, %C[%i, %j] : memref<100x50xf32>
 123         }
 124      }
 125   }
 126   return %C : memref<100x50xf32>
 127 }
 128 ```
 129
 130 ## Notation
 131
 132 MLIR has a simple and unambiguous grammar, allowing it to reliably round-trip
 133 through a textual form. This is important for development of the compiler - e.g.
 134 for understanding the state of code as it is being transformed and writing test
 135 cases.
 136
 137 This document describes the grammar using
 138 [Extended Backus-Naur Form (EBNF)](https://en.wikipedia.org/wiki/Extended_Backus%E2%80%93Naur_form).
 139
 140 This is the EBNF grammar used in this document, presented in yellow boxes.
 141
 142 ```
 143 alternation ::= expr0 | expr1 | expr2  // Either expr0 or expr1 or expr2.
 144 sequence    ::= expr0 expr1 expr2      // Sequence of expr0 expr1 expr2.
 145 repetition0 ::= expr*  // 0 or more occurrences.
 146 repetition1 ::= expr+  // 1 or more occurrences.
 147 optionality ::= expr?  // 0 or 1 occurrence.
 148 grouping    ::= (expr) // Everything inside parens is grouped together.
 149 literal     ::= `abcd` // Matches the literal `abcd`.
 150 ```
 151
 152 Code examples are presented in blue boxes.
 153
 154 ```
 155 // This is an example use of the grammar above:
 156 // This matches things like: ba, bana, boma, banana, banoma, bomana...
 157 example ::= `b` (`an` | `om`)* `a`
 158 ```
 159
 160 ### Common syntax
 161
 162 The following core grammar productions are used in this document:
 163
 164 ```
 165 // TODO: Clarify the split between lexing (tokens) and parsing (grammar).
 166 digit     ::= [0-9]
 167 hex_digit ::= [0-9a-fA-F]
 168 letter    ::= [a-zA-Z]
 169 id-punct  ::= [$._-]
 170
 171 integer-literal ::= decimal-literal | hexadecimal-literal
 172 decimal-literal ::= digit+
 173 hexadecimal-literal ::= `0x` hex_digit+
 174 float-literal ::= [-+]?[0-9]+[.][0-9]*([eE][-+]?[0-9]+)?
 175 string-literal  ::= `"` [^"\n\f\v\r]* `"`   TODO: define escaping rules
 176 ```
 177
 178 Not listed here, but MLIR does support comments. They use standard BCPL syntax,
 179 starting with a `//` and going until the end of the line.
 180
 181
 182 ### Top level Productions
 183
 184 ```
 185 // Top level production
 186 toplevel := (operation | attribute-alias-def | type-alias-def)*
 187 ```
 188
 189 The production `toplevel` is the top level production that is parsed by any parsing
 190 consuming the MLIR syntax. [Operations](#operations),
 191 [Attribute alises](#attribute-value-aliases), and [Type aliases](#type-aliases)
 192 can be declared on the toplevel.
 193
 194 ### Identifiers and keywords
 195
 196 Syntax:
 197
 198 ```
 199 // Identifiers
 200 bare-id ::= (letter|[_]) (letter|digit|[_$.])*
 201 bare-id-list ::= bare-id (`,` bare-id)*
 202 value-id ::= `%` suffix-id
 203 alias-name :: = bare-id
 204 suffix-id ::= (digit+ | ((letter|id-punct) (letter|id-punct|digit)*))
 205
 206 symbol-ref-id ::= `@` (suffix-id | string-literal) (`::` symbol-ref-id)?
 207 value-id-list ::= value-id (`,` value-id)*
 208
 209 // Uses of value, e.g. in an operand list to an operation.
 210 value-use ::= value-id
 211 value-use-list ::= value-use (`,` value-use)*
 212 ```
 213
 214 Identifiers name entities such as values, types and functions, and are chosen by
 215 the writer of MLIR code. Identifiers may be descriptive (e.g. `%batch_size`,
 216 `@matmul`), or may be non-descriptive when they are auto-generated (e.g. `%23`,
 217 `@func42`). Identifier names for values may be used in an MLIR text file but are
 218 not persisted as part of the IR - the printer will give them anonymous names
 219 like `%42`.
 220
 221 MLIR guarantees identifiers never collide with keywords by prefixing identifiers
 222 with a sigil (e.g. `%`, `#`, `@`, `^`, `!`). In certain unambiguous contexts
 223 (e.g. affine expressions), identifiers are not prefixed, for brevity. New
 224 keywords may be added to future versions of MLIR without danger of collision
 225 with existing identifiers.
 226
 227 Value identifiers are only [in scope](#value-scoping) for the (nested) region in
 228 which they are defined and cannot be accessed or referenced outside of that
 229 region. Argument identifiers in mapping functions are in scope for the mapping
 230 body. Particular operations may further limit which identifiers are in scope in
 231 their regions. For instance, the scope of values in a region with
 232 [SSA control flow semantics](#control-flow-and-ssacfg-regions) is constrained
 233 according to the standard definition of
 234 [SSA dominance](https://en.wikipedia.org/wiki/Dominator_\(graph_theory\)).
 235 Another example is the [IsolatedFromAbove trait](Traits.md/#isolatedfromabove),
 236 which restricts directly accessing values defined in containing regions.
 237
 238 Function identifiers and mapping identifiers are associated with
 239 [Symbols](SymbolsAndSymbolTables.md) and have scoping rules dependent on symbol
 240 attributes.
 241
 242 ## Dialects
 243
 244 Dialects are the mechanism by which to engage with and extend the MLIR
 245 ecosystem. They allow for defining new [operations](#operations), as well as
 246 [attributes](#attributes) and [types](#type-system). Each dialect is given a
 247 unique `namespace` that is prefixed to each defined attribute/operation/type.
 248 For example, the [Affine dialect](Dialects/Affine.md) defines the namespace:
 249 `affine`.
 250
 251 MLIR allows for multiple dialects, even those outside of the main tree, to
 252 co-exist together within one module. Dialects are produced and consumed by
 253 certain passes. MLIR provides a [framework](DialectConversion.md) to convert
 254 between, and within, different dialects.
 255
 256 A few of the dialects supported by MLIR:
 257
 258 *   [Affine dialect](Dialects/Affine.md)
 259 *   [Func dialect](Dialects/Func.md)
 260 *   [GPU dialect](Dialects/GPU.md)
 261 *   [LLVM dialect](Dialects/LLVM.md)
 262 *   [SPIR-V dialect](Dialects/SPIR-V.md)
 263 *   [Vector dialect](Dialects/Vector.md)
 264
 265 ### Target specific operations
 266
 267 Dialects provide a modular way in which targets can expose target-specific
 268 operations directly through to MLIR. As an example, some targets go through
 269 LLVM. LLVM has a rich set of intrinsics for certain target-independent
 270 operations (e.g. addition with overflow check) as well as providing access to
 271 target-specific operations for the targets it supports (e.g. vector permutation
 272 operations). LLVM intrinsics in MLIR are represented via operations that start
 273 with an "llvm." name.
 274
 275 Example:
 276
 277 ```mlir
 278 // LLVM: %x = call {i16, i1} @llvm.sadd.with.overflow.i16(i16 %a, i16 %b)
 279 %x:2 = "llvm.sadd.with.overflow.i16"(%a, %b) : (i16, i16) -> (i16, i1)
 280 ```
 281
 282 These operations only work when targeting LLVM as a backend (e.g. for CPUs and
 283 GPUs), and are required to align with the LLVM definition of these intrinsics.
 284
 285 ## Operations
 286
 287 Syntax:
 288
 289 ```
 290 operation            ::= op-result-list? (generic-operation | custom-operation)
 291                          trailing-location?
 292 generic-operation    ::= string-literal `(` value-use-list? `)`  successor-list?
 293                          region-list? dictionary-attribute? `:` function-type
 294 custom-operation     ::= bare-id custom-operation-format
 295 op-result-list       ::= op-result (`,` op-result)* `=`
 296 op-result            ::= value-id (`:` integer-literal)
 297 successor-list       ::= `[` successor (`,` successor)* `]`
 298 successor            ::= caret-id (`:` block-arg-list)?
 299 region-list          ::= `(` region (`,` region)* `)`
 300 dictionary-attribute ::= `{` (attribute-entry (`,` attribute-entry)*)? `}`
 301 trailing-location    ::= (`loc` `(` location `)`)?
 302 ```
 303
 304 MLIR introduces a uniform concept called *operations* to enable describing many
 305 different levels of abstractions and computations. Operations in MLIR are fully
 306 extensible (there is no fixed list of operations) and have application-specific
 307 semantics. For example, MLIR supports
 308 [target-independent operations](Dialects/MemRef.md),
 309 [affine operations](Dialects/Affine.md), and
 310 [target-specific machine operations](#target-specific-operations).
 311
 312 The internal representation of an operation is simple: an operation is
 313 identified by a unique string (e.g. `dim`, `tf.Conv2d`, `x86.repmovsb`,
 314 `ppc.eieio`, etc), can return zero or more results, take zero or more operands,
 315 has a dictionary of [attributes](#attributes), has zero or more successors, and
 316 zero or more enclosed [regions](#regions). The generic printing form includes
 317 all these elements literally, with a function type to indicate the types of the
 318 results and operands.
 319
 320 Example:
 321
 322 ```mlir
 323 // An operation that produces two results.
 324 // The results of %result can be accessed via the <name> `#` <opNo> syntax.
 325 %result:2 = "foo_div"() : () -> (f32, i32)
 326
 327 // Pretty form that defines a unique name for each result.
 328 %foo, %bar = "foo_div"() : () -> (f32, i32)
 329
 330 // Invoke a TensorFlow function called tf.scramble with two inputs
 331 // and an attribute "fruit".
 332 %2 = "tf.scramble"(%result#0, %bar) {fruit = "banana"} : (f32, i32) -> f32
 333 ```
 334
 335 In addition to the basic syntax above, dialects may register known operations.
 336 This allows those dialects to support *custom assembly form* for parsing and
 337 printing operations. In the operation sets listed below, we show both forms.
 338
 339 ### Builtin Operations
 340
 341 The [builtin dialect](Dialects/Builtin.md) defines a select few operations that
 342 are widely applicable by MLIR dialects, such as a universal conversion cast
 343 operation that simplifies inter/intra dialect conversion. This dialect also
 344 defines a top-level `module` operation, that represents a useful IR container.
 345
 346 ## Blocks
 347
 348 Syntax:
 349
 350 ```
 351 block           ::= block-label operation+
 352 block-label     ::= block-id block-arg-list? `:`
 353 block-id        ::= caret-id
 354 caret-id        ::= `^` suffix-id
 355 value-id-and-type ::= value-id `:` type
 356
 357 // Non-empty list of names and types.
 358 value-id-and-type-list ::= value-id-and-type (`,` value-id-and-type)*
 359
 360 block-arg-list ::= `(` value-id-and-type-list? `)`
 361 ```
 362
 363 A *Block* is a list of operations. In
 364 [SSACFG regions](#control-flow-and-ssacfg-regions), each block represents a
 365 compiler [basic block](https://en.wikipedia.org/wiki/Basic_block) where
 366 instructions inside the block are executed in order and terminator operations
 367 implement control flow branches between basic blocks.
 368
 369 The last operation in a block must be a
 370 [terminator operation](#control-flow-and-ssacfg-regions). A region with a single
 371 block may opt out of this requirement by attaching the `NoTerminator` on the
 372 enclosing op. The top-level `ModuleOp` is an example of such an operation which
 373 defines this trait and whose block body does not have a terminator.
 374
 375 Blocks in MLIR take a list of block arguments, notated in a function-like way.
 376 Block arguments are bound to values specified by the semantics of individual
 377 operations. Block arguments of the entry block of a region are also arguments to
 378 the region and the values bound to these arguments are determined by the
 379 semantics of the containing operation. Block arguments of other blocks are
 380 determined by the semantics of terminator operations, e.g. Branches, which have
 381 the block as a successor. In regions with
 382 [control flow](#control-flow-and-ssacfg-regions), MLIR leverages this structure
 383 to implicitly represent the passage of control-flow dependent values without the
 384 complex nuances of PHI nodes in traditional SSA representations. Note that
 385 values which are not control-flow dependent can be referenced directly and do
 386 not need to be passed through block arguments.
 387
 388 Here is a simple example function showing branches, returns, and block
 389 arguments:
 390
 391 ```mlir
 392 func.func @simple(i64, i1) -> i64 {
 393 ^bb0(%a: i64, %cond: i1): // Code dominated by ^bb0 may refer to %a
 394   cf.cond_br %cond, ^bb1, ^bb2
 395
 396 ^bb1:
 397   cf.br ^bb3(%a: i64)    // Branch passes %a as the argument
 398
 399 ^bb2:
 400   %b = arith.addi %a, %a : i64
 401   cf.br ^bb3(%b: i64)    // Branch passes %b as the argument
 402
 403 // ^bb3 receives an argument, named %c, from predecessors
 404 // and passes it on to bb4 along with %a. %a is referenced
 405 // directly from its defining operation and is not passed through
 406 // an argument of ^bb3.
 407 ^bb3(%c: i64):
 408   cf.br ^bb4(%c, %a : i64, i64)
 409
 410 ^bb4(%d : i64, %e : i64):
 411   %0 = arith.addi %d, %e : i64
 412   return %0 : i64   // Return is also a terminator.
 413 }
 414 ```
 415
 416 **Context:** The "block argument" representation eliminates a number of special
 417 cases from the IR compared to traditional "PHI nodes are operations" SSA IRs
 418 (like LLVM). For example, the
 419 [parallel copy semantics](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.524.5461&rep=rep1&type=pdf)
 420 of SSA is immediately apparent, and function arguments are no longer a special
 421 case: they become arguments to the entry block
 422 [[more rationale](Rationale/Rationale.md/#block-arguments-vs-phi-nodes)]. Blocks
 423 are also a fundamental concept that cannot be represented by operations because
 424 values defined in an operation cannot be accessed outside the operation.
 425
 426 ## Regions
 427
 428 ### Definition
 429
 430 A region is an ordered list of MLIR [Blocks](#blocks). The semantics within a
 431 region is not imposed by the IR. Instead, the containing operation defines the
 432 semantics of the regions it contains. MLIR currently defines two kinds of
 433 regions: [SSACFG regions](#control-flow-and-ssacfg-regions), which describe
 434 control flow between blocks, and [Graph regions](#graph-regions), which do not
 435 require control flow between block. The kinds of regions within an operation are
 436 described using the [RegionKindInterface](Interfaces.md/#regionkindinterfaces).
 437
 438 Regions do not have a name or an address, only the blocks contained in a region
 439 do. Regions must be contained within operations and have no type or attributes.
 440 The first block in the region is a special block called the 'entry block'. The
 441 arguments to the entry block are also the arguments of the region itself. The
 442 entry block cannot be listed as a successor of any other block. The syntax for a
 443 region is as follows:
 444
 445 ```
 446 region      ::= `{` entry-block? block* `}`
 447 entry-block ::= operation+
 448 ```
 449
 450 A function body is an example of a region: it consists of a CFG of blocks and
 451 has additional semantic restrictions that other types of regions may not have.
 452 For example, in a function body, block terminators must either branch to a
 453 different block, or return from a function where the types of the `return`
 454 arguments must match the result types of the function signature. Similarly, the
 455 function arguments must match the types and count of the region arguments. In
 456 general, operations with regions can define these correspondences arbitrarily.
 457
 458 An *entry block* is a block with no label and no arguments that may occur at
 459 the beginning of a region. It enables a common pattern of using a region to
 460 open a new scope.
 461
 462
 463 ### Value Scoping
 464
 465 Regions provide hierarchical encapsulation of programs: it is impossible to
 466 reference, i.e. branch to, a block which is not in the same region as the source
 467 of the reference, i.e. a terminator operation. Similarly, regions provides a
 468 natural scoping for value visibility: values defined in a region don't escape to
 469 the enclosing region, if any. By default, operations inside a region can
 470 reference values defined outside of the region whenever it would have been legal
 471 for operands of the enclosing operation to reference those values, but this can
 472 be restricted using traits, such as
 473 [OpTrait::IsolatedFromAbove](Traits.md/#isolatedfromabove), or a custom
 474 verifier.
 475
 476 Example:
 477
 478 ```mlir
 479   "any_op"(%a) ({ // if %a is in-scope in the containing region...
 480      // then %a is in-scope here too.
 481     %new_value = "another_op"(%a) : (i64) -> (i64)
 482   }) : (i64) -> (i64)
 483 ```
 484
 485 MLIR defines a generalized 'hierarchical dominance' concept that operates across
 486 hierarchy and defines whether a value is 'in scope' and can be used by a
 487 particular operation. Whether a value can be used by another operation in the
 488 same region is defined by the kind of region. A value defined in a region can be
 489 used by an operation which has a parent in the same region, if and only if the
 490 parent could use the value. A value defined by an argument to a region can
 491 always be used by any operation deeply contained in the region. A value defined
 492 in a region can never be used outside of the region.
 493
 494 ### Control Flow and SSACFG Regions
 495
 496 In MLIR, control flow semantics of a region is indicated by
 497 [RegionKind::SSACFG](Interfaces.md/#regionkindinterfaces). Informally, these
 498 regions support semantics where operations in a region 'execute sequentially'.
 499 Before an operation executes, its operands have well-defined values. After an
 500 operation executes, the operands have the same values and results also have
 501 well-defined values. After an operation executes, the next operation in the
 502 block executes until the operation is the terminator operation at the end of a
 503 block, in which case some other operation will execute. The determination of the
 504 next instruction to execute is the 'passing of control flow'.
 505
 506 In general, when control flow is passed to an operation, MLIR does not restrict
 507 when control flow enters or exits the regions contained in that operation.
 508 However, when control flow enters a region, it always begins in the first block
 509 of the region, called the *entry* block. Terminator operations ending each block
 510 represent control flow by explicitly specifying the successor blocks of the
 511 block. Control flow can only pass to one of the specified successor blocks as in
 512 a `branch` operation, or back to the containing operation as in a `return`
 513 operation. Terminator operations without successors can only pass control back
 514 to the containing operation. Within these restrictions, the particular semantics
 515 of terminator operations is determined by the specific dialect operations
 516 involved. Blocks (other than the entry block) that are not listed as a successor
 517 of a terminator operation are defined to be unreachable and can be removed
 518 without affecting the semantics of the containing operation.
 519
 520 Although control flow always enters a region through the entry block, control
 521 flow may exit a region through any block with an appropriate terminator. The
 522 standard dialect leverages this capability to define operations with
 523 Single-Entry-Multiple-Exit (SEME) regions, possibly flowing through different
 524 blocks in the region and exiting through any block with a `return` operation.
 525 This behavior is similar to that of a function body in most programming
 526 languages. In addition, control flow may also not reach the end of a block or
 527 region, for example if a function call does not return.
 528
 529 Example:
 530
 531 ```mlir
 532 func.func @accelerator_compute(i64, i1) -> i64 { // An SSACFG region
 533 ^bb0(%a: i64, %cond: i1): // Code dominated by ^bb0 may refer to %a
 534   cf.cond_br %cond, ^bb1, ^bb2
 535
 536 ^bb1:
 537   // This def for %value does not dominate ^bb2
 538   %value = "op.convert"(%a) : (i64) -> i64
 539   cf.br ^bb3(%a: i64)    // Branch passes %a as the argument
 540
 541 ^bb2:
 542   accelerator.launch() { // An SSACFG region
 543     ^bb0:
 544       // Region of code nested under "accelerator.launch", it can reference %a but
 545       // not %value.
 546       %new_value = "accelerator.do_something"(%a) : (i64) -> ()
 547   }
 548   // %new_value cannot be referenced outside of the region
 549
 550 ^bb3:
 551   ...
 552 }
 553 ```
 554
 555 #### Operations with Multiple Regions
 556
 557 An operation containing multiple regions also completely determines the
 558 semantics of those regions. In particular, when control flow is passed to an
 559 operation, it may transfer control flow to any contained region. When control
 560 flow exits a region and is returned to the containing operation, the containing
 561 operation may pass control flow to any region in the same operation. An
 562 operation may also pass control flow to multiple contained regions concurrently.
 563 An operation may also pass control flow into regions that were specified in
 564 other operations, in particular those that defined the values or symbols the
 565 given operation uses as in a call operation. This passage of control is
 566 generally independent of passage of control flow through the basic blocks of the
 567 containing region.
 568
 569 #### Closure
 570
 571 Regions allow defining an operation that creates a closure, for example by
 572 “boxing” the body of the region into a value they produce. It remains up to the
 573 operation to define its semantics. Note that if an operation triggers
 574 asynchronous execution of the region, it is under the responsibility of the
 575 operation caller to wait for the region to be executed guaranteeing that any
 576 directly used values remain live.
 577
 578 ### Graph Regions
 579
 580 In MLIR, graph-like semantics in a region is indicated by
 581 [RegionKind::Graph](Interfaces.md/#regionkindinterfaces). Graph regions are
 582 appropriate for concurrent semantics without control flow, or for modeling
 583 generic directed graph data structures. Graph regions are appropriate for
 584 representing cyclic relationships between coupled values where there is no
 585 fundamental order to the relationships. For instance, operations in a graph
 586 region may represent independent threads of control with values representing
 587 streams of data. As usual in MLIR, the particular semantics of a region is
 588 completely determined by its containing operation. Graph regions may only
 589 contain a single basic block (the entry block).
 590
 591 **Rationale:** Currently graph regions are arbitrarily limited to a single basic
 592 block, although there is no particular semantic reason for this limitation. This
 593 limitation has been added to make it easier to stabilize the pass infrastructure
 594 and commonly used passes for processing graph regions to properly handle
 595 feedback loops. Multi-block regions may be allowed in the future if use cases
 596 that require it arise.
 597
 598 In graph regions, MLIR operations naturally represent nodes, while each MLIR
 599 value represents a multi-edge connecting a single source node and multiple
 600 destination nodes. All values defined in the region as results of operations are
 601 in scope within the region and can be accessed by any other operation in the
 602 region. In graph regions, the order of operations within a block and the order
 603 of blocks in a region is not semantically meaningful and non-terminator
 604 operations may be freely reordered, for instance, by canonicalization. Other
 605 kinds of graphs, such as graphs with multiple source nodes and multiple
 606 destination nodes, can also be represented by representing graph edges as MLIR
 607 operations.
 608
 609 Note that cycles can occur within a single block in a graph region, or between
 610 basic blocks.
 611
 612 ```mlir
 613 "test.graph_region"() ({ // A Graph region
 614   %1 = "op1"(%1, %3) : (i32, i32) -> (i32)  // OK: %1, %3 allowed here
 615   %2 = "test.ssacfg_region"() ({
 616      %5 = "op2"(%1, %2, %3, %4) : (i32, i32, i32, i32) -> (i32) // OK: %1, %2, %3, %4 all defined in the containing region
 617   }) : () -> (i32)
 618   %3 = "op2"(%1, %4) : (i32, i32) -> (i32)  // OK: %4 allowed here
 619   %4 = "op3"(%1) : (i32) -> (i32)
 620 }) : () -> ()
 621 ```
 622
 623 ### Arguments and Results
 624
 625 The arguments of the first block of a region are treated as arguments of the
 626 region. The source of these arguments is defined by the semantics of the parent
 627 operation. They may correspond to some of the values the operation itself uses.
 628
 629 Regions produce a (possibly empty) list of values. The operation semantics
 630 defines the relation between the region results and the operation results.
 631
 632 ## Type System
 633
 634 Each value in MLIR has a type defined by the type system. MLIR has an open type
 635 system (i.e. there is no fixed list of types), and types may have
 636 application-specific semantics. MLIR dialects may define any number of types
 637 with no restrictions on the abstractions they represent.
 638
 639 ```
 640 type ::= type-alias | dialect-type | builtin-type
 641
 642 type-list-no-parens ::=  type (`,` type)*
 643 type-list-parens ::= `(` `)`
 644                    | `(` type-list-no-parens `)`
 645
 646 // This is a common way to refer to a value with a specified type.
 647 ssa-use-and-type ::= ssa-use `:` type
 648 ssa-use ::= value-use
 649
 650 // Non-empty list of names and types.
 651 ssa-use-and-type-list ::= ssa-use-and-type (`,` ssa-use-and-type)*
 652
 653 function-type ::= (type | type-list-parens) `->` (type | type-list-parens)
 654 ```
 655
 656 ### Type Aliases
 657
 658 ```
 659 type-alias-def ::= '!' alias-name '=' type
 660 type-alias ::= '!' alias-name
 661 ```
 662
 663 MLIR supports defining named aliases for types. A type alias is an identifier
 664 that can be used in the place of the type that it defines. These aliases *must*
 665 be defined before their uses. Alias names may not contain a '.', since those
 666 names are reserved for [dialect types](#dialect-types).
 667
 668 Example:
 669
 670 ```mlir
 671 !avx_m128 = vector<4 x f32>
 672
 673 // Using the original type.
 674 "foo"(%x) : vector<4 x f32> -> ()
 675
 676 // Using the type alias.
 677 "foo"(%x) : !avx_m128 -> ()
 678 ```
 679
 680 ### Dialect Types
 681
 682 Similarly to operations, dialects may define custom extensions to the type
 683 system.
 684
 685 ```
 686 dialect-namespace ::= bare-id
 687
 688 dialect-type ::= '!' (opaque-dialect-type | pretty-dialect-type)
 689 opaque-dialect-type ::= dialect-namespace dialect-type-body
 690 pretty-dialect-type ::= dialect-namespace '.' pretty-dialect-type-lead-ident
 691                                               dialect-type-body?
 692 pretty-dialect-type-lead-ident ::= '[A-Za-z][A-Za-z0-9._]*'
 693
 694 dialect-type-body ::= '<' dialect-type-contents+ '>'
 695 dialect-type-contents ::= dialect-type-body
 696                             | '(' dialect-type-contents+ ')'
 697                             | '[' dialect-type-contents+ ']'
 698                             | '{' dialect-type-contents+ '}'
 699                             | '[^\[<({\]>)}\0]+'
 700 ```
 701
 702 Dialect types are generally specified in an opaque form, where the contents
 703 of the type are defined within a body wrapped with the dialect namespace
 704 and `<>`. Consider the following examples:
 705
 706 ```mlir
 707 // A tensorflow string type.
 708 !tf<string>
 709
 710 // A type with complex components.
 711 !foo<something<abcd>>
 712
 713 // An even more complex type.
 714 !foo<"a123^^^" + bar>
 715 ```
 716
 717 Dialect types that are simple enough may use a prettier format, which unwraps
 718 part of the syntax into an equivalent, but lighter weight form:
 719
 720 ```mlir
 721 // A tensorflow string type.
 722 !tf.string
 723
 724 // A type with complex components.
 725 !foo.something<abcd>
 726 ```
 727
 728 See [here](AttributesAndTypes.md) to learn how to define dialect types.
 729
 730 ### Builtin Types
 731
 732 The [builtin dialect](Dialects/Builtin.md) defines a set of types that are
 733 directly usable by any other dialect in MLIR. These types cover a range from
 734 primitive integer and floating-point types, function types, and more.
 735
 736 ## Attributes
 737
 738 Syntax:
 739
 740 ```
 741 attribute-entry ::= (bare-id | string-literal) `=` attribute-value
 742 attribute-value ::= attribute-alias | dialect-attribute | builtin-attribute
 743 ```
 744
 745 Attributes are the mechanism for specifying constant data on operations in
 746 places where a variable is never allowed - e.g. the comparison predicate of a
 747 [`cmpi` operation](Dialects/ArithmeticOps.md#arithcmpi-mlirarithcmpiop). Each operation has an
 748 attribute dictionary, which associates a set of attribute names to attribute
 749 values. MLIR's builtin dialect provides a rich set of
 750 [builtin attribute values](#builtin-attribute-values) out of the box (such as
 751 arrays, dictionaries, strings, etc.). Additionally, dialects can define their
 752 own [dialect attribute values](#dialect-attribute-values).
 753
 754 The top-level attribute dictionary attached to an operation has special
 755 semantics. The attribute entries are considered to be of two different kinds
 756 based on whether their dictionary key has a dialect prefix:
 757
 758 -   *inherent attributes* are inherent to the definition of an operation's
 759     semantics. The operation itself is expected to verify the consistency of
 760     these attributes. An example is the `predicate` attribute of the
 761     `arith.cmpi` op. These attributes must have names that do not start with a
 762     dialect prefix.
 763
 764 -   *discardable attributes* have semantics defined externally to the operation
 765     itself, but must be compatible with the operations's semantics. These
 766     attributes must have names that start with a dialect prefix. The dialect
 767     indicated by the dialect prefix is expected to verify these attributes. An
 768     example is the `gpu.container_module` attribute.
 769
 770 Note that attribute values are allowed to themselves be dictionary attributes,
 771 but only the top-level dictionary attribute attached to the operation is subject
 772 to the classification above.
 773
 774 ### Attribute Value Aliases
 775
 776 ```
 777 attribute-alias-def ::= '#' alias-name '=' attribute-value
 778 attribute-alias ::= '#' alias-name
 779 ```
 780
 781 MLIR supports defining named aliases for attribute values. An attribute alias is
 782 an identifier that can be used in the place of the attribute that it defines.
 783 These aliases *must* be defined before their uses. Alias names may not contain a
 784 '.', since those names are reserved for
 785 [dialect attributes](#dialect-attribute-values).
 786
 787 Example:
 788
 789 ```mlir
 790 #map = affine_map<(d0) -> (d0 + 10)>
 791
 792 // Using the original attribute.
 793 %b = affine.apply affine_map<(d0) -> (d0 + 10)> (%a)
 794
 795 // Using the attribute alias.
 796 %b = affine.apply #map(%a)
 797 ```
 798
 799 ### Dialect Attribute Values
 800
 801 Similarly to operations, dialects may define custom attribute values.
 802
 803 ```
 804 dialect-namespace ::= bare-id
 805
 806 dialect-attribute ::= '#' (opaque-dialect-attribute | pretty-dialect-attribute)
 807 opaque-dialect-attribute ::= dialect-namespace dialect-attribute-body
 808 pretty-dialect-attribute ::= dialect-namespace '.' pretty-dialect-attribute-lead-ident
 809                                               dialect-attribute-body?
 810 pretty-dialect-attribute-lead-ident ::= '[A-Za-z][A-Za-z0-9._]*'
 811
 812 dialect-attribute-body ::= '<' dialect-attribute-contents+ '>'
 813 dialect-attribute-contents ::= dialect-attribute-body
 814                             | '(' dialect-attribute-contents+ ')'
 815                             | '[' dialect-attribute-contents+ ']'
 816                             | '{' dialect-attribute-contents+ '}'
 817                             | '[^\[<({\]>)}\0]+'
 818 ```
 819
 820 Dialect attributes are generally specified in an opaque form, where the contents
 821 of the attribute are defined within a body wrapped with the dialect namespace
 822 and `<>`. Consider the following examples:
 823
 824 ```mlir
 825 // A string attribute.
 826 #foo<string<"">>
 827
 828 // A complex attribute.
 829 #foo<"a123^^^" + bar>
 830 ```
 831
 832 Dialect attributes that are simple enough may use a prettier format, which unwraps
 833 part of the syntax into an equivalent, but lighter weight form:
 834
 835 ```mlir
 836 // A string attribute.
 837 #foo.string<"">
 838 ```
 839
 840 See [here](AttributesAndTypes.md) on how to define dialect attribute values.
 841
 842 ### Builtin Attribute Values
 843
 844 The [builtin dialect](Dialects/Builtin.md) defines a set of attribute values
 845 that are directly usable by any other dialect in MLIR. These types cover a range
 846 from primitive integer and floating-point values, attribute dictionaries, dense
 847 multi-dimensional arrays, and more.