flang/docs/HighLevelFIR.md

   1 The approach of FIR and lowering design so far was to start with the minimal set
   2 of IR operations that could allow implementing the core aspects of Fortran (like
   3 memory allocations, array addressing, runtime descriptors, and structured
   4 control flow operations). One notable aspect of the current FIR is that array
   5 and character operations are buffered (some storage is allocated for the result,
   6 and the storage is addressed to implement the operation).  While this proved
   7 functional so far, the code lowering expressions and assignments from the
   8 front-end representations (the evaluate::Expr and parser nodes) to FIR has
   9 significantly grown in complexity while it still lacks some F95 features around
  10 character array expressions or FORALL. This is mainly explained by the fact that
  11 the representation level gap is big, and a lot is happening in lowering.  It
  12 appears more and more that some intermediate steps would help to split concerns
  13 between translating the front-end representation to MLIR, implementing some
  14 Fortran concepts at a lower-level (like character or derived type assignments),
  15 and how bufferizations of character and array expressions should be done.
  16
  17 This document proposes the addition of two concepts and a set of related
  18 operations in a new dialect HLFIR to allow a simpler lowering to a higher-level
  19 FIR representation that would later be lowered to the current FIR representation
  20 via MLIR translation passes.  As a result of these additions, it is likely that
  21 the fir.array_load/fir.array_merge_store and related array operations could be
  22 removed from FIR since array assignment analysis could directly happen on the
  23 higher-level FIR representation.
  24
  25
  26 The main principles of the new lowering design are:
  27 -   Make expression lowering context independent and rather naive
  28 -   Do not materialize temporaries while lowering to FIR
  29 -   Preserve Fortran semantics/information for high-level optimizations
  30
  31 The core impact on lowering will be:
  32 -   Lowering expressions and assignments in the exact same way, regardless of
  33     whether it is an array assignment context and/or an expression inside a
  34     forall.
  35 -   Lowering transformational intrinsics in a verbatim way (no runtime calls and
  36     memory aspects yet).
  37 -   Lowering character expressions in a verbatim way (no memcpy/runtime calls
  38     and memory aspects yet).
  39 -   Argument association side effects will be delayed (copy-in/copy-out) to help
  40     inlining/function specialization to get rid of them when they are not
  41     relevant.
  42
  43
  44 ## Variable and Expression value concepts in HLFIR
  45
  46 ## Strengthening the variable concept
  47
  48 Fortran variables are currently represented in FIR as mlir::Value with reference
  49 or box type coming from special operations or block arguments. They are either
  50 the result of a fir.alloca, fir.allocmem, or fir.address_of operations with the
  51 mangled name of the variable as attribute, or they are function block arguments
  52 with the mangled name of the variable as attribute.
  53
  54 Fortran variables are defined with a Fortran type (both dynamic and static) that
  55 may have type parameters, a rank and shape (including lower bounds), and some
  56 attributes (like TARGET, OPTIONAL, VOLATILE...). All this information is
  57 currently not represented in FIR. Instead, lowering keeps track of all this
  58 information in the fir::ExtendedValue lowering data structure and uses it when
  59 needed. If unused in lowering, some information about variables is lost (like
  60 non-constant array bound expressions). In the IR, only the static type, the
  61 compile time constant extents, and compile time character lengths can be
  62 retrieved from the mlir::Value of a variable in the general case (more can be
  63 retrieved if the variable is tracked via a fir.box, but not if it is a bare
  64 memory reference).
  65
  66 This makes reasoning about Fortran variables in FIR harder, and in general
  67 forces lowering to apply all decisions related to the information that is lost
  68 in FIR. A more problematic point is that it does not allow generating debug
  69 information for the variables from FIR, since the bounds and type parameters
  70 information is not tightly linked to the base mlir::Value.
  71
  72 The proposal is to add a hlfir.declare operation that would anchor the
  73 fir::ExtendedValue information in the IR. A variable will be represented by a
  74 single SSA value with a memory type (fir.ref<T>, fir.ptr<T>, fir.heap<T>,
  75 fir.box<T>, fir.boxchar or fir.ref<fir.box<T>>). Not all memory types will be
  76 allowed for a variable: it should allow retrieving all the shape, type
  77 parameters, and dynamic type information without requiring to look-up for any
  78 defining operations. For instance, `fir.ref<fir.array<?xf32>>` will not be
  79 allowed as an HLFIR variable, and fir.box<fir.array<?xf32>> will be used
  80 instead. However, `fir.ref<fir.array<100xf32>>` will be allowed to represent an
  81 array whose lower bounds are all ones (if the lower bounds are different than
  82 one, a fir.box will still be needed).  The hlfir.declare operation will be
  83 responsible for producing the SSA value with the right memory type given the
  84 variable specifications. One notable point is that, except for the POINTER and
  85 ALLOCATABLE attributes that are retrievable from the SSA value type, other
  86 Fortran attributes (OPTIONAL, TARGET, VOLATILE...) will not be retrievable from
  87 the SSA value alone, and it will be required to access the defining
  88 hlfir.declare to get the full picture.
  89
  90 This means that semantically relevant attributes will need to be set by
  91 lowering on operations using variables when that is relevant (for instance when
  92 using an OPTIONAL variable in an intrinsic where it is allowed to be absent).
  93 Then, the optimizations passes will be allowed to look for the defining
  94 hlfir.declare, but not to assume that it must be visible.  For instance, simple
  95 contiguity of fir.box can be deduced in certain case from the hlfir.declare,
  96 and if the hlfir.declare cannot be found, transformation passes will have to
  97 assume that the variable may be non-contiguous.
  98
  99 In practice, it is expected that the passes will be able to leverage
 100 hlfir.declare in most cases, but that guaranteeing that it will always be the
 101 case would constraint the IR and optimizations too much.  The goal is also to
 102 remove the fir.box usages when possible while lowering to FIR, to avoid
 103 needlessly creating runtime descriptors for variables that do not really
 104 require it.
 105
 106 The hlfir.declare operation and restrained memory types will allow:
 107 - Pushing higher-level Fortran concepts into FIR operations (like array
 108   assignments or transformational intrinsics).
 109 - Generating debug information for the variables based on the hlfir.declare
 110   operation.
 111 - Generic Fortran aliasing analysis (currently implemented only around array
 112   assignments with the fir.array_load concept).
 113
 114 The hlfir.declare will have a sibling fir.declare operation in FIR that will
 115 allow keeping variable information until debug info is generated. The main
 116 difference is that the fir.declare will simply return its first operand,
 117 making its codegen a no-op, while hlfir.declare might change the type of
 118 its first operand to return an HLFIR variable compatible type.
 119 The fir.declare op is the only operation described by this change that will be
 120 added to FIR. The rational for this is that it is intended to survive until
 121 LLVM dialect codegeneration so that debug info generation can use them and
 122 alias information can take advantage of them even on FIR.
 123
 124 Note that Fortran variables are not necessarily named objects, they can also be
 125 the result of function references returning POINTERs. hlfir.declare will also
 126 accept such variables to be described in the IR (a unique name will be built
 127 from the caller scope name and the function name.). In general, fir.declare
 128 will allow to view every memory storage as a variable, and this will be used to
 129 describe and use compiler created array temporaries.
 130
 131 ## Adding an expression value concept in HLFIR
 132
 133 Currently, Fortran expressions can be represented as SSA values for scalar
 134 logical, integer, real, and complex expressions. Scalar character or
 135 derived-type expressions and all array expressions are buffered in lowering:
 136 their results are directly given a memory storage in lowering and are
 137 manipulated as variables.
 138
 139 While this keeps FIR simple, this makes the amount of IR generated for these
 140 expressions higher, and in general makes later optimization passes job harder
 141 since they present non-trivial patterns (with memory operations) and cannot be
 142 eliminated by naive dead code elimination when the result is unused. This also
 143 forces lowering to combine elemental array expressions into single loop nests to
 144 avoid bufferizing all array sub-expressions (which would yield terrible
 145 performance). These combinations, which are implemented using C++ lambdas in
 146 lowering makes lowering code harder to understand. It also makes the expression
 147 lowering code context dependent (especially designators lowering). The lowering
 148 code paths may be different when lowering a syntactically similar expression in
 149 an elemental expression context, in a forall context, or in a normal context.
 150
 151 Some of the combinations described in [Array Composition](ArrayComposition.md)
 152 are currently not implemented in lowering because they are less trivial
 153 optimizations, and do not really belong in lowering. However, deploying such
 154 combinations on the generated FIR with bufferizations requires the usage of
 155 non-trivial pattern matching and rewrites (recognizing temporary allocation,
 156 usage, and related runtime calls). Note that the goal of such combination is not
 157 only about inlining transformational runtime calls, it is mainly about never
 158 generating a temporary for an array expression sub-operand that is a
 159 transformational intrinsic call matching certain criteria. So the optimization
 160 pass will not only need to recognize the intrinsic call, it must understand the
 161 context it is being called in.
 162
 163 The usage of memory manipulations also makes some of the alias analysis more
 164 complex, especially when dealing with foralls (the alias analysis cannot simply
 165 follow an operand tree, it must understand indirect dependencies from operations
 166 stored in memory).
 167
 168 The proposal is to add a !hlfir.expr<T> SSA value type concept, and set of
 169 character operations (concatenation, TRIM, MAX, MIN, comparisons...), a set of
 170 array transformational operations (SUM, MATMUL, TRANSPOSE, ...), and a generic
 171 hlfir.elemental operation. The hlfir.expr<T> type is not intended to be used
 172 with scalar types that already have SSA value types (e.g., integer or real
 173 scalars).  Instead, these existing SSA types will implicitly be considered as
 174 being expressions when used in high-level FIR operations, which will simplify
 175 interfacing with other dialects that define operations with these types (e.g.,
 176 the arith dialect).
 177
 178 These hlfir.expr values could then be placed in memory when needed (assigned to
 179 a variable, passed as a procedure argument, or an IO output item...) via
 180 hlfir.assign or hlfir.associate operations that will later be described.
 181
 182 When no special optimization pass is run, a translation pass would lower the
 183 operations producing hlfir.expr to buffer allocations and memory operations just
 184 as in the currently generated FIR.
 185
 186 However, these high-level operations should allow the writing of optimization
 187 passes combining chains of operations producing hlfir.expr into optimized forms
 188 via pattern matching on the operand tree.
 189
 190 The hlfir.elemental operation will be discussed in more detail below. It allows
 191 simplifying lowering while keeping the ability to combine elemental
 192 sub-expressions into a single loop nest. It should also allow rewriting some of
 193 the transformational intrinsic operations to functions of the indices as
 194 described in [Array Composition](ArrayComposition.md).
 195
 196 ## Proposed design for HLFIR (High-Level Fortran IR)
 197
 198 ### HLFIR Operations and Types
 199
 200 #### Introduce a hlfir.expr<T> type
 201
 202 Motivation: avoid the need to materialize expressions in temporaries while
 203 lowering.
 204
 205 Syntax: ``` !hlfir.expr<[extent x]* T [, class]> ```
 206
 207 - `[extent x]*` represents the shape for arrays similarly to !fir.array<> type,
 208   except that the shape cannot be assumed rank (!hlfir.expr<..xT> is invalid).
 209   This restriction can be added because it is impossible to create an assumed
 210   rank expression in Fortran that is not a variable.
 211 - `T` is the element type of the static type
 212 - `class` flag can be set to denote that this a polymorphic expression (that the
 213   dynamic type should not be assumed to be the static type).
 214
 215
 216 examples: !hlfir.expr<fir.char<?>>, !hlfir.expr<10xi32>,
 217 !hlfir.expr<?x10x?xfir.complex<4>>
 218
 219 T in scalar hlfir.expr<T> can be:
 220 -   A character type (fir.char<10, kind>, fir.char<?, kind>)
 221 -   A derived type: (fir.type<t{...}>)
 222
 223 T in an array hlfir.expr< e1 x ex2 ..  : T> can be:
 224 -   A character or derived type
 225 -   A logical type (fir.logical<kind>)
 226 -   An integer type (i1, i32, ….)
 227 -   A floating point type (f32, f16…)
 228 -   A complex type (fir.complex<4> or mlir::complex<f32>...)
 229
 230 Some expressions may be polymorphic (for instance, MERGE can be used on
 231 polymorphic entities). The hlfir.expr type has an optional "class" flag to
 232 denote this: hlfir.expr<T, class>.
 233
 234 Note that the ALLOCATABLE, POINTER, TARGET, VOLATILE, ASYNCHRONOUS, OPTIONAL
 235 aspects do not apply to expressions, they apply to variables.
 236
 237 It is possible to query the following about an expression:
 238 -   What is the extent : via hlfir.get_extent %expr, dim
 239 -   What are the length parameters: via hlfir.get_typeparam %expr [, param_name]
 240 -   What is the dynamic type: via hlfir.get_dynamic_type %expr
 241
 242 It is possible to get the value of an array expression element:
 243 - %element = hlfir.apply %expr, %i, %j : (!hlfir.expr<T>, index index) ->
 244   hlfir.expr<ScalarType> | AnyConstantSizeScalarType
 245
 246 It is not directly possible to take an address for the expression, but an
 247 expression value can be associated to a new variable whose address can be used
 248 (required when passing the expression in a user call, or to concepts that are
 249 kept low level in FIR, like IO runtime calls).  The variable created may be a
 250 compiler created temporary, or may relate to a Fortran source variable if this
 251 mechanism is used to implement ASSOCIATE.
 252
 253 -   %var = hlfir.associate %expr [attributes about the association]->
 254     AnyMemoryOrBoxType
 255 -   hlfir.end_association %var
 256
 257 The intention is that the hlfir.expr<T> is the result of an operation, and
 258 should most often not be a block argument. This is because the hlfir.expr is
 259 mostly intended to allow combining chains of operations into more optimal
 260 forms. But it is possible to represent any expression result via a Fortran
 261 runtime descriptor (fir.box<T>), implying that if a hlfir.expr<T> is passed as
 262 a block argument, the expression bufferization pass will evaluate the operation
 263 producing the expression in a temporary, and transform the block operand into a
 264 fir.box describing the temporary. Clean-up for the temporary will be inserted
 265 after the last use of the hlfir.expr. Note that, at least at first, lowering
 266 may help FIR to find the last use of a hlfir.expr by explicitly inserting a
 267 hlfir.finalize %expr operation that may turn into a no-op if the expression is
 268 not later materialized in memory.
 269
 270 It is nonetheless not intended that such abstract types be used as block
 271 arguments to avoid introducing allocations and descriptor manipulations.
 272
 273 #### hlfir.declare operation
 274
 275 Motivation: represent variables, linking together a memory storage, shape,
 276 length parameters, attributes and the variable name.
 277
 278 Syntax:
 279 ```
 280 %var = hlfir.declare %base [shape %extent1, %extent2, ...] [lbs %lb1, %lb2, ...] [typeparams %l1, ...] {fir.def = mangled_variable_name, attributes} : [(....) ->] T1, T2
 281 ```
 282
 283 %var#0 will have a FIR memory type that is allowed for HLFIR variables. %var#1
 284 will have the same type as %base, it is intended to be used when lowering HLFIR
 285 to FIR in order to avoid creating unnecessary fir.box (that would become
 286 runtime descriptors). When an HLFIR operation has access to the defining
 287 hlfir.declare of its variable operands, the operation codegen will be allowed
 288 to replace the %var#0 reference by the simpler %var#1 reference.
 289
 290 - Extents should only be provided if %base is not a fir.box and the entity is an
 291   array.
 292 - lower bounds should only be provided if the entity is an array and the lower
 293   bounds are not default (all ones). It should also not be provided for POINTERs
 294   and ALLOCATABLES since the lower bounds may change.
 295 - type parameters should be provided for entities with length parameters, unless
 296   the entity is a CHARACTER where the length is constant in %base type.
 297 - The attributes will include the Fortran attributes: TARGET (fir.target),
 298   POINTER (fir.ptr), ALLOCATABLE (fir.alloc), CONTIGUOUS (fir.contiguous),
 299   OPTIONAL (fir.optional), VOLATILE (fir.volatile), ASYNCHRONOUS (fir.async).
 300   They will also indicate when an entity is part of an equivalence by giving the
 301   equivalence name (fir.equiv = mangled_equivalence_name).
 302
 303 hlfir.declare will be used for all Fortran variables, except the ones created via
 304 the ASSOCIATE construct that will use hlfir.associate described below.
 305
 306 hlfir.declare will also be used when creating compiler created temporaries, in
 307 which case the fir.tmp attribute will be given.
 308
 309 Examples:
 310
 311 | FORTRAN                                   | HLFIR                                                                                                                                                                                                                    |
 312 | ----------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
 313 | REAL :: X                                 | %mem = fir.alloca f32 <br> %x = hlfir.declare %mem {fir.def = "\_QPfooEx"} : fir.ref<f32>, fir.ref<f32>                                                                                                                  |
 314 | REAL, TARGET :: X(10)                     | %mem = fir.alloca f32 <br> %nval = fir.load %n <br> %x = hlfir.declare %mem {fir.def = "\_QPfooEx", fir.target} : fir.ref<fir.array<10xf32>>, fir.ref<fir.array<10xf32>>                                                 |
 315 | REAL :: X(N)                              | %mem = // … alloc or dummy argument <br> %nval = fir.load %n : i64 <br> %x = hlfir.declare %mem shape %nval {fir.def = "\_QPfooEx"} : (i64) -> fir.box<fir.array<?xf32>>, fir.ref<fir.array<?xf32>>                      |
 316 | REAL :: X(0:)                             | %mem = // … dummy argument <br> %c0 = arith.constant 0 : index <br> %x = hlfir.declare %mem lbs %c0 {fir.def = "\_QPfooEx"} : (index) -> fir.box<fir.array<?xf32>>, fir.box<fir.array<?xf32>>                            |
 317 | <br>REAL, POINTER :: X(:)                 | %mem = // … dummy argument, or local, or global <br> %x = hlfir.declare %mem {fir.def = "\_QPfooEx", fir.ptr} :  fir.ref<fir.box<fir.ptr<fir.array<?xf32>>>>, fir.ref<fir.box<fir.ptr<fir.array<?xf32>>>>                |
 318 | REAL, ALLOCATABLE :: X(:)                 | %mem = // … dummy argument, or local, or global <br> %x = hlfir.declare %mem {fir.def = "\_QPfooEx", fir.alloc} :  fir.ref<fir.box<fir.heap<fir.array<?xf32>>>>, fir.ref<fir.box<fir.heap<fir.array<?xf32>>>>            |
 319 | CHARACTER(10) :: C                        | %mem = //  … dummy argument, or local, or global <br> %c = hlfir.declare %mem lbs %c0 {fir.def = "\_QPfooEc"} :  fir.ref<fir.char<10>>, fir.ref<fir.char<10>>                                                            |
 320 | CHARACTER(\*) :: C                        | %unbox = fir.unbox %bochar (fir.boxchar<1>) -> (fir.ref<fir.char<?>>, index) <br> %c = hlfir.declare %unbox#0 typeparams %unbox#1 {fir.def = "\_QPfooEc"} : (index) ->  fir.boxchar<1>, fir.ref<fir.char<?>>             |
 321 | CHARACTER(\*), OPTIONAL, ALLOCATABLE :: C | %mem = // … dummy argument <br> %c = hlfir.declare %mem {fir.def = "\_QPfooEc", fir.alloc, fir.optional, fir.assumed\_len\_alloc} :  fir.ref<fir.box<fir.heap<fir.char<?>>>>, fir.ref<fir.box<fir.heap<fir.char<?>>>>    |
 322 | TYPE(T) :: X                              | %mem = //  … dummy argument, or local, or global <br> %x = hlfir.declare %mem {fir.def = "\_QPfooEx"} : fir.ref<fir.type<t{...}>>, fir.ref<fir.type<t{...}>>                                                             |
 323 | TYPE(T(L)) :: X                           | %mem = //  … dummy argument, or local, or global <br> %lval = fir.load %l <br> %x = hlfir.declare %mem typeparams %lval {fir.def = "\_QPfooEx"} : fir.box<fir.type<t{...}>>, fir.box<fir.type<t{...}>>                   |
 324 | CLASS(\*), POINTER :: X                   | %mem = //  … dummy argument, or local, or global <br> %x = hlfir.declare %mem {fir.def = "\_QPfooEx", fir.ptr} : fir.class<fir.ptr<None>>  fir.class<fir.ptr<None>>                                                      |
 325 | REAL :: X(..)                             | %mem = //  … dummy argument <br> %x = hlfir.declare %mem {fir.def = "\_QPfooEx"} : fir.box<fir.array<..xf32>>, fir.box<fir.array<..xf32>>                                                                                |
 326
 327 #### fir.declare operation
 328
 329 Motivation: keep variable information available in FIR, at least with
 330 the intent to be able to produce debug information.
 331
 332 Syntax:
 333 ```
 334 %var = fir.declare %base [shape %extent1, %extent2, ...] [lbs %lb1, %lb2, ...] [typeparams %l1, ...] {fir.def = mangled_variable_name, attributes} : [(....) ->] T
 335 ```
 336
 337 %var will have the same type as %base. When no debug info is generated, the
 338 operation can be replaced by %base when lowering to LLVM. Otherwise, the
 339 operation is similar to hlfir.declare and will be produced from it.
 340
 341 #### hlfir.associate operation
 342
 343 Motivation: represent Fortran associations (both from variables and expressions)
 344 and allow keeping actual/dummy argument association information after inlining.
 345
 346 Syntax:
 347 ```
 348 %var = hlfir.associate %expr_or_var {fir.def = mangled_uniq_name, attributes} (AnyExprOrVarType) -> AnyVarType
 349 ```
 350
 351 hlfir.associate is used to represent the following associations:
 352 - Dummy/Actual association on the caller side (the callee side uses
 353   hlfir.declare).
 354 - Host association in block constructs when VOLATILE/ASYNC attributes are added
 355   locally
 356 - ASSOCIATE construct (both from variable and expressions).
 357
 358 When the operand is a variable, hlfir.associate allows changing the attributes
 359 of the variable locally, and to encode certain side-effects (like
 360 copy-in/copy-out when going from a non-contiguous variable to a contiguous
 361 variable, with the help of the related hlfir.end_association operation).
 362
 363 When the operand is an expression, hlfir.associate allows associating a storage
 364 location to an expression value.
 365
 366 A hlfir.associate must be followed by a related hlfir.end_association that will
 367 allow inserting any necessary finalization or copy-out later.
 368
 369 #### hlfir.end_association operation
 370
 371 Motivation: mark the place where some association should end and some side
 372 effects might need to occur.
 373
 374 The hlfir.end_associate is a placeholder to later insert
 375 deallocation/finalization if the variable was associated with an expression,
 376 and to insert copy-out/deallocation if the variable was associated with another
 377 variable with a copy-in.
 378
 379 Syntax:
 380 ```
 381 hlfir.end_association %var [%original_variable] {attributes}
 382 ```
 383
 384
 385 The attributes can be:
 386 -   copy_out (copy out the associated variable back into the original variable
 387     if a copy-in occurred)
 388 -   finalize_copy_in (deallocate the temporary storage for the associated
 389     variable if a copy-in occurred but the associated variable was not modified
 390     (e.g., it is intent(in))).
 391 -   finalize: indicate that a finalizer should be run on the entity associated
 392     with the variable (There is currently no way to deduce this only from the
 393     variable type in FIR). It will give the finalizer mangled name so that it
 394     can be later called.
 395
 396 If the copy_out or finalize_copy_in attribute is set, “original_variable” (the
 397 argument of the hlfir.associate that produced %var) must be provided. The
 398 rationale is that the original variable address is needed to verify if a
 399 temporary was created, and if needed, to copy the data back to it.
 400
 401 #### hlfir.finalize
 402
 403 Motivation: mark end of life of local variables
 404
 405 Mark the place where a local variable will go out of scope. The main goal is to
 406 retain this information even after local variables are inlined.
 407
 408 Syntax:
 409 ```
 410 hlfir.finalize %var {attributes}
 411 ```
 412
 413 The attributes can be:
 414 -   finalize: indicate that a finalizer should be run on the entity associated
 415     with the variable (There is currently no way to deduce this only from the
 416     variable type in FIR).
 417
 418 Note that finalization will not free the local variable storage if it was
 419 allocated on the heap. If lowering created the storage passed to hlfir.declare
 420 via a fir.allocmem, lowering should insert a fir.freemem after the
 421 hlfir.finalize.  This could help making fir.allocmem to fir.alloca promotion
 422 simpler, and also because finalization may be run without the intent to
 423 deallocate the variable storage (like on INTENT(OUT) dummies).
 424
 425
 426 #### hlfir.designate
 427
 428 Motivation: Represent designators at a high-level and allow representing some
 429 information about derived type components that would otherwise be lost, like
 430 component lower bounds.
 431
 432 Represent Fortran designators in a verbatim way: both triplet, and component
 433 parts.
 434
 435 Syntax:
 436 ```
 437 %var = hlfir.designate %base [“component”,] [(%i, %k:l%:%m)] [substr ub, lb] [imag|real] [shape extent1, extent2, ....] [lbs lb1, lb2, .....] [typeparams %l1, ...] {attributes}
 438 ```
 439
 440 hlfir.designate is intended to encode a single part-ref (as defined by the
 441 fortran standard). That means that a(:)%x(i, j, k) must be split into two
 442 hlfir.designate: one for a(:), and one for x(i, j, k).  If the base is ranked,
 443 and the component is an array, the subscripts are mandatory and must not
 444 contain triplets. This ensures that the result of a fir.designator cannot be a
 445 "super-array".
 446
 447 The subscripts passed to hlfir.designate must be based on the base lower bounds
 448 (one by default).
 449
 450 A substring is built by providing the lower and upper character indices after
 451 `substr`. Implicit substring bounds must be made explicit by lowering.  It is
 452 not possible to provide substr if a component is already provided. Instead the
 453 related Fortran designator must be split into two fir.designator. This is
 454 because the component character length will be needed to compute the right
 455 stride, and it might be lost if not placed on the first designator typeparams.
 456
 457 Real and Imaginary complex parts are represented by an optional imag or real
 458 tag. It can be added even if there is already a component.
 459
 460 The shape, lower bound, and type parameter operands represent the output entity
 461 properties. The point of having those made explicit is to allow early folding
 462 and hoisting of array section shape and length parameters (which especially in
 463 FORALL contexts, can simplify later assignment temporary insertion a lot). Also,
 464 if lower bounds of a derived type component array could not be added here, they
 465 would be lost since they are not represented by other means in FIR (the fir.type
 466 does not include this information).
 467
 468 hlfir.designate is not intended to describe vector subscripted variables.
 469 Instead, lowering will have to introduce loops to do element by element
 470 addressing. See the Examples section. This helps keeping hlfir.designate simple,
 471 and since the contexts where a vector subscripted entity is considered to be a
 472 variable (in the sense that it can be modified) are very limited, it seems
 473 reasonable to have lowering deal with this aspect. For instance, a vector
 474 subscripted entity cannot be passed as a variable, it cannot be a pointer
 475 assignment target, and when it appears as an associated entity in an ASSOCIATE,
 476 the related variable cannot be modified.
 477
 478 #### hlfir.assign
 479
 480 Motivation: represent assignment at a high-level (mainly a change for array and
 481 character assignment) so that optimization pass can clearly reason about it
 482 (value propagation, inserting temporary for right-hand side evaluation only when
 483 needed), and that lowering does not have to implement it all.
 484
 485 Syntax:
 486 ```
 487 hlfir.assign %expr_or_var to %var [attributes]
 488 ```
 489
 490 The attributes can be:
 491
 492 -   realloc: mark that assignment has F2003 semantics and that the left-hand
 493     side may have to be deallocated/reallocated…
 494 -   use_assign=@function: mark a user defined assignment
 495 -   no_overlap: mark that an assignment does not need a temporary (added by an
 496     analysis pass).
 497 -   unordered : mark that an assignment can happen in any element order (not
 498     true if there is an impure elemental function being called).
 499
 500 This will replace the current array_load/array_access/array_merge semantics.
 501 Instead, a more generic alias analysis will be performed on the LHS and RHS to
 502 detect aliasing, and a temporary inserted if needed. The alias analysis will
 503 look at all the memory references in the RHS operand tree and base overlap
 504 decisions on the related variable declaration operations. This same analysis
 505 should later allow moving/merging some expression evaluation between different
 506 statements.
 507
 508 Note about user defined assignments: semantics is resolving them and building
 509 the related subroutine call. So a fir.call could directly be made in lowering if
 510 the right hand side was always evaluated in a temporary. The motivation to use
 511 hlfir.assign is to help the temporary removal, and also to deal with two edge
 512 cases: user assignment in a FORALL (the forall pass will need to understand that
 513 this an assignment), and allocatable assignment mixed with user assignment
 514 (implementing this as a call in lowering would require lowering the whole
 515 reallocation logic in lowering already, duplicating the fact that hlfir.assign
 516 should deal with it).
 517
 518 #### hlfir.ptr_assign
 519
 520 Motivation: represent pointer assignment without lowering the exact pointer
 521 implementation (descriptor address, fir.ref<fir.box> or simple pointer scalar
 522 fir.llvm_ptr<fir.ptr>).
 523
 524 Syntax:
 525 ```
 526 hlfir.ptr_assign %var [[reshape %reshape] | [lbounds %lb1, …., %lbn]] to %ptr
 527 ```
 528
 529 It is important to keep pointer assignment at a high-level so that they can
 530 later correctly be processed in hlfir.forall.
 531
 532 #### hlfir.allocate
 533
 534 Motivation: keep POINTER and ALLOCATABLE allocation explicit in HLFIR, while
 535 allowing later lowering to either inlined fir.allocmem or Fortran runtime
 536 calls. Generating runtime calls allow the runtime to do Fortran specific
 537 bookkeeping or flagging and to provide better runtime error reports.
 538
 539 The main difference with the ALLOCATE statement is that one distinct
 540 hlfir.allocate has to be created for each element of the allocation-list.
 541 Otherwise, it is a naive lowering of the ALLOCATE statement.
 542
 543 Syntax:
 544 ```
 545 %stat = hlfir.allocate %var [%shape] [%type_params] [[src=%source] | [mold=%mold]] [errmsg =%errmsg]
 546 ```
 547
 548 #### hlfir.deallocate
 549
 550 Motivation: keep deallocation explicit in HLFIR, while allowing later lowering
 551 to Fortran runtime calls to allow the runtime to do Fortran specific
 552 bookkeeping or flagging of allocations.
 553
 554 Similarly to hlfir.allocate, one operation must be created for each
 555 allocate-object-list object.
 556
 557 Syntax:
 558 ```
 559 %stat = hlfir.deallocate %var [errmsg=err].
 560 ```
 561
 562 ####  hlfir.elemental
 563
 564 Motivation: represent elemental operations without defining array level
 565 operations for each of them, and allow the representation of array expressions
 566 as function of the indices.
 567
 568 The hlfir.elemental operation can be seen as a closure: it is defining a
 569 function of the indices that returns the value of the element of the
 570 represented array expression at the given indices. This an operation with an
 571 MLIR region. It allows detailing how an elemental expression is implemented at
 572 the element level, without yet requiring materializing the operands and result
 573 in memory.  The hlfir.expr<T> elements value can be obtained using hlfir.apply.
 574
 575 The element result is built with a fir.result op, whose result type can be a
 576 scalar hlfir.expr<T> or any scalar constant size types (e.g. i32, or f32).
 577
 578 Syntax:
 579 ```
 580 %op = hlfir.elemental (%indices) %shape [%type_params] [%dynamic_type] {
 581   ….
 582   fir.result %result_element
 583 }
 584 ```
 585
 586
 587 Note that %indices are not operands, they are the elemental region block
 588 arguments, representing the array iteration space in a one based fashion.
 589 The choice of using one based indicies is to match Fortran default for
 590 array variables, so that there is no need to generate bound adjustments
 591 when working with one based array variables in an expression.
 592
 593 Illustration: “A + B” represented with a hlfir.elemental.
 594
 595 ```
 596 %add = hlfir.elemental (%i:index, %j:index) shape %shape (!fir.shape<2>) -> !hlfir.expr<?x?xf32> {
 597   %belt = hlfir.designate %b, %i, %j : (!fir.box<!fir.array<?x?xf32>>, index, index) -> !fir.ref<f32>
 598   %celt = hlfir.designate %c, %i, %j : (!fir.box<!fir.array<?x?xf32>>, index, index) -> !fir.ref<f32>
 599   %bval = fir.load %belt : (!fir.ref<f32>) -> f32
 600   %cval = fir.load %celt : (!fir.ref<f32>) -> f32
 601   %add = arith.addf %bval, %cval : f32
 602   fir.result %res : f32
 603 }
 604 ```
 605
 606 In contexts where it can be proved that the array operands were not modified
 607 between the hlfir.elemental and the hlfir.apply, the region of the
 608 hlfir.elemental can be inlined at the hlfir.apply. Otherwise, if there is no
 609 such guarantee, or if the hlfir.elemental is not “visible” (because its result
 610 is passed as a block argument), the hlfir.elemental will be lowered to an array
 611 temporary. This will be done as a HLFIR to HLFIR optimization pass. Note that
 612 MLIR inlining could be used if hlfir.elemental implemented the
 613 CallableInterface and hlfir.apply the CallInterface.  But MLIR generic inlining
 614 is probably too generic for this case: no recursion is possible here, the call
 615 graphs are trivial, and using MLIR inlining here could introduce later
 616 conflicts or make normal function inlining more complex because FIR inlining
 617 hooks would already be used.
 618
 619 hlfir.elemental allows delaying elemental array expression buffering and
 620 combination. Its generic aspect has two advantages:
 621 - It avoids defining one operation per elemental operation or intrinsic,
 622   instead, the related arith dialect operations can be used directly in the
 623   elemental regions. This avoids growing HLFIR and having to maintain about a
 624   hundred operations.
 625 - It allows representing transformational intrinsics as functions of the indices
 626   while doing optimization as described in
 627   [Array Composition](ArrayComposition.md). This because the indices can be
 628   transformed inside the region before being applied to array variables
 629   according to any kind of transformation (semi-affine or not).
 630
 631
 632 #### Introducing the hlfir.apply operation
 633
 634 Motivation: provide a way to get the element of an array expression
 635 (hlfir.expr<?x…xT>)
 636
 637 This is the addressing equivalent for expressions. A notable difference is that
 638 it can only take simple scalar indices (no triplets) because it is not clear
 639 why supporting triplets would be needed, and keeping the indexing simple makes
 640 inlining of hlfir.elemental much easier.
 641
 642 If hlfir.elemental inlining is not performed, or if the hlfir.expr<T> array
 643 expression is produced by another operation (like fir.intrinsic) that is not
 644 rewritten, hlfir.apply will be lowered to an actual addressing operation that
 645 will address the temporary that was created for the hlfir.expr<T> value that
 646 was materialized in memory.
 647
 648 hlfir.apply indices will be one based to make further lowering simpler.
 649
 650 Syntax:
 651 ```
 652 %element = hlfir.apply %array_expr %i, %j: (hlfir.expr<?x?xi32>) -> i32
 653 ```
 654
 655
 656 #### Introducing operations for transformational intrinsic functions
 657
 658 Motivation: Represent transformational intrinsics functions at a high-level so
 659 that they can be manipulated easily by the optimizer, and do not require
 660 materializing the result as a temporary in lowering.
 661
 662 An operation will be added for each Fortran transformational functions (SUM,
 663 MATMUL, TRANSPOSE....). It translates the Fortran expression verbatim: it takes
 664 the same number of arguments as the Fortran intrinsics and returns a
 665 hlfir.expr<T>. The arguments may be hlfir.expr<T>, simple scalar types (e.g.,
 666 i32, f32), or variables.
 667
 668 The exception being that the arguments that are statically absent would be
 669 passed to it (passing results of fir.absent operation), so that the arguments
 670 can be identified via their positions.
 671
 672 This operation is meant for the transformational intrinsics, not the elemental
 673 intrinsics, that will be implemented using hlfir.elemental + mlir math dialect
 674 operations, nor the intrinsic subroutines (like random_seed or system_clock),
 675 that will be directly lowered in lowering.
 676
 677 Syntax:
 678 ```
 679 %res = hlfir."intrinsic_name" %expr_or_var, ...
 680 ```
 681
 682 These operations will all inherit a same operation base in tablegen to make
 683 their definition and identification easy.
 684
 685 Without any optimization, codegen would then translate the operations to
 686 exactly the same FIR as currently generated by IntrinsicCall.cpp (runtime calls
 687 or inlined code with temporary allocation for array results). The fact that
 688 they are the verbatim Fortran translations should allow to move the lowering
 689 code to a translation pass without massive changes.
 690
 691 An operation will at least be created for each of the following transformational
 692 intrinsics: all, any, count, cshift, dot_product, eoshift, findloc, iall, iany,
 693 iparity, matmul, maxloc, maxval, minloc, minval, norm2, pack, parity, product,
 694 reduce, repeat, reshape, spread, sum, transfer, transpose, trim, unpack.
 695
 696 For the following transformational intrinsics, the current lowering to runtime
 697 call will probably be used since there is little point to keep them high level:
 698 - command_argument_count, get_team, null, num_images, team_number, this_image
 699   that are more program related (and cannot appear for instance in constant
 700   expressions)
 701 - selected_char_kind, selected_int_kind, selected_real_kind that returns scalar
 702   integers
 703
 704 #### Introducing operations for character operations and elemental intrinsic functions
 705
 706
 707 Motivation: represent character operations without requiring the operand and
 708 results to be materialized in memory.
 709
 710 fir.char_op is intended to represent:
 711 -  Character concatenation (//)
 712 -  Character MIN/MAX
 713 -  Character MERGE
 714 -  “SET_LENGTH”
 715 -  Character conversions
 716 -  REPEAT
 717 -  INDEX
 718 -  CHAR
 719 -  Character comparisons
 720 -  LEN_TRIM
 721
 722 The arguments must be scalars, the elemental aspect should be handled by a
 723 hlfir.elemental operation.
 724
 725 Syntax:
 726 ```
 727 %res = hlfir.“char_op” %expr_or_var
 728 ```
 729
 730 Just like for the transformational intrinsics, if no optimization occurs, these
 731 operations will be lowered to memory operations with temporary results (if the
 732 result is a character), using the same generation code as the one currently used
 733 in lowering.
 734
 735 #### hlfir.array_ctor
 736
 737 Motivation: represent array constructor without creating temporary
 738
 739 Many array constructors have a limited number of elements (less than 10), the
 740 current lowering of array constructor is rather complex because it must deal
 741 with the generic cases.
 742
 743 Having a representation to represent array constructor will allow an easier
 744 lowering of array constructor, and make array ctor a lot easier to manipulate.
 745 For instance, for small array constructors, loops could could be unrolled with
 746 the array ctor elements without ever creating a dynamically allocated array
 747 temporary and loop nest using it.
 748
 749 Syntax:
 750 ```
 751 %array_ctor = hlfir.array_ctor %expr1, %expr2 ….
 752 ```
 753
 754 Note that hlfir.elemental could be used to implement some ac-implied-do,
 755 although this is not yet clarified since ac-implied-do may contain more than
 756 one scalar element (they may contain a list of scalar and array values, which
 757 would render the representation in a hlfir.elemental tricky, but maybe not
 758 impossible using if/then/else and hlfir.elemental nests using the index value).
 759 One big issue though is that hlfir.elemental requires the result shape to be
 760 pre-computed (it is an operand), and with an ac-implied-do containing user
 761 transformational calls returning allocatable or pointer arrays, it is
 762 impossible to pre-evaluate the shape without evaluating all the function calls
 763 entirely (and therefore all the array constructor elements).
 764
 765 #### hlfir.get_extent
 766
 767 Motivation: inquire about the extent of a hlfir.expr, variable, or fir.shape
 768
 769 Syntax:
 770 ```
 771 %extent = hlfir.get_extent %shape_expr_or_var, dim
 772 ```
 773
 774 dim is a constant integer attribute.
 775
 776 This allows inquiring about the extents of expressions whose shape may not be
 777 yet computable without generating detailed, low level operations (e.g, for some
 778 transformational intrinsics), or to avoid going into low level details for
 779 pointer and allocatable variables (where the descriptor needs to be read and
 780 loaded).
 781
 782 #### hlfir.get_typeparam
 783
 784 Motivation: inquire about the type parameters of a hlfir.expr, or variable.
 785
 786 Syntax:
 787 ```
 788 %param = hlfir.get_typeparam %expr_or_var [, param_name]
 789 ```
 790 - param_name is an optional string attribute that must contain the length
 791   parameter name if %expr_or_var is a derived type.
 792
 793 ####  hlfir.get_dynamic_type
 794
 795 Motivation: inquire about the dynamic type of a polymorphic hlfir.expr or
 796 variable.
 797
 798 Syntax:
 799 ```
 800 %dynamic_type = hlfir.get_dynamic_type %expr_or_var
 801 ```
 802
 803 #### hlfir.get_lbound
 804
 805 Motivation: inquire about the lower bounds of variables without digging into
 806 the implementation details of pointers and allocatables.
 807
 808 Syntax:
 809 ```
 810 %lb = hlfir.get_lbound %var, n
 811 ```
 812
 813 Note: n is an integer constant attribute for the (zero based) dimension.
 814
 815 ####  hlfir.shape_meet
 816
 817 Motivation: represent conformity requirement/information between two array
 818 operands so that later optimization can choose the best shape information
 819 source, or insert conformity runtime checks.
 820
 821 Syntax:
 822 ```
 823 %shape = hlfir.shape_meet %shape1, %shape2
 824 ```
 825
 826 Suppose A(n), B(m) are two explicit shape arrays. Currently, when A+B is
 827 lowered, lowering chose which operand shape gives the result shape information,
 828 and it is later not retrievable that both n and m can be used. If lowering
 829 chose n, but m later gets folded thanks to inlining or constant propagation, the
 830 optimization passes have no way to use this constant information to optimize the
 831 result storage allocation or vectorization of A+B.  hlfir.shape_meet intends to
 832 delay this choice until constant propagation or inlining can provide better
 833 information about n and m.
 834
 835 #### hlfir.forall
 836
 837 Motivation: segregate the Forall lowering complexity in its own unit.
 838
 839 Forall is tough to lower because:
 840 -   Lowering it in an optimal way requires analyzing several assignments/mask
 841     expressions.
 842 -   The shape of the temporary needed to store intermediate evaluation values is
 843     not a Fortran array in the general case, and cannot in the general case be
 844     maximized/pre-computed without executing the forall to compute the bounds of
 845     inner forall, and the shape of the assignment operands that may depend on
 846     the bound values.
 847 -   Mask expressions evaluation should be affected by previous assignment
 848     statements, but not by the following ones. Array temporaries may be
 849     required for the masks to cover this.
 850 -   On top of the above points, Forall can contain user assignments, pointer
 851     assignments, and assignment to whole allocatable.
 852
 853
 854 The hlfir.forall syntax would be exactly the one of a fir.do_loop. The
 855 difference would be that hlfir.assign and hlfir.ptr_assign inside hlfir.forall
 856 have specific semantics (the same as in Fortran):
 857 -   Given one hlfir.assign, all the iteration values of the LHS/RHS must be
 858     evaluated before the assignment of any value is done.
 859 -   Given two hlfir.assign, the first hlfir.assign must be fully performed
 860     before any evaluation of the operands of the second assignment is done.
 861 -   Masks (fir.if arguments), if any, should be evaluated before any nested
 862     assignments. Any assignments syntactically before the where mask occurrence
 863     must be performed before the mask evaluation.
 864
 865 Note that forall forbids impure function calls, hence, no calls should modify
 866 any other expression evaluation and can be removed if unused.
 867
 868 The translation of hlfir.forall will happen by:
 869 -   1. Determining if the where masks value may be modified by any assignments
 870     - Yes, pre-compute all masks in a pre-run of the forall loop, creating
 871       a “forall temps” (we may need a FIR concept to help here).
 872     - No, Do nothing (or indicate it is safe to evaluate masks while evaluating
 873       the rest).
 874 -   2. Determining if a hlfir.assign operand expression depends on the
 875        previous hlfir.assign left-hand side base value.
 876     - Yes, split the hlfir.assign into their own nest of hlfir.forall loops.
 877     - No, do nothing (or indicate it is safe to evaluate the assignment while
 878       evaluating previous assignments)
 879 -   3. For each assignments, check if the RHS/LHS operands value may depend
 880      on the LHS base:
 881     - Yes, split the forall loops. Insert a “forall temps” before the loops for
 882       the “smallest” part that may overlap (which may be the whole RHS, or some
 883       RHS sub-part, or some LHS indices). In the first nest, evaluate this
 884       overlapping part into the temp. In the next forall loop nest, modify the
 885       assignment to use the temporary, and add the [no_overlap] flag to indicate
 886       no further temporary is needed. Insert code to finalize the temp after its
 887       usage.
 888
 889 ## New HLFIR Transformation Passes
 890
 891 ### Mandatory Passes (translation towards lower-level representation)
 892
 893 Note that these passes could be implemented as a single MLIR pass, or successive
 894 passes.
 895
 896 -   Forall rewrites (getting rid of hlfir.forall)
 897 -   Array assignment rewrites (getting rid of array hlfir.assign)
 898 -   Bufferization: expression temporary materialization (getting rid of
 899     hlfir.expr, and all the operations that may produce it like transformational
 900     intrinsics and hlfir.elemental, hlfir.apply).
 901 -   Call interface argument association lowering (getting rid of hlfir.associate
 902     and hlfir.end_associate)
 903 -   Lowering high level operations using variables into FIR operations
 904     operating on memory (translating hlfir.designate, scalar hlfir.assign,
 905     hlfir.finalize into fir.array_coor, fir.do_loop, fir.store, fir.load.
 906     fir.embox/fir.rebox operations).
 907
 908 Note that these passes do not have to be the first one run after lowering. It is
 909 intended that CSE, DCE, algebraic simplification, inlining and some other new
 910 high-level optimization passes discused below be run before doing any of these
 911 translations.
 912
 913 After that, the current FIR pipeline could be used to continue lowering towards
 914 LLVM.
 915
 916 ### Optimization Passes
 917
 918 -   Elemental expression inlining (inlining of hlfir.elemental in hlfir.apply)
 919 -   User function Inlining
 920 -   Transformational intrinsic rewrites as hlfir.elemental expressions
 921 -   Assignments propagation
 922 -   Shape/Rank/dynamic type propagation
 923
 924 These high level optimization passes can be run any number of times in any
 925 order.
 926
 927 ## Transition Plan
 928
 929 The new higher-level steps proposed in this document will require significant
 930 refactoring of lowering. Codegen should not be impacted since the current FIR
 931 will remain untouched.
 932
 933 A lot of the code in lowering generating Fortran features (like an intrinsic or
 934 how to do assignments) is based on the fir::ExtendedValue concept. This
 935 currently is a collection of mlir::Value that allows describing a Fortran object
 936 (either a variable or an evaluated expression result). The variable and
 937 expression concepts described above should allow to keep an interface very
 938 similar to the fir::ExtendedValue, but having the fir::ExtendedValue wrap a
 939 single value or mlir::Operation* from which all of the object entity
 940 information can be inferred.
 941
 942 That way, all the helpers currently generating FIR from fir::ExtendedValue could
 943 be kept and used with the new variable and expression concepts with as little
 944 modification as possible.
 945
 946 The proposed plan is to:
 947 - 1. Introduce the new HLFIR operations.
 948 - 2. Refactor fir::ExtendedValue so that it can work with the new variable and
 949      expression concepts (requires part of 1.).
 950 - 3. Introduce the new translation passes, using the fir::ExtendedValue helpers
 951      (requires 1.).
 952 - 3.b Introduce the new optimization passes (requires 1.).
 953 - 4. Introduce the fir.declare and hlfir.finalize usage in lowering (requires 1.
 954      and 2. and part of 3.).
 955
 956 The following steps might have to be done in parallel of the current lowering,
 957 to avoid disturbing the work on performance until the new lowering is complete
 958 and on par.
 959
 960 - 5. Introduce hlfir.designate and hlfir.associate usage in lowering.
 961 - 6. Introduce lowering to hlfir.assign (with RHS that is not a hlfir.expr),
 962      hlfir.ptr_assign.
 963 - 7. Introduce lowering to hlfir.expr and related operations.
 964 - 8. Introduce lowering to hlfir.forall.
 965
 966 At that point, lowering using the high-level FIR should be in place, allowing
 967 extensive testing.
 968 - 9. Debugging correctness.
 969 - 10. Debugging execution performance.
 970
 971 The plan is to do these steps incrementally upstream, but for lowering this will
 972 most likely be safer to do have the new expression lowering implemented in
 973 parallel upstream, and to add an option to use the new lowering rather than to
 974 directly modify the current expression lowering and have it step by step
 975 equivalent functionally and performance wise.
 976
 977 ## Examples
 978
 979 ### Example 1: simple array assignment
 980
 981 ```Fortran
 982 subroutine foo(a, b)
 983   real :: a(:), b(:)
 984   a = b
 985 end subroutine
 986 ```
 987
 988 Lowering output:
 989
 990 ```HLFIR
 991 func.func @_QPfoo(%arg0: !fir.box<!fir.array<?xf32>>, %arg1: !fir.box<!fir.array<?xf32>>) {
 992   %a = hlfir.declare %arg0 {fir.def = "_QPfooEa"} : !fir.box<!fir.array<?xf32>>, !fir.box<!fir.array<?xf32>>
 993   %b = hlfir.declare %arg1 {fir.def = "_QPfooEb"} : !fir.box<!fir.array<?xf32>>, !fir.box<!fir.array<?xf32>>
 994   hlfir.assign %b#0 to %a#0 : !fir.box<!fir.array<?xf32>>
 995   return
 996 }
 997 ```
 998
 999 HLFIR array assignment lowering pass:
1000 -   Query: can %b value depend on %a? No, they are two different argument
1001     associated variables that are neither target nor pointers.
1002 -   Lower to assignment to loop:
1003
1004 ```HFLIR
1005 func.func @_QPfoo(%arg0: !fir.box<!fir.array<?xf32>>, %arg1: !fir.box<!fir.array<?xf32>>) {
1006   %a = hlfir.declare %arg0 {fir.def = "_QPfooEa"} : !fir.box<!fir.array<?xf32>>, !fir.box<!fir.array<?xf32>>
1007   %b = hlfir.declare %arg1 {fir.def = "_QPfooEb"} : !fir.box<!fir.array<?xf32>>, !fir.box<!fir.array<?xf32>>
1008
1009   %ashape = hlfir.shape_of %a#0
1010   %bshape = hlfir.shape_of %b#0
1011   %shape = hlfir.shape_meet %ashape, %bshape
1012   %extent = hlfir.get_extent %shape, 0
1013
1014   %c1 = arith.constant 1 : index
1015
1016   fir.do_loop %i = %c1 to %extent step %c1 unordered {
1017     %belt = hlfir.designate %b#0, %i
1018     %aelt = hlfir.designate %a#0, %i
1019     hlfir.assign %belt to %aelt : fir.ref<f32>, fir.ref<f32>
1020   }
1021   return
1022 }
1023 ```
1024
1025 HLFIR variable operations to memory translation pass:
1026 -   hlfir.designate is rewritten into fir.array_coor operation on the variable
1027     associated memory buffer, and returns the element address
1028 -   For numerical scalar, hlfir.assign is rewritten to fir.store (and fir.load
1029     of the operand if needed), for derived type and characters, memory copy
1030     (and padding for characters) is done.
1031 -   hlfir.shape_of are lowered to fir.box_dims, here, no constant information
1032     was obtained from any of the source shape, so hlfir.shape_meet is a no-op,
1033     selecting the first shape (a conformity runtime check could be inserted
1034     under debug options).
1035 -   hlfir.declare are translated into fir.declare that are no-ops and will allow
1036     generating debug information for LLVM.
1037
1038 This pass would wrap operations defining variables (hlfir.declare/hlfir.designate)
1039 as fir::ExtendedValue, and use all the current helpers operating on it
1040 (e.g.: fir::factory::genScalarAssignment).
1041
1042 ```
1043 func.func @_QPfoo(%arg0: !fir.box<!fir.array<?xf32>>, %arg1:
1044   !fir.box<!fir.array<?xf32>>) {
1045   %a = fir.declare %arg0 {fir.def = "_QPfooEa"} : !fir.box<!fir.array<?xf32>>
1046   %b = fir.declare %arg1 {fir.def = "_QPfooEb"} : !fir.box<!fir.array<?xf32>>
1047   %c1 = arith.constant 1 : index
1048   %dims = fir.box_dims %a, 1
1049   fir.do_loop %i = %c1 to %dims#1 step %c1 unordered {
1050     %belt = fir.array_coor %b, %i : (!fir.box<!fir.array<?xf32>>, index) -> fir.ref<f32>
1051     %aelt = fir.array_coor %a, %i : (!fir.box<!fir.array<?xf32>>, index) -> fir.ref<f32>
1052     %bval = fir.load %belt : f32
1053     fir.store %bval to %aelt : fir.ref<f32>
1054   }
1055   return
1056 }
1057 ```
1058
1059 This reaches the current FIR level (except fir.declare that can be kept until
1060 LLVM codegen and dropped on the floor if there is no debug information
1061 generated).
1062
1063 ### Example 2: array assignment with elemental expression
1064
1065 ```Fortran
1066 subroutine foo(a, b, p, c)
1067   real, target :: a(:)
1068   real :: b(:), c(100)
1069   real, pointer :: p(:)
1070   a = b*p + c
1071 end subroutine
1072 ```
1073
1074 Lowering output:
1075
1076 ```HLFIR
1077 func.func @_QPfoo(%arg0: !fir.box<!fir.array<?xf32>>, %arg1: !fir.box<!fir.array<?xf32>>, %arg2: !fir.box<!fir.ptr<!fir.array<?xf32>>>, %arg3: !fir.ref<!fir.array<100xf32>>) {
1078   %a = hlfir.declare %arg0 {fir.def = "_QPfooEa"} {fir.target} : !fir.box<!fir.array<?xf32>, !fir.box<!fir.array<?xf32>
1079   %b =  hlfir.declare %arg1 {fir.def = "_QPfooEb"} : !fir.box<!fir.array<?xf32>>, !fir.box<!fir.array<?xf32>>
1080   %p = hlfir.declare %arg2 {fir.def = "_QPfooEp", fir.ptr} : !fir.box<!fir.ptr<!fir.array<?xf32>>>, !fir.box<!fir.ptr<!fir.array<?xf32>>>
1081   %c =  hlfir.declare %arg3 {fir.def = "_QPfooEc"} : !fir.ref<!fir.array<100xf32>>, !fir.ref<!fir.array<100xf32>>
1082   %bshape = hlfir.shape_of %b#0
1083   %pshape = hlfir.shape_of %p#0
1084   %shape1 = hlfir.shape_meet %bshape, %pshape
1085   %mul = hlfir.elemental(%i:index) %shape1 {
1086     %belt = hlfir.designate %b#0, %i
1087     %p_lb = hlfir.get_lbound %p#0, 1
1088     %i_zero = arith.subi %i, %c1
1089     %i_p = arith.addi %i_zero,  %p_lb
1090     %pelt = hlfir.designate %p#0, %i_p
1091     %bval = fir.load %belt : f32
1092     %pval = fir.load %pelt : f32
1093     %mulres = arith.mulf %bval, %pval : f32
1094      fir.result %mulres : f32
1095   }
1096   %cshape = hlfir.shape_of %c
1097   %shape2 = hlfir.shape_meet %cshape, %shape1
1098   %add =  hlfir.elemental(%i:index) %shape2 {
1099     %mulval = hlfir.apply %mul, %i : f32
1100     %celt = hlfir.designate %c#0, %i
1101     %cval = fir.load %celt
1102     %add_res = arith.addf %mulval, %cval
1103     fir.result %add_res
1104   }
1105   hlfir.assign %add to %a#0 : hlfir.expr<?xf32>, !fir.box<!fir.array<?xf32>
1106   return
1107 }
1108 ```
1109
1110 Step 1: hlfir.elemental inlining: inline the first hlfir.elemental into the
1111 second one at the hlfir.apply.
1112
1113
1114 ```HLFIR
1115 func.func @_QPfoo(%arg0: !fir.box<!fir.array<?xf32>>, %arg1: !fir.box<!fir.array<?xf32>>, %arg2: !fir.box<!fir.ptr<!fir.array<?xf32>>>, %arg3: !fir.ref<!fir.array<100xf32>>) {
1116   %a = hlfir.declare %arg0 {fir.def = "_QPfooEa"} {fir.target} : !fir.box<!fir.array<?xf32>, !fir.box<!fir.array<?xf32>
1117   %b =  hlfir.declare %arg1 {fir.def = "_QPfooEb"} : !fir.box<!fir.array<?xf32>>, !fir.box<!fir.array<?xf32>>
1118   %p = hlfir.declare %arg2 {fir.def = "_QPfooEp", fir.ptr} : !fir.box<!fir.ptr<!fir.array<?xf32>>>, !fir.box<!fir.ptr<!fir.array<?xf32>>>
1119   %c =  hlfir.declare %arg3 {fir.def = "_QPfooEc"} : !fir.ref<!fir.array<100xf32>>, !fir.ref<!fir.array<100xf32>>
1120   %bshape = hlfir.shape_of %b#0
1121   %pshape = hlfir.shape_of %p#0
1122   %shape1 = hlfir.shape_meet %bshape, %pshape
1123   %cshape = hlfir.shape_of %c
1124   %shape2 = hlfir.shape_meet %cshape, %shape1
1125   %add =  hlfir.elemental(%i:index) %shape2 {
1126     %belt = hlfir.designate %b#0, %i
1127     %p_lb = hlfir.get_lbound %p#0, 1
1128     %i_zero = arith.subi %i, %c1
1129     %i_p = arith.addi %i_zero,  %p_lb
1130     %pelt = hlfir.designate %p#0, %i_p
1131     %bval = fir.load %belt : f32
1132     %pval = fir.load %pelt : f32
1133     %mulval = arith.mulf %bval, %pval : f32
1134     %celt = hlfir.designate %c#0, %i
1135     %cval = fir.load %celt
1136     %add_res = arith.addf %mulval, %cval
1137     fir.result %add_res
1138   }
1139   hlfir.assign %add to %a#0 : hlfir.expr<?xf32>, !fir.box<!fir.array<?xf32>
1140   return
1141 }
1142 ```
1143
1144 Step2: alias analysis around the array assignment:
1145
1146 -   May %add value depend on %a variable?
1147 -   Gather variable and function calls in %add operand tree (visiting
1148     hlfir.elemental regions)
1149 -   Gather references to %b, %p, and %c. %p is a pointer variable according to
1150     its defining operations. It may alias with %a that is a target. -> answer
1151     yes.
1152 -   Insert temporary, and duplicate array assignments, that can be lowered to
1153     loops at that point
1154
1155 Note that the alias analysis could have already occurred without inlining the
1156 %add hlfir.elemental.
1157
1158
1159 ```HLFIR
1160 func.func @_QPfoo(%arg0: !fir.box<!fir.array<?xf32>>, %arg1: !fir.box<!fir.array<?xf32>>, %arg2: !fir.box<!fir.ptr<!fir.array<?xf32>>>, %arg3: !fir.ref<!fir.array<100xf32>>) {
1161   %a = hlfir.declare %arg0 {fir.def = "_QPfooEa"} {fir.target} : !fir.box<!fir.array<?xf32>, !fir.box<!fir.array<?xf32>
1162   %b =  hlfir.declare %arg1 {fir.def = "_QPfooEb"} : !fir.box<!fir.array<?xf32>>, !fir.box<!fir.array<?xf32>>
1163   %p = hlfir.declare %arg2 {fir.def = "_QPfooEp", fir.ptr} : !fir.box<!fir.ptr<!fir.array<?xf32>>>, !fir.box<!fir.ptr<!fir.array<?xf32>>>
1164   %c =  hlfir.declare %arg3 {fir.def = "_QPfooEc"} : !fir.ref<!fir.array<100xf32>>, !fir.ref<!fir.array<100xf32>>
1165   %bshape = hlfir.shape_of %b#0
1166   %pshape = hlfir.shape_of %p#0
1167   %shape1 = hlfir.shape_meet %bshape, %pshape
1168   %cshape = hlfir.shape_of %c
1169   %shape2 = hlfir.shape_meet %cshape, %shape1
1170   %add =  hlfir.elemental(%i:index) %shape2 {
1171     %belt = hlfir.designate %b#0, %i
1172     %p_lb = hlfir.get_lbound %p#0, 1
1173     %i_zero = arith.subi %i, %c1
1174     %i_p = arith.addi %i_zero, %p_lb
1175     %pelt = hlfir.designate %p#0, %i_p
1176     %bval = fir.load %belt : f32
1177     %pval = fir.load %pelt : f32
1178     %mulval = arith.mulf %bval, %pval : f32
1179     %celt = hlfir.designate %c#0, %i
1180     %cval = fir.load %celt
1181     %add_res = arith.addf %mulval, %cval
1182     fir.result %add_res
1183   }
1184   %extent = hlfir.get_extent %shape2, 0: (fir.shape<1>) -> index
1185   %tempstorage = fir.allocmem %extent : fir.heap<fir.array<?xf32>>
1186   %temp = hlfir.declare %tempstorage, shape %extent {fir.def = QPfoo.temp001} : (index) -> fir.box<fir.array<?xf32>>, fir.heap<fir.array<?xf32>>
1187   hlfir.assign %add to %temp#0 no_overlap : hlfir.expr<?xf32>, !fir.box<!fir.array<?xf32>>
1188   hlfir.assign %temp to %a#0 : no_overlap  : !fir.box<!fir.array<?xf32>>, !fir.box<!fir.array<?xf32>>
1189   hlfir.finalize %temp#0
1190   fir.freemem %tempstorage
1191   return
1192 }
1193 ```
1194
1195 Step 4: Lower assignments to regular loops since they have the no_overlap
1196 attribute, and inline the hlfir.elemental into the first loop nest.
1197
1198 ```HLFIR
1199 func.func @_QPfoo(%arg0: !fir.box<!fir.array<?xf32>>, %arg1: !fir.box<!fir.array<?xf32>>, %arg2: !fir.box<!fir.ptr<!fir.array<?xf32>>>, %arg3: !fir.ref<!fir.array<100xf32>>) {
1200   %a = hlfir.declare %arg0 {fir.def = "_QPfooEa"} {fir.target} : !fir.box<!fir.array<?xf32>, !fir.box<!fir.array<?xf32>
1201   %b =  hlfir.declare %arg1 {fir.def = "_QPfooEb"} : !fir.box<!fir.array<?xf32>>, !fir.box<!fir.array<?xf32>>
1202   %p = hlfir.declare %arg2 {fir.def = "_QPfooEp", fir.ptr} : !fir.box<!fir.ptr<!fir.array<?xf32>>>, !fir.box<!fir.ptr<!fir.array<?xf32>>>
1203   %c =  hlfir.declare %arg3 {fir.def = "_QPfooEc"} : !fir.ref<!fir.array<100xf32>>, !fir.ref<!fir.array<100xf32>>
1204   %bshape = hlfir.shape_of %b#0
1205   %pshape = hlfir.shape_of %p#0
1206   %shape1 = hlfir.shape_meet %bshape, %pshape
1207   %cshape = hlfir.shape_of %c
1208   %shape2 = hlfir.shape_meet %cshape, %shape1
1209   %extent = hlfir.get_extent %shape2, 0: (fir.shape<1>) -> index
1210   %tempstorage = fir.allocmem %extent : fir.heap<fir.array<?xf32>>
1211   %temp = hlfir.declare %tempstorage, shape %extent {fir.def = QPfoo.temp001} : (index) -> fir.box<fir.array<?xf32>>, fir.heap<fir.array<?xf32>>
1212   fir.do_loop %i = %c1 to %shape2 step %c1 unordered {
1213     %belt = hlfir.designate %b#0, %i
1214     %p_lb = hlfir.get_lbound %p#0, 1
1215     %i_zero = arith.subi %i, %c1
1216     %i_p = arith.addi %i_zero,  %p_lb
1217     %pelt = hlfir.designate %p#0, %i_p
1218     %bval = fir.load %belt : f32
1219     %pval = fir.load %pelt : f32
1220     %mulval = arith.mulf %bval, %pval : f32
1221     %celt = hlfir.designate %c#0, %i
1222     %cval = fir.load %celt
1223     %add_res = arith.addf %mulval, %cval
1224     %tempelt = hlfir.designate %temp#0, %i
1225     hlfir.assign %add_res to %tempelt : f32, fir.ref<f32>
1226   }
1227   fir.do_loop %i = %c1 to %shape2 step %c1 unordered {
1228     %aelt = hlfir.designate %a#0, %i
1229     %tempelt = hlfir.designate %temp#0, %i
1230     hlfir.assign %add_res to %tempelt : f32, fir.ref<f32>
1231   }
1232   hlfir.finalize %temp#0
1233   fir.freemem %tempstorage
1234   return
1235 }
1236 ```
1237
1238 Step 5 (may also occur earlier or several times): shape propagation.
1239 -   %shape2 can be inferred from %cshape that has constant shape: the
1240     hlfir.shape_meet results can be replaced by it, and if the option is set,
1241     conformance checks can be added for %a, %b and %p.
1242 -   %temp is small, and its fir.allocmem can be promoted to a stack allocation
1243
1244 ```HLFIR
1245 func.func @_QPfoo(%arg0: !fir.box<!fir.array<?xf32>>, %arg1: !fir.box<!fir.array<?xf32>>, %arg2: !fir.box<!fir.ptr<!fir.array<?xf32>>>, %arg3: !fir.ref<!fir.array<100xf32>>) {
1246   // .....
1247   %cshape = fir.shape %c100
1248   %extent = %c100
1249   // updated fir.alloca
1250   %tempstorage = fir.alloca %extent : fir.ref<fir.array<100xf32>>
1251   %temp = hlfir.declare %tempstorage, shape %extent {fir.def = QPfoo.temp001} : (index) -> fir.box<fir.array<?xf32>>, fir.heap<fir.array<?xf32>>
1252   fir.do_loop %i = %c1 to %c100 step %c1 unordered {
1253     // ...
1254   }
1255   fir.do_loop %i = %c1 to %c100 step %c1 unordered {
1256     // ...
1257   }
1258   hlfir.finalize %temp#0
1259   // deleted fir.freemem %tempstorage
1260   return
1261 }
1262 ```
1263
1264 Step 6: lower hlfir.designate/hlfir.assign in a translation pass:
1265
1266 At this point, the representation is similar to the current representation after
1267 the array value copy pass, and the existing FIR flow is used (lowering
1268 fir.do_loop to cfg and doing codegen to LLVM).
1269
1270 ### Example 3: assignments with vector subscript
1271
1272 ```Fortran
1273 subroutine foo(a, b, v)
1274   real :: a(*), b(*)
1275   integer :: v(:)
1276   a(v) = b(v)
1277 end subroutine
1278 ```
1279
1280 Lowering of vector subscripted entities would happen as follow:
1281 - vector subscripted entities would be lowered as a hlfir.elemental implementing
1282   the vector subscript addressing.
1283 - If the vector appears in a context where it can be modified (which can only
1284   be an assignment LHS, or in input IO), lowering could transform the
1285   hlfir.elemental into hlfir.forall (for assignments), or a fir.iter_while (for
1286   input IO) by inlining the elemental body into the created loops, and
1287   identifying the hlfir.designate producing the result.
1288
1289 ```HFLFIR
1290 func.func @_QPfoo(%arg0: !fir.ref<!fir.array<?xf32>>, %arg1: !fir.ref<!fir.array<?xf32>>, %arg2: !fir.box<<!fir.array<?xi32>>) {
1291   %a = hlfir.declare %arg0 {fir.def = "_QPfooEa"} : !fir.box<!fir.array<?xf32>>, !fir.ref<!fir.array<?xf32>>
1292   %b = hlfir.declare %arg1 {fir.def = "_QPfooEb"} : !fir.box<!fir.array<?xf32>>, !fir.ref<!fir.array<?xf32>>
1293   %v = hlfir.declare %arg2 {fir.def = "_QPfooEv"} : !fir.box<!fir.array<?xi32>>, !fir.box<!fir.array<?xi32>>
1294   %vshape = hlfir.shape_of %v : fir.shape<1>
1295   %bsection =  hlfir.elemental(%i:index) %vshape : (fir.shape<1>) -> hlfir.expr<?xf32> {
1296     %v_elt = hlfir.designate %v#0, %i : (!fir.box<!fir.array<?xi32>>, index) -> fir.ref<i32>
1297     %v_val = fir.load %v_elt : fir.ref<i32>
1298     %cast = fir.convert %v_val : (i32) -> index
1299     %b_elt = hlfir.designate %b#0, %v_val : (!fir.ref<!fir.array<?xf32>>, index) -> fir.ref<f32>
1300     %b_val = fir.load %b_elt : fir.ref<f32>
1301     fir.result %b_elt
1302   }
1303   %extent = hlfir.get_extent %vshape, 0 : (fir.shape<1>) -> index
1304   %c1 = arith.constant 1 : index
1305   hlfir.forall (%i from %c1 to %extent step %c1) {
1306     %b_section_val = hlfir.apply %bsection, %i : (hlfir.expr<?xf32>, index) -> f32
1307     %v_elt = hlfir.designate %v#0, %i : (!fir.box<!fir.array<?xi32>>, index) -> fir.ref<i32>
1308     %v_val = fir.load %v_elt : fir.ref<i32>
1309     %cast = fir.convert %v_val : (i32) -> index
1310     %a_elt = hlfir.designate %a#0, %v_val : (!fir.ref<!fir.array<?xf32>>, index) -> fir.ref<f32>
1311     hlfir.assign %b_section_val to %a_elt  : f32, fir.ref<f32>
1312   }
1313   return
1314 }
1315 ```
1316
1317 This would then be lowered as described in the examples above (hlfir.elemental
1318 will be inlined, hlfir.forall will be rewritten into normal loops taking into
1319 account the alias analysis, and hlfir.assign/hlfir.designate operations will be
1320 lowered to fir.array_coor and fir.store operations).
1321
1322 # Alternatives that were not retained
1323
1324 ## Using a non-MLIR based mutable CFG representation
1325
1326 An option would have been to extend the PFT to describe expressions in a way
1327 that can be annotated and modified with the ability to introduce temporaries.
1328 This has been rejected because this would imply a whole new set of
1329 infrastructure and data structures while FIR is already using MLIR
1330 infrastructure, so enriching FIR seems a smoother approach and will benefit from
1331 the MLIR infrastructure experience that was gained.
1332
1333 ## Using symbols for HLFIR variables
1334
1335 ### Using attributes as pseudo variable symbols
1336
1337 Instead of restricting the memory types an HLFIR variable can have, it was
1338 force the defining operation of HLFIR variable SSA values to always be
1339 retrievable. The idea was to add a fir.ref attribute that would repeat the name
1340 of the HLFIR variable. Using such an attribute would prevent MLIR from merging
1341 two operations using different variables when merging IR blocks. (which is the
1342 main reason why the defining op may become inaccessible). The advantage of
1343 forcing the defining operation to be retrievable is that it allowed all Fortran
1344 information of variables (like attributes) to always be accessible in HLFIR
1345 when looking at their uses, and avoids requiring the introduction of fir.box
1346 usages for simply contiguous variables. The big drawback is that this implies
1347 naming all HLFIR variables, and there are many more of them than there are
1348 Fortran named variables. Naming designators with unique names was not very
1349 natural, and would make designator CSE harder. It also made inlining harder,
1350 because inlining HLFIR code without any fir.def/fir.ref attributes renaming
1351 would break the name uniqueness, which could lead to some operations using
1352 different variables to be merged, and to break the assumption that parent
1353 operations must be visible. Renaming would be possible, but would increase
1354 complexity and risks. Besides, inlining may not be the only transformation
1355 doing code motion, and whose complexity would be increased by the naming
1356 constraints.
1357
1358
1359 ### Using MLIR symbols for variables
1360
1361 Using MLIR symbols for HLFIR variables has been rejected because MLIR symbols
1362 are mainly intended to deal with globals and functions that may refer to each
1363 other before being defined. Their processing is not as light as normal values,
1364 and would require to turn every FIR operation with a region into an MLIR symbol
1365 table. This would especially be annoying since fir.designator also produces
1366 variables with their own properties, which would imply creating a lot of MLIR
1367 symbols. All the operations that both accept variable and expression operands
1368 would also either need to be more complex in order to both accept SSA values or
1369 MLIR symbol operands (or some fir.as_expr %var operation should be added to
1370 turn a variable into an expression). Given all variable definitions will
1371 dominate their uses, it seems better to use an SSA model with named attributes.
1372 Using SSA values also makes the transition and mixture with lower-level FIR
1373 operations smoother: a variable SSA usage can simply be replaced by lower-level
1374 FIR operations using the same SSA value.
1375
1376 ## Using some existing MLIR dialects for the high-level Fortran.
1377
1378 ### Why not using Linalg dialect?
1379
1380 The linalg dialects offers a powerful way to represent array operations: the
1381 linalg.generic operation takes a set of input and output arrays, a related set
1382 of affine maps to represent how these inputs/outputs are to be addressed, and a
1383 region detailing what operation should happen at each iteration point, given the
1384 input and output array elements. It seems mainly intended to optimize matmul,
1385 dot, and sum.
1386
1387 Issues:
1388
1389 -   The linalg dialect is tightly linked to the tensor/memref concepts that
1390     cannot represent byte stride based discontinuity and would most likely
1391     require FIR to use MLIR memref descriptor format to take advantage of it.
1392 -   It is not clear whether all Fortran array expression addressing can be
1393     represented as semi affine maps. For instance, vector subscripted entities
1394     can probably not, which may force creating temporaries for the related
1395     designator expressions to fit in this framework. Fortran has a lot more
1396     transformational intrinsics than matmul, dot, and sum that can and should
1397     still be optimized.
1398
1399 So while there may be benefits to use linalg at the optimization level (like
1400 rewriting fir.sum/fir.matmul to a linalg sum, with dialect types plumbing
1401 around the operand and results, to get tiling done by linalg), using it as a
1402 lowering target would not cover all Fortran needs (especially for the non
1403 semi-affine cases).
1404 So using linalg is for now left as an optimization pass opportunity in some
1405 cases that could be experimented.
1406
1407 ### Why not using Shape dialect?
1408
1409 MLIR shape dialect gives a set of operations to manipulate shapes. The
1410 shape.meet operation is exactly similar with hlfir.shape_meet, except that it
1411 returns a tensor or a shape.shape.
1412
1413 The main issue with using the shape dialect is that it is dependent on tensors.
1414 Bringing the tensor toolchain in flang for the sole purpose of manipulating
1415 shape is not seen as beneficial given that the only thing Fortran needs is
1416 shape.meet The shape dialect is a lot more complex because it is intended to
1417 deal with computations involving dynamically ranked entity, which is not the
1418 case in Fortran (assumed rank usage in Fortran is greatly limited).
1419
1420 ## Using embox/rebox and box as an alternative to fir.declare/hlfir.designate and hlfir.expr/ variable concept
1421
1422 All Fortran entities (*) can be described at runtime by a fir.box, except for
1423 some attributes that are not part of the runtime descriptors (like TARGET,
1424 OPTIONAL or VOLATILE).  In that sense, it would be possible to have
1425 fir.declare, hlfir.designate, and hlfir.associate be replaced by embox/rebox,
1426 and also to have all operation creating hlfir.expr to create fir.box.
1427
1428 This was rejected because this would lack clarity, and make embox/rebox
1429 semantics way too complex (their codegen is already non-trivial), and also
1430 because it would then not really be possible to know if a fir.box is an
1431 expression or a variable when it is an operand, which would make reasoning
1432 harder: this would already imply that expressions have been buffered, and it is
1433 not clear when looking at a fir.box if the value it describe may change or not,
1434 while a hlfir.expr value cannot change, which allows moving its usages more
1435 easily.
1436
1437 This would also risk generating too many runtime descriptors read and writes
1438 that could make later optimizations harder.
1439
1440 Hence, while this would be functionally possible, this makes the reasoning about
1441 the IR harder and would not benefit high-level optimizations.
1442
1443 (*) This not true for vector subscripted variables, but the proposed plan will
1444 also not allow creating vector subscripted variables as the result of a
1445 hlfir.designate. Lowering will deal with the assignment and input IO special
1446 case using hlfir.elemental.