flang/docs/Extensions.md

   1 <!--===- docs/Extensions.md
   2
   3    Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
   4    See https://llvm.org/LICENSE.txt for license information.
   5    SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
   6
   7 -->
   8
   9 # Fortran Extensions supported by Flang
  10
  11 ```eval_rst
  12 .. contents::
  13    :local:
  14 ```
  15
  16 As a general principle, this compiler will accept by default and
  17 without complaint many legacy features, extensions to the standard
  18 language, and features that have been deleted from the standard,
  19 so long as the recognition of those features would not cause a
  20 standard-conforming program to be rejected or misinterpreted.
  21
  22 Other non-standard features, which do conflict with the current
  23 standard specification of the Fortran programming language, are
  24 accepted if enabled by command-line options.
  25
  26 ## Intentional violations of the standard
  27
  28 * Scalar `INTEGER` actual argument expressions (not variables!)
  29   are converted to the kinds of scalar `INTEGER` dummy arguments
  30   when the interface is explicit and the kinds differ.
  31   This conversion allows the results of the intrinsics like
  32   `SIZE` that (as mentioned below) may return non-default
  33   `INTEGER` results by default to be passed.  A warning is
  34   emitted when truncation is possible.  These conversions
  35   are not applied in calls to non-intrinsic generic procedures.
  36 * We are not strict on the contents of `BLOCK DATA` subprograms
  37   so long as they contain no executable code, no internal subprograms,
  38   and allocate no storage outside a named `COMMON` block.  (C1415)
  39 * Delimited list-directed (and NAMELIST) character output is required
  40   to emit contiguous doubled instances of the delimiter character
  41   when it appears in the output value.  When fixed-size records
  42   are being emitted, as is the case with internal output, this
  43   is not possible when the problematic character falls on the last
  44   position of a record.  No two other Fortran compilers do the same
  45   thing in this situation so there is no good precedent to follow.
  46   Because it seems least wrong, we emit one copy of the delimiter as
  47   the last character of the current record and another as the first
  48   character of the next record.  (The second-least-wrong alternative
  49   might be to flag a runtime error, but that seems harsh since it's
  50   not an explicit error in the standard, and the output may not have
  51   to be usable later as input anyway.)
  52   Consequently, the output is not suitable for use as list-directed or
  53   NAMELIST input.  If a later standard were to clarify this case, this
  54   behavior will change as needed to conform.
  55 ```
  56 character(11) :: buffer(3)
  57 character(10) :: quotes = '""""""""""'
  58 write(buffer,*,delim="QUOTE") quotes
  59 print "('>',a10,'<')", buffer
  60 end
  61 ```
  62 * The name of the control variable in an implied DO loop in an array
  63   constructor or DATA statement has a scope over the value-list only,
  64   not the bounds of the implied DO loop.  It is not advisable to use
  65   an object of the same name as the index variable in a bounds
  66   expression, but it will work, instead of being needlessly undefined.
  67 * If both the `COUNT=` and the `COUNT_MAX=` optional arguments are
  68   present on the same call to the intrinsic subroutine `SYSTEM_CLOCK`,
  69   we require that their types have the same integer kind, since the
  70   kind of these arguments is used to select the clock rate.
  71   In common with some other compilers, the clock is in milliseconds
  72   for kinds <= 4 and nanoseconds otherwise where the target system
  73   supports these rates.
  74 * If a dimension of a descriptor has zero extent in a call to
  75   `CFI_section`, `CFI_setpointer` or `CFI_allocate`, the lower
  76   bound on that dimension will be set to 1 for consistency with
  77   the `LBOUND()` intrinsic function.
  78
  79 ## Extensions, deletions, and legacy features supported by default
  80
  81 * Tabs in source
  82 * `<>` as synonym for `.NE.` and `/=`
  83 * `$` and `@` as legal characters in names
  84 * Initialization in type declaration statements using `/values/`
  85 * Kind specification with `*`, e.g. `REAL*4`
  86 * `DOUBLE COMPLEX`
  87 * Signed complex literal constants
  88 * DEC `STRUCTURE`, `RECORD`, with '%FILL'; but `UNION`, and `MAP`
  89   are not yet supported throughout compilation, and elicit a
  90   "not yet implemented" message.
  91 * Structure field access with `.field`
  92 * `BYTE` as synonym for `INTEGER(KIND=1)`
  93 * Quad precision REAL literals with `Q`
  94 * `X` prefix/suffix as synonym for `Z` on hexadecimal literals
  95 * `B`, `O`, `Z`, and `X` accepted as suffixes as well as prefixes
  96 * Triplets allowed in array constructors
  97 * `%LOC`, `%VAL`, and `%REF`
  98 * Leading comma allowed before I/O item list
  99 * Empty parentheses allowed in `PROGRAM P()`
 100 * Missing parentheses allowed in `FUNCTION F`
 101 * Cray based `POINTER(p,x)` and `LOC()` intrinsic (with `%LOC()` as
 102   an alias)
 103 * Arithmetic `IF`.  (Which branch should NaN take? Fall through?)
 104 * `ASSIGN` statement, assigned `GO TO`, and assigned format
 105 * `PAUSE` statement
 106 * Hollerith literals and edit descriptors
 107 * `NAMELIST` allowed in the execution part
 108 * Omitted colons on type declaration statements with attributes
 109 * COMPLEX constructor expression, e.g. `(x+y,z)`
 110 * `+` and `-` before all primary expressions, e.g. `x*-y`
 111 * `.NOT. .NOT.` accepted
 112 * `NAME=` as synonym for `FILE=`
 113 * Data edit descriptors without width or other details
 114 * `D` lines in fixed form as comments or debug code
 115 * `CARRIAGECONTROL=` on the OPEN and INQUIRE statements
 116 * `CONVERT=` on the OPEN and INQUIRE statements
 117 * `DISPOSE=` on the OPEN and INQUIRE statements
 118 * Leading semicolons are ignored before any statement that
 119   could have a label
 120 * The character `&` in column 1 in fixed form source is a variant form
 121   of continuation line.
 122 * Character literals as elements of an array constructor without an explicit
 123   type specifier need not have the same length; the longest literal determines
 124   the length parameter of the implicit type, not the first.
 125 * Outside a character literal, a comment after a continuation marker (&)
 126   need not begin with a comment marker (!).
 127 * Classic C-style /*comments*/ are skipped, so multi-language header
 128   files are easier to write and use.
 129 * $ and \ edit descriptors are supported in FORMAT to suppress newline
 130   output on user prompts.
 131 * Tabs in format strings (not `FORMAT` statements) are allowed on output.
 132 * REAL and DOUBLE PRECISION variable and bounds in DO loops
 133 * Integer literals without explicit kind specifiers that are out of range
 134   for the default kind of INTEGER are assumed to have the least larger kind
 135   that can hold them, if one exists.
 136 * BOZ literals can be used as INTEGER values in contexts where the type is
 137   unambiguous: the right hand sides of assigments and initializations
 138   of INTEGER entities, as actual arguments to a few intrinsic functions
 139   (ACHAR, BTEST, CHAR), and as actual arguments of references to
 140   procedures with explicit interfaces whose corresponding dummy
 141   argument has a numeric type to which the BOZ literal may be
 142   converted.  BOZ literals are interpreted as default INTEGER only
 143   when they appear as the first items of array constructors with no
 144   explicit type.  Otherwise, they generally cannot be used if the type would
 145   not be known (e.g., `IAND(X'1',X'2')`).
 146 * BOZ literals can also be used as REAL values in some contexts where the
 147   type is unambiguous, such as initializations of REAL parameters.
 148 * EQUIVALENCE of numeric and character sequences (a ubiquitous extension),
 149   as well as of sequences of non-default kinds of numeric types
 150   with each other.
 151 * Values for whole anonymous parent components in structure constructors
 152   (e.g., `EXTENDEDTYPE(PARENTTYPE(1,2,3))` rather than `EXTENDEDTYPE(1,2,3)`
 153    or `EXTENDEDTYPE(PARENTTYPE=PARENTTYPE(1,2,3))`).
 154 * Some intrinsic functions are specified in the standard as requiring the
 155   same type and kind for their arguments (viz., ATAN with two arguments,
 156   ATAN2, DIM, HYPOT, MAX, MIN, MOD, and MODULO);
 157   we allow distinct types to be used, promoting
 158   the arguments as if they were operands to an intrinsic `+` operator,
 159   and defining the result type accordingly.
 160 * DOUBLE COMPLEX intrinsics DREAL, DCMPLX, DCONJG, and DIMAG.
 161 * The DFLOAT intrinsic function.
 162 * INT_PTR_KIND intrinsic returns the kind of c_intptr_t.
 163 * Restricted specific conversion intrinsics FLOAT, SNGL, IDINT, IFIX, DREAL,
 164   and DCMPLX accept arguments of any kind instead of only the default kind or
 165   double precision kind. Their result kinds remain as specified.
 166 * Specific intrinsics AMAX0, AMAX1, AMIN0, AMIN1, DMAX1, DMIN1, MAX0, MAX1,
 167   MIN0, and MIN1 accept more argument types than specified. They are replaced by
 168   the related generics followed by conversions to the specified result types.
 169 * When a scalar CHARACTER actual argument of the same kind is known to
 170   have a length shorter than the associated dummy argument, it is extended
 171   on the right with blanks, similar to assignment.
 172 * When a dummy argument is `POINTER` or `ALLOCATABLE` and is `INTENT(IN)`, we
 173   relax enforcement of some requirements on actual arguments that must otherwise
 174   hold true for definable arguments.
 175 * Assignment of `LOGICAL` to `INTEGER` and vice versa (but not other types) is
 176   allowed.  The values are normalized.
 177 * Static initialization of `LOGICAL` with `INTEGER` is allowed in `DATA` statements
 178   and object initializers.
 179   The results are *not* normalized to canonical `.TRUE.`/`.FALSE.`.
 180   Static initialization of `INTEGER` with `LOGICAL` is also permitted.
 181 * An effectively empty source file (no program unit) is accepted and
 182   produces an empty relocatable output file.
 183 * A `RETURN` statement may appear in a main program.
 184 * DATA statement initialization is allowed for procedure pointers outside
 185   structure constructors.
 186 * Nonstandard intrinsic functions: ISNAN, SIZEOF
 187 * A forward reference to a default INTEGER scalar dummy argument is
 188   permitted to appear in a specification expression, such as an array
 189   bound, in a scope with IMPLICIT NONE(TYPE) if the name
 190   of the dummy argument would have caused it to be implicitly typed
 191   as default INTEGER if IMPLICIT NONE(TYPE) were absent.
 192 * OPEN(ACCESS='APPEND') is interpreted as OPEN(POSITION='APPEND')
 193   to ease porting from Sun Fortran.
 194 * Intrinsic subroutines EXIT([status]) and ABORT()
 195 * The definition of simple contiguity in 9.5.4 applies only to arrays;
 196   we also treat scalars as being trivially contiguous, so that they
 197   can be used in contexts like data targets in pointer assignments
 198   with bounds remapping.
 199 * We support some combinations of specific procedures in generic
 200   interfaces that a strict reading of the standard would preclude
 201   when their calls must nonetheless be distinguishable.
 202   Specifically, `ALLOCATABLE` dummy arguments are distinguishing
 203   if an actual argument acceptable to one could not be passed to
 204   the other & vice versa because exactly one is polymorphic or
 205   exactly one is unlimited polymorphic).
 206 * External unit 0 is predefined and connected to the standard error output,
 207   and defined as `ERROR_UNIT` in the intrinsic `ISO_FORTRAN_ENV` module.
 208 * Objects in blank COMMON may be initialized.
 209 * Multiple specifications of the SAVE attribute on the same object
 210   are allowed, with a warning.
 211 * Specific intrinsic functions BABS, IIABS, JIABS, KIABS, ZABS, and CDABS.
 212 * A `POINTER` component's type need not be a sequence type when
 213   the component appears in a derived type with `SEQUENCE`.
 214   (This case should probably be an exception to constraint C740 in
 215   the standard.)
 216 * Format expressions that have type but are not character and not
 217   integer scalars are accepted so long as they are simply contiguous.
 218   This legacy extension supports pre-Fortran'77 usage in which
 219   variables initialized in DATA statements with Hollerith literals
 220   as modifiable formats.
 221 * At runtime, `NAMELIST` input will skip over `NAMELIST` groups
 222   with other names, and will treat text before and between groups
 223   as if they were comment lines, even if not begun with `!`.
 224 * Commas are required in FORMAT statements and character variables
 225   only when they prevent ambiguity.
 226
 227 ### Extensions supported when enabled by options
 228
 229 * C-style backslash escape sequences in quoted CHARACTER literals
 230   (but not Hollerith) [-fbackslash]
 231 * Logical abbreviations `.T.`, `.F.`, `.N.`, `.A.`, `.O.`, and `.X.`
 232   [-flogical-abbreviations]
 233 * `.XOR.` as a synonym for `.NEQV.` [-fxor-operator]
 234 * The default `INTEGER` type is required by the standard to occupy
 235   the same amount of storage as the default `REAL` type.  Default
 236   `REAL` is of course 32-bit IEEE-754 floating-point today.  This legacy
 237   rule imposes an artificially small constraint in some cases
 238   where Fortran mandates that something have the default `INTEGER`
 239   type: specifically, the results of references to the intrinsic functions
 240   `SIZE`, `STORAGE_SIZE`,`LBOUND`, `UBOUND`, `SHAPE`, and the location reductions
 241   `FINDLOC`, `MAXLOC`, and `MINLOC` in the absence of an explicit
 242   `KIND=` actual argument.  We return `INTEGER(KIND=8)` by default in
 243   these cases when the `-flarge-sizes` option is enabled.
 244   `SIZEOF` and `C_SIZEOF` always return `INTEGER(KIND=8)`.
 245 * Treat each specification-part like is has `IMPLICIT NONE`
 246   [-fimplicit-none-type-always]
 247 * Ignore occurrences of `IMPLICIT NONE` and `IMPLICIT NONE(TYPE)`
 248   [-fimplicit-none-type-never]
 249 * Old-style `PARAMETER pi=3.14` statement without parentheses
 250   [-falternative-parameter-statement]
 251
 252 ### Extensions and legacy features deliberately not supported
 253
 254 * `.LG.` as synonym for `.NE.`
 255 * `REDIMENSION`
 256 * Allocatable `COMMON`
 257 * Expressions in formats
 258 * `ACCEPT` as synonym for `READ *`
 259 * `TYPE` as synonym for `PRINT`
 260 * `ARRAY` as synonym for `DIMENSION`
 261 * `VIRTUAL` as synonym for `DIMENSION`
 262 * `ENCODE` and `DECODE` as synonyms for internal I/O
 263 * `IMPLICIT AUTOMATIC`, `IMPLICIT STATIC`
 264 * Default exponent of zero, e.g. `3.14159E`
 265 * Characters in defined operators that are neither letters nor digits
 266 * `B` suffix on unquoted octal constants
 267 * `Z` prefix on unquoted hexadecimal constants (dangerous)
 268 * `T` and `F` as abbreviations for `.TRUE.` and `.FALSE.` in DATA (PGI/XLF)
 269 * Use of host FORMAT labels in internal subprograms (PGI-only feature)
 270 * ALLOCATE(TYPE(derived)::...) as variant of correct ALLOCATE(derived::...) (PGI only)
 271 * Defining an explicit interface for a subprogram within itself (PGI only)
 272 * USE association of a procedure interface within that same procedure's definition
 273 * NULL() as a structure constructor expression for an ALLOCATABLE component (PGI).
 274 * Conversion of LOGICAL to INTEGER in expressions.
 275 * IF (integer expression) THEN ... END IF  (PGI/Intel)
 276 * Comparsion of LOGICAL with ==/.EQ. rather than .EQV. (also .NEQV.) (PGI/Intel)
 277 * Procedure pointers in COMMON blocks (PGI/Intel)
 278 * Underindexing multi-dimensional arrays (e.g., A(1) rather than A(1,1)) (PGI only)
 279 * Legacy PGI `NCHARACTER` type and `NC` Kanji character literals
 280 * Using non-integer expressions for array bounds (e.g., REAL A(3.14159)) (PGI/Intel)
 281 * Mixing INTEGER types as operands to bit intrinsics (e.g., IAND); only two
 282   compilers support it, and they disagree on sign extension.
 283 * Module & program names that conflict with an object inside the unit (PGI only).
 284 * When the same name is brought into scope via USE association from
 285   multiple modules, the name must refer to a generic interface; PGI
 286   allows a name to be a procedure from one module and a generic interface
 287   from another.
 288 * Type parameter declarations must come first in a derived type definition;
 289   some compilers allow them to follow `PRIVATE`, or be intermixed with the
 290   component declarations.
 291 * Wrong argument types in calls to specific intrinsics that have different names than the
 292   related generics. Some accepted exceptions are listed above in the allowed extensions.
 293   PGI, Intel, and XLF support this in ways that are not numerically equivalent.
 294   PGI converts the arguments while Intel and XLF replace the specific by the related generic.
 295
 296 ## Preprocessing behavior
 297
 298 * The preprocessor is always run, whatever the filename extension may be.
 299 * We respect Fortran comments in macro actual arguments (like GNU, Intel, NAG;
 300   unlike PGI and XLF) on the principle that macro calls should be treated
 301   like function references.  Fortran's line continuation methods also work.
 302
 303 ## Standard features not silently accepted
 304
 305 * Fortran explicitly ignores type declaration statements when they
 306   attempt to type the name of a generic intrinsic function (8.2 p3).
 307   One can declare `CHARACTER::COS` and still get a real result
 308   from `COS(3.14159)`, for example.  f18 will complain when a
 309   generic intrinsic function's inferred result type does not
 310   match an explicit declaration.  This message is a warning.
 311
 312 ## Standard features that might as well not be
 313
 314 * f18 supports designators with constant expressions, properly
 315   constrained, as initial data targets for data pointers in
 316   initializers of variable and component declarations and in
 317   `DATA` statements; e.g., `REAL, POINTER :: P => T(1:10:2)`.
 318   This Fortran 2008 feature might as well be viewed like an
 319   extension; no other compiler that we've tested can handle
 320   it yet.
 321
 322 ## Behavior in cases where the standard is ambiguous or indefinite
 323
 324 * When an inner procedure of a subprogram uses the value or an attribute
 325   of an undeclared name in a specification expression and that name does
 326   not appear in the host, it is not clear in the standard whether that
 327   name is an implicitly typed local variable of the inner procedure or a
 328   host association with an implicitly typed local variable of the host.
 329   For example:
 330 ```
 331 module module
 332  contains
 333   subroutine host(j)
 334     ! Although "m" never appears in the specification or executable
 335     ! parts of this subroutine, both of its contained subroutines
 336     ! might be accessing it via host association.
 337     integer, intent(in out) :: j
 338     call inner1(j)
 339     call inner2(j)
 340    contains
 341     subroutine inner1(n)
 342       integer(kind(m)), intent(in) :: n
 343       m = n + 1
 344     end subroutine
 345     subroutine inner2(n)
 346       integer(kind(m)), intent(out) :: n
 347       n = m + 2
 348     end subroutine
 349   end subroutine
 350 end module
 351
 352 program demo
 353   use module
 354   integer :: k
 355   k = 0
 356   call host(k)
 357   print *, k, " should be 3"
 358 end
 359
 360 ```
 361
 362   Other Fortran compilers disagree in their interpretations of this example;
 363   some seem to treat the references to `m` as if they were host associations
 364   to an implicitly typed variable (and print `3`), while others seem to
 365   treat them as references to implicitly typed local variabless, and
 366   load uninitialized values.
 367
 368   In f18, we chose to emit an error message for this case since the standard
 369   is unclear, the usage is not portable, and the issue can be easily resolved
 370   by adding a declaration.
 371
 372 * In subclause 7.5.6.2 of Fortran 2018 the standard defines a partial ordering
 373   of the final subroutine calls for finalizable objects, their non-parent
 374   components, and then their parent components.
 375   (The object is finalized, then the non-parent components of each element,
 376   and then the parent component.)
 377   Some have argued that the standard permits an implementation
 378   to finalize the parent component before finalizing an allocatable component in
 379   the context of deallocation, and the next revision of the language may codify
 380   this option.
 381   In the interest of avoiding needless confusion, this compiler implements what
 382   we believe to be the least surprising order of finalization.
 383   Specifically: all non-parent components are finalized before
 384   the parent, allocatable or not;
 385   all finalization takes place before any deallocation;
 386   and no object or subobject will be finalized more than once.
 387
 388 * When `RECL=` is set via the `OPEN` statement for a sequential formatted input
 389   file, it functions as an effective maximum record length.
 390   Longer records, if any, will appear as if they had been truncated to
 391   the value of `RECL=`.
 392   (Other compilers ignore `RECL=`, signal an error, or apply effective truncation
 393   to some forms of input in this situation.)
 394   For sequential formatted output, RECL= serves as a limit on record lengths
 395   that raises an error when it is exceeded.