1 <!--===- docs/Overview.md
3 Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
4 See https://llvm.org/LICENSE.txt for license information.
5 SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
9 # Overview of Compiler Phases
15 The Flang compiler transforms Fortran source code into an executable file.
16 This transformation proceeds in three high level phases -- analysis, lowering,
17 and code generation/linking.
19 The first high level phase (analysis) transforms Fortran source code into a
20 decorated parse tree and a symbol table. During this phase, all user
21 related errors are detected and reported.
23 The second high level phase (lowering), changes the decorated parse tree and
24 symbol table into the Fortran Intermediate Representation (FIR), which is a
25 dialect of LLVM's Multi-Level Intermediate Representation or MLIR. It then
26 runs a series of passes on the FIR code which verify its validity, perform a
27 series of optimizations, and finally transform it into LLVM's Intermediate
28 Representation, or LLVM IR
30 The third high level phase generates machine code and invokes a linker to
31 produce an executable file.
33 This document describes the first two high level phases. Each of these is
34 described in more detailed phases.
36 Each detailed phase is described -- its inputs and outputs along with how to
37 produce a readable version of the outputs.
39 Each detailed phase produces either correct output or fatal errors.
43 This high level phase validates that the program is correct and creates all of
44 the information needed for lowering.
46 ## Prescan and Preprocess
48 See [Preprocessing.md](Preprocessing.md).
50 **Input:** Fortran source and header files, command line macro definitions,
51 set of enabled compiler directives (to be treated as directives rather than
55 - A "cooked" character stream: the entire program as a contiguous stream of
56 normalized Fortran source.
57 Extraneous whitespace and comments are removed (except comments that are
58 compiler directives that are not disabled) and case is normalized. Also,
59 directives are processed and macros expanded.
60 - Provenance information mapping each character back to the source it came from.
61 This is used in subsequent phases that need source locations. This includes
62 error messages, optimization reports, and debugging information.
64 **Entry point:** `parser::Parsing::Prescan`
67 - `flang-new -fc1 -E src.f90` dumps the cooked character stream
68 - `flang-new -fc1 -fdebug-dump-provenance src.f90` dumps provenance
73 **Input:** Cooked character stream
75 **Output:** A parse tree for each Fortran program unit in the source code
76 representing a syntactically correct program, rooted at the program unit. See:
77 [Parsing.md](Parsing.md) and [ParserCombinators.md](ParserCombinators.md).
79 **Entry point:** `parser::Parsing::Parse`
82 - `flang-new -fc1 -fdebug-dump-parse-tree-no-sema src.f90` dumps the parse tree
83 - `flang-new -fc1 -fdebug-unparse src.f90` converts the parse tree to normalized Fortran
84 - `flang-new -fc1 -fdebug-dump-parsing-log src.f90` runs an instrumented parse and dumps the log
85 - `flang-new -fc1 -fdebug-measure-parse-tree src.f90` measures the parse tree
87 ## Semantic processing
89 **Input:** the parse tree, the cooked character stream, and provenance
95 * module files, (see: [ModFiles.md](ModFiles.md))
96 * the intrinsic procedure table
97 * the target characteristics
98 * the runtime derived type derived type tables (see: [RuntimeTypeInfo.md](RuntimeTypeInfo.md))
100 **Entry point:** `semantics::Semantics::Perform`
102 For more detail on semantic analysis, see: [Semantics.md](Semantics.md).
103 Semantic processing performs several tasks:
104 * validates labels, see: [LabelResolution.md](LabelResolution.md).
105 * canonicalizes DO statements,
106 * canonicalizes OpenACC and OpenMP code
107 * resolves names, building a tree of scopes and symbols
108 * rewrites the parse tree to correct parsing mistakes (when needed) once semantic information is available to clarify the program's meaning
109 * checks the validity of declarations
110 * analyzes expressions and statements, emitting error messages where appropriate
111 * creates module files if the source code contains modules,
112 see [ModFiles.md](ModFiles.md).
114 In the course of semantic analysis, the compiler:
115 * creates the symbol table
116 * decorates the parse tree with semantic information (such as pointers into the symbol table)
117 * creates the intrinsic procedure table
118 * folds constant expressions
120 At the end of semantic processing, all validation of the user's program is complete. This is the last detailed phase of analysis processing.
123 - `flang-new -fc1 -fdebug-dump-parse-tree src.f90` dumps the parse tree after semantic analysis
124 - `flang-new -fc1 -fdebug-dump-symbols src.f90` dumps the symbol table
125 - `flang-new -fc1 -fdebug-dump-all src.f90` dumps both the parse tree and the symbol table
129 Lowering takes the parse tree and symbol table produced by analysis and
132 ## Create the lowering bridge
137 - The default KINDs for intrinsic types (specified by default or command line option)
138 - The intrinsic procedure table (created in semantics processing)
139 - The target characteristics (created during semantics processing)
140 - The cooked character stream
141 - The target triple -- CPU type, vendor, operating system
142 - The mapping between Fortran KIND values to FIR KIND values
144 The lowering bridge is a container that holds all of the information needed for lowering.
146 **Output:** A container with all of the information needed for lowering
148 **Entry point:** lower::LoweringBridge::create
152 **Input:** the lowering bridge
154 **Output:** A Fortran IR (FIR) representation of the program.
156 **Entry point:** `lower::LoweringBridge::lower`
158 The compiler then takes the information in the lowering bridge and creates a
159 pre-FIR tree or PFT. The PFT is a list of programs and modules. The programs
160 and modules contain lists of function-like units. The function-like units
161 contain a list of evaluations. All of these contain pointers back into the
162 parse tree. The compiler walks the PFT generating FIR.
165 - `flang-new -fc1 -fdebug-dump-pft src.f90` dumps the pre-FIR tree
166 - `flang-new -fc1 -emit-mlir src.f90` dumps the FIR to the files src.mlir
168 ## Transformation passes
170 **Input:** initial version of the FIR code
172 **Output:** An LLVM IR representation of the program
174 **Entry point:** `mlir::PassManager::run`
176 The compiler then runs a series of passes over the FIR code. The first is a
177 verification pass. It's followed by a series of transformation passes that
178 perform various optimizations and transformations. The final pass creates an
179 LLVM IR representation of the program.
182 - `flang-new -mmlir --mlir-print-ir-after-all -S src.f90` dumps the FIR code after each pass to standard error
183 - `flang-new -fc1 -emit-llvm src.f90` dumps the LLVM IR to src.ll
185 # Object code generation and linking
187 After the LLVM IR is created, the flang driver invokes LLVM's existing
188 infrastructure to generate object code and invoke a linker to create the