llvm/docs/NewPassManager.rst

   1 ==========================
   2 Using the New Pass Manager
   3 ==========================
   4
   5 .. contents::
   6     :local:
   7
   8 Overview
   9 ========
  10
  11 For an overview of the new pass manager, see the `blog post
  12 <https://blog.llvm.org/posts/2021-03-26-the-new-pass-manager/>`_.
  13
  14 Adding Passes to a Pass Manager
  15 ===============================
  16
  17 For how to write a new PM pass, see :doc:`this page <WritingAnLLVMNewPMPass>`.
  18
  19 To add a pass to a new PM pass manager, the important thing is to match the
  20 pass type and the pass manager type. For example, a ``FunctionPassManager``
  21 can only contain function passes:
  22
  23 .. code-block:: c++
  24
  25   FunctionPassManager FPM;
  26   // InstSimplifyPass is a function pass
  27   FPM.addPass(InstSimplifyPass());
  28
  29 If you want add a loop pass that runs on all loops in a function to a
  30 ``FunctionPassManager``, the loop pass must be wrapped in a function pass
  31 adaptor that goes through all the loops in the function and runs the loop
  32 pass on each one.
  33
  34 .. code-block:: c++
  35
  36   FunctionPassManager FPM;
  37   // LoopRotatePass is a loop pass
  38   FPM.addPass(createFunctionToLoopPassAdaptor(LoopRotatePass()));
  39
  40 The IR hierarchy in terms of the new PM is Module -> (CGSCC ->) Function ->
  41 Loop, where going through a CGSCC is optional.
  42
  43 .. code-block:: c++
  44
  45   FunctionPassManager FPM;
  46   // loop -> function
  47   FPM.addPass(createFunctionToLoopPassAdaptor(LoopFooPass()));
  48
  49   CGSCCPassManager CGPM;
  50   // loop -> function -> cgscc
  51   CGPM.addPass(createCGSCCToFunctionPassAdaptor(createFunctionToLoopPassAdaptor(LoopFooPass())));
  52   // function -> cgscc
  53   CGPM.addPass(createCGSCCToFunctionPassAdaptor(FunctionFooPass()));
  54
  55   ModulePassManager MPM;
  56   // loop -> function -> module
  57   MPM.addPass(createModuleToFunctionPassAdaptor(createFunctionToLoopPassAdaptor(LoopFooPass())));
  58   // function -> module
  59   MPM.addPass(createModuleToFunctionPassAdaptor(FunctionFooPass()));
  60
  61   // loop -> function -> cgscc -> module
  62   MPM.addPass(createModuleToCGSCCPassAdaptor(createCGSCCToFunctionPassAdaptor(createFunctionToLoopPassAdaptor(LoopFooPass()))));
  63   // function -> cgscc -> module
  64   MPM.addPass(createModuleToCGSCCPassAdaptor(createCGSCCToFunctionPassAdaptor(FunctionFooPass())));
  65
  66
  67 A pass manager of a specific IR unit is also a pass of that kind. For
  68 example, a ``FunctionPassManager`` is a function pass, meaning it can be
  69 added to a ``ModulePassManager``:
  70
  71 .. code-block:: c++
  72
  73   ModulePassManager MPM;
  74
  75   FunctionPassManager FPM;
  76   // InstSimplifyPass is a function pass
  77   FPM.addPass(InstSimplifyPass());
  78
  79   MPM.addPass(createModuleToFunctionPassAdaptor(std::move(FPM)));
  80
  81 Generally you want to group CGSCC/function/loop passes together in a pass
  82 manager, as opposed to adding adaptors for each pass to the containing upper
  83 level pass manager. For example,
  84
  85 .. code-block:: c++
  86
  87   ModulePassManager MPM;
  88   MPM.addPass(createModuleToFunctionPassAdaptor(FunctionPass1()));
  89   MPM.addPass(createModuleToFunctionPassAdaptor(FunctionPass2()));
  90   MPM.run();
  91
  92 will run ``FunctionPass1`` on each function in a module, then run
  93 ``FunctionPass2`` on each function in the module. In contrast,
  94
  95 .. code-block:: c++
  96
  97   ModulePassManager MPM;
  98
  99   FunctionPassManager FPM;
 100   FPM.addPass(FunctionPass1());
 101   FPM.addPass(FunctionPass2());
 102
 103   MPM.addPass(createModuleToFunctionPassAdaptor(std::move(FPM)));
 104
 105 will run ``FunctionPass1`` and ``FunctionPass2`` on the first function in a
 106 module, then run both passes on the second function in the module, and so on.
 107 This is better for cache locality around LLVM data structures. This similarly
 108 applies for the other IR types, and in some cases can even affect the quality
 109 of optimization. For example, running all loop passes on a loop may cause a
 110 later loop to be able to be optimized more than if each loop pass were run
 111 separately.
 112
 113 Inserting Passes into Default Pipelines
 114 =======================================
 115
 116 Rather than manually adding passes to a pass manager, the typical way of
 117 creating a pass manager is to use a ``PassBuilder`` and call something like
 118 ``PassBuilder::buildPerModuleDefaultPipeline()`` which creates a typical
 119 pipeline for a given optimization level.
 120
 121 Sometimes either frontends or backends will want to inject passes into the
 122 pipeline. For example, frontends may want to add instrumentation, and target
 123 backends may want to add passes that lower custom intrinsics. For these
 124 cases, ``PassBuilder`` exposes callbacks that allow injecting passes into
 125 certain parts of the pipeline. For example,
 126
 127 .. code-block:: c++
 128
 129   PassBuilder PB;
 130   PB.registerPipelineStartEPCallback([&](ModulePassManager &MPM,
 131                                          PassBuilder::OptimizationLevel Level) {
 132       MPM.addPass(FooPass());
 133   };
 134
 135 will add ``FooPass`` near the very beginning of the pipeline for pass
 136 managers created by that ``PassBuilder``. See the documentation for
 137 ``PassBuilder`` for the various places that passes can be added.
 138
 139 If a ``PassBuilder`` has a corresponding ``TargetMachine`` for a backend, it
 140 will call ``TargetMachine::registerPassBuilderCallbacks()`` to allow the
 141 backend to inject passes into the pipeline. This is equivalent to the legacy
 142 PM's ``TargetMachine::adjustPassManager()``.
 143
 144 Clang's ``BackendUtil.cpp`` shows examples of a frontend adding (mostly
 145 sanitizer) passes to various parts of the pipeline.
 146 ``AMDGPUTargetMachine::registerPassBuilderCallbacks()`` is an example of a
 147 backend adding passes to various parts of the pipeline.
 148
 149 Using Analyses
 150 ==============
 151
 152 LLVM provides many analyses that passes can use, such as a dominator tree.
 153 Calculating these can be expensive, so the new pass manager has
 154 infrastructure to cache analyses and reuse them when possible.
 155
 156 When a pass runs on some IR, it also receives an analysis manager which it can
 157 query for analyses. Querying for an analysis will cause the manager to check if
 158 it has already computed the result for the requested IR. If it already has and
 159 the result is still valid, it will return that. Otherwise it will construct a
 160 new result by calling the analysis's ``run()`` method, cache it, and return it.
 161 You can also ask the analysis manager to only return an analysis if it's
 162 already cached.
 163
 164 The analysis manager only provides analysis results for the same IR type as
 165 what the pass runs on. For example, a function pass receives an analysis
 166 manager that only provides function-level analyses. This works for many
 167 passes which work on a fixed scope. However, some passes want to peek up or
 168 down the IR hierarchy. For example, an SCC pass may want to look at function
 169 analyses for the functions inside the SCC. Or it may want to look at some
 170 immutable global analysis. In these cases, the analysis manager can provide a
 171 proxy to an outer or inner level analysis manager. For example, to get a
 172 ``FunctionAnalysisManager`` from a ``CGSCCAnalysisManager``, you can call
 173
 174 .. code-block:: c++
 175
 176   FunctionAnalysisManager &FAM =
 177       AM.getResult<FunctionAnalysisManagerCGSCCProxy>(InitialC, CG)
 178           .getManager();
 179
 180 and use ``FAM`` as a typical ``FunctionAnalysisManager`` that a function pass
 181 would have access to. To get access to an outer level IR analysis, you can
 182 call
 183
 184 .. code-block:: c++
 185
 186   const auto &MAMProxy =
 187       AM.getResult<ModuleAnalysisManagerCGSCCProxy>(InitialC, CG);
 188   FooAnalysisResult *AR = MAMProxy.getCachedResult<FooAnalysis>(M);
 189
 190 Getting direct access to an outer level IR analysis manager is not allowed.
 191 This is to keep in mind potential future pass concurrency, for example
 192 parallelizing function passes over different functions in a CGSCC or module.
 193 Since passes can ask for a cached analysis result, allowing passes to trigger
 194 outer level analysis computation could result in non-determinism if
 195 concurrency was supported. Therefore a pass running on inner level IR cannot
 196 change the state of outer level IR analyses. Another limitation is that outer
 197 level IR analyses that are used must be immutable, or else they could be
 198 invalidated by changes to inner level IR. Outer analyses unused by inner
 199 passes can and often will be invalidated by changes to inner level IR. These
 200 invalidations happen after the inner pass manager finishes, so accessing
 201 mutable analyses would give invalid results.
 202
 203 The exception to the above is accessing function analyses in loop passes.
 204 Loop passes inherently require modifying the function the loop is in, and
 205 that includes some function analyses the loop analyses depend on. This
 206 discounts future concurrency over separate loops in a function, but that's a
 207 tradeoff due to how tightly a loop and its function are coupled. To make sure
 208 the function analyses loop passes use are valid, they are manually updated in
 209 the loop passes to ensure that invalidation is not necessary. There is a set
 210 of common function analyses that loop passes and analyses have access to
 211 which is passed into loop passes as a ``LoopStandardAnalysisResults``
 212 parameter. Other function analyses are not accessible from loop passes.
 213
 214 As with any caching mechanism, we need some way to tell analysis managers
 215 when results are no longer valid. Much of the analysis manager complexity
 216 comes from trying to invalidate as few analysis results as possible to keep
 217 compile times as low as possible.
 218
 219 There are two ways to deal with potentially invalid analysis results. One is
 220 to simply force clear the results. This should generally only be used when
 221 the IR that the result is keyed on becomes invalid. For example, a function
 222 is deleted, or a CGSCC has become invalid due to call graph changes.
 223
 224 The typical way to invalidate analysis results is for a pass to declare what
 225 types of analyses it preserves and what types it does not. When transforming
 226 IR, a pass either has the option to update analyses alongside the IR
 227 transformation, or tell the analysis manager that analyses are no longer
 228 valid and should be invalidated. If a pass wants to keep some specific
 229 analysis up to date, such as when updating it would be faster than
 230 invalidating and recalculating it, the analysis itself may have methods to
 231 update it for specific transformations, or there may be helper updaters like
 232 ``DomTreeUpdater`` for a ``DominatorTree``. Otherwise to mark some analysis
 233 as no longer valid, the pass can return a ``PreservedAnalyses`` with the
 234 proper analyses invalidated.
 235
 236 .. code-block:: c++
 237
 238   // We've made no transformations that can affect any analyses.
 239   return PreservedAnalyses::all();
 240
 241   // We've made transformations and don't want to bother to update any analyses.
 242   return PreservedAnalyses::none();
 243
 244   // We've specifically updated the dominator tree alongside any transformations, but other analysis results may be invalid.
 245   PreservedAnalyses PA;
 246   PA.preserve<DominatorAnalysis>();
 247   return PA;
 248
 249   // We haven't made any control flow changes, any analyses that only care about the control flow are still valid.
 250   PreservedAnalyses PA;
 251   PA.preserveSet<CFGAnalyses>();
 252   return PA;
 253
 254 The pass manager will call the analysis manager's ``invalidate()`` method
 255 with the pass's returned ``PreservedAnalyses``. This can be also done
 256 manually within the pass:
 257
 258 .. code-block:: c++
 259
 260   FooModulePass::run(Module& M, ModuleAnalysisManager& AM) {
 261     auto &FAM = AM.getResult<FunctionAnalysisManagerModuleProxy>(M).getManager();
 262
 263     // Invalidate all analysis results for function F
 264     FAM.invalidate(F, PreservedAnalyses::none());
 265
 266     // Invalidate all analysis results
 267     AM.invalidate(M, PreservedAnalyses::none());
 268
 269     ...
 270   }
 271
 272 This is especially important when a pass removes then adds a function. The
 273 analysis manager may store a pointer to a function that has been deleted, and
 274 if the pass creates a new function before invalidating analysis results, the
 275 new function may be at the same address as the old one, causing invalid
 276 cached results. This is also useful for being more precise about
 277 invalidation. Selectively invalidating analysis results only for functions
 278 modified in an SCC pass can allow more analysis results to remain. But except
 279 for complex fine-grain invalidation with inner proxies, passes should
 280 typically just return a proper ``PreservedAnalyses`` and let the pass manager
 281 deal with proper invalidation.
 282
 283 Implementing Analysis Invalidation
 284 ==================================
 285
 286 By default, an analysis is invalidated if ``PreservedAnalyses`` says that
 287 analyses on the IR unit it runs on are not preserved (see
 288 ``AnalysisResultModel::invalidate()``). An analysis can implement
 289 ``invalidate()`` to be more conservative when it comes to invalidation. For
 290 example,
 291
 292 .. code-block:: c++
 293
 294   bool FooAnalysisResult::invalidate(Function &F, const PreservedAnalyses &PA,
 295                                      FunctionAnalysisManager::Invalidator &) {
 296     auto PAC = PA.getChecker<FooAnalysis>();
 297     // the default would be:
 298     // return !(PAC.preserved() || PAC.preservedSet<AllAnalysesOn<Function>>());
 299     return !(PAC.preserved() || PAC.preservedSet<AllAnalysesOn<Function>>()
 300         || PAC.preservedSet<CFGAnalyses>());
 301   }
 302
 303 says that if the ``PreservedAnalyses`` specifically preserves
 304 ``FooAnalysis``, or if ``PreservedAnalyses`` preserves all analyses (implicit
 305 in ``PAC.preserved()``), or if ``PreservedAnalyses`` preserves all function
 306 analyses, or ``PreservedAnalyses`` preserves all analyses that only care
 307 about the CFG, the ``FooAnalysisResult`` should not be invalidated.
 308
 309 If an analysis is stateless and generally shouldn't be invalidated, use the
 310 following:
 311
 312 .. code-block:: c++
 313
 314   bool FooAnalysisResult::invalidate(Function &F, const PreservedAnalyses &PA,
 315                                      FunctionAnalysisManager::Invalidator &) {
 316     // Check whether the analysis has been explicitly invalidated. Otherwise, it's
 317     // stateless and remains preserved.
 318     auto PAC = PA.getChecker<FooAnalysis>();
 319     return !PAC.preservedWhenStateless();
 320   }
 321
 322 If an analysis depends on other analyses, those analyses also need to be
 323 checked if they are invalidated:
 324
 325 .. code-block:: c++
 326
 327   bool FooAnalysisResult::invalidate(Function &F, const PreservedAnalyses &PA,
 328                                      FunctionAnalysisManager::Invalidator &) {
 329     auto PAC = PA.getChecker<FooAnalysis>();
 330     if (!PAC.preserved() && !PAC.preservedSet<AllAnalysesOn<Function>>())
 331       return true;
 332
 333     // Check transitive dependencies.
 334     return Inv.invalidate<BarAnalysis>(F, PA) ||
 335           Inv.invalidate<BazAnalysis>(F, PA);
 336   }
 337
 338 Combining invalidation and analysis manager proxies results in some
 339 complexity. For example, when we invalidate all analyses in a module pass,
 340 we have to make sure that we also invalidate function analyses accessible via
 341 any existing inner proxies. The inner proxy's ``invalidate()`` first checks
 342 if the proxy itself should be invalidated. If so, that means the proxy may
 343 contain pointers to IR that is no longer valid, meaning that the inner proxy
 344 needs to completely clear all relevant analysis results. Otherwise the proxy
 345 simply forwards the invalidation to the inner analysis manager.
 346
 347 Generally for outer proxies, analysis results from the outer analysis manager
 348 should be immutable, so invalidation shouldn't be a concern. However, it is
 349 possible for some inner analysis to depend on some outer analysis, and when
 350 the outer analysis is invalidated, we need to make sure that dependent inner
 351 analyses are also invalidated. This actually happens with alias analysis
 352 results. Alias analysis is a function-level analysis, but there are
 353 module-level implementations of specific types of alias analysis. Currently
 354 ``GlobalsAA`` is the only module-level alias analysis and it generally is not
 355 invalidated so this is not so much of a concern. See
 356 ``OuterAnalysisManagerProxy::Result::registerOuterAnalysisInvalidation()``
 357 for more details.
 358
 359 Invoking ``opt``
 360 ================
 361
 362 To use the legacy pass manager:
 363
 364 .. code-block:: shell
 365
 366   $ opt -enable-new-pm=0 -pass1 -pass2 /tmp/a.ll -S
 367
 368 This will be removed once the legacy pass manager is deprecated and removed for
 369 the optimization pipeline.
 370
 371 To use the new PM:
 372
 373 .. code-block:: shell
 374
 375   $ opt -passes='pass1,pass2' /tmp/a.ll -S
 376
 377 The new PM typically requires explicit pass nesting. For example, to run a
 378 function pass, then a module pass, we need to wrap the function pass in a module
 379 adaptor:
 380
 381 .. code-block:: shell
 382
 383   $ opt -passes='function(no-op-function),no-op-module' /tmp/a.ll -S
 384
 385 A more complete example, and ``-debug-pass-manager`` to show the execution
 386 order:
 387
 388 .. code-block:: shell
 389
 390   $ opt -passes='no-op-module,cgscc(no-op-cgscc,function(no-op-function,loop(no-op-loop))),function(no-op-function,loop(no-op-loop))' /tmp/a.ll -S -debug-pass-manager
 391
 392 Improper nesting can lead to error messages such as
 393
 394 .. code-block:: shell
 395
 396   $ opt -passes='no-op-function,no-op-module' /tmp/a.ll -S
 397   opt: unknown function pass 'no-op-module'
 398
 399 The nesting is: module (-> cgscc) -> function -> loop, where the CGSCC nesting is optional.
 400
 401 There are a couple of special cases for easier typing:
 402
 403 * If the first pass is not a module pass, a pass manager of the first pass is
 404   implicitly created
 405
 406   * For example, the following are equivalent
 407
 408 .. code-block:: shell
 409
 410   $ opt -passes='no-op-function,no-op-function' /tmp/a.ll -S
 411   $ opt -passes='function(no-op-function,no-op-function)' /tmp/a.ll -S
 412
 413 * If there is an adaptor for a pass that lets it fit in the previous pass
 414   manager, that is implicitly created
 415
 416   * For example, the following are equivalent
 417
 418 .. code-block:: shell
 419
 420   $ opt -passes='no-op-function,no-op-loop' /tmp/a.ll -S
 421   $ opt -passes='no-op-function,loop(no-op-loop)' /tmp/a.ll -S
 422
 423 For a list of available passes and analyses, including the IR unit (module,
 424 CGSCC, function, loop) they operate on, run
 425
 426 .. code-block:: shell
 427
 428   $ opt --print-passes
 429
 430 or take a look at ``PassRegistry.def``.
 431
 432 To make sure an analysis named ``foo`` is available before a pass, add
 433 ``require<foo>`` to the pass pipeline. This adds a pass that simply requests
 434 that the analysis is run. This pass is also subject to proper nesting.  For
 435 example, to make sure some function analysis is already computed for all
 436 functions before a module pass:
 437
 438 .. code-block:: shell
 439
 440   $ opt -passes='function(require<my-function-analysis>),my-module-pass' /tmp/a.ll -S
 441
 442 Status of the New and Legacy Pass Managers
 443 ==========================================
 444
 445 LLVM currently contains two pass managers, the legacy PM and the new PM. The
 446 optimization pipeline (aka the middle-end) works with both the legacy PM and
 447 the new PM, whereas the backend target-dependent code generation only works
 448 with the legacy PM.
 449
 450 For the optimization pipeline, the new PM is the default PM. The legacy PM is
 451 available for the optimization pipeline either by setting the CMake flag
 452 ``-DENABLE_EXPERIMENTAL_NEW_PASS_MANAGER=OFF`` when building LLVM, or by
 453 various compiler/linker flags, e.g. ``-flegacy-pass-manager`` for ``clang``.
 454
 455 There will be efforts to deprecate and remove the legacy PM for the
 456 optimization pipeline in the future.
 457
 458 Some IR passes are considered part of the backend codegen pipeline even if
 459 they are LLVM IR passes (whereas all MIR passes are codegen passes). This
 460 includes anything added via ``TargetPassConfig`` hooks, e.g.
 461 ``TargetPassConfig::addCodeGenPrepare()``. As mentioned before, passes added
 462 in ``TargetMachine::adjustPassManager()`` are part of the optimization
 463 pipeline, and should have a corresponding line in
 464 ``TargetMachine::registerPassBuilderCallbacks()``.
 465
 466 Currently there are efforts to make the codegen pipeline work with the new
 467 PM.