llvm/docs/ORCv2.rst

   1 ===============================
   2 ORC Design and Implementation
   3 ===============================
   4
   5 .. contents::
   6    :local:
   7
   8 Introduction
   9 ============
  10
  11 This document aims to provide a high-level overview of the design and
  12 implementation of the ORC JIT APIs. Except where otherwise stated all discussion
  13 refers to the modern ORCv2 APIs (available since LLVM 7). Clients wishing to
  14 transition from OrcV1 should see Section :ref:`transitioning_orcv1_to_orcv2`.
  15
  16 Use-cases
  17 =========
  18
  19 ORC provides a modular API for building JIT compilers. There are a number
  20 of use cases for such an API. For example:
  21
  22 1. The LLVM tutorials use a simple ORC-based JIT class to execute expressions
  23 compiled from a toy language: Kaleidoscope.
  24
  25 2. The LLVM debugger, LLDB, uses a cross-compiling JIT for expression
  26 evaluation. In this use case, cross compilation allows expressions compiled
  27 in the debugger process to be executed on the debug target process, which may
  28 be on a different device/architecture.
  29
  30 3. In high-performance JITs (e.g. JVMs, Julia) that want to make use of LLVM's
  31 optimizations within an existing JIT infrastructure.
  32
  33 4. In interpreters and REPLs, e.g. Cling (C++) and the Swift interpreter.
  34
  35 By adopting a modular, library-based design we aim to make ORC useful in as many
  36 of these contexts as possible.
  37
  38 Features
  39 ========
  40
  41 ORC provides the following features:
  42
  43 **JIT-linking**
  44   ORC provides APIs to link relocatable object files (COFF, ELF, MachO) [1]_
  45   into a target process at runtime. The target process may be the same process
  46   that contains the JIT session object and jit-linker, or may be another process
  47   (even one running on a different machine or architecture) that communicates
  48   with the JIT via RPC.
  49
  50 **LLVM IR compilation**
  51   ORC provides off the shelf components (IRCompileLayer, SimpleCompiler,
  52   ConcurrentIRCompiler) that make it easy to add LLVM IR to a JIT'd process.
  53
  54 **Eager and lazy compilation**
  55   By default, ORC will compile symbols as soon as they are looked up in the JIT
  56   session object (``ExecutionSession``). Compiling eagerly by default makes it
  57   easy to use ORC as an in-memory compiler for an existing JIT (similar to how
  58   MCJIT is commonly used). However ORC also provides built-in support for lazy
  59   compilation via lazy-reexports (see :ref:`Laziness`).
  60
  61 **Support for Custom Compilers and Program Representations**
  62   Clients can supply custom compilers for each symbol that they define in their
  63   JIT session. ORC will run the user-supplied compiler when the a definition of
  64   a symbol is needed. ORC is actually fully language agnostic: LLVM IR is not
  65   treated specially, and is supported via the same wrapper mechanism (the
  66   ``MaterializationUnit`` class) that is used for custom compilers.
  67
  68 **Concurrent JIT'd code** and **Concurrent Compilation**
  69   JIT'd code may be executed in multiple threads, may spawn new threads, and may
  70   re-enter the ORC (e.g. to request lazy compilation) concurrently from multiple
  71   threads. Compilers launched my ORC can run concurrently (provided the client
  72   sets up an appropriate dispatcher). Built-in dependency tracking ensures that
  73   ORC does not release pointers to JIT'd code or data until all dependencies
  74   have also been JIT'd and they are safe to call or use.
  75
  76 **Removable Code**
  77   Resources for JIT'd program representations
  78
  79 **Orthogonality** and **Composability**
  80   Each of the features above can be used independently. It is possible to put
  81   ORC components together to make a non-lazy, in-process, single threaded JIT
  82   or a lazy, out-of-process, concurrent JIT, or anything in between.
  83
  84 LLJIT and LLLazyJIT
  85 ===================
  86
  87 ORC provides two basic JIT classes off-the-shelf. These are useful both as
  88 examples of how to assemble ORC components to make a JIT, and as replacements
  89 for earlier LLVM JIT APIs (e.g. MCJIT).
  90
  91 The LLJIT class uses an IRCompileLayer and RTDyldObjectLinkingLayer to support
  92 compilation of LLVM IR and linking of relocatable object files. All operations
  93 are performed eagerly on symbol lookup (i.e. a symbol's definition is compiled
  94 as soon as you attempt to look up its address). LLJIT is a suitable replacement
  95 for MCJIT in most cases (note: some more advanced features, e.g.
  96 JITEventListeners are not supported yet).
  97
  98 The LLLazyJIT extends LLJIT and adds a CompileOnDemandLayer to enable lazy
  99 compilation of LLVM IR. When an LLVM IR module is added via the addLazyIRModule
 100 method, function bodies in that module will not be compiled until they are first
 101 called. LLLazyJIT aims to provide a replacement of LLVM's original (pre-MCJIT)
 102 JIT API.
 103
 104 LLJIT and LLLazyJIT instances can be created using their respective builder
 105 classes: LLJITBuilder and LLazyJITBuilder. For example, assuming you have a
 106 module ``M`` loaded on a ThreadSafeContext ``Ctx``:
 107
 108 .. code-block:: c++
 109
 110   // Try to detect the host arch and construct an LLJIT instance.
 111   auto JIT = LLJITBuilder().create();
 112
 113   // If we could not construct an instance, return an error.
 114   if (!JIT)
 115     return JIT.takeError();
 116
 117   // Add the module.
 118   if (auto Err = JIT->addIRModule(TheadSafeModule(std::move(M), Ctx)))
 119     return Err;
 120
 121   // Look up the JIT'd code entry point.
 122   auto EntrySym = JIT->lookup("entry");
 123   if (!EntrySym)
 124     return EntrySym.takeError();
 125
 126   // Cast the entry point address to a function pointer.
 127   auto *Entry = (void(*)())EntrySym.getAddress();
 128
 129   // Call into JIT'd code.
 130   Entry();
 131
 132 The builder classes provide a number of configuration options that can be
 133 specified before the JIT instance is constructed. For example:
 134
 135 .. code-block:: c++
 136
 137   // Build an LLLazyJIT instance that uses four worker threads for compilation,
 138   // and jumps to a specific error handler (rather than null) on lazy compile
 139   // failures.
 140
 141   void handleLazyCompileFailure() {
 142     // JIT'd code will jump here if lazy compilation fails, giving us an
 143     // opportunity to exit or throw an exception into JIT'd code.
 144     throw JITFailed();
 145   }
 146
 147   auto JIT = LLLazyJITBuilder()
 148                .setNumCompileThreads(4)
 149                .setLazyCompileFailureAddr(
 150                    toJITTargetAddress(&handleLazyCompileFailure))
 151                .create();
 152
 153   // ...
 154
 155 For users wanting to get started with LLJIT a minimal example program can be
 156 found at ``llvm/examples/HowToUseLLJIT``.
 157
 158 Design Overview
 159 ===============
 160
 161 ORC's JIT program model aims to emulate the linking and symbol resolution
 162 rules used by the static and dynamic linkers. This allows ORC to JIT
 163 arbitrary LLVM IR, including IR produced by an ordinary static compiler (e.g.
 164 clang) that uses constructs like symbol linkage and visibility, and weak [3]_
 165 and common symbol definitions.
 166
 167 To see how this works, imagine a program ``foo`` which links against a pair
 168 of dynamic libraries: ``libA`` and ``libB``. On the command line, building this
 169 program might look like:
 170
 171 .. code-block:: bash
 172
 173   $ clang++ -shared -o libA.dylib a1.cpp a2.cpp
 174   $ clang++ -shared -o libB.dylib b1.cpp b2.cpp
 175   $ clang++ -o myapp myapp.cpp -L. -lA -lB
 176   $ ./myapp
 177
 178 In ORC, this would translate into API calls on a hypothetical CXXCompilingLayer
 179 (with error checking omitted for brevity) as:
 180
 181 .. code-block:: c++
 182
 183   ExecutionSession ES;
 184   RTDyldObjectLinkingLayer ObjLinkingLayer(
 185       ES, []() { return std::make_unique<SectionMemoryManager>(); });
 186   CXXCompileLayer CXXLayer(ES, ObjLinkingLayer);
 187
 188   // Create JITDylib "A" and add code to it using the CXX layer.
 189   auto &LibA = ES.createJITDylib("A");
 190   CXXLayer.add(LibA, MemoryBuffer::getFile("a1.cpp"));
 191   CXXLayer.add(LibA, MemoryBuffer::getFile("a2.cpp"));
 192
 193   // Create JITDylib "B" and add code to it using the CXX layer.
 194   auto &LibB = ES.createJITDylib("B");
 195   CXXLayer.add(LibB, MemoryBuffer::getFile("b1.cpp"));
 196   CXXLayer.add(LibB, MemoryBuffer::getFile("b2.cpp"));
 197
 198   // Create and specify the search order for the main JITDylib. This is
 199   // equivalent to a "links against" relationship in a command-line link.
 200   auto &MainJD = ES.createJITDylib("main");
 201   MainJD.addToLinkOrder(&LibA);
 202   MainJD.addToLinkOrder(&LibB);
 203   CXXLayer.add(MainJD, MemoryBuffer::getFile("main.cpp"));
 204
 205   // Look up the JIT'd main, cast it to a function pointer, then call it.
 206   auto MainSym = ExitOnErr(ES.lookup({&MainJD}, "main"));
 207   auto *Main = (int(*)(int, char*[]))MainSym.getAddress();
 208
 209   int Result = Main(...);
 210
 211 This example tells us nothing about *how* or *when* compilation will happen.
 212 That will depend on the implementation of the hypothetical CXXCompilingLayer.
 213 The same linker-based symbol resolution rules will apply regardless of that
 214 implementation, however. For example, if a1.cpp and a2.cpp both define a
 215 function "foo" then ORCv2 will generate a duplicate definition error. On the
 216 other hand, if a1.cpp and b1.cpp both define "foo" there is no error (different
 217 dynamic libraries may define the same symbol). If main.cpp refers to "foo", it
 218 should bind to the definition in LibA rather than the one in LibB, since
 219 main.cpp is part of the "main" dylib, and the main dylib links against LibA
 220 before LibB.
 221
 222 Many JIT clients will have no need for this strict adherence to the usual
 223 ahead-of-time linking rules, and should be able to get by just fine by putting
 224 all of their code in a single JITDylib. However, clients who want to JIT code
 225 for languages/projects that traditionally rely on ahead-of-time linking (e.g.
 226 C++) will find that this feature makes life much easier.
 227
 228 Symbol lookup in ORC serves two other important functions, beyond providing
 229 addresses for symbols: (1) It triggers compilation of the symbol(s) searched for
 230 (if they have not been compiled already), and (2) it provides the
 231 synchronization mechanism for concurrent compilation. The pseudo-code for the
 232 lookup process is:
 233
 234 .. code-block:: none
 235
 236   construct a query object from a query set and query handler
 237   lock the session
 238   lodge query against requested symbols, collect required materializers (if any)
 239   unlock the session
 240   dispatch materializers (if any)
 241
 242 In this context a materializer is something that provides a working definition
 243 of a symbol upon request. Usually materializers are just wrappers for compilers,
 244 but they may also wrap a jit-linker directly (if the program representation
 245 backing the definitions is an object file), or may even be a class that writes
 246 bits directly into memory (for example, if the definitions are
 247 stubs). Materialization is the blanket term for any actions (compiling, linking,
 248 splatting bits, registering with runtimes, etc.) that are required to generate a
 249 symbol definition that is safe to call or access.
 250
 251 As each materializer completes its work it notifies the JITDylib, which in turn
 252 notifies any query objects that are waiting on the newly materialized
 253 definitions. Each query object maintains a count of the number of symbols that
 254 it is still waiting on, and once this count reaches zero the query object calls
 255 the query handler with a *SymbolMap* (a map of symbol names to addresses)
 256 describing the result. If any symbol fails to materialize the query immediately
 257 calls the query handler with an error.
 258
 259 The collected materialization units are sent to the ExecutionSession to be
 260 dispatched, and the dispatch behavior can be set by the client. By default each
 261 materializer is run on the calling thread. Clients are free to create new
 262 threads to run materializers, or to send the work to a work queue for a thread
 263 pool (this is what LLJIT/LLLazyJIT do).
 264
 265 Top Level APIs
 266 ==============
 267
 268 Many of ORC's top-level APIs are visible in the example above:
 269
 270 - *ExecutionSession* represents the JIT'd program and provides context for the
 271   JIT: It contains the JITDylibs, error reporting mechanisms, and dispatches the
 272   materializers.
 273
 274 - *JITDylibs* provide the symbol tables.
 275
 276 - *Layers* (ObjLinkingLayer and CXXLayer) are wrappers around compilers and
 277   allow clients to add uncompiled program representations supported by those
 278   compilers to JITDylibs.
 279
 280 - *ResourceTrackers* allow you to remove code.
 281
 282 Several other important APIs are used explicitly. JIT clients need not be aware
 283 of them, but Layer authors will use them:
 284
 285 - *MaterializationUnit* - When XXXLayer::add is invoked it wraps the given
 286   program representation (in this example, C++ source) in a MaterializationUnit,
 287   which is then stored in the JITDylib. MaterializationUnits are responsible for
 288   describing the definitions they provide, and for unwrapping the program
 289   representation and passing it back to the layer when compilation is required
 290   (this ownership shuffle makes writing thread-safe layers easier, since the
 291   ownership of the program representation will be passed back on the stack,
 292   rather than having to be fished out of a Layer member, which would require
 293   synchronization).
 294
 295 - *MaterializationResponsibility* - When a MaterializationUnit hands a program
 296   representation back to the layer it comes with an associated
 297   MaterializationResponsibility object. This object tracks the definitions
 298   that must be materialized and provides a way to notify the JITDylib once they
 299   are either successfully materialized or a failure occurs.
 300
 301 Absolute Symbols, Aliases, and Reexports
 302 ========================================
 303
 304 ORC makes it easy to define symbols with absolute addresses, or symbols that
 305 are simply aliases of other symbols:
 306
 307 Absolute Symbols
 308 ----------------
 309
 310 Absolute symbols are symbols that map directly to addresses without requiring
 311 further materialization, for example: "foo" = 0x1234. One use case for
 312 absolute symbols is allowing resolution of process symbols. E.g.
 313
 314 .. code-block: c++
 315
 316   JD.define(absoluteSymbols(SymbolMap({
 317       { Mangle("printf"),
 318         { pointerToJITTargetAddress(&printf),
 319           JITSymbolFlags::Callable } }
 320     });
 321
 322 With this mapping established code added to the JIT can refer to printf
 323 symbolically rather than requiring the address of printf to be "baked in".
 324 This in turn allows cached versions of the JIT'd code (e.g. compiled objects)
 325 to be re-used across JIT sessions as the JIT'd code no longer changes, only the
 326 absolute symbol definition does.
 327
 328 For process and library symbols the DynamicLibrarySearchGenerator utility (See
 329 :ref:`How to Add Process and Library Symbols to JITDylibs
 330 <ProcessAndLibrarySymbols>`) can be used to automatically build absolute
 331 symbol mappings for you. However the absoluteSymbols function is still useful
 332 for making non-global objects in your JIT visible to JIT'd code. For example,
 333 imagine that your JIT standard library needs access to your JIT object to make
 334 some calls. We could bake the address of your object into the library, but then
 335 it would need to be recompiled for each session:
 336
 337 .. code-block: c++
 338
 339   // From standard library for JIT'd code:
 340
 341   class MyJIT {
 342   public:
 343     void log(const char *Msg);
 344   };
 345
 346   void log(const char *Msg) { ((MyJIT*)0x1234)->log(Msg); }
 347
 348 We can turn this into a symbolic reference in the JIT standard library:
 349
 350 .. code-block: c++
 351
 352   extern MyJIT *__MyJITInstance;
 353
 354   void log(const char *Msg) { __MyJITInstance->log(Msg); }
 355
 356 And then make our JIT object visible to the JIT standard library with an
 357 absolute symbol definition when the JIT is started:
 358
 359 .. code-block: c++
 360
 361   MyJIT J = ...;
 362
 363   auto &JITStdLibJD = ... ;
 364
 365   JITStdLibJD.define(absoluteSymbols(SymbolMap({
 366       { Mangle("__MyJITInstance"),
 367         { pointerToJITTargetAddress(&J), JITSymbolFlags() } }
 368     });
 369
 370 Aliases and Reexports
 371 ---------------------
 372
 373 Aliases and reexports allow you to define new symbols that map to existing
 374 symbols. This can be useful for changing linkage relationships between symbols
 375 across sessions without having to recompile code. For example, imagine that
 376 JIT'd code has access to a log function, ``void log(const char*)`` for which
 377 there are two implementations in the JIT standard library: ``log_fast`` and
 378 ``log_detailed``. Your JIT can choose which one of these definitions will be
 379 used when the ``log`` symbol is referenced by setting up an alias at JIT startup
 380 time:
 381
 382 .. code-block: c++
 383
 384   auto &JITStdLibJD = ... ;
 385
 386   auto LogImplementationSymbol =
 387    Verbose ? Mangle("log_detailed") : Mangle("log_fast");
 388
 389   JITStdLibJD.define(
 390     symbolAliases(SymbolAliasMap({
 391         { Mangle("log"),
 392           { LogImplementationSymbol
 393             JITSymbolFlags::Exported | JITSymbolFlags::Callable } }
 394       });
 395
 396 The ``symbolAliases`` function allows you to define aliases within a single
 397 JITDylib. The ``reexports`` function provides the same functionality, but
 398 operates across JITDylib boundaries. E.g.
 399
 400 .. code-block: c++
 401
 402   auto &JD1 = ... ;
 403   auto &JD2 = ... ;
 404
 405   // Make 'bar' in JD2 an alias for 'foo' from JD1.
 406   JD2.define(
 407     reexports(JD1, SymbolAliasMap({
 408         { Mangle("bar"), { Mangle("foo"), JITSymbolFlags::Exported } }
 409       });
 410
 411 The reexports utility can be handy for composing a single JITDylib interface by
 412 re-exporting symbols from several other JITDylibs.
 413
 414 .. _Laziness:
 415
 416 Laziness
 417 ========
 418
 419 Laziness in ORC is provided by a utility called "lazy reexports". A lazy
 420 reexport is similar to a regular reexport or alias: It provides a new name for
 421 an existing symbol. Unlike regular reexports however, lookups of lazy reexports
 422 do not trigger immediate materialization of the reexported symbol. Instead, they
 423 only trigger materialization of a function stub. This function stub is
 424 initialized to point at a *lazy call-through*, which provides reentry into the
 425 JIT. If the stub is called at runtime then the lazy call-through will look up
 426 the reexported symbol (triggering materialization for it if necessary), update
 427 the stub (to call directly to the reexported symbol on subsequent calls), and
 428 then return via the reexported symbol. By re-using the existing symbol lookup
 429 mechanism, lazy reexports inherit the same concurrency guarantees: calls to lazy
 430 reexports can be made from multiple threads concurrently, and the reexported
 431 symbol can be any state of compilation (uncompiled, already in the process of
 432 being compiled, or already compiled) and the call will succeed. This allows
 433 laziness to be safely mixed with features like remote compilation, concurrent
 434 compilation, concurrent JIT'd code, and speculative compilation.
 435
 436 There is one other key difference between regular reexports and lazy reexports
 437 that some clients must be aware of: The address of a lazy reexport will be
 438 *different* from the address of the reexported symbol (whereas a regular
 439 reexport is guaranteed to have the same address as the reexported symbol).
 440 Clients who care about pointer equality will generally want to use the address
 441 of the reexport as the canonical address of the reexported symbol. This will
 442 allow the address to be taken without forcing materialization of the reexport.
 443
 444 Usage example:
 445
 446 If JITDylib ``JD`` contains definitions for symbols ``foo_body`` and
 447 ``bar_body``, we can create lazy entry points ``Foo`` and ``Bar`` in JITDylib
 448 ``JD2`` by calling:
 449
 450 .. code-block:: c++
 451
 452   auto ReexportFlags = JITSymbolFlags::Exported | JITSymbolFlags::Callable;
 453   JD2.define(
 454     lazyReexports(CallThroughMgr, StubsMgr, JD,
 455                   SymbolAliasMap({
 456                     { Mangle("foo"), { Mangle("foo_body"), ReexportedFlags } },
 457                     { Mangle("bar"), { Mangle("bar_body"), ReexportedFlags } }
 458                   }));
 459
 460 A full example of how to use lazyReexports with the LLJIT class can be found at
 461 ``llvm/examples/OrcV2Examples/LLJITWithLazyReexports``.
 462
 463 Supporting Custom Compilers
 464 ===========================
 465
 466 TBD.
 467
 468 .. _transitioning_orcv1_to_orcv2:
 469
 470 Transitioning from ORCv1 to ORCv2
 471 =================================
 472
 473 Since LLVM 7.0, new ORC development work has focused on adding support for
 474 concurrent JIT compilation. The new APIs (including new layer interfaces and
 475 implementations, and new utilities) that support concurrency are collectively
 476 referred to as ORCv2, and the original, non-concurrent layers and utilities
 477 are now referred to as ORCv1.
 478
 479 The majority of the ORCv1 layers and utilities were renamed with a 'Legacy'
 480 prefix in LLVM 8.0, and have deprecation warnings attached in LLVM 9.0. In LLVM
 481 12.0 ORCv1 will be removed entirely.
 482
 483 Transitioning from ORCv1 to ORCv2 should be easy for most clients. Most of the
 484 ORCv1 layers and utilities have ORCv2 counterparts [2]_ that can be directly
 485 substituted. However there are some design differences between ORCv1 and ORCv2
 486 to be aware of:
 487
 488   1. ORCv2 fully adopts the JIT-as-linker model that began with MCJIT. Modules
 489      (and other program representations, e.g. Object Files)  are no longer added
 490      directly to JIT classes or layers. Instead, they are added to ``JITDylib``
 491      instances *by* layers. The ``JITDylib`` determines *where* the definitions
 492      reside, the layers determine *how* the definitions will be compiled.
 493      Linkage relationships between ``JITDylibs`` determine how inter-module
 494      references are resolved, and symbol resolvers are no longer used. See the
 495      section `Design Overview`_ for more details.
 496
 497      Unless multiple JITDylibs are needed to model linkage relationships, ORCv1
 498      clients should place all code in a single JITDylib.
 499      MCJIT clients should use LLJIT (see `LLJIT and LLLazyJIT`_), and can place
 500      code in LLJIT's default created main JITDylib (See
 501      ``LLJIT::getMainJITDylib()``).
 502
 503   2. All JIT stacks now need an ``ExecutionSession`` instance. ExecutionSession
 504      manages the string pool, error reporting, synchronization, and symbol
 505      lookup.
 506
 507   3. ORCv2 uses uniqued strings (``SymbolStringPtr`` instances) rather than
 508      string values in order to reduce memory overhead and improve lookup
 509      performance. See the subsection `How to manage symbol strings`_.
 510
 511   4. IR layers require ThreadSafeModule instances, rather than
 512      std::unique_ptr<Module>s. ThreadSafeModule is a wrapper that ensures that
 513      Modules that use the same LLVMContext are not accessed concurrently.
 514      See `How to use ThreadSafeModule and ThreadSafeContext`_.
 515
 516   5. Symbol lookup is no longer handled by layers. Instead, there is a
 517      ``lookup`` method on JITDylib that takes a list of JITDylibs to scan.
 518
 519      .. code-block:: c++
 520
 521        ExecutionSession ES;
 522        JITDylib &JD1 = ...;
 523        JITDylib &JD2 = ...;
 524
 525        auto Sym = ES.lookup({&JD1, &JD2}, ES.intern("_main"));
 526
 527   6. The removeModule/removeObject methods are replaced by
 528      ``ResourceTracker::remove``.
 529      See the subsection `How to remove code`_.
 530
 531 For code examples and suggestions of how to use the ORCv2 APIs, please see
 532 the section `How-tos`_.
 533
 534 How-tos
 535 =======
 536
 537 How to manage symbol strings
 538 ----------------------------
 539
 540 Symbol strings in ORC are uniqued to improve lookup performance, reduce memory
 541 overhead, and allow symbol names to function as efficient keys. To get the
 542 unique ``SymbolStringPtr`` for a string value, call the
 543 ``ExecutionSession::intern`` method:
 544
 545   .. code-block:: c++
 546
 547     ExecutionSession ES;
 548     /// ...
 549     auto MainSymbolName = ES.intern("main");
 550
 551 If you wish to perform lookup using the C/IR name of a symbol you will also
 552 need to apply the platform linker-mangling before interning the string. On
 553 Linux this mangling is a no-op, but on other platforms it usually involves
 554 adding a prefix to the string (e.g. '_' on Darwin). The mangling scheme is
 555 based on the DataLayout for the target. Given a DataLayout and an
 556 ExecutionSession, you can create a MangleAndInterner function object that
 557 will perform both jobs for you:
 558
 559   .. code-block:: c++
 560
 561     ExecutionSession ES;
 562     const DataLayout &DL = ...;
 563     MangleAndInterner Mangle(ES, DL);
 564
 565     // ...
 566
 567     // Portable IR-symbol-name lookup:
 568     auto Sym = ES.lookup({&MainJD}, Mangle("main"));
 569
 570 How to create JITDylibs and set up linkage relationships
 571 --------------------------------------------------------
 572
 573 In ORC, all symbol definitions reside in JITDylibs. JITDylibs are created by
 574 calling the ``ExecutionSession::createJITDylib`` method with a unique name:
 575
 576   .. code-block:: c++
 577
 578     ExecutionSession ES;
 579     auto &JD = ES.createJITDylib("libFoo.dylib");
 580
 581 The JITDylib is owned by the ``ExecutionEngine`` instance and will be freed
 582 when it is destroyed.
 583
 584 How to remove code
 585 ------------------
 586
 587 To remove an individual module from a JITDylib it must first be added using an
 588 explicit ``ResourceTracker``. The module can then be removed by calling
 589 ``ResourceTracker::remove``:
 590
 591   .. code-block:: c++
 592
 593     auto &JD = ... ;
 594     auto M = ... ;
 595
 596     auto RT = JD.createResourceTracker();
 597     Layer.add(RT, std::move(M)); // Add M to JD, tracking resources with RT
 598
 599     RT.remove(); // Remove M from JD.
 600
 601 Modules added directly to a JITDylib will be tracked by that JITDylib's default
 602 resource tracker.
 603
 604 All code can be removed from a JITDylib by calling ``JITDylib::clear``. This
 605 leaves the cleared JITDylib in an empty but usable state.
 606
 607 JITDylibs can be removed by calling ``ExecutionSession::removeJITDylib``. This
 608 clears the JITDylib and then puts it into a defunct state. No further operations
 609 can be performed on the JITDylib, and it will be destroyed as soon as the last
 610 handle to it is released.
 611
 612 An example of how to use the resource management APIs can be found at
 613 ``llvm/examples/OrcV2Examples/LLJITRemovableCode``.
 614
 615
 616 How to add the support for custom program representation
 617 --------------------------------------------------------
 618 In order to add the support for a custom program representation, a custom ``MaterializationUnit``
 619 for the program representation, and a custom ``Layer`` are needed. The Layer will have two
 620 operations: ``add`` and ``emit``. The ``add`` operation takes an instance of your program
 621 representation, builds one of your custom ``MaterializationUnits`` to hold it, then adds it
 622 to a ``JITDylib``. The emit operation takes a ``MaterializationResponsibility`` object and an
 623 instance of your program representation and materializes it, usually by compiling it and handing
 624 the resulting object off to an ``ObjectLinkingLayer``.
 625
 626 Your custom ``MaterializationUnit`` will have two operations: ``materialize`` and ``discard``. The
 627 ``materialize`` function will be called for you when any symbol provided by the unit is looked up,
 628 and it should just call the ``emit`` function on your layer, passing in the given
 629 ``MaterializationResponsibility`` and the wrapped program representation. The ``discard`` function
 630 will be called if some weak symbol provided by your unit is not needed (because the JIT found an
 631 overriding definition). You can use this to drop your definition early, or just ignore it and let
 632 the linker drops the definition later.
 633
 634 Here is an example of an ASTLayer:
 635
 636   .. code-block:: c++
 637
 638     // ... In you JIT class
 639     AstLayer astLayer;
 640     // ...
 641
 642
 643     class AstMaterializationUnit : public orc::MaterializationUnit {
 644     public:
 645       AstMaterializationUnit(AstLayer &l, Ast &ast)
 646       : llvm::orc::MaterializationUnit(l.getInterface(ast)), astLayer(l),
 647       ast(ast) {};
 648
 649       llvm::StringRef getName() const override {
 650         return "AstMaterializationUnit";
 651       }
 652
 653       void materialize(std::unique_ptr<orc::MaterializationResponsibility> r) override {
 654         astLayer.emit(std::move(r), ast);
 655       };
 656
 657     private:
 658       void discard(const llvm::orc::JITDylib &jd, const llvm::orc::SymbolStringPtr &sym) override {
 659         llvm_unreachable("functions are not overridable");
 660       }
 661
 662
 663       AstLayer &astLayer;
 664       Ast &ast;
 665     };
 666
 667     class AstLayer {
 668       llvhm::orc::IRLayer &baseLayer;
 669       llvhm::orc::MangleAndInterner &mangler;
 670
 671     public:
 672       AstLayer(llvm::orc::IRLayer &baseLayer, llvm::orc::MangleAndInterner &mangler)
 673       : baseLayer(baseLayer), mangler(mangler){};
 674
 675       llvm::Error add(llvm::orc::ResourceTrackerSP &rt, Ast &ast) {
 676         return rt->getJITDylib().define(std::make_unique<AstMaterializationUnit>(*this, ast), rt);
 677       }
 678
 679       void emit(std::unique_ptr<orc::MaterializationResponsibility> mr, Ast &ast) {
 680         // compileAst is just function that compiles the given AST and returns
 681         // a `llvm::orc::ThreadSafeModule`
 682         baseLayer.emit(std::move(mr), compileAst(ast));
 683       }
 684
 685       llvm::orc::MaterializationUnit::Interface getInterface(Ast &ast) {
 686           SymbolFlagsMap Symbols;
 687           // Find all the symbols in the AST and for each of them
 688           // add it to the Symbols map.
 689           Symbols[mangler(someNameFromAST)] =
 690             JITSymbolFlags(JITSymbolFlags::Exported | JITSymbolFlags::Callable);
 691           return MaterializationUnit::Interface(std::move(Symbols), nullptr);
 692       }
 693     };
 694
 695 Take look at the source code of `Building A JIT's Chapter 4 <tutorial/BuildingAJIT4.html>`_ for a complete example.
 696
 697 How to use ThreadSafeModule and ThreadSafeContext
 698 -------------------------------------------------
 699
 700 ThreadSafeModule and ThreadSafeContext are wrappers around Modules and
 701 LLVMContexts respectively. A ThreadSafeModule is a pair of a
 702 std::unique_ptr<Module> and a (possibly shared) ThreadSafeContext value. A
 703 ThreadSafeContext is a pair of a std::unique_ptr<LLVMContext> and a lock.
 704 This design serves two purposes: providing a locking scheme and lifetime
 705 management for LLVMContexts. The ThreadSafeContext may be locked to prevent
 706 accidental concurrent access by two Modules that use the same LLVMContext.
 707 The underlying LLVMContext is freed once all ThreadSafeContext values pointing
 708 to it are destroyed, allowing the context memory to be reclaimed as soon as
 709 the Modules referring to it are destroyed.
 710
 711 ThreadSafeContexts can be explicitly constructed from a
 712 std::unique_ptr<LLVMContext>:
 713
 714   .. code-block:: c++
 715
 716     ThreadSafeContext TSCtx(std::make_unique<LLVMContext>());
 717
 718 ThreadSafeModules can be constructed from a pair of a std::unique_ptr<Module>
 719 and a ThreadSafeContext value. ThreadSafeContext values may be shared between
 720 multiple ThreadSafeModules:
 721
 722   .. code-block:: c++
 723
 724     ThreadSafeModule TSM1(
 725       std::make_unique<Module>("M1", *TSCtx.getContext()), TSCtx);
 726
 727     ThreadSafeModule TSM2(
 728       std::make_unique<Module>("M2", *TSCtx.getContext()), TSCtx);
 729
 730 Before using a ThreadSafeContext, clients should ensure that either the context
 731 is only accessible on the current thread, or that the context is locked. In the
 732 example above (where the context is never locked) we rely on the fact that both
 733 ``TSM1`` and ``TSM2``, and TSCtx are all created on one thread. If a context is
 734 going to be shared between threads then it must be locked before any accessing
 735 or creating any Modules attached to it. E.g.
 736
 737   .. code-block:: c++
 738
 739     ThreadSafeContext TSCtx(std::make_unique<LLVMContext>());
 740
 741     ThreadPool TP(NumThreads);
 742     JITStack J;
 743
 744     for (auto &ModulePath : ModulePaths) {
 745       TP.async(
 746         [&]() {
 747           auto Lock = TSCtx.getLock();
 748           auto M = loadModuleOnContext(ModulePath, TSCtx.getContext());
 749           J.addModule(ThreadSafeModule(std::move(M), TSCtx));
 750         });
 751     }
 752
 753     TP.wait();
 754
 755 To make exclusive access to Modules easier to manage the ThreadSafeModule class
 756 provides a convenience function, ``withModuleDo``, that implicitly (1) locks the
 757 associated context, (2) runs a given function object, (3) unlocks the context,
 758 and (3) returns the result generated by the function object. E.g.
 759
 760   .. code-block:: c++
 761
 762     ThreadSafeModule TSM = getModule(...);
 763
 764     // Dump the module:
 765     size_t NumFunctionsInModule =
 766       TSM.withModuleDo(
 767         [](Module &M) { // <- Context locked before entering lambda.
 768           return M.size();
 769         } // <- Context unlocked after leaving.
 770       );
 771
 772 Clients wishing to maximize possibilities for concurrent compilation will want
 773 to create every new ThreadSafeModule on a new ThreadSafeContext. For this
 774 reason a convenience constructor for ThreadSafeModule is provided that implicitly
 775 constructs a new ThreadSafeContext value from a std::unique_ptr<LLVMContext>:
 776
 777   .. code-block:: c++
 778
 779     // Maximize concurrency opportunities by loading every module on a
 780     // separate context.
 781     for (const auto &IRPath : IRPaths) {
 782       auto Ctx = std::make_unique<LLVMContext>();
 783       auto M = std::make_unique<LLVMContext>("M", *Ctx);
 784       CompileLayer.add(MainJD, ThreadSafeModule(std::move(M), std::move(Ctx)));
 785     }
 786
 787 Clients who plan to run single-threaded may choose to save memory by loading
 788 all modules on the same context:
 789
 790   .. code-block:: c++
 791
 792     // Save memory by using one context for all Modules:
 793     ThreadSafeContext TSCtx(std::make_unique<LLVMContext>());
 794     for (const auto &IRPath : IRPaths) {
 795       ThreadSafeModule TSM(parsePath(IRPath, *TSCtx.getContext()), TSCtx);
 796       CompileLayer.add(MainJD, ThreadSafeModule(std::move(TSM));
 797     }
 798
 799 .. _ProcessAndLibrarySymbols:
 800
 801 How to Add Process and Library Symbols to JITDylibs
 802 ===================================================
 803
 804 JIT'd code may need to access symbols in the host program or in supporting
 805 libraries. The best way to enable this is to reflect these symbols into your
 806 JITDylibs so that they appear the same as any other symbol defined within the
 807 execution session (i.e. they are findable via `ExecutionSession::lookup`, and
 808 so visible to the JIT linker during linking).
 809
 810 One way to reflect external symbols is to add them manually using the
 811 absoluteSymbols function:
 812
 813   .. code-block:: c++
 814
 815     const DataLayout &DL = getDataLayout();
 816     MangleAndInterner Mangle(ES, DL);
 817
 818     auto &JD = ES.createJITDylib("main");
 819
 820     JD.define(
 821       absoluteSymbols({
 822         { Mangle("puts"), pointerToJITTargetAddress(&puts)},
 823         { Mangle("gets"), pointerToJITTargetAddress(&getS)}
 824       }));
 825
 826 Using absoluteSymbols is reasonable if the set of symbols to be reflected is
 827 small and fixed. On the other hand, if the set of symbols is large or variable
 828 it may make more sense to have the definitions added for you on demand by a
 829 *definition generator*.A definition generator is an object that can be attached
 830 to a JITDylib, receiving a callback whenever a lookup within that JITDylib fails
 831 to find one or more symbols. The definition generator is given a chance to
 832 produce a definition of the missing symbol(s) before the lookup proceeds.
 833
 834 ORC provides the ``DynamicLibrarySearchGenerator`` utility for reflecting symbols
 835 from the process (or a specific dynamic library) for you. For example, to reflect
 836 the whole interface of a runtime library:
 837
 838   .. code-block:: c++
 839
 840     const DataLayout &DL = getDataLayout();
 841     auto &JD = ES.createJITDylib("main");
 842
 843     if (auto DLSGOrErr =
 844         DynamicLibrarySearchGenerator::Load("/path/to/lib"
 845                                             DL.getGlobalPrefix()))
 846       JD.addGenerator(std::move(*DLSGOrErr);
 847     else
 848       return DLSGOrErr.takeError();
 849
 850     // IR added to JD can now link against all symbols exported by the library
 851     // at '/path/to/lib'.
 852     CompileLayer.add(JD, loadModule(...));
 853
 854 The ``DynamicLibrarySearchGenerator`` utility can also be constructed with a
 855 filter function to restrict the set of symbols that may be reflected. For
 856 example, to expose an allowed set of symbols from the main process:
 857
 858   .. code-block:: c++
 859
 860     const DataLayout &DL = getDataLayout();
 861     MangleAndInterner Mangle(ES, DL);
 862
 863     auto &JD = ES.createJITDylib("main");
 864
 865     DenseSet<SymbolStringPtr> AllowList({
 866         Mangle("puts"),
 867         Mangle("gets")
 868       });
 869
 870     // Use GetForCurrentProcess with a predicate function that checks the
 871     // allowed list.
 872     JD.addGenerator(cantFail(DynamicLibrarySearchGenerator::GetForCurrentProcess(
 873           DL.getGlobalPrefix(),
 874           [&](const SymbolStringPtr &S) { return AllowList.count(S); })));
 875
 876     // IR added to JD can now link against any symbols exported by the process
 877     // and contained in the list.
 878     CompileLayer.add(JD, loadModule(...));
 879
 880 References to process or library symbols could also be hardcoded into your IR
 881 or object files using the symbols' raw addresses, however symbolic resolution
 882 using the JIT symbol tables should be preferred: it keeps the IR and objects
 883 readable and reusable in subsequent JIT sessions. Hardcoded addresses are
 884 difficult to read, and usually only good for one session.
 885
 886 Roadmap
 887 =======
 888
 889 ORC is still undergoing active development. Some current and future works are
 890 listed below.
 891
 892 Current Work
 893 ------------
 894
 895 1. **TargetProcessControl: Improvements to in-tree support for out-of-process
 896    execution**
 897
 898    The ``TargetProcessControl`` API provides various operations on the JIT
 899    target process (the one which will execute the JIT'd code), including
 900    memory allocation, memory writes, function execution, and process queries
 901    (e.g. for the target triple). By targeting this API new components can be
 902    developed which will work equally well for in-process and out-of-process
 903    JITing.
 904
 905
 906 2. **ORC RPC based TargetProcessControl implementation**
 907
 908    An ORC RPC based implementation of the ``TargetProcessControl`` API is
 909    currently under development to enable easy out-of-process JITing via
 910    file descriptors / sockets.
 911
 912 3. **Core State Machine Cleanup**
 913
 914    The core ORC state machine is currently implemented between JITDylib and
 915    ExecutionSession. Methods are slowly being moved to `ExecutionSession`. This
 916    will tidy up the code base, and also allow us to support asynchronous removal
 917    of JITDylibs (in practice deleting an associated state object in
 918    ExecutionSession and leaving the JITDylib instance in a defunct state until
 919    all references to it have been released).
 920
 921 Near Future Work
 922 ----------------
 923
 924 1. **ORC JIT Runtime Libraries**
 925
 926    We need a runtime library for JIT'd code. This would include things like
 927    TLS registration, reentry functions, registration code for language runtimes
 928    (e.g. Objective C and Swift) and other JIT specific runtime code. This should
 929    be built in a similar manner to compiler-rt (possibly even as part of it).
 930
 931 2. **Remote jit_dlopen / jit_dlclose**
 932
 933    To more fully mimic the environment that static programs operate in we would
 934    like JIT'd code to be able to "dlopen" and "dlclose" JITDylibs, running all of
 935    their initializers/deinitializers on the current thread. This would require
 936    support from the runtime library described above.
 937
 938 3. **Debugging support**
 939
 940    ORC currently supports the GDBRegistrationListener API when using RuntimeDyld
 941    as the underlying JIT linker. We will need a new solution for JITLink based
 942    platforms.
 943
 944 Further Future Work
 945 -------------------
 946
 947 1. **Speculative Compilation**
 948
 949    ORC's support for concurrent compilation allows us to easily enable
 950    *speculative* JIT compilation: compilation of code that is not needed yet,
 951    but which we have reason to believe will be needed in the future. This can be
 952    used to hide compile latency and improve JIT throughput. A proof-of-concept
 953    example of speculative compilation with ORC has already been developed (see
 954    ``llvm/examples/SpeculativeJIT``). Future work on this is likely to focus on
 955    re-using and improving existing profiling support (currently used by PGO) to
 956    feed speculation decisions, as well as built-in tools to simplify use of
 957    speculative compilation.
 958
 959 .. [1] Formats/architectures vary in terms of supported features. MachO and
 960        ELF tend to have better support than COFF. Patches very welcome!
 961
 962 .. [2] The ``LazyEmittingLayer``, ``RemoteObjectClientLayer`` and
 963        ``RemoteObjectServerLayer`` do not have counterparts in the new
 964        system. In the case of ``LazyEmittingLayer`` it was simply no longer
 965        needed: in ORCv2, deferring compilation until symbols are looked up is
 966        the default. The removal of ``RemoteObjectClientLayer`` and
 967        ``RemoteObjectServerLayer`` means that JIT stacks can no longer be split
 968        across processes, however this functionality appears not to have been
 969        used.
 970
 971 .. [3] Weak definitions are currently handled correctly within dylibs, but if
 972        multiple dylibs provide a weak definition of a symbol then each will end
 973        up with its own definition (similar to how weak definitions are handled
 974        in Windows DLLs). This will be fixed in the future.