llvm/docs/ORCv2.rst

   1 ===============================
   2 ORC Design and Implementation
   3 ===============================
   4
   5 .. contents::
   6    :local:
   7
   8 Introduction
   9 ============
  10
  11 This document aims to provide a high-level overview of the design and
  12 implementation of the ORC JIT APIs. Except where otherwise stated, all
  13 discussion applies to the design of the APIs as of LLVM Version 10 (ORCv2).
  14
  15 Use-cases
  16 =========
  17
  18 ORC provides a modular API for building JIT compilers. There are a range
  19 of use cases for such an API. For example:
  20
  21 1. The LLVM tutorials use a simple ORC-based JIT class to execute expressions
  22 compiled from a toy language: Kaleidoscope.
  23
  24 2. The LLVM debugger, LLDB, uses a cross-compiling JIT for expression
  25 evaluation. In this use case, cross compilation allows expressions compiled
  26 in the debugger process to be executed on the debug target process, which may
  27 be on a different device/architecture.
  28
  29 3. In high-performance JITs (e.g. JVMs, Julia) that want to make use of LLVM's
  30 optimizations within an existing JIT infrastructure.
  31
  32 4. In interpreters and REPLs, e.g. Cling (C++) and the Swift interpreter.
  33
  34 By adopting a modular, library-based design we aim to make ORC useful in as many
  35 of these contexts as possible.
  36
  37 Features
  38 ========
  39
  40 ORC provides the following features:
  41
  42 *JIT-linking*
  43   ORC provides APIs to link relocatable object files (COFF, ELF, MachO) [1]_
  44   into a target process at runtime. The target process may be the same process
  45   that contains the JIT session object and jit-linker, or may be another process
  46   (even one running on a different machine or architecture) that communicates
  47   with the JIT via RPC.
  48
  49 *LLVM IR compilation*
  50   ORC provides off the shelf components (IRCompileLayer, SimpleCompiler,
  51   ConcurrentIRCompiler) that make it easy to add LLVM IR to a JIT'd process.
  52
  53 *Eager and lazy compilation*
  54   By default, ORC will compile symbols as soon as they are looked up in the JIT
  55   session object (``ExecutionSession``). Compiling eagerly by default makes it
  56   easy to use ORC as a simple in-memory compiler within an existing JIT
  57   infrastructure. However ORC also provides support for lazy compilation via
  58   lazy-reexports (see :ref:`Laziness`).
  59
  60 *Support for Custom Compilers and Program Representations*
  61   Clients can supply custom compilers for each symbol that they define in their
  62   JIT session. ORC will run the user-supplied compiler when the a definition of
  63   a symbol is needed. ORC is actually fully language agnostic: LLVM IR is not
  64   treated specially, and is supported via the same wrapper mechanism (the
  65   ``MaterializationUnit`` class) that is used for custom compilers.
  66
  67 *Concurrent JIT'd code* and *Concurrent Compilation*
  68   JIT'd code may spawn multiple threads, and may re-enter the JIT (e.g. for lazy
  69   compilation) concurrently from multiple threads. The ORC APIs also support
  70   running multiple compilers concurrently. Built-in dependency tracking (via the
  71   JIT linker) ensures that ORC does not release code for execution until it is
  72   safe to call.
  73
  74 *Orthogonality* and *Composability*
  75   Each of the features above can be used (or not) independently. It is possible
  76   to put ORC components together to make a non-lazy, in-process, single threaded
  77   JIT or a lazy, out-of-process, concurrent JIT, or anything in between.
  78
  79 LLJIT and LLLazyJIT
  80 ===================
  81
  82 ORC provides two basic JIT classes off-the-shelf. These are useful both as
  83 examples of how to assemble ORC components to make a JIT, and as replacements
  84 for earlier LLVM JIT APIs (e.g. MCJIT).
  85
  86 The LLJIT class uses an IRCompileLayer and RTDyldObjectLinkingLayer to support
  87 compilation of LLVM IR and linking of relocatable object files. All operations
  88 are performed eagerly on symbol lookup (i.e. a symbol's definition is compiled
  89 as soon as you attempt to look up its address). LLJIT is a suitable replacement
  90 for MCJIT in most cases (note: some more advanced features, e.g.
  91 JITEventListeners are not supported yet).
  92
  93 The LLLazyJIT extends LLJIT and adds a CompileOnDemandLayer to enable lazy
  94 compilation of LLVM IR. When an LLVM IR module is added via the addLazyIRModule
  95 method, function bodies in that module will not be compiled until they are first
  96 called. LLLazyJIT aims to provide a replacement of LLVM's original (pre-MCJIT)
  97 JIT API.
  98
  99 LLJIT and LLLazyJIT instances can be created using their respective builder
 100 classes: LLJITBuilder and LLazyJITBuilder. For example, assuming you have a
 101 module ``M`` loaded on a ThreadSafeContext ``Ctx``:
 102
 103 .. code-block:: c++
 104
 105   // Try to detect the host arch and construct an LLJIT instance.
 106   auto JIT = LLJITBuilder().create();
 107
 108   // If we could not construct an instance, return an error.
 109   if (!JIT)
 110     return JIT.takeError();
 111
 112   // Add the module.
 113   if (auto Err = JIT->addIRModule(TheadSafeModule(std::move(M), Ctx)))
 114     return Err;
 115
 116   // Look up the JIT'd code entry point.
 117   auto EntrySym = JIT->lookup("entry");
 118   if (!EntrySym)
 119     return EntrySym.takeError();
 120
 121   // Cast the entry point address to a function pointer.
 122   auto *Entry = (void(*)())EntrySym.getAddress();
 123
 124   // Call into JIT'd code.
 125   Entry();
 126
 127 The builder classes provide a number of configuration options that can be
 128 specified before the JIT instance is constructed. For example:
 129
 130 .. code-block:: c++
 131
 132   // Build an LLLazyJIT instance that uses four worker threads for compilation,
 133   // and jumps to a specific error handler (rather than null) on lazy compile
 134   // failures.
 135
 136   void handleLazyCompileFailure() {
 137     // JIT'd code will jump here if lazy compilation fails, giving us an
 138     // opportunity to exit or throw an exception into JIT'd code.
 139     throw JITFailed();
 140   }
 141
 142   auto JIT = LLLazyJITBuilder()
 143                .setNumCompileThreads(4)
 144                .setLazyCompileFailureAddr(
 145                    toJITTargetAddress(&handleLazyCompileFailure))
 146                .create();
 147
 148   // ...
 149
 150 For users wanting to get started with LLJIT a minimal example program can be
 151 found at ``llvm/examples/HowToUseLLJIT``.
 152
 153 Design Overview
 154 ===============
 155
 156 ORC's JIT'd program model aims to emulate the linking and symbol resolution
 157 rules used by the static and dynamic linkers. This allows ORC to JIT
 158 arbitrary LLVM IR, including IR produced by an ordinary static compiler (e.g.
 159 clang) that uses constructs like symbol linkage and visibility, and weak [3]_
 160 and common symbol definitions.
 161
 162 To see how this works, imagine a program ``foo`` which links against a pair
 163 of dynamic libraries: ``libA`` and ``libB``. On the command line, building this
 164 program might look like:
 165
 166 .. code-block:: bash
 167
 168   $ clang++ -shared -o libA.dylib a1.cpp a2.cpp
 169   $ clang++ -shared -o libB.dylib b1.cpp b2.cpp
 170   $ clang++ -o myapp myapp.cpp -L. -lA -lB
 171   $ ./myapp
 172
 173 In ORC, this would translate into API calls on a "CXXCompilingLayer" (with error
 174 checking omitted for brevity) as:
 175
 176 .. code-block:: c++
 177
 178   ExecutionSession ES;
 179   RTDyldObjectLinkingLayer ObjLinkingLayer(
 180       ES, []() { return std::make_unique<SectionMemoryManager>(); });
 181   CXXCompileLayer CXXLayer(ES, ObjLinkingLayer);
 182
 183   // Create JITDylib "A" and add code to it using the CXX layer.
 184   auto &LibA = ES.createJITDylib("A");
 185   CXXLayer.add(LibA, MemoryBuffer::getFile("a1.cpp"));
 186   CXXLayer.add(LibA, MemoryBuffer::getFile("a2.cpp"));
 187
 188   // Create JITDylib "B" and add code to it using the CXX layer.
 189   auto &LibB = ES.createJITDylib("B");
 190   CXXLayer.add(LibB, MemoryBuffer::getFile("b1.cpp"));
 191   CXXLayer.add(LibB, MemoryBuffer::getFile("b2.cpp"));
 192
 193   // Specify the search order for the main JITDylib. This is equivalent to a
 194   // "links against" relationship in a command-line link.
 195   ES.getMainJITDylib().setSearchOrder({{&LibA, false}, {&LibB, false}});
 196   CXXLayer.add(ES.getMainJITDylib(), MemoryBuffer::getFile("main.cpp"));
 197
 198   // Look up the JIT'd main, cast it to a function pointer, then call it.
 199   auto MainSym = ExitOnErr(ES.lookup({&ES.getMainJITDylib()}, "main"));
 200   auto *Main = (int(*)(int, char*[]))MainSym.getAddress();
 201
 202   int Result = Main(...);
 203
 204 This example tells us nothing about *how* or *when* compilation will happen.
 205 That will depend on the implementation of the hypothetical CXXCompilingLayer.
 206 The same linker-based symbol resolution rules will apply regardless of that
 207 implementation, however. For example, if a1.cpp and a2.cpp both define a
 208 function "foo" then ORCv2 will generate a duplicate definition error. On the
 209 other hand, if a1.cpp and b1.cpp both define "foo" there is no error (different
 210 dynamic libraries may define the same symbol). If main.cpp refers to "foo", it
 211 should bind to the definition in LibA rather than the one in LibB, since
 212 main.cpp is part of the "main" dylib, and the main dylib links against LibA
 213 before LibB.
 214
 215 Many JIT clients will have no need for this strict adherence to the usual
 216 ahead-of-time linking rules, and should be able to get by just fine by putting
 217 all of their code in a single JITDylib. However, clients who want to JIT code
 218 for languages/projects that traditionally rely on ahead-of-time linking (e.g.
 219 C++) will find that this feature makes life much easier.
 220
 221 Symbol lookup in ORC serves two other important functions, beyond providing
 222 addresses for symbols: (1) It triggers compilation of the symbol(s) searched for
 223 (if they have not been compiled already), and (2) it provides the
 224 synchronization mechanism for concurrent compilation. The pseudo-code for the
 225 lookup process is:
 226
 227 .. code-block:: none
 228
 229   construct a query object from a query set and query handler
 230   lock the session
 231   lodge query against requested symbols, collect required materializers (if any)
 232   unlock the session
 233   dispatch materializers (if any)
 234
 235 In this context a materializer is something that provides a working definition
 236 of a symbol upon request. Usually materializers are just wrappers for compilers,
 237 but they may also wrap a jit-linker directly (if the program representation
 238 backing the definitions is an object file), or may even be a class that writes
 239 bits directly into memory (for example, if the definitions are
 240 stubs). Materialization is the blanket term for any actions (compiling, linking,
 241 splatting bits, registering with runtimes, etc.) that are required to generate a
 242 symbol definition that is safe to call or access.
 243
 244 As each materializer completes its work it notifies the JITDylib, which in turn
 245 notifies any query objects that are waiting on the newly materialized
 246 definitions. Each query object maintains a count of the number of symbols that
 247 it is still waiting on, and once this count reaches zero the query object calls
 248 the query handler with a *SymbolMap* (a map of symbol names to addresses)
 249 describing the result. If any symbol fails to materialize the query immediately
 250 calls the query handler with an error.
 251
 252 The collected materialization units are sent to the ExecutionSession to be
 253 dispatched, and the dispatch behavior can be set by the client. By default each
 254 materializer is run on the calling thread. Clients are free to create new
 255 threads to run materializers, or to send the work to a work queue for a thread
 256 pool (this is what LLJIT/LLLazyJIT do).
 257
 258 Top Level APIs
 259 ==============
 260
 261 Many of ORC's top-level APIs are visible in the example above:
 262
 263 - *ExecutionSession* represents the JIT'd program and provides context for the
 264   JIT: It contains the JITDylibs, error reporting mechanisms, and dispatches the
 265   materializers.
 266
 267 - *JITDylibs* provide the symbol tables.
 268
 269 - *Layers* (ObjLinkingLayer and CXXLayer) are wrappers around compilers and
 270   allow clients to add uncompiled program representations supported by those
 271   compilers to JITDylibs.
 272
 273 Several other important APIs are used explicitly. JIT clients need not be aware
 274 of them, but Layer authors will use them:
 275
 276 - *MaterializationUnit* - When XXXLayer::add is invoked it wraps the given
 277   program representation (in this example, C++ source) in a MaterializationUnit,
 278   which is then stored in the JITDylib. MaterializationUnits are responsible for
 279   describing the definitions they provide, and for unwrapping the program
 280   representation and passing it back to the layer when compilation is required
 281   (this ownership shuffle makes writing thread-safe layers easier, since the
 282   ownership of the program representation will be passed back on the stack,
 283   rather than having to be fished out of a Layer member, which would require
 284   synchronization).
 285
 286 - *MaterializationResponsibility* - When a MaterializationUnit hands a program
 287   representation back to the layer it comes with an associated
 288   MaterializationResponsibility object. This object tracks the definitions
 289   that must be materialized and provides a way to notify the JITDylib once they
 290   are either successfully materialized or a failure occurs.
 291
 292 Absolute Symbols, Aliases, and Reexports
 293 ========================================
 294
 295 ORC makes it easy to define symbols with absolute addresses, or symbols that
 296 are simply aliases of other symbols:
 297
 298 Absolute Symbols
 299 ----------------
 300
 301 Absolute symbols are symbols that map directly to addresses without requiring
 302 further materialization, for example: "foo" = 0x1234. One use case for
 303 absolute symbols is allowing resolution of process symbols. E.g.
 304
 305 .. code-block: c++
 306
 307   JD.define(absoluteSymbols(SymbolMap({
 308       { Mangle("printf"),
 309         { pointerToJITTargetAddress(&printf),
 310           JITSymbolFlags::Callable } }
 311     });
 312
 313 With this mapping established code added to the JIT can refer to printf
 314 symbolically rather than requiring the address of printf to be "baked in".
 315 This in turn allows cached versions of the JIT'd code (e.g. compiled objects)
 316 to be re-used across JIT sessions as the JIT'd code no longer changes, only the
 317 absolute symbol definition does.
 318
 319 For process and library symbols the DynamicLibrarySearchGenerator utility (See
 320 :ref:`How to Add Process and Library Symbols to JITDylibs
 321 <ProcessAndLibrarySymbols>`) can be used to automatically build absolute
 322 symbol mappings for you. However the absoluteSymbols function is still useful
 323 for making non-global objects in your JIT visible to JIT'd code. For example,
 324 imagine that your JIT standard library needs access to your JIT object to make
 325 some calls. We could bake the address of your object into the library, but then
 326 it would need to be recompiled for each session:
 327
 328 .. code-block: c++
 329
 330   // From standard library for JIT'd code:
 331
 332   class MyJIT {
 333   public:
 334     void log(const char *Msg);
 335   };
 336
 337   void log(const char *Msg) { ((MyJIT*)0x1234)->log(Msg); }
 338
 339 We can turn this into a symbolic reference in the JIT standard library:
 340
 341 .. code-block: c++
 342
 343   extern MyJIT *__MyJITInstance;
 344
 345   void log(const char *Msg) { __MyJITInstance->log(Msg); }
 346
 347 And then make our JIT object visible to the JIT standard library with an
 348 absolute symbol definition when the JIT is started:
 349
 350 .. code-block: c++
 351
 352   MyJIT J = ...;
 353
 354   auto &JITStdLibJD = ... ;
 355
 356   JITStdLibJD.define(absoluteSymbols(SymbolMap({
 357       { Mangle("__MyJITInstance"),
 358         { pointerToJITTargetAddress(&J), JITSymbolFlags() } }
 359     });
 360
 361 Aliases and Reexports
 362 ---------------------
 363
 364 Aliases and reexports allow you to define new symbols that map to existing
 365 symbols. This can be useful for changing linkage relationships between symbols
 366 across sessions without having to recompile code. For example, imagine that
 367 JIT'd code has access to a log function, ``void log(const char*)`` for which
 368 there are two implementations in the JIT standard library: ``log_fast`` and
 369 ``log_detailed``. Your JIT can choose which one of these definitions will be
 370 used when the ``log`` symbol is referenced by setting up an alias at JIT startup
 371 time:
 372
 373 .. code-block: c++
 374
 375   auto &JITStdLibJD = ... ;
 376
 377   auto LogImplementationSymbol =
 378    Verbose ? Mangle("log_detailed") : Mangle("log_fast");
 379
 380   JITStdLibJD.define(
 381     symbolAliases(SymbolAliasMap({
 382         { Mangle("log"),
 383           { LogImplementationSymbol
 384             JITSymbolFlags::Exported | JITSymbolFlags::Callable } }
 385       });
 386
 387 The ``symbolAliases`` function allows you to define aliases within a single
 388 JITDylib. The ``reexports`` function provides the same functionality, but
 389 operates across JITDylib boundaries. E.g.
 390
 391 .. code-block: c++
 392
 393   auto &JD1 = ... ;
 394   auto &JD2 = ... ;
 395
 396   // Make 'bar' in JD2 an alias for 'foo' from JD1.
 397   JD2.define(
 398     reexports(JD1, SymbolAliasMap({
 399         { Mangle("bar"), { Mangle("foo"), JITSymbolFlags::Exported } }
 400       });
 401
 402 The reexports utility can be handy for composing a single JITDylib interface by
 403 re-exporting symbols from several other JITDylibs.
 404
 405 .. _Laziness:
 406
 407 Laziness
 408 ========
 409
 410 Laziness in ORC is provided by a utility called "lazy reexports". A lazy
 411 reexport is similar to a regular reexport or alias: It provides a new name for
 412 an existing symbol. Unlike regular reexports however, lookups of lazy reexports
 413 do not trigger immediate materialization of the reexported symbol. Instead, they
 414 only trigger materialization of a function stub. This function stub is
 415 initialized to point at a *lazy call-through*, which provides reentry into the
 416 JIT. If the stub is called at runtime then the lazy call-through will look up
 417 the reexported symbol (triggering materialization for it if necessary), update
 418 the stub (to call directly to the reexported symbol on subsequent calls), and
 419 then return via the reexported symbol. By re-using the existing symbol lookup
 420 mechanism, lazy reexports inherit the same concurrency guarantees: calls to lazy
 421 reexports can be made from multiple threads concurrently, and the reexported
 422 symbol can be any state of compilation (uncompiled, already in the process of
 423 being compiled, or already compiled) and the call will succeed. This allows
 424 laziness to be safely mixed with features like remote compilation, concurrent
 425 compilation, concurrent JIT'd code, and speculative compilation.
 426
 427 There is one other key difference between regular reexports and lazy reexports
 428 that some clients must be aware of: The address of a lazy reexport will be
 429 *different* from the address of the reexported symbol (whereas a regular
 430 reexport is guaranteed to have the same address as the reexported symbol).
 431 Clients who care about pointer equality will generally want to use the address
 432 of the reexport as the canonical address of the reexported symbol. This will
 433 allow the address to be taken without forcing materialization of the reexport.
 434
 435 Usage example:
 436
 437 If JITDylib ``JD`` contains definitions for symbols ``foo_body`` and
 438 ``bar_body``, we can create lazy entry points ``Foo`` and ``Bar`` in JITDylib
 439 ``JD2`` by calling:
 440
 441 .. code-block:: c++
 442
 443   auto ReexportFlags = JITSymbolFlags::Exported | JITSymbolFlags::Callable;
 444   JD2.define(
 445     lazyReexports(CallThroughMgr, StubsMgr, JD,
 446                   SymbolAliasMap({
 447                     { Mangle("foo"), { Mangle("foo_body"), ReexportedFlags } },
 448                     { Mangle("bar"), { Mangle("bar_body"), ReexportedFlags } }
 449                   }));
 450
 451 A full example of how to use lazyReexports with the LLJIT class can be found at
 452 ``llvm_project/llvm/examples/LLJITExamples/LLJITWithLazyReexports``.
 453
 454 Supporting Custom Compilers
 455 ===========================
 456
 457 TBD.
 458
 459 Transitioning from ORCv1 to ORCv2
 460 =================================
 461
 462 Since LLVM 7.0, new ORC development work has focused on adding support for
 463 concurrent JIT compilation. The new APIs (including new layer interfaces and
 464 implementations, and new utilities) that support concurrency are collectively
 465 referred to as ORCv2, and the original, non-concurrent layers and utilities
 466 are now referred to as ORCv1.
 467
 468 The majority of the ORCv1 layers and utilities were renamed with a 'Legacy'
 469 prefix in LLVM 8.0, and have deprecation warnings attached in LLVM 9.0. In LLVM
 470 10.0 ORCv1 will be removed entirely.
 471
 472 Transitioning from ORCv1 to ORCv2 should be easy for most clients. Most of the
 473 ORCv1 layers and utilities have ORCv2 counterparts [2]_ that can be directly
 474 substituted. However there are some design differences between ORCv1 and ORCv2
 475 to be aware of:
 476
 477   1. ORCv2 fully adopts the JIT-as-linker model that began with MCJIT. Modules
 478      (and other program representations, e.g. Object Files)  are no longer added
 479      directly to JIT classes or layers. Instead, they are added to ``JITDylib``
 480      instances *by* layers. The ``JITDylib`` determines *where* the definitions
 481      reside, the layers determine *how* the definitions will be compiled.
 482      Linkage relationships between ``JITDylibs`` determine how inter-module
 483      references are resolved, and symbol resolvers are no longer used. See the
 484      section `Design Overview`_ for more details.
 485
 486      Unless multiple JITDylibs are needed to model linkage relationships, ORCv1
 487      clients should place all code in the main JITDylib (returned by
 488      ``ExecutionSession::getMainJITDylib()``). MCJIT clients should use LLJIT
 489      (see `LLJIT and LLLazyJIT`_).
 490
 491   2. All JIT stacks now need an ``ExecutionSession`` instance. ExecutionSession
 492      manages the string pool, error reporting, synchronization, and symbol
 493      lookup.
 494
 495   3. ORCv2 uses uniqued strings (``SymbolStringPtr`` instances) rather than
 496      string values in order to reduce memory overhead and improve lookup
 497      performance. See the subsection `How to manage symbol strings`_.
 498
 499   4. IR layers require ThreadSafeModule instances, rather than
 500      std::unique_ptr<Module>s. ThreadSafeModule is a wrapper that ensures that
 501      Modules that use the same LLVMContext are not accessed concurrently.
 502      See `How to use ThreadSafeModule and ThreadSafeContext`_.
 503
 504   5. Symbol lookup is no longer handled by layers. Instead, there is a
 505      ``lookup`` method on JITDylib that takes a list of JITDylibs to scan.
 506
 507      .. code-block:: c++
 508
 509        ExecutionSession ES;
 510        JITDylib &JD1 = ...;
 511        JITDylib &JD2 = ...;
 512
 513        auto Sym = ES.lookup({&JD1, &JD2}, ES.intern("_main"));
 514
 515   6. Module removal is not yet supported. There is no equivalent of the
 516      layer concept removeModule/removeObject methods. Work on resource tracking
 517      and removal in ORCv2 is ongoing.
 518
 519 For code examples and suggestions of how to use the ORCv2 APIs, please see
 520 the section `How-tos`_.
 521
 522 How-tos
 523 =======
 524
 525 How to manage symbol strings
 526 ----------------------------
 527
 528 Symbol strings in ORC are uniqued to improve lookup performance, reduce memory
 529 overhead, and allow symbol names to function as efficient keys. To get the
 530 unique ``SymbolStringPtr`` for a string value, call the
 531 ``ExecutionSession::intern`` method:
 532
 533   .. code-block:: c++
 534
 535     ExecutionSession ES;
 536     /// ...
 537     auto MainSymbolName = ES.intern("main");
 538
 539 If you wish to perform lookup using the C/IR name of a symbol you will also
 540 need to apply the platform linker-mangling before interning the string. On
 541 Linux this mangling is a no-op, but on other platforms it usually involves
 542 adding a prefix to the string (e.g. '_' on Darwin). The mangling scheme is
 543 based on the DataLayout for the target. Given a DataLayout and an
 544 ExecutionSession, you can create a MangleAndInterner function object that
 545 will perform both jobs for you:
 546
 547   .. code-block:: c++
 548
 549     ExecutionSession ES;
 550     const DataLayout &DL = ...;
 551     MangleAndInterner Mangle(ES, DL);
 552
 553     // ...
 554
 555     // Portable IR-symbol-name lookup:
 556     auto Sym = ES.lookup({&ES.getMainJITDylib()}, Mangle("main"));
 557
 558 How to create JITDylibs and set up linkage relationships
 559 --------------------------------------------------------
 560
 561 In ORC, all symbol definitions reside in JITDylibs. JITDylibs are created by
 562 calling the ``ExecutionSession::createJITDylib`` method with a unique name:
 563
 564   .. code-block:: c++
 565
 566     ExecutionSession ES;
 567     auto &JD = ES.createJITDylib("libFoo.dylib");
 568
 569 The JITDylib is owned by the ``ExecutionEngine`` instance and will be freed
 570 when it is destroyed.
 571
 572 A JITDylib representing the JIT main program is created by ExecutionEngine by
 573 default. A reference to it can be obtained by calling
 574 ``ExecutionSession::getMainJITDylib()``:
 575
 576   .. code-block:: c++
 577
 578     ExecutionSession ES;
 579     auto &MainJD = ES.getMainJITDylib();
 580
 581 How to use ThreadSafeModule and ThreadSafeContext
 582 -------------------------------------------------
 583
 584 ThreadSafeModule and ThreadSafeContext are wrappers around Modules and
 585 LLVMContexts respectively. A ThreadSafeModule is a pair of a
 586 std::unique_ptr<Module> and a (possibly shared) ThreadSafeContext value. A
 587 ThreadSafeContext is a pair of a std::unique_ptr<LLVMContext> and a lock.
 588 This design serves two purposes: providing a locking scheme and lifetime
 589 management for LLVMContexts. The ThreadSafeContext may be locked to prevent
 590 accidental concurrent access by two Modules that use the same LLVMContext.
 591 The underlying LLVMContext is freed once all ThreadSafeContext values pointing
 592 to it are destroyed, allowing the context memory to be reclaimed as soon as
 593 the Modules referring to it are destroyed.
 594
 595 ThreadSafeContexts can be explicitly constructed from a
 596 std::unique_ptr<LLVMContext>:
 597
 598   .. code-block:: c++
 599
 600     ThreadSafeContext TSCtx(std::make_unique<LLVMContext>());
 601
 602 ThreadSafeModules can be constructed from a pair of a std::unique_ptr<Module>
 603 and a ThreadSafeContext value. ThreadSafeContext values may be shared between
 604 multiple ThreadSafeModules:
 605
 606   .. code-block:: c++
 607
 608     ThreadSafeModule TSM1(
 609       std::make_unique<Module>("M1", *TSCtx.getContext()), TSCtx);
 610
 611     ThreadSafeModule TSM2(
 612       std::make_unique<Module>("M2", *TSCtx.getContext()), TSCtx);
 613
 614 Before using a ThreadSafeContext, clients should ensure that either the context
 615 is only accessible on the current thread, or that the context is locked. In the
 616 example above (where the context is never locked) we rely on the fact that both
 617 ``TSM1`` and ``TSM2``, and TSCtx are all created on one thread. If a context is
 618 going to be shared between threads then it must be locked before any accessing
 619 or creating any Modules attached to it. E.g.
 620
 621   .. code-block:: c++
 622
 623     ThreadSafeContext TSCtx(std::make_unique<LLVMContext>());
 624
 625     ThreadPool TP(NumThreads);
 626     JITStack J;
 627
 628     for (auto &ModulePath : ModulePaths) {
 629       TP.async(
 630         [&]() {
 631           auto Lock = TSCtx.getLock();
 632           auto M = loadModuleOnContext(ModulePath, TSCtx.getContext());
 633           J.addModule(ThreadSafeModule(std::move(M), TSCtx));
 634         });
 635     }
 636
 637     TP.wait();
 638
 639 To make exclusive access to Modules easier to manage the ThreadSafeModule class
 640 provides a convenience function, ``withModuleDo``, that implicitly (1) locks the
 641 associated context, (2) runs a given function object, (3) unlocks the context,
 642 and (3) returns the result generated by the function object. E.g.
 643
 644   .. code-block:: c++
 645
 646     ThreadSafeModule TSM = getModule(...);
 647
 648     // Dump the module:
 649     size_t NumFunctionsInModule =
 650       TSM.withModuleDo(
 651         [](Module &M) { // <- Context locked before entering lambda.
 652           return M.size();
 653         } // <- Context unlocked after leaving.
 654       );
 655
 656 Clients wishing to maximize possibilities for concurrent compilation will want
 657 to create every new ThreadSafeModule on a new ThreadSafeContext. For this
 658 reason a convenience constructor for ThreadSafeModule is provided that implicitly
 659 constructs a new ThreadSafeContext value from a std::unique_ptr<LLVMContext>:
 660
 661   .. code-block:: c++
 662
 663     // Maximize concurrency opportunities by loading every module on a
 664     // separate context.
 665     for (const auto &IRPath : IRPaths) {
 666       auto Ctx = std::make_unique<LLVMContext>();
 667       auto M = std::make_unique<LLVMContext>("M", *Ctx);
 668       CompileLayer.add(ES.getMainJITDylib(),
 669                        ThreadSafeModule(std::move(M), std::move(Ctx)));
 670     }
 671
 672 Clients who plan to run single-threaded may choose to save memory by loading
 673 all modules on the same context:
 674
 675   .. code-block:: c++
 676
 677     // Save memory by using one context for all Modules:
 678     ThreadSafeContext TSCtx(std::make_unique<LLVMContext>());
 679     for (const auto &IRPath : IRPaths) {
 680       ThreadSafeModule TSM(parsePath(IRPath, *TSCtx.getContext()), TSCtx);
 681       CompileLayer.add(ES.getMainJITDylib(), ThreadSafeModule(std::move(TSM));
 682     }
 683
 684 .. _ProcessAndLibrarySymbols:
 685
 686 How to Add Process and Library Symbols to the JITDylibs
 687 =======================================================
 688
 689 JIT'd code typically needs access to symbols in the host program or in
 690 supporting libraries. References to process symbols can be "baked in" to code
 691 as it is compiled by turning external references into pre-resolved integer
 692 constants, however this ties the JIT'd code to the current process's virtual
 693 memory layout (meaning that it can not be cached between runs) and makes
 694 debugging lower level program representations difficult (as all external
 695 references are opaque integer values). A bettor solution is to maintain symbolic
 696 external references and let the jit-linker bind them for you at runtime. To
 697 allow the JIT linker to find these external definitions their addresses must
 698 be added to a JITDylib that the JIT'd definitions link against.
 699
 700 Adding definitions for external symbols could be done using the absoluteSymbols
 701 function:
 702
 703   .. code-block:: c++
 704
 705     const DataLayout &DL = getDataLayout();
 706     MangleAndInterner Mangle(ES, DL);
 707
 708     auto &JD = ES.getMainJITDylib();
 709
 710     JD.define(
 711       absoluteSymbols({
 712         { Mangle("puts"), pointerToJITTargetAddress(&puts)},
 713         { Mangle("gets"), pointerToJITTargetAddress(&getS)}
 714       }));
 715
 716 Manually adding absolute symbols for a large or changing interface is cumbersome
 717 however, so ORC provides an alternative to generate new definitions on demand:
 718 *definition generators*. If a definition generator is attached to a JITDylib,
 719 then any unsuccessful lookup on that JITDylib will fall back to calling the
 720 definition generator, and the definition generator may choose to generate a new
 721 definition for the missing symbols. Of particular use here is the
 722 ``DynamicLibrarySearchGenerator`` utility. This can be used to reflect the whole
 723 exported symbol set of the process or a specific dynamic library, or a subset
 724 of either of these determined by a predicate.
 725
 726 For example, to load the whole interface of a runtime library:
 727
 728   .. code-block:: c++
 729
 730     const DataLayout &DL = getDataLayout();
 731     auto &JD = ES.getMainJITDylib();
 732
 733     JD.setGenerator(DynamicLibrarySearchGenerator::Load("/path/to/lib"
 734                                                         DL.getGlobalPrefix()));
 735
 736     // IR added to JD can now link against all symbols exported by the library
 737     // at '/path/to/lib'.
 738     CompileLayer.add(JD, loadModule(...));
 739
 740 Or, to expose a whitelisted set of symbols from the main process:
 741
 742   .. code-block:: c++
 743
 744     const DataLayout &DL = getDataLayout();
 745     MangleAndInterner Mangle(ES, DL);
 746
 747     auto &JD = ES.getMainJITDylib();
 748
 749     DenseSet<SymbolStringPtr> Whitelist({
 750         Mangle("puts"),
 751         Mangle("gets")
 752       });
 753
 754     // Use GetForCurrentProcess with a predicate function that checks the
 755     // whitelist.
 756     JD.setGenerator(
 757       DynamicLibrarySearchGenerator::GetForCurrentProcess(
 758         DL.getGlobalPrefix(),
 759         [&](const SymbolStringPtr &S) { return Whitelist.count(S); }));
 760
 761     // IR added to JD can now link against any symbols exported by the process
 762     // and contained in the whitelist.
 763     CompileLayer.add(JD, loadModule(...));
 764
 765 Future Features
 766 ===============
 767
 768 TBD: Speculative compilation. Object Caches.
 769
 770 .. [1] Formats/architectures vary in terms of supported features. MachO and
 771        ELF tend to have better support than COFF. Patches very welcome!
 772
 773 .. [2] The ``LazyEmittingLayer``, ``RemoteObjectClientLayer`` and
 774        ``RemoteObjectServerLayer`` do not have counterparts in the new
 775        system. In the case of ``LazyEmittingLayer`` it was simply no longer
 776        needed: in ORCv2, deferring compilation until symbols are looked up is
 777        the default. The removal of ``RemoteObjectClientLayer`` and
 778        ``RemoteObjectServerLayer`` means that JIT stacks can no longer be split
 779        across processes, however this functionality appears not to have been
 780        used.
 781
 782 .. [3] Weak definitions are currently handled correctly within dylibs, but if
 783        multiple dylibs provide a weak definition of a symbol then each will end
 784        up with its own definition (similar to how weak definitions are handled
 785        in Windows DLLs). This will be fixed in the future.