1 ===============================
2 ORC Design and Implementation
3 ===============================
11 This document aims to provide a high-level overview of the design and
12 implementation of the ORC JIT APIs. Except where otherwise stated all discussion
13 refers to the modern ORCv2 APIs (available since LLVM 7). Clients wishing to
14 transition from OrcV1 should see Section :ref:`transitioning_orcv1_to_orcv2`.
19 ORC provides a modular API for building JIT compilers. There are a number
20 of use cases for such an API. For example:
22 1. The LLVM tutorials use a simple ORC-based JIT class to execute expressions
23 compiled from a toy language: Kaleidoscope.
25 2. The LLVM debugger, LLDB, uses a cross-compiling JIT for expression
26 evaluation. In this use case, cross compilation allows expressions compiled
27 in the debugger process to be executed on the debug target process, which may
28 be on a different device/architecture.
30 3. In high-performance JITs (e.g. JVMs, Julia) that want to make use of LLVM's
31 optimizations within an existing JIT infrastructure.
33 4. In interpreters and REPLs, e.g. Cling (C++) and the Swift interpreter.
35 By adopting a modular, library-based design we aim to make ORC useful in as many
36 of these contexts as possible.
41 ORC provides the following features:
44 ORC provides APIs to link relocatable object files (COFF, ELF, MachO) [1]_
45 into a target process at runtime. The target process may be the same process
46 that contains the JIT session object and jit-linker, or may be another process
47 (even one running on a different machine or architecture) that communicates
50 **LLVM IR compilation**
51 ORC provides off the shelf components (IRCompileLayer, SimpleCompiler,
52 ConcurrentIRCompiler) that make it easy to add LLVM IR to a JIT'd process.
54 **Eager and lazy compilation**
55 By default, ORC will compile symbols as soon as they are looked up in the JIT
56 session object (``ExecutionSession``). Compiling eagerly by default makes it
57 easy to use ORC as an in-memory compiler for an existing JIT (similar to how
58 MCJIT is commonly used). However ORC also provides built-in support for lazy
59 compilation via lazy-reexports (see :ref:`Laziness`).
61 **Support for Custom Compilers and Program Representations**
62 Clients can supply custom compilers for each symbol that they define in their
63 JIT session. ORC will run the user-supplied compiler when the a definition of
64 a symbol is needed. ORC is actually fully language agnostic: LLVM IR is not
65 treated specially, and is supported via the same wrapper mechanism (the
66 ``MaterializationUnit`` class) that is used for custom compilers.
68 **Concurrent JIT'd code** and **Concurrent Compilation**
69 JIT'd code may be executed in multiple threads, may spawn new threads, and may
70 re-enter the ORC (e.g. to request lazy compilation) concurrently from multiple
71 threads. Compilers launched my ORC can run concurrently (provided the client
72 sets up an appropriate dispatcher). Built-in dependency tracking ensures that
73 ORC does not release pointers to JIT'd code or data until all dependencies
74 have also been JIT'd and they are safe to call or use.
77 Resources for JIT'd program representations
79 **Orthogonality** and **Composability**
80 Each of the features above can be used independently. It is possible to put
81 ORC components together to make a non-lazy, in-process, single threaded JIT
82 or a lazy, out-of-process, concurrent JIT, or anything in between.
87 ORC provides two basic JIT classes off-the-shelf. These are useful both as
88 examples of how to assemble ORC components to make a JIT, and as replacements
89 for earlier LLVM JIT APIs (e.g. MCJIT).
91 The LLJIT class uses an IRCompileLayer and RTDyldObjectLinkingLayer to support
92 compilation of LLVM IR and linking of relocatable object files. All operations
93 are performed eagerly on symbol lookup (i.e. a symbol's definition is compiled
94 as soon as you attempt to look up its address). LLJIT is a suitable replacement
95 for MCJIT in most cases (note: some more advanced features, e.g.
96 JITEventListeners are not supported yet).
98 The LLLazyJIT extends LLJIT and adds a CompileOnDemandLayer to enable lazy
99 compilation of LLVM IR. When an LLVM IR module is added via the addLazyIRModule
100 method, function bodies in that module will not be compiled until they are first
101 called. LLLazyJIT aims to provide a replacement of LLVM's original (pre-MCJIT)
104 LLJIT and LLLazyJIT instances can be created using their respective builder
105 classes: LLJITBuilder and LLazyJITBuilder. For example, assuming you have a
106 module ``M`` loaded on a ThreadSafeContext ``Ctx``:
110 // Try to detect the host arch and construct an LLJIT instance.
111 auto JIT = LLJITBuilder().create();
113 // If we could not construct an instance, return an error.
115 return JIT.takeError();
118 if (auto Err = JIT->addIRModule(TheadSafeModule(std::move(M), Ctx)))
121 // Look up the JIT'd code entry point.
122 auto EntrySym = JIT->lookup("entry");
124 return EntrySym.takeError();
126 // Cast the entry point address to a function pointer.
127 auto *Entry = (void(*)())EntrySym.getAddress();
129 // Call into JIT'd code.
132 The builder classes provide a number of configuration options that can be
133 specified before the JIT instance is constructed. For example:
137 // Build an LLLazyJIT instance that uses four worker threads for compilation,
138 // and jumps to a specific error handler (rather than null) on lazy compile
141 void handleLazyCompileFailure() {
142 // JIT'd code will jump here if lazy compilation fails, giving us an
143 // opportunity to exit or throw an exception into JIT'd code.
147 auto JIT = LLLazyJITBuilder()
148 .setNumCompileThreads(4)
149 .setLazyCompileFailureAddr(
150 ExecutorAddr::fromPtr(&handleLazyCompileFailure))
155 For users wanting to get started with LLJIT a minimal example program can be
156 found at ``llvm/examples/HowToUseLLJIT``.
161 ORC's JIT program model aims to emulate the linking and symbol resolution
162 rules used by the static and dynamic linkers. This allows ORC to JIT
163 arbitrary LLVM IR, including IR produced by an ordinary static compiler (e.g.
164 clang) that uses constructs like symbol linkage and visibility, and weak [3]_
165 and common symbol definitions.
167 To see how this works, imagine a program ``foo`` which links against a pair
168 of dynamic libraries: ``libA`` and ``libB``. On the command line, building this
169 program might look like:
173 $ clang++ -shared -o libA.dylib a1.cpp a2.cpp
174 $ clang++ -shared -o libB.dylib b1.cpp b2.cpp
175 $ clang++ -o myapp myapp.cpp -L. -lA -lB
178 In ORC, this would translate into API calls on a hypothetical CXXCompilingLayer
179 (with error checking omitted for brevity) as:
184 RTDyldObjectLinkingLayer ObjLinkingLayer(
185 ES, []() { return std::make_unique<SectionMemoryManager>(); });
186 CXXCompileLayer CXXLayer(ES, ObjLinkingLayer);
188 // Create JITDylib "A" and add code to it using the CXX layer.
189 auto &LibA = ES.createJITDylib("A");
190 CXXLayer.add(LibA, MemoryBuffer::getFile("a1.cpp"));
191 CXXLayer.add(LibA, MemoryBuffer::getFile("a2.cpp"));
193 // Create JITDylib "B" and add code to it using the CXX layer.
194 auto &LibB = ES.createJITDylib("B");
195 CXXLayer.add(LibB, MemoryBuffer::getFile("b1.cpp"));
196 CXXLayer.add(LibB, MemoryBuffer::getFile("b2.cpp"));
198 // Create and specify the search order for the main JITDylib. This is
199 // equivalent to a "links against" relationship in a command-line link.
200 auto &MainJD = ES.createJITDylib("main");
201 MainJD.addToLinkOrder(&LibA);
202 MainJD.addToLinkOrder(&LibB);
203 CXXLayer.add(MainJD, MemoryBuffer::getFile("main.cpp"));
205 // Look up the JIT'd main, cast it to a function pointer, then call it.
206 auto MainSym = ExitOnErr(ES.lookup({&MainJD}, "main"));
207 auto *Main = (int(*)(int, char*[]))MainSym.getAddress();
209 int Result = Main(...);
211 This example tells us nothing about *how* or *when* compilation will happen.
212 That will depend on the implementation of the hypothetical CXXCompilingLayer.
213 The same linker-based symbol resolution rules will apply regardless of that
214 implementation, however. For example, if a1.cpp and a2.cpp both define a
215 function "foo" then ORCv2 will generate a duplicate definition error. On the
216 other hand, if a1.cpp and b1.cpp both define "foo" there is no error (different
217 dynamic libraries may define the same symbol). If main.cpp refers to "foo", it
218 should bind to the definition in LibA rather than the one in LibB, since
219 main.cpp is part of the "main" dylib, and the main dylib links against LibA
222 Many JIT clients will have no need for this strict adherence to the usual
223 ahead-of-time linking rules, and should be able to get by just fine by putting
224 all of their code in a single JITDylib. However, clients who want to JIT code
225 for languages/projects that traditionally rely on ahead-of-time linking (e.g.
226 C++) will find that this feature makes life much easier.
228 Symbol lookup in ORC serves two other important functions, beyond providing
229 addresses for symbols: (1) It triggers compilation of the symbol(s) searched for
230 (if they have not been compiled already), and (2) it provides the
231 synchronization mechanism for concurrent compilation. The pseudo-code for the
236 construct a query object from a query set and query handler
238 lodge query against requested symbols, collect required materializers (if any)
240 dispatch materializers (if any)
242 In this context a materializer is something that provides a working definition
243 of a symbol upon request. Usually materializers are just wrappers for compilers,
244 but they may also wrap a jit-linker directly (if the program representation
245 backing the definitions is an object file), or may even be a class that writes
246 bits directly into memory (for example, if the definitions are
247 stubs). Materialization is the blanket term for any actions (compiling, linking,
248 splatting bits, registering with runtimes, etc.) that are required to generate a
249 symbol definition that is safe to call or access.
251 As each materializer completes its work it notifies the JITDylib, which in turn
252 notifies any query objects that are waiting on the newly materialized
253 definitions. Each query object maintains a count of the number of symbols that
254 it is still waiting on, and once this count reaches zero the query object calls
255 the query handler with a *SymbolMap* (a map of symbol names to addresses)
256 describing the result. If any symbol fails to materialize the query immediately
257 calls the query handler with an error.
259 The collected materialization units are sent to the ExecutionSession to be
260 dispatched, and the dispatch behavior can be set by the client. By default each
261 materializer is run on the calling thread. Clients are free to create new
262 threads to run materializers, or to send the work to a work queue for a thread
263 pool (this is what LLJIT/LLLazyJIT do).
268 Many of ORC's top-level APIs are visible in the example above:
270 - *ExecutionSession* represents the JIT'd program and provides context for the
271 JIT: It contains the JITDylibs, error reporting mechanisms, and dispatches the
274 - *JITDylibs* provide the symbol tables.
276 - *Layers* (ObjLinkingLayer and CXXLayer) are wrappers around compilers and
277 allow clients to add uncompiled program representations supported by those
278 compilers to JITDylibs.
280 - *ResourceTrackers* allow you to remove code.
282 Several other important APIs are used explicitly. JIT clients need not be aware
283 of them, but Layer authors will use them:
285 - *MaterializationUnit* - When XXXLayer::add is invoked it wraps the given
286 program representation (in this example, C++ source) in a MaterializationUnit,
287 which is then stored in the JITDylib. MaterializationUnits are responsible for
288 describing the definitions they provide, and for unwrapping the program
289 representation and passing it back to the layer when compilation is required
290 (this ownership shuffle makes writing thread-safe layers easier, since the
291 ownership of the program representation will be passed back on the stack,
292 rather than having to be fished out of a Layer member, which would require
295 - *MaterializationResponsibility* - When a MaterializationUnit hands a program
296 representation back to the layer it comes with an associated
297 MaterializationResponsibility object. This object tracks the definitions
298 that must be materialized and provides a way to notify the JITDylib once they
299 are either successfully materialized or a failure occurs.
301 Absolute Symbols, Aliases, and Reexports
302 ========================================
304 ORC makes it easy to define symbols with absolute addresses, or symbols that
305 are simply aliases of other symbols:
310 Absolute symbols are symbols that map directly to addresses without requiring
311 further materialization, for example: "foo" = 0x1234. One use case for
312 absolute symbols is allowing resolution of process symbols. E.g.
316 JD.define(absoluteSymbols(SymbolMap({
318 { ExecutorAddr::fromPtr(&printf),
319 JITSymbolFlags::Callable } }
322 With this mapping established code added to the JIT can refer to printf
323 symbolically rather than requiring the address of printf to be "baked in".
324 This in turn allows cached versions of the JIT'd code (e.g. compiled objects)
325 to be re-used across JIT sessions as the JIT'd code no longer changes, only the
326 absolute symbol definition does.
328 For process and library symbols the DynamicLibrarySearchGenerator utility (See
329 :ref:`How to Add Process and Library Symbols to JITDylibs
330 <ProcessAndLibrarySymbols>`) can be used to automatically build absolute
331 symbol mappings for you. However the absoluteSymbols function is still useful
332 for making non-global objects in your JIT visible to JIT'd code. For example,
333 imagine that your JIT standard library needs access to your JIT object to make
334 some calls. We could bake the address of your object into the library, but then
335 it would need to be recompiled for each session:
339 // From standard library for JIT'd code:
343 void log(const char *Msg);
346 void log(const char *Msg) { ((MyJIT*)0x1234)->log(Msg); }
348 We can turn this into a symbolic reference in the JIT standard library:
352 extern MyJIT *__MyJITInstance;
354 void log(const char *Msg) { __MyJITInstance->log(Msg); }
356 And then make our JIT object visible to the JIT standard library with an
357 absolute symbol definition when the JIT is started:
363 auto &JITStdLibJD = ... ;
365 JITStdLibJD.define(absoluteSymbols(SymbolMap({
366 { Mangle("__MyJITInstance"),
367 { ExecutorAddr::fromPtr(&J), JITSymbolFlags() } }
370 Aliases and Reexports
371 ---------------------
373 Aliases and reexports allow you to define new symbols that map to existing
374 symbols. This can be useful for changing linkage relationships between symbols
375 across sessions without having to recompile code. For example, imagine that
376 JIT'd code has access to a log function, ``void log(const char*)`` for which
377 there are two implementations in the JIT standard library: ``log_fast`` and
378 ``log_detailed``. Your JIT can choose which one of these definitions will be
379 used when the ``log`` symbol is referenced by setting up an alias at JIT startup
384 auto &JITStdLibJD = ... ;
386 auto LogImplementationSymbol =
387 Verbose ? Mangle("log_detailed") : Mangle("log_fast");
390 symbolAliases(SymbolAliasMap({
392 { LogImplementationSymbol
393 JITSymbolFlags::Exported | JITSymbolFlags::Callable } }
396 The ``symbolAliases`` function allows you to define aliases within a single
397 JITDylib. The ``reexports`` function provides the same functionality, but
398 operates across JITDylib boundaries. E.g.
405 // Make 'bar' in JD2 an alias for 'foo' from JD1.
407 reexports(JD1, SymbolAliasMap({
408 { Mangle("bar"), { Mangle("foo"), JITSymbolFlags::Exported } }
411 The reexports utility can be handy for composing a single JITDylib interface by
412 re-exporting symbols from several other JITDylibs.
419 Laziness in ORC is provided by a utility called "lazy reexports". A lazy
420 reexport is similar to a regular reexport or alias: It provides a new name for
421 an existing symbol. Unlike regular reexports however, lookups of lazy reexports
422 do not trigger immediate materialization of the reexported symbol. Instead, they
423 only trigger materialization of a function stub. This function stub is
424 initialized to point at a *lazy call-through*, which provides reentry into the
425 JIT. If the stub is called at runtime then the lazy call-through will look up
426 the reexported symbol (triggering materialization for it if necessary), update
427 the stub (to call directly to the reexported symbol on subsequent calls), and
428 then return via the reexported symbol. By re-using the existing symbol lookup
429 mechanism, lazy reexports inherit the same concurrency guarantees: calls to lazy
430 reexports can be made from multiple threads concurrently, and the reexported
431 symbol can be any state of compilation (uncompiled, already in the process of
432 being compiled, or already compiled) and the call will succeed. This allows
433 laziness to be safely mixed with features like remote compilation, concurrent
434 compilation, concurrent JIT'd code, and speculative compilation.
436 There is one other key difference between regular reexports and lazy reexports
437 that some clients must be aware of: The address of a lazy reexport will be
438 *different* from the address of the reexported symbol (whereas a regular
439 reexport is guaranteed to have the same address as the reexported symbol).
440 Clients who care about pointer equality will generally want to use the address
441 of the reexport as the canonical address of the reexported symbol. This will
442 allow the address to be taken without forcing materialization of the reexport.
446 If JITDylib ``JD`` contains definitions for symbols ``foo_body`` and
447 ``bar_body``, we can create lazy entry points ``Foo`` and ``Bar`` in JITDylib
452 auto ReexportFlags = JITSymbolFlags::Exported | JITSymbolFlags::Callable;
454 lazyReexports(CallThroughMgr, StubsMgr, JD,
456 { Mangle("foo"), { Mangle("foo_body"), ReexportedFlags } },
457 { Mangle("bar"), { Mangle("bar_body"), ReexportedFlags } }
460 A full example of how to use lazyReexports with the LLJIT class can be found at
461 ``llvm/examples/OrcV2Examples/LLJITWithLazyReexports``.
463 Supporting Custom Compilers
464 ===========================
468 .. _transitioning_orcv1_to_orcv2:
470 Transitioning from ORCv1 to ORCv2
471 =================================
473 Since LLVM 7.0, new ORC development work has focused on adding support for
474 concurrent JIT compilation. The new APIs (including new layer interfaces and
475 implementations, and new utilities) that support concurrency are collectively
476 referred to as ORCv2, and the original, non-concurrent layers and utilities
477 are now referred to as ORCv1.
479 The majority of the ORCv1 layers and utilities were renamed with a 'Legacy'
480 prefix in LLVM 8.0, and have deprecation warnings attached in LLVM 9.0. In LLVM
481 12.0 ORCv1 will be removed entirely.
483 Transitioning from ORCv1 to ORCv2 should be easy for most clients. Most of the
484 ORCv1 layers and utilities have ORCv2 counterparts [2]_ that can be directly
485 substituted. However there are some design differences between ORCv1 and ORCv2
488 1. ORCv2 fully adopts the JIT-as-linker model that began with MCJIT. Modules
489 (and other program representations, e.g. Object Files) are no longer added
490 directly to JIT classes or layers. Instead, they are added to ``JITDylib``
491 instances *by* layers. The ``JITDylib`` determines *where* the definitions
492 reside, the layers determine *how* the definitions will be compiled.
493 Linkage relationships between ``JITDylibs`` determine how inter-module
494 references are resolved, and symbol resolvers are no longer used. See the
495 section `Design Overview`_ for more details.
497 Unless multiple JITDylibs are needed to model linkage relationships, ORCv1
498 clients should place all code in a single JITDylib.
499 MCJIT clients should use LLJIT (see `LLJIT and LLLazyJIT`_), and can place
500 code in LLJIT's default created main JITDylib (See
501 ``LLJIT::getMainJITDylib()``).
503 2. All JIT stacks now need an ``ExecutionSession`` instance. ExecutionSession
504 manages the string pool, error reporting, synchronization, and symbol
507 3. ORCv2 uses uniqued strings (``SymbolStringPtr`` instances) rather than
508 string values in order to reduce memory overhead and improve lookup
509 performance. See the subsection `How to manage symbol strings`_.
511 4. IR layers require ThreadSafeModule instances, rather than
512 std::unique_ptr<Module>s. ThreadSafeModule is a wrapper that ensures that
513 Modules that use the same LLVMContext are not accessed concurrently.
514 See `How to use ThreadSafeModule and ThreadSafeContext`_.
516 5. Symbol lookup is no longer handled by layers. Instead, there is a
517 ``lookup`` method on JITDylib that takes a list of JITDylibs to scan.
525 auto Sym = ES.lookup({&JD1, &JD2}, ES.intern("_main"));
527 6. The removeModule/removeObject methods are replaced by
528 ``ResourceTracker::remove``.
529 See the subsection `How to remove code`_.
531 For code examples and suggestions of how to use the ORCv2 APIs, please see
532 the section `How-tos`_.
537 How to manage symbol strings
538 ----------------------------
540 Symbol strings in ORC are uniqued to improve lookup performance, reduce memory
541 overhead, and allow symbol names to function as efficient keys. To get the
542 unique ``SymbolStringPtr`` for a string value, call the
543 ``ExecutionSession::intern`` method:
549 auto MainSymbolName = ES.intern("main");
551 If you wish to perform lookup using the C/IR name of a symbol you will also
552 need to apply the platform linker-mangling before interning the string. On
553 Linux this mangling is a no-op, but on other platforms it usually involves
554 adding a prefix to the string (e.g. '_' on Darwin). The mangling scheme is
555 based on the DataLayout for the target. Given a DataLayout and an
556 ExecutionSession, you can create a MangleAndInterner function object that
557 will perform both jobs for you:
562 const DataLayout &DL = ...;
563 MangleAndInterner Mangle(ES, DL);
567 // Portable IR-symbol-name lookup:
568 auto Sym = ES.lookup({&MainJD}, Mangle("main"));
570 How to create JITDylibs and set up linkage relationships
571 --------------------------------------------------------
573 In ORC, all symbol definitions reside in JITDylibs. JITDylibs are created by
574 calling the ``ExecutionSession::createJITDylib`` method with a unique name:
579 auto &JD = ES.createJITDylib("libFoo.dylib");
581 The JITDylib is owned by the ``ExecutionEngine`` instance and will be freed
582 when it is destroyed.
587 To remove an individual module from a JITDylib it must first be added using an
588 explicit ``ResourceTracker``. The module can then be removed by calling
589 ``ResourceTracker::remove``:
596 auto RT = JD.createResourceTracker();
597 Layer.add(RT, std::move(M)); // Add M to JD, tracking resources with RT
599 RT.remove(); // Remove M from JD.
601 Modules added directly to a JITDylib will be tracked by that JITDylib's default
604 All code can be removed from a JITDylib by calling ``JITDylib::clear``. This
605 leaves the cleared JITDylib in an empty but usable state.
607 JITDylibs can be removed by calling ``ExecutionSession::removeJITDylib``. This
608 clears the JITDylib and then puts it into a defunct state. No further operations
609 can be performed on the JITDylib, and it will be destroyed as soon as the last
610 handle to it is released.
612 An example of how to use the resource management APIs can be found at
613 ``llvm/examples/OrcV2Examples/LLJITRemovableCode``.
616 How to add the support for custom program representation
617 --------------------------------------------------------
618 In order to add the support for a custom program representation, a custom ``MaterializationUnit``
619 for the program representation, and a custom ``Layer`` are needed. The Layer will have two
620 operations: ``add`` and ``emit``. The ``add`` operation takes an instance of your program
621 representation, builds one of your custom ``MaterializationUnits`` to hold it, then adds it
622 to a ``JITDylib``. The emit operation takes a ``MaterializationResponsibility`` object and an
623 instance of your program representation and materializes it, usually by compiling it and handing
624 the resulting object off to an ``ObjectLinkingLayer``.
626 Your custom ``MaterializationUnit`` will have two operations: ``materialize`` and ``discard``. The
627 ``materialize`` function will be called for you when any symbol provided by the unit is looked up,
628 and it should just call the ``emit`` function on your layer, passing in the given
629 ``MaterializationResponsibility`` and the wrapped program representation. The ``discard`` function
630 will be called if some weak symbol provided by your unit is not needed (because the JIT found an
631 overriding definition). You can use this to drop your definition early, or just ignore it and let
632 the linker drops the definition later.
634 Here is an example of an ASTLayer:
638 // ... In you JIT class
643 class AstMaterializationUnit : public orc::MaterializationUnit {
645 AstMaterializationUnit(AstLayer &l, Ast &ast)
646 : llvm::orc::MaterializationUnit(l.getInterface(ast)), astLayer(l),
649 llvm::StringRef getName() const override {
650 return "AstMaterializationUnit";
653 void materialize(std::unique_ptr<orc::MaterializationResponsibility> r) override {
654 astLayer.emit(std::move(r), ast);
658 void discard(const llvm::orc::JITDylib &jd, const llvm::orc::SymbolStringPtr &sym) override {
659 llvm_unreachable("functions are not overridable");
668 llvhm::orc::IRLayer &baseLayer;
669 llvhm::orc::MangleAndInterner &mangler;
672 AstLayer(llvm::orc::IRLayer &baseLayer, llvm::orc::MangleAndInterner &mangler)
673 : baseLayer(baseLayer), mangler(mangler){};
675 llvm::Error add(llvm::orc::ResourceTrackerSP &rt, Ast &ast) {
676 return rt->getJITDylib().define(std::make_unique<AstMaterializationUnit>(*this, ast), rt);
679 void emit(std::unique_ptr<orc::MaterializationResponsibility> mr, Ast &ast) {
680 // compileAst is just function that compiles the given AST and returns
681 // a `llvm::orc::ThreadSafeModule`
682 baseLayer.emit(std::move(mr), compileAst(ast));
685 llvm::orc::MaterializationUnit::Interface getInterface(Ast &ast) {
686 SymbolFlagsMap Symbols;
687 // Find all the symbols in the AST and for each of them
688 // add it to the Symbols map.
689 Symbols[mangler(someNameFromAST)] =
690 JITSymbolFlags(JITSymbolFlags::Exported | JITSymbolFlags::Callable);
691 return MaterializationUnit::Interface(std::move(Symbols), nullptr);
695 Take look at the source code of `Building A JIT's Chapter 4 <tutorial/BuildingAJIT4.html>`_ for a complete example.
697 How to use ThreadSafeModule and ThreadSafeContext
698 -------------------------------------------------
700 ThreadSafeModule and ThreadSafeContext are wrappers around Modules and
701 LLVMContexts respectively. A ThreadSafeModule is a pair of a
702 std::unique_ptr<Module> and a (possibly shared) ThreadSafeContext value. A
703 ThreadSafeContext is a pair of a std::unique_ptr<LLVMContext> and a lock.
704 This design serves two purposes: providing a locking scheme and lifetime
705 management for LLVMContexts. The ThreadSafeContext may be locked to prevent
706 accidental concurrent access by two Modules that use the same LLVMContext.
707 The underlying LLVMContext is freed once all ThreadSafeContext values pointing
708 to it are destroyed, allowing the context memory to be reclaimed as soon as
709 the Modules referring to it are destroyed.
711 ThreadSafeContexts can be explicitly constructed from a
712 std::unique_ptr<LLVMContext>:
716 ThreadSafeContext TSCtx(std::make_unique<LLVMContext>());
718 ThreadSafeModules can be constructed from a pair of a std::unique_ptr<Module>
719 and a ThreadSafeContext value. ThreadSafeContext values may be shared between
720 multiple ThreadSafeModules:
724 ThreadSafeModule TSM1(
725 std::make_unique<Module>("M1", *TSCtx.getContext()), TSCtx);
727 ThreadSafeModule TSM2(
728 std::make_unique<Module>("M2", *TSCtx.getContext()), TSCtx);
730 Before using a ThreadSafeContext, clients should ensure that either the context
731 is only accessible on the current thread, or that the context is locked. In the
732 example above (where the context is never locked) we rely on the fact that both
733 ``TSM1`` and ``TSM2``, and TSCtx are all created on one thread. If a context is
734 going to be shared between threads then it must be locked before any accessing
735 or creating any Modules attached to it. E.g.
739 ThreadSafeContext TSCtx(std::make_unique<LLVMContext>());
741 ThreadPool TP(NumThreads);
744 for (auto &ModulePath : ModulePaths) {
747 auto Lock = TSCtx.getLock();
748 auto M = loadModuleOnContext(ModulePath, TSCtx.getContext());
749 J.addModule(ThreadSafeModule(std::move(M), TSCtx));
755 To make exclusive access to Modules easier to manage the ThreadSafeModule class
756 provides a convenience function, ``withModuleDo``, that implicitly (1) locks the
757 associated context, (2) runs a given function object, (3) unlocks the context,
758 and (3) returns the result generated by the function object. E.g.
762 ThreadSafeModule TSM = getModule(...);
765 size_t NumFunctionsInModule =
767 [](Module &M) { // <- Context locked before entering lambda.
769 } // <- Context unlocked after leaving.
772 Clients wishing to maximize possibilities for concurrent compilation will want
773 to create every new ThreadSafeModule on a new ThreadSafeContext. For this
774 reason a convenience constructor for ThreadSafeModule is provided that implicitly
775 constructs a new ThreadSafeContext value from a std::unique_ptr<LLVMContext>:
779 // Maximize concurrency opportunities by loading every module on a
781 for (const auto &IRPath : IRPaths) {
782 auto Ctx = std::make_unique<LLVMContext>();
783 auto M = std::make_unique<LLVMContext>("M", *Ctx);
784 CompileLayer.add(MainJD, ThreadSafeModule(std::move(M), std::move(Ctx)));
787 Clients who plan to run single-threaded may choose to save memory by loading
788 all modules on the same context:
792 // Save memory by using one context for all Modules:
793 ThreadSafeContext TSCtx(std::make_unique<LLVMContext>());
794 for (const auto &IRPath : IRPaths) {
795 ThreadSafeModule TSM(parsePath(IRPath, *TSCtx.getContext()), TSCtx);
796 CompileLayer.add(MainJD, ThreadSafeModule(std::move(TSM));
799 .. _ProcessAndLibrarySymbols:
801 How to Add Process and Library Symbols to JITDylibs
802 ===================================================
804 JIT'd code may need to access symbols in the host program or in supporting
805 libraries. The best way to enable this is to reflect these symbols into your
806 JITDylibs so that they appear the same as any other symbol defined within the
807 execution session (i.e. they are findable via `ExecutionSession::lookup`, and
808 so visible to the JIT linker during linking).
810 One way to reflect external symbols is to add them manually using the
811 absoluteSymbols function:
815 const DataLayout &DL = getDataLayout();
816 MangleAndInterner Mangle(ES, DL);
818 auto &JD = ES.createJITDylib("main");
822 { Mangle("puts"), ExecutorAddr::fromPtr(&puts)},
823 { Mangle("gets"), ExecutorAddr::fromPtr(&getS)}
826 Using absoluteSymbols is reasonable if the set of symbols to be reflected is
827 small and fixed. On the other hand, if the set of symbols is large or variable
828 it may make more sense to have the definitions added for you on demand by a
829 *definition generator*.A definition generator is an object that can be attached
830 to a JITDylib, receiving a callback whenever a lookup within that JITDylib fails
831 to find one or more symbols. The definition generator is given a chance to
832 produce a definition of the missing symbol(s) before the lookup proceeds.
834 ORC provides the ``DynamicLibrarySearchGenerator`` utility for reflecting symbols
835 from the process (or a specific dynamic library) for you. For example, to reflect
836 the whole interface of a runtime library:
840 const DataLayout &DL = getDataLayout();
841 auto &JD = ES.createJITDylib("main");
844 DynamicLibrarySearchGenerator::Load("/path/to/lib"
845 DL.getGlobalPrefix()))
846 JD.addGenerator(std::move(*DLSGOrErr);
848 return DLSGOrErr.takeError();
850 // IR added to JD can now link against all symbols exported by the library
851 // at '/path/to/lib'.
852 CompileLayer.add(JD, loadModule(...));
854 The ``DynamicLibrarySearchGenerator`` utility can also be constructed with a
855 filter function to restrict the set of symbols that may be reflected. For
856 example, to expose an allowed set of symbols from the main process:
860 const DataLayout &DL = getDataLayout();
861 MangleAndInterner Mangle(ES, DL);
863 auto &JD = ES.createJITDylib("main");
865 DenseSet<SymbolStringPtr> AllowList({
870 // Use GetForCurrentProcess with a predicate function that checks the
872 JD.addGenerator(cantFail(DynamicLibrarySearchGenerator::GetForCurrentProcess(
873 DL.getGlobalPrefix(),
874 [&](const SymbolStringPtr &S) { return AllowList.count(S); })));
876 // IR added to JD can now link against any symbols exported by the process
877 // and contained in the list.
878 CompileLayer.add(JD, loadModule(...));
880 References to process or library symbols could also be hardcoded into your IR
881 or object files using the symbols' raw addresses, however symbolic resolution
882 using the JIT symbol tables should be preferred: it keeps the IR and objects
883 readable and reusable in subsequent JIT sessions. Hardcoded addresses are
884 difficult to read, and usually only good for one session.
889 ORC is still undergoing active development. Some current and future works are
895 1. **TargetProcessControl: Improvements to in-tree support for out-of-process
898 The ``TargetProcessControl`` API provides various operations on the JIT
899 target process (the one which will execute the JIT'd code), including
900 memory allocation, memory writes, function execution, and process queries
901 (e.g. for the target triple). By targeting this API new components can be
902 developed which will work equally well for in-process and out-of-process
906 2. **ORC RPC based TargetProcessControl implementation**
908 An ORC RPC based implementation of the ``TargetProcessControl`` API is
909 currently under development to enable easy out-of-process JITing via
910 file descriptors / sockets.
912 3. **Core State Machine Cleanup**
914 The core ORC state machine is currently implemented between JITDylib and
915 ExecutionSession. Methods are slowly being moved to `ExecutionSession`. This
916 will tidy up the code base, and also allow us to support asynchronous removal
917 of JITDylibs (in practice deleting an associated state object in
918 ExecutionSession and leaving the JITDylib instance in a defunct state until
919 all references to it have been released).
924 1. **ORC JIT Runtime Libraries**
926 We need a runtime library for JIT'd code. This would include things like
927 TLS registration, reentry functions, registration code for language runtimes
928 (e.g. Objective C and Swift) and other JIT specific runtime code. This should
929 be built in a similar manner to compiler-rt (possibly even as part of it).
931 2. **Remote jit_dlopen / jit_dlclose**
933 To more fully mimic the environment that static programs operate in we would
934 like JIT'd code to be able to "dlopen" and "dlclose" JITDylibs, running all of
935 their initializers/deinitializers on the current thread. This would require
936 support from the runtime library described above.
938 3. **Debugging support**
940 ORC currently supports the GDBRegistrationListener API when using RuntimeDyld
941 as the underlying JIT linker. We will need a new solution for JITLink based
947 1. **Speculative Compilation**
949 ORC's support for concurrent compilation allows us to easily enable
950 *speculative* JIT compilation: compilation of code that is not needed yet,
951 but which we have reason to believe will be needed in the future. This can be
952 used to hide compile latency and improve JIT throughput. A proof-of-concept
953 example of speculative compilation with ORC has already been developed (see
954 ``llvm/examples/SpeculativeJIT``). Future work on this is likely to focus on
955 re-using and improving existing profiling support (currently used by PGO) to
956 feed speculation decisions, as well as built-in tools to simplify use of
957 speculative compilation.
959 .. [1] Formats/architectures vary in terms of supported features. MachO and
960 ELF tend to have better support than COFF. Patches very welcome!
962 .. [2] The ``LazyEmittingLayer``, ``RemoteObjectClientLayer`` and
963 ``RemoteObjectServerLayer`` do not have counterparts in the new
964 system. In the case of ``LazyEmittingLayer`` it was simply no longer
965 needed: in ORCv2, deferring compilation until symbols are looked up is
966 the default. The removal of ``RemoteObjectClientLayer`` and
967 ``RemoteObjectServerLayer`` means that JIT stacks can no longer be split
968 across processes, however this functionality appears not to have been
971 .. [3] Weak definitions are currently handled correctly within dylibs, but if
972 multiple dylibs provide a weak definition of a symbol then each will end
973 up with its own definition (similar to how weak definitions are handled
974 in Windows DLLs). This will be fixed in the future.