1 ===============================
2 ORC Design and Implementation
3 ===============================
11 This document aims to provide a high-level overview of the design and
12 implementation of the ORC JIT APIs. Except where otherwise stated, all
13 discussion applies to the design of the APIs as of LLVM Version 10 (ORCv2).
18 ORC provides a modular API for building JIT compilers. There are a range
19 of use cases for such an API. For example:
21 1. The LLVM tutorials use a simple ORC-based JIT class to execute expressions
22 compiled from a toy language: Kaleidoscope.
24 2. The LLVM debugger, LLDB, uses a cross-compiling JIT for expression
25 evaluation. In this use case, cross compilation allows expressions compiled
26 in the debugger process to be executed on the debug target process, which may
27 be on a different device/architecture.
29 3. In high-performance JITs (e.g. JVMs, Julia) that want to make use of LLVM's
30 optimizations within an existing JIT infrastructure.
32 4. In interpreters and REPLs, e.g. Cling (C++) and the Swift interpreter.
34 By adopting a modular, library-based design we aim to make ORC useful in as many
35 of these contexts as possible.
40 ORC provides the following features:
43 ORC provides APIs to link relocatable object files (COFF, ELF, MachO) [1]_
44 into a target process at runtime. The target process may be the same process
45 that contains the JIT session object and jit-linker, or may be another process
46 (even one running on a different machine or architecture) that communicates
50 ORC provides off the shelf components (IRCompileLayer, SimpleCompiler,
51 ConcurrentIRCompiler) that make it easy to add LLVM IR to a JIT'd process.
53 *Eager and lazy compilation*
54 By default, ORC will compile symbols as soon as they are looked up in the JIT
55 session object (``ExecutionSession``). Compiling eagerly by default makes it
56 easy to use ORC as a simple in-memory compiler within an existing JIT
57 infrastructure. However ORC also provides support for lazy compilation via
58 lazy-reexports (see :ref:`Laziness`).
60 *Support for Custom Compilers and Program Representations*
61 Clients can supply custom compilers for each symbol that they define in their
62 JIT session. ORC will run the user-supplied compiler when the a definition of
63 a symbol is needed. ORC is actually fully language agnostic: LLVM IR is not
64 treated specially, and is supported via the same wrapper mechanism (the
65 ``MaterializationUnit`` class) that is used for custom compilers.
67 *Concurrent JIT'd code* and *Concurrent Compilation*
68 JIT'd code may spawn multiple threads, and may re-enter the JIT (e.g. for lazy
69 compilation) concurrently from multiple threads. The ORC APIs also support
70 running multiple compilers concurrently. Built-in dependency tracking (via the
71 JIT linker) ensures that ORC does not release code for execution until it is
74 *Orthogonality* and *Composability*
75 Each of the features above can be used (or not) independently. It is possible
76 to put ORC components together to make a non-lazy, in-process, single threaded
77 JIT or a lazy, out-of-process, concurrent JIT, or anything in between.
82 ORC provides two basic JIT classes off-the-shelf. These are useful both as
83 examples of how to assemble ORC components to make a JIT, and as replacements
84 for earlier LLVM JIT APIs (e.g. MCJIT).
86 The LLJIT class uses an IRCompileLayer and RTDyldObjectLinkingLayer to support
87 compilation of LLVM IR and linking of relocatable object files. All operations
88 are performed eagerly on symbol lookup (i.e. a symbol's definition is compiled
89 as soon as you attempt to look up its address). LLJIT is a suitable replacement
90 for MCJIT in most cases (note: some more advanced features, e.g.
91 JITEventListeners are not supported yet).
93 The LLLazyJIT extends LLJIT and adds a CompileOnDemandLayer to enable lazy
94 compilation of LLVM IR. When an LLVM IR module is added via the addLazyIRModule
95 method, function bodies in that module will not be compiled until they are first
96 called. LLLazyJIT aims to provide a replacement of LLVM's original (pre-MCJIT)
99 LLJIT and LLLazyJIT instances can be created using their respective builder
100 classes: LLJITBuilder and LLazyJITBuilder. For example, assuming you have a
101 module ``M`` loaded on a ThreadSafeContext ``Ctx``:
105 // Try to detect the host arch and construct an LLJIT instance.
106 auto JIT = LLJITBuilder().create();
108 // If we could not construct an instance, return an error.
110 return JIT.takeError();
113 if (auto Err = JIT->addIRModule(TheadSafeModule(std::move(M), Ctx)))
116 // Look up the JIT'd code entry point.
117 auto EntrySym = JIT->lookup("entry");
119 return EntrySym.takeError();
121 // Cast the entry point address to a function pointer.
122 auto *Entry = (void(*)())EntrySym.getAddress();
124 // Call into JIT'd code.
127 The builder classes provide a number of configuration options that can be
128 specified before the JIT instance is constructed. For example:
132 // Build an LLLazyJIT instance that uses four worker threads for compilation,
133 // and jumps to a specific error handler (rather than null) on lazy compile
136 void handleLazyCompileFailure() {
137 // JIT'd code will jump here if lazy compilation fails, giving us an
138 // opportunity to exit or throw an exception into JIT'd code.
142 auto JIT = LLLazyJITBuilder()
143 .setNumCompileThreads(4)
144 .setLazyCompileFailureAddr(
145 toJITTargetAddress(&handleLazyCompileFailure))
150 For users wanting to get started with LLJIT a minimal example program can be
151 found at ``llvm/examples/HowToUseLLJIT``.
156 ORC's JIT'd program model aims to emulate the linking and symbol resolution
157 rules used by the static and dynamic linkers. This allows ORC to JIT
158 arbitrary LLVM IR, including IR produced by an ordinary static compiler (e.g.
159 clang) that uses constructs like symbol linkage and visibility, and weak [3]_
160 and common symbol definitions.
162 To see how this works, imagine a program ``foo`` which links against a pair
163 of dynamic libraries: ``libA`` and ``libB``. On the command line, building this
164 program might look like:
168 $ clang++ -shared -o libA.dylib a1.cpp a2.cpp
169 $ clang++ -shared -o libB.dylib b1.cpp b2.cpp
170 $ clang++ -o myapp myapp.cpp -L. -lA -lB
173 In ORC, this would translate into API calls on a "CXXCompilingLayer" (with error
174 checking omitted for brevity) as:
179 RTDyldObjectLinkingLayer ObjLinkingLayer(
180 ES, []() { return std::make_unique<SectionMemoryManager>(); });
181 CXXCompileLayer CXXLayer(ES, ObjLinkingLayer);
183 // Create JITDylib "A" and add code to it using the CXX layer.
184 auto &LibA = ES.createJITDylib("A");
185 CXXLayer.add(LibA, MemoryBuffer::getFile("a1.cpp"));
186 CXXLayer.add(LibA, MemoryBuffer::getFile("a2.cpp"));
188 // Create JITDylib "B" and add code to it using the CXX layer.
189 auto &LibB = ES.createJITDylib("B");
190 CXXLayer.add(LibB, MemoryBuffer::getFile("b1.cpp"));
191 CXXLayer.add(LibB, MemoryBuffer::getFile("b2.cpp"));
193 // Specify the search order for the main JITDylib. This is equivalent to a
194 // "links against" relationship in a command-line link.
195 ES.getMainJITDylib().setSearchOrder({{&LibA, false}, {&LibB, false}});
196 CXXLayer.add(ES.getMainJITDylib(), MemoryBuffer::getFile("main.cpp"));
198 // Look up the JIT'd main, cast it to a function pointer, then call it.
199 auto MainSym = ExitOnErr(ES.lookup({&ES.getMainJITDylib()}, "main"));
200 auto *Main = (int(*)(int, char*[]))MainSym.getAddress();
202 int Result = Main(...);
204 This example tells us nothing about *how* or *when* compilation will happen.
205 That will depend on the implementation of the hypothetical CXXCompilingLayer.
206 The same linker-based symbol resolution rules will apply regardless of that
207 implementation, however. For example, if a1.cpp and a2.cpp both define a
208 function "foo" then ORCv2 will generate a duplicate definition error. On the
209 other hand, if a1.cpp and b1.cpp both define "foo" there is no error (different
210 dynamic libraries may define the same symbol). If main.cpp refers to "foo", it
211 should bind to the definition in LibA rather than the one in LibB, since
212 main.cpp is part of the "main" dylib, and the main dylib links against LibA
215 Many JIT clients will have no need for this strict adherence to the usual
216 ahead-of-time linking rules, and should be able to get by just fine by putting
217 all of their code in a single JITDylib. However, clients who want to JIT code
218 for languages/projects that traditionally rely on ahead-of-time linking (e.g.
219 C++) will find that this feature makes life much easier.
221 Symbol lookup in ORC serves two other important functions, beyond providing
222 addresses for symbols: (1) It triggers compilation of the symbol(s) searched for
223 (if they have not been compiled already), and (2) it provides the
224 synchronization mechanism for concurrent compilation. The pseudo-code for the
229 construct a query object from a query set and query handler
231 lodge query against requested symbols, collect required materializers (if any)
233 dispatch materializers (if any)
235 In this context a materializer is something that provides a working definition
236 of a symbol upon request. Usually materializers are just wrappers for compilers,
237 but they may also wrap a jit-linker directly (if the program representation
238 backing the definitions is an object file), or may even be a class that writes
239 bits directly into memory (for example, if the definitions are
240 stubs). Materialization is the blanket term for any actions (compiling, linking,
241 splatting bits, registering with runtimes, etc.) that are required to generate a
242 symbol definition that is safe to call or access.
244 As each materializer completes its work it notifies the JITDylib, which in turn
245 notifies any query objects that are waiting on the newly materialized
246 definitions. Each query object maintains a count of the number of symbols that
247 it is still waiting on, and once this count reaches zero the query object calls
248 the query handler with a *SymbolMap* (a map of symbol names to addresses)
249 describing the result. If any symbol fails to materialize the query immediately
250 calls the query handler with an error.
252 The collected materialization units are sent to the ExecutionSession to be
253 dispatched, and the dispatch behavior can be set by the client. By default each
254 materializer is run on the calling thread. Clients are free to create new
255 threads to run materializers, or to send the work to a work queue for a thread
256 pool (this is what LLJIT/LLLazyJIT do).
261 Many of ORC's top-level APIs are visible in the example above:
263 - *ExecutionSession* represents the JIT'd program and provides context for the
264 JIT: It contains the JITDylibs, error reporting mechanisms, and dispatches the
267 - *JITDylibs* provide the symbol tables.
269 - *Layers* (ObjLinkingLayer and CXXLayer) are wrappers around compilers and
270 allow clients to add uncompiled program representations supported by those
271 compilers to JITDylibs.
273 Several other important APIs are used explicitly. JIT clients need not be aware
274 of them, but Layer authors will use them:
276 - *MaterializationUnit* - When XXXLayer::add is invoked it wraps the given
277 program representation (in this example, C++ source) in a MaterializationUnit,
278 which is then stored in the JITDylib. MaterializationUnits are responsible for
279 describing the definitions they provide, and for unwrapping the program
280 representation and passing it back to the layer when compilation is required
281 (this ownership shuffle makes writing thread-safe layers easier, since the
282 ownership of the program representation will be passed back on the stack,
283 rather than having to be fished out of a Layer member, which would require
286 - *MaterializationResponsibility* - When a MaterializationUnit hands a program
287 representation back to the layer it comes with an associated
288 MaterializationResponsibility object. This object tracks the definitions
289 that must be materialized and provides a way to notify the JITDylib once they
290 are either successfully materialized or a failure occurs.
292 Absolute Symbols, Aliases, and Reexports
293 ========================================
295 ORC makes it easy to define symbols with absolute addresses, or symbols that
296 are simply aliases of other symbols:
301 Absolute symbols are symbols that map directly to addresses without requiring
302 further materialization, for example: "foo" = 0x1234. One use case for
303 absolute symbols is allowing resolution of process symbols. E.g.
307 JD.define(absoluteSymbols(SymbolMap({
309 { pointerToJITTargetAddress(&printf),
310 JITSymbolFlags::Callable } }
313 With this mapping established code added to the JIT can refer to printf
314 symbolically rather than requiring the address of printf to be "baked in".
315 This in turn allows cached versions of the JIT'd code (e.g. compiled objects)
316 to be re-used across JIT sessions as the JIT'd code no longer changes, only the
317 absolute symbol definition does.
319 For process and library symbols the DynamicLibrarySearchGenerator utility (See
320 :ref:`How to Add Process and Library Symbols to JITDylibs
321 <ProcessAndLibrarySymbols>`) can be used to automatically build absolute
322 symbol mappings for you. However the absoluteSymbols function is still useful
323 for making non-global objects in your JIT visible to JIT'd code. For example,
324 imagine that your JIT standard library needs access to your JIT object to make
325 some calls. We could bake the address of your object into the library, but then
326 it would need to be recompiled for each session:
330 // From standard library for JIT'd code:
334 void log(const char *Msg);
337 void log(const char *Msg) { ((MyJIT*)0x1234)->log(Msg); }
339 We can turn this into a symbolic reference in the JIT standard library:
343 extern MyJIT *__MyJITInstance;
345 void log(const char *Msg) { __MyJITInstance->log(Msg); }
347 And then make our JIT object visible to the JIT standard library with an
348 absolute symbol definition when the JIT is started:
354 auto &JITStdLibJD = ... ;
356 JITStdLibJD.define(absoluteSymbols(SymbolMap({
357 { Mangle("__MyJITInstance"),
358 { pointerToJITTargetAddress(&J), JITSymbolFlags() } }
361 Aliases and Reexports
362 ---------------------
364 Aliases and reexports allow you to define new symbols that map to existing
365 symbols. This can be useful for changing linkage relationships between symbols
366 across sessions without having to recompile code. For example, imagine that
367 JIT'd code has access to a log function, ``void log(const char*)`` for which
368 there are two implementations in the JIT standard library: ``log_fast`` and
369 ``log_detailed``. Your JIT can choose which one of these definitions will be
370 used when the ``log`` symbol is referenced by setting up an alias at JIT startup
375 auto &JITStdLibJD = ... ;
377 auto LogImplementationSymbol =
378 Verbose ? Mangle("log_detailed") : Mangle("log_fast");
381 symbolAliases(SymbolAliasMap({
383 { LogImplementationSymbol
384 JITSymbolFlags::Exported | JITSymbolFlags::Callable } }
387 The ``symbolAliases`` function allows you to define aliases within a single
388 JITDylib. The ``reexports`` function provides the same functionality, but
389 operates across JITDylib boundaries. E.g.
396 // Make 'bar' in JD2 an alias for 'foo' from JD1.
398 reexports(JD1, SymbolAliasMap({
399 { Mangle("bar"), { Mangle("foo"), JITSymbolFlags::Exported } }
402 The reexports utility can be handy for composing a single JITDylib interface by
403 re-exporting symbols from several other JITDylibs.
410 Laziness in ORC is provided by a utility called "lazy reexports". A lazy
411 reexport is similar to a regular reexport or alias: It provides a new name for
412 an existing symbol. Unlike regular reexports however, lookups of lazy reexports
413 do not trigger immediate materialization of the reexported symbol. Instead, they
414 only trigger materialization of a function stub. This function stub is
415 initialized to point at a *lazy call-through*, which provides reentry into the
416 JIT. If the stub is called at runtime then the lazy call-through will look up
417 the reexported symbol (triggering materialization for it if necessary), update
418 the stub (to call directly to the reexported symbol on subsequent calls), and
419 then return via the reexported symbol. By re-using the existing symbol lookup
420 mechanism, lazy reexports inherit the same concurrency guarantees: calls to lazy
421 reexports can be made from multiple threads concurrently, and the reexported
422 symbol can be any state of compilation (uncompiled, already in the process of
423 being compiled, or already compiled) and the call will succeed. This allows
424 laziness to be safely mixed with features like remote compilation, concurrent
425 compilation, concurrent JIT'd code, and speculative compilation.
427 There is one other key difference between regular reexports and lazy reexports
428 that some clients must be aware of: The address of a lazy reexport will be
429 *different* from the address of the reexported symbol (whereas a regular
430 reexport is guaranteed to have the same address as the reexported symbol).
431 Clients who care about pointer equality will generally want to use the address
432 of the reexport as the canonical address of the reexported symbol. This will
433 allow the address to be taken without forcing materialization of the reexport.
437 If JITDylib ``JD`` contains definitions for symbols ``foo_body`` and
438 ``bar_body``, we can create lazy entry points ``Foo`` and ``Bar`` in JITDylib
443 auto ReexportFlags = JITSymbolFlags::Exported | JITSymbolFlags::Callable;
445 lazyReexports(CallThroughMgr, StubsMgr, JD,
447 { Mangle("foo"), { Mangle("foo_body"), ReexportedFlags } },
448 { Mangle("bar"), { Mangle("bar_body"), ReexportedFlags } }
451 A full example of how to use lazyReexports with the LLJIT class can be found at
452 ``llvm_project/llvm/examples/LLJITExamples/LLJITWithLazyReexports``.
454 Supporting Custom Compilers
455 ===========================
459 Transitioning from ORCv1 to ORCv2
460 =================================
462 Since LLVM 7.0, new ORC development work has focused on adding support for
463 concurrent JIT compilation. The new APIs (including new layer interfaces and
464 implementations, and new utilities) that support concurrency are collectively
465 referred to as ORCv2, and the original, non-concurrent layers and utilities
466 are now referred to as ORCv1.
468 The majority of the ORCv1 layers and utilities were renamed with a 'Legacy'
469 prefix in LLVM 8.0, and have deprecation warnings attached in LLVM 9.0. In LLVM
470 10.0 ORCv1 will be removed entirely.
472 Transitioning from ORCv1 to ORCv2 should be easy for most clients. Most of the
473 ORCv1 layers and utilities have ORCv2 counterparts [2]_ that can be directly
474 substituted. However there are some design differences between ORCv1 and ORCv2
477 1. ORCv2 fully adopts the JIT-as-linker model that began with MCJIT. Modules
478 (and other program representations, e.g. Object Files) are no longer added
479 directly to JIT classes or layers. Instead, they are added to ``JITDylib``
480 instances *by* layers. The ``JITDylib`` determines *where* the definitions
481 reside, the layers determine *how* the definitions will be compiled.
482 Linkage relationships between ``JITDylibs`` determine how inter-module
483 references are resolved, and symbol resolvers are no longer used. See the
484 section `Design Overview`_ for more details.
486 Unless multiple JITDylibs are needed to model linkage relationships, ORCv1
487 clients should place all code in the main JITDylib (returned by
488 ``ExecutionSession::getMainJITDylib()``). MCJIT clients should use LLJIT
489 (see `LLJIT and LLLazyJIT`_).
491 2. All JIT stacks now need an ``ExecutionSession`` instance. ExecutionSession
492 manages the string pool, error reporting, synchronization, and symbol
495 3. ORCv2 uses uniqued strings (``SymbolStringPtr`` instances) rather than
496 string values in order to reduce memory overhead and improve lookup
497 performance. See the subsection `How to manage symbol strings`_.
499 4. IR layers require ThreadSafeModule instances, rather than
500 std::unique_ptr<Module>s. ThreadSafeModule is a wrapper that ensures that
501 Modules that use the same LLVMContext are not accessed concurrently.
502 See `How to use ThreadSafeModule and ThreadSafeContext`_.
504 5. Symbol lookup is no longer handled by layers. Instead, there is a
505 ``lookup`` method on JITDylib that takes a list of JITDylibs to scan.
513 auto Sym = ES.lookup({&JD1, &JD2}, ES.intern("_main"));
515 6. Module removal is not yet supported. There is no equivalent of the
516 layer concept removeModule/removeObject methods. Work on resource tracking
517 and removal in ORCv2 is ongoing.
519 For code examples and suggestions of how to use the ORCv2 APIs, please see
520 the section `How-tos`_.
525 How to manage symbol strings
526 ----------------------------
528 Symbol strings in ORC are uniqued to improve lookup performance, reduce memory
529 overhead, and allow symbol names to function as efficient keys. To get the
530 unique ``SymbolStringPtr`` for a string value, call the
531 ``ExecutionSession::intern`` method:
537 auto MainSymbolName = ES.intern("main");
539 If you wish to perform lookup using the C/IR name of a symbol you will also
540 need to apply the platform linker-mangling before interning the string. On
541 Linux this mangling is a no-op, but on other platforms it usually involves
542 adding a prefix to the string (e.g. '_' on Darwin). The mangling scheme is
543 based on the DataLayout for the target. Given a DataLayout and an
544 ExecutionSession, you can create a MangleAndInterner function object that
545 will perform both jobs for you:
550 const DataLayout &DL = ...;
551 MangleAndInterner Mangle(ES, DL);
555 // Portable IR-symbol-name lookup:
556 auto Sym = ES.lookup({&ES.getMainJITDylib()}, Mangle("main"));
558 How to create JITDylibs and set up linkage relationships
559 --------------------------------------------------------
561 In ORC, all symbol definitions reside in JITDylibs. JITDylibs are created by
562 calling the ``ExecutionSession::createJITDylib`` method with a unique name:
567 auto &JD = ES.createJITDylib("libFoo.dylib");
569 The JITDylib is owned by the ``ExecutionEngine`` instance and will be freed
570 when it is destroyed.
572 A JITDylib representing the JIT main program is created by ExecutionEngine by
573 default. A reference to it can be obtained by calling
574 ``ExecutionSession::getMainJITDylib()``:
579 auto &MainJD = ES.getMainJITDylib();
581 How to use ThreadSafeModule and ThreadSafeContext
582 -------------------------------------------------
584 ThreadSafeModule and ThreadSafeContext are wrappers around Modules and
585 LLVMContexts respectively. A ThreadSafeModule is a pair of a
586 std::unique_ptr<Module> and a (possibly shared) ThreadSafeContext value. A
587 ThreadSafeContext is a pair of a std::unique_ptr<LLVMContext> and a lock.
588 This design serves two purposes: providing a locking scheme and lifetime
589 management for LLVMContexts. The ThreadSafeContext may be locked to prevent
590 accidental concurrent access by two Modules that use the same LLVMContext.
591 The underlying LLVMContext is freed once all ThreadSafeContext values pointing
592 to it are destroyed, allowing the context memory to be reclaimed as soon as
593 the Modules referring to it are destroyed.
595 ThreadSafeContexts can be explicitly constructed from a
596 std::unique_ptr<LLVMContext>:
600 ThreadSafeContext TSCtx(std::make_unique<LLVMContext>());
602 ThreadSafeModules can be constructed from a pair of a std::unique_ptr<Module>
603 and a ThreadSafeContext value. ThreadSafeContext values may be shared between
604 multiple ThreadSafeModules:
608 ThreadSafeModule TSM1(
609 std::make_unique<Module>("M1", *TSCtx.getContext()), TSCtx);
611 ThreadSafeModule TSM2(
612 std::make_unique<Module>("M2", *TSCtx.getContext()), TSCtx);
614 Before using a ThreadSafeContext, clients should ensure that either the context
615 is only accessible on the current thread, or that the context is locked. In the
616 example above (where the context is never locked) we rely on the fact that both
617 ``TSM1`` and ``TSM2``, and TSCtx are all created on one thread. If a context is
618 going to be shared between threads then it must be locked before any accessing
619 or creating any Modules attached to it. E.g.
623 ThreadSafeContext TSCtx(std::make_unique<LLVMContext>());
625 ThreadPool TP(NumThreads);
628 for (auto &ModulePath : ModulePaths) {
631 auto Lock = TSCtx.getLock();
632 auto M = loadModuleOnContext(ModulePath, TSCtx.getContext());
633 J.addModule(ThreadSafeModule(std::move(M), TSCtx));
639 To make exclusive access to Modules easier to manage the ThreadSafeModule class
640 provides a convenience function, ``withModuleDo``, that implicitly (1) locks the
641 associated context, (2) runs a given function object, (3) unlocks the context,
642 and (3) returns the result generated by the function object. E.g.
646 ThreadSafeModule TSM = getModule(...);
649 size_t NumFunctionsInModule =
651 [](Module &M) { // <- Context locked before entering lambda.
653 } // <- Context unlocked after leaving.
656 Clients wishing to maximize possibilities for concurrent compilation will want
657 to create every new ThreadSafeModule on a new ThreadSafeContext. For this
658 reason a convenience constructor for ThreadSafeModule is provided that implicitly
659 constructs a new ThreadSafeContext value from a std::unique_ptr<LLVMContext>:
663 // Maximize concurrency opportunities by loading every module on a
665 for (const auto &IRPath : IRPaths) {
666 auto Ctx = std::make_unique<LLVMContext>();
667 auto M = std::make_unique<LLVMContext>("M", *Ctx);
668 CompileLayer.add(ES.getMainJITDylib(),
669 ThreadSafeModule(std::move(M), std::move(Ctx)));
672 Clients who plan to run single-threaded may choose to save memory by loading
673 all modules on the same context:
677 // Save memory by using one context for all Modules:
678 ThreadSafeContext TSCtx(std::make_unique<LLVMContext>());
679 for (const auto &IRPath : IRPaths) {
680 ThreadSafeModule TSM(parsePath(IRPath, *TSCtx.getContext()), TSCtx);
681 CompileLayer.add(ES.getMainJITDylib(), ThreadSafeModule(std::move(TSM));
684 .. _ProcessAndLibrarySymbols:
686 How to Add Process and Library Symbols to the JITDylibs
687 =======================================================
689 JIT'd code typically needs access to symbols in the host program or in
690 supporting libraries. References to process symbols can be "baked in" to code
691 as it is compiled by turning external references into pre-resolved integer
692 constants, however this ties the JIT'd code to the current process's virtual
693 memory layout (meaning that it can not be cached between runs) and makes
694 debugging lower level program representations difficult (as all external
695 references are opaque integer values). A bettor solution is to maintain symbolic
696 external references and let the jit-linker bind them for you at runtime. To
697 allow the JIT linker to find these external definitions their addresses must
698 be added to a JITDylib that the JIT'd definitions link against.
700 Adding definitions for external symbols could be done using the absoluteSymbols
705 const DataLayout &DL = getDataLayout();
706 MangleAndInterner Mangle(ES, DL);
708 auto &JD = ES.getMainJITDylib();
712 { Mangle("puts"), pointerToJITTargetAddress(&puts)},
713 { Mangle("gets"), pointerToJITTargetAddress(&getS)}
716 Manually adding absolute symbols for a large or changing interface is cumbersome
717 however, so ORC provides an alternative to generate new definitions on demand:
718 *definition generators*. If a definition generator is attached to a JITDylib,
719 then any unsuccessful lookup on that JITDylib will fall back to calling the
720 definition generator, and the definition generator may choose to generate a new
721 definition for the missing symbols. Of particular use here is the
722 ``DynamicLibrarySearchGenerator`` utility. This can be used to reflect the whole
723 exported symbol set of the process or a specific dynamic library, or a subset
724 of either of these determined by a predicate.
726 For example, to load the whole interface of a runtime library:
730 const DataLayout &DL = getDataLayout();
731 auto &JD = ES.getMainJITDylib();
733 JD.setGenerator(DynamicLibrarySearchGenerator::Load("/path/to/lib"
734 DL.getGlobalPrefix()));
736 // IR added to JD can now link against all symbols exported by the library
737 // at '/path/to/lib'.
738 CompileLayer.add(JD, loadModule(...));
740 Or, to expose a whitelisted set of symbols from the main process:
744 const DataLayout &DL = getDataLayout();
745 MangleAndInterner Mangle(ES, DL);
747 auto &JD = ES.getMainJITDylib();
749 DenseSet<SymbolStringPtr> Whitelist({
754 // Use GetForCurrentProcess with a predicate function that checks the
757 DynamicLibrarySearchGenerator::GetForCurrentProcess(
758 DL.getGlobalPrefix(),
759 [&](const SymbolStringPtr &S) { return Whitelist.count(S); }));
761 // IR added to JD can now link against any symbols exported by the process
762 // and contained in the whitelist.
763 CompileLayer.add(JD, loadModule(...));
768 TBD: Speculative compilation. Object Caches.
770 .. [1] Formats/architectures vary in terms of supported features. MachO and
771 ELF tend to have better support than COFF. Patches very welcome!
773 .. [2] The ``LazyEmittingLayer``, ``RemoteObjectClientLayer`` and
774 ``RemoteObjectServerLayer`` do not have counterparts in the new
775 system. In the case of ``LazyEmittingLayer`` it was simply no longer
776 needed: in ORCv2, deferring compilation until symbols are looked up is
777 the default. The removal of ``RemoteObjectClientLayer`` and
778 ``RemoteObjectServerLayer`` means that JIT stacks can no longer be split
779 across processes, however this functionality appears not to have been
782 .. [3] Weak definitions are currently handled correctly within dylibs, but if
783 multiple dylibs provide a weak definition of a symbol then each will end
784 up with its own definition (similar to how weak definitions are handled
785 in Windows DLLs). This will be fixed in the future.