llvm/docs/JITLink.rst

   1 ====================================
   2 JITLink and ORC's ObjectLinkingLayer
   3 ====================================
   4
   5 .. contents::
   6    :local:
   7
   8 Introduction
   9 ============
  10
  11 This document aims to provide a high-level overview of the design and API
  12 of the JITLink library. It assumes some familiarity with linking and
  13 relocatable object files, but should not require deep expertise. If you know
  14 what a section, symbol, and relocation are you should find this document
  15 accessible. If it is not, please submit a patch (:doc:`Contributing`) or file a
  16 bug (:doc:`HowToSubmitABug`).
  17
  18 JITLink is a library for :ref:`jit_linking`. It was built to support the :doc:`ORC JIT
  19 APIs<ORCv2>` and is most commonly accessed via ORC's ObjectLinkingLayer API. JITLink was
  20 developed with the aim of supporting the full set of features provided by each
  21 object format; including static initializers, exception handling, thread local
  22 variables, and language runtime registration. Supporting these features enables
  23 ORC to execute code generated from source languages which rely on these features
  24 (e.g. C++ requires object format support for static initializers to support
  25 static constructors, eh-frame registration for exceptions, and TLV support for
  26 thread locals; Swift and Objective-C require language runtime registration for
  27 many features). For some object format features support is provided entirely
  28 within JITLink, and for others it is provided in cooperation with the
  29 (prototype) ORC runtime.
  30
  31 JITLink aims to support the following features, some of which are still under
  32 development:
  33
  34 1. Cross-process and cross-architecture linking of single relocatable objects
  35    into a target *executor* process.
  36
  37 2. Support for all object format features.
  38
  39 3. Open linker data structures (``LinkGraph``) and pass system.
  40
  41 JITLink and ObjectLinkingLayer
  42 ==============================
  43
  44 ``ObjectLinkingLayer`` is ORCs wrapper for JITLink. It is an ORC layer that
  45 allows objects to be added to a ``JITDylib``, or emitted from some higher level
  46 program representation. When an object is emitted, ``ObjectLinkingLayer`` uses
  47 JITLink to construct a ``LinkGraph`` (see :ref:`constructing_linkgraphs`) and
  48 calls JITLink's ``link`` function to link the graph into the executor process.
  49
  50 The ``ObjectLinkingLayer`` class provides a plugin API,
  51 ``ObjectLinkingLayer::Plugin``, which users can subclass in order to inspect and
  52 modify ``LinkGraph`` instances at link time, and react to important JIT events
  53 (such as an object being emitted into target memory). This enables many features
  54 and optimizations that were not possible under MCJIT or RuntimeDyld.
  55
  56 ObjectLinkingLayer Plugins
  57 --------------------------
  58
  59 The ``ObjectLinkingLayer::Plugin`` class  provides the following  methods:
  60
  61 * ``modifyPassConfig`` is called each time a LinkGraph is about to be linked. It
  62   can be overridden to install JITLink *Passes* to run during the link process.
  63
  64   .. code-block:: c++
  65
  66     void modifyPassConfig(MaterializationResponsibility &MR,
  67                           const Triple &TT,
  68                           jitlink::PassConfiguration &Config)
  69
  70 * ``notifyLoaded`` is called before the link begins, and can be overridden to
  71   set up any initial state for the given ``MaterializationResponsibility`` if
  72   needed.
  73
  74   .. code-block:: c++
  75
  76     void notifyLoaded(MaterializationResponsibility &MR)
  77
  78 * ``notifyEmitted`` is called after the link is complete and code has been
  79   emitted to the executor process. It can be overridden to finalize state
  80   for the ``MaterializationResponsibility`` if needed.
  81
  82   .. code-block:: c++
  83
  84     Error notifyEmitted(MaterializationResponsibility &MR)
  85
  86 * ``notifyFailed`` is called if the link fails at any point. It can be
  87   overridden to react to the failure (e.g. to deallocate any already allocated
  88   resources).
  89
  90   .. code-block:: c++
  91
  92     Error notifyFailed(MaterializationResponsibility &MR)
  93
  94 * ``notifyRemovingResources`` is called when a request is made to remove any
  95   resources associated with the ``ResourceKey`` *K* for the
  96   ``MaterializationResponsibility``.
  97
  98   .. code-block:: c++
  99
 100     Error notifyRemovingResources(ResourceKey K)
 101
 102 * ``notifyTransferringResources`` is called if/when a request is made to
 103   transfer tracking of any resources associated with ``ResourceKey``
 104   *SrcKey* to *DstKey*.
 105
 106   .. code-block:: c++
 107
 108     void notifyTransferringResources(ResourceKey DstKey,
 109                                      ResourceKey SrcKey)
 110
 111 Plugin authors are required to implement the ``notifyFailed``,
 112 ``notifyRemovingResources``, and ``notifyTransferringResources`` methods in
 113 order to safely manage resources in the case of resource removal or transfer,
 114 or link failure. If no resources are managed by the plugin then these methods
 115 can be implemented as no-ops returning ``Error::success()``.
 116
 117 Plugin instances are added to an ``ObjectLinkingLayer`` by
 118 calling the ``addPlugin`` method [1]_. E.g.
 119
 120 .. code-block:: c++
 121
 122   // Plugin class to print the set of defined symbols in an object when that
 123   // object is linked.
 124   class MyPlugin : public ObjectLinkingLayer::Plugin {
 125   public:
 126
 127     // Add passes to print the set of defined symbols after dead-stripping.
 128     void modifyPassConfig(MaterializationResponsibility &MR,
 129                           const Triple &TT,
 130                           jitlink::PassConfiguration &Config) override {
 131       Config.PostPrunePasses.push_back([this](jitlink::LinkGraph &G) {
 132         return printAllSymbols(G);
 133       });
 134     }
 135
 136     // Implement mandatory overrides:
 137     Error notifyFailed(MaterializationResponsibility &MR) override {
 138       return Error::success();
 139     }
 140     Error notifyRemovingResources(ResourceKey K) override {
 141       return Error::success();
 142     }
 143     void notifyTransferringResources(ResourceKey DstKey,
 144                                      ResourceKey SrcKey) override {}
 145
 146     // JITLink pass to print all defined symbols in G.
 147     Error printAllSymbols(LinkGraph &G) {
 148       for (auto *Sym : G.defined_symbols())
 149         if (Sym->hasName())
 150           dbgs() << Sym->getName() << "\n";
 151       return Error::success();
 152     }
 153   };
 154
 155   // Create our LLJIT instance using a custom object linking layer setup.
 156   // This gives us a chance to install our plugin.
 157   auto J = ExitOnErr(LLJITBuilder()
 158              .setObjectLinkingLayerCreator(
 159                [](ExecutionSession &ES, const Triple &T) {
 160                  // Manually set up the ObjectLinkingLayer for our LLJIT
 161                  // instance.
 162                  auto OLL = std::make_unique<ObjectLinkingLayer>(
 163                      ES, std::make_unique<jitlink::InProcessMemoryManager>());
 164
 165                  // Install our plugin:
 166                  OLL->addPlugin(std::make_unique<MyPlugin>());
 167
 168                  return OLL;
 169                })
 170              .create());
 171
 172   // Add an object to the JIT. Nothing happens here: linking isn't triggered
 173   // until we look up some symbol in our object.
 174   ExitOnErr(J->addObject(loadFromDisk("main.o")));
 175
 176   // Plugin triggers here when our lookup of main triggers linking of main.o
 177   auto MainSym = J->lookup("main");
 178
 179 LinkGraph
 180 =========
 181
 182 JITLink maps all relocatable object formats to a generic ``LinkGraph`` type
 183 that is designed to make linking fast and easy (``LinkGraph`` instances can
 184 also be created manually. See :ref:`constructing_linkgraphs`).
 185
 186 Relocatable object formats (e.g. COFF, ELF, MachO) differ in their details,
 187 but share a common goal: to represent machine level code and data with
 188 annotations that allow them to be relocated in a virtual address space. To
 189 this end they usually contain names (symbols) for content defined inside the
 190 file or externally, chunks of content that must be moved as a unit (sections
 191 or subsections, depending on the format), and annotations describing how to
 192 patch content based on the final address of some target symbol/section
 193 (relocations).
 194
 195 At a high level, the ``LinkGraph`` type represents these concepts as a decorated
 196 graph. Nodes in the graph represent symbols and content, and edges represent
 197 relocations. Each of the elements of the graph is listed here:
 198
 199 * ``Addressable`` -- A node in the link graph that can be assigned an address
 200   in the executor process's virtual address space.
 201
 202   Absolute and external symbols are represented using plain ``Addressable``
 203   instances. Content defined inside the object file is represented using the
 204   ``Block`` subclass.
 205
 206 * ``Block`` -- An ``Addressable`` node that has ``Content`` (or is marked as
 207   zero-filled), a parent ``Section``, a ``Size``, an ``Alignment`` (and an
 208   ``AlignmentOffset``), and a list of ``Edge`` instances.
 209
 210   Blocks provide a container for binary content which must remain contiguous in
 211   the target address space (a *layout unit*). Many interesting low level
 212   operations on ``LinkGraph`` instances involve inspecting or mutating block
 213   content or edges.
 214
 215   * ``Content`` is represented as an ``llvm::StringRef``, and accessible via
 216     the ``getContent`` method. Content is only available for content blocks,
 217     and not for zero-fill blocks (use ``isZeroFill`` to check, and prefer
 218     ``getSize`` when only the block size is needed as it works for both
 219     zero-fill and content blocks).
 220
 221   * ``Section`` is represented as a ``Section&`` reference, and accessible via
 222     the ``getSection`` method. The ``Section`` class is described in more detail
 223     below.
 224
 225   * ``Size`` is represented as a ``size_t``, and is accessible via the
 226     ``getSize`` method for both content and zero-filled blocks.
 227
 228   * ``Alignment`` is represented as a ``uint64_t``, and available via the
 229     ``getAlignment`` method. It represents the minimum alignment requirement (in
 230     bytes) of the start of the block.
 231
 232   * ``AlignmentOffset`` is represented as a ``uint64_t``, and accessible via the
 233     ``getAlignmentOffset`` method. It represents the offset from the alignment
 234     required for the start of the block. This is required to support blocks
 235     whose minimum alignment requirement comes from data at some non-zero offset
 236     inside the block. E.g. if a block consists of a single byte (with byte
 237     alignment) followed by a uint64_t (with 8-byte alignment), then the block
 238     will have 8-byte alignment with an alignment offset of 7.
 239
 240   * list of ``Edge`` instances. An iterator range for this list is returned by
 241     the ``edges`` method. The ``Edge`` class is described in more detail below.
 242
 243 * ``Symbol`` -- An offset from an ``Addressable`` (often a ``Block``), with an
 244   optional ``Name``, a ``Linkage``, a ``Scope``, a ``Callable`` flag, and a
 245   ``Live`` flag.
 246
 247   Symbols make it possible to name content (blocks and addressables are
 248   anonymous), or target content with an ``Edge``.
 249
 250   * ``Name`` is represented as an ``llvm::StringRef`` (equal to
 251     ``llvm::StringRef()`` if the symbol has no name), and accessible via the
 252     ``getName`` method.
 253
 254   * ``Linkage`` is one of *Strong* or *Weak*, and is accessible via the
 255     ``getLinkage`` method. The ``JITLinkContext`` can use this flag to determine
 256     whether this symbol definition should be kept or dropped.
 257
 258   * ``Scope`` is one of *Default*, *Hidden*, or *Local*, and is accessible via
 259     the ``getScope`` method. The ``JITLinkContext`` can use this to determine
 260     who should be able to see the symbol. A symbol with default scope should be
 261     globally visible. A symbol with hidden scope should be visible to other
 262     definitions within the same simulated dylib (e.g. ORC ``JITDylib``) or
 263     executable, but not from elsewhere. A symbol with local scope should only be
 264     visible within the current ``LinkGraph``.
 265
 266   * ``Callable`` is a boolean which is set to true if this symbol can be called,
 267     and is accessible via the ``isCallable`` method. This can be used to
 268     automate the introduction of call-stubs for lazy compilation.
 269
 270   * ``Live`` is a boolean that can be set to mark this symbol as root for
 271     dead-stripping purposes (see :ref:`generic_link_algorithm`). JITLink's
 272     dead-stripping algorithm will propagate liveness flags through the graph to
 273     all reachable symbols before deleting any symbols (and blocks) that are not
 274     marked live.
 275
 276 * ``Edge`` -- A quad of an ``Offset`` (implicitly from the start of the
 277   containing ``Block``), a ``Kind`` (describing the relocation type), a
 278   ``Target``, and an ``Addend``.
 279
 280   Edges represent relocations, and occasionally other relationships, between
 281   blocks and symbols.
 282
 283   * ``Offset``, accessible via ``getOffset``, is an offset from the start of the
 284     ``Block`` containing the ``Edge``.
 285
 286   * ``Kind``, accessible via ``getKind`` is a relocation type -- it describes
 287     what kinds of changes (if any) should be made to block content at the given
 288     ``Offset`` based on the address of the ``Target``.
 289
 290   * ``Target``, accessible via ``getTarget``, is a pointer to a ``Symbol``,
 291     representing whose address is relevant to the fixup calculation specified by
 292     the edge's ``Kind``.
 293
 294   * ``Addend``, accessible via ``getAddend``, is a constant whose interpretation
 295     is determined by the edge's ``Kind``.
 296
 297 * ``Section`` -- A set of ``Symbol`` instances, plus a set of ``Block``
 298   instances, with a ``Name``, a set of ``ProtectionFlags``, and an ``Ordinal``.
 299
 300   Sections make it easy to iterate over the symbols or blocks associated with
 301   a particular section in the source object file.
 302
 303   * ``blocks()`` returns an iterator over the set of blocks defined in the
 304     section (as ``Block*`` pointers).
 305
 306   * ``symbols()`` returns an iterator over the set of symbols defined in the
 307     section (as ``Symbol*`` pointers).
 308
 309   * ``Name`` is represented as an ``llvm::StringRef``, and is accessible via the
 310     ``getName`` method.
 311
 312   * ``ProtectionFlags`` are represented as a sys::Memory::ProtectionFlags enum,
 313     and accessible via the ``getProtectionFlags`` method. These flags describe
 314     whether the section is readable, writable, executable, or some combination
 315     of these. The most common combinations are ``RW-`` for writable data,
 316     ``R--`` for constant data, and ``R-X`` for code.
 317
 318   * ``SectionOrdinal``, accessible via ``getOrdinal``, is a number used to order
 319     the section relative to others.  It is usually used to preserve section
 320     order within a segment (a set of sections with the same memory protections)
 321     when laying out memory.
 322
 323 For the graph-theorists: The ``LinkGraph`` is bipartite, with one set of
 324 ``Symbol`` nodes and one set of ``Addressable`` nodes. Each ``Symbol`` node has
 325 one (implicit) edge to its target ``Addressable``. Each ``Block`` has a set of
 326 edges (possibly empty, represented as ``Edge`` instances) back to elements of
 327 the ``Symbol`` set. For convenience and performance of common algorithms,
 328 symbols and blocks are further grouped into ``Sections``.
 329
 330 The ``LinkGraph`` itself provides operations for constructing, removing, and
 331 iterating over sections, symbols, and blocks. It also provides metadata
 332 and utilities relevant to the linking process:
 333
 334 * Graph element operations
 335
 336   * ``sections`` returns an iterator over all sections in the graph.
 337
 338   * ``findSectionByName`` returns a pointer to the section with the given
 339     name (as a ``Section*``) if it exists, otherwise returns a nullptr.
 340
 341   * ``blocks`` returns an iterator over all blocks in the graph (across all
 342     sections).
 343
 344   * ``defined_symbols`` returns an iterator over all defined symbols in the
 345     graph (across all sections).
 346
 347   * ``external_symbols`` returns an iterator over all external symbols in the
 348     graph.
 349
 350   * ``absolute_symbols`` returns an iterator over all absolute symbols in the
 351     graph.
 352
 353   * ``createSection`` creates a section with a given name and protection flags.
 354
 355   * ``createContentBlock`` creates a block with the given initial content,
 356     parent section, address, alignment, and alignment offset.
 357
 358   * ``createZeroFillBlock`` creates a zero-fill block with the given size,
 359     parent section, address, alignment, and alignment offset.
 360
 361   * ``addExternalSymbol`` creates a new addressable and symbol with a given
 362     name, size, and linkage.
 363
 364   * ``addAbsoluteSymbol`` creates a new addressable and symbol with a given
 365     name, address, size, linkage, scope, and liveness.
 366
 367   * ``addCommonSymbol`` convenience function for creating a zero-filled block
 368     and weak symbol with a given name, scope, section, initial address, size,
 369     alignment and liveness.
 370
 371   * ``addAnonymousSymbol`` creates a new anonymous symbol for a given block,
 372     offset, size, callable-ness, and liveness.
 373
 374   * ``addDefinedSymbol`` creates a new symbol for a given block with a name,
 375     offset, size, linkage, scope, callable-ness and liveness.
 376
 377   * ``makeExternal`` transforms a formerly defined symbol into an external one
 378     by creating a new addressable and pointing the symbol at it. The existing
 379     block is not deleted, but can be manually removed (if unreferenced) by
 380     calling ``removeBlock``. All edges to the symbol remain valid, but the
 381     symbol must now be defined outside this ``LinkGraph``.
 382
 383   * ``removeExternalSymbol`` removes an external symbol and its target
 384     addressable. The target addressable must not be referenced by any other
 385     symbols.
 386
 387   * ``removeAbsoluteSymbol`` removes an absolute symbol and its target
 388     addressable. The target addressable must not be referenced by any other
 389     symbols.
 390
 391   * ``removeDefinedSymbol`` removes a defined symbol, but *does not* remove
 392     its target block.
 393
 394   * ``removeBlock`` removes the given block.
 395
 396   * ``splitBlock`` split a given block in two at a given index (useful where
 397     it is known that a block contains decomposable records, e.g. CFI records
 398     in an eh-frame section).
 399
 400 * Graph utility operations
 401
 402   * ``getName`` returns the name of this graph, which is usually based on the
 403     name of the input object file.
 404
 405   * ``getTargetTriple`` returns an `llvm::Triple` for the executor process.
 406
 407   * ``getPointerSize`` returns the size of a pointer (in bytes) in the executor
 408     process.
 409
 410   * ``getEndinaness`` returns the endianness of the executor process.
 411
 412   * ``allocateString`` copies data from a given ``llvm::Twine`` into the
 413     link graph's internal allocator. This can be used to ensure that content
 414     created inside a pass outlives that pass's execution.
 415
 416 .. _generic_link_algorithm:
 417
 418 Generic Link Algorithm
 419 ======================
 420
 421 JITLink provides a generic link algorithm which can be extended / modified at
 422 certain points by the introduction of JITLink :ref:`passes`.
 423
 424 At the end of each phase the linker packages its state into a *continuation*
 425 and calls the ``JITLinkContext`` object to perform a (potentially high-latency)
 426 asynchronous operation: allocating memory, resolving external symbols, and
 427 finally transferring linked memory to the executing process.
 428
 429 #. Phase 1
 430
 431    This phase is called immediately by the ``link`` function as soon as the
 432    initial configuration (including the pass pipeline setup) is complete.
 433
 434    #. Run pre-prune passes.
 435
 436       These passes are called on the graph before it is pruned. At this stage
 437       ``LinkGraph`` nodes still have their original vmaddrs. A mark-live pass
 438       (supplied by the ``JITLinkContext``) will be run at the end of this
 439       sequence to mark the initial set of live symbols.
 440
 441       Notable use cases: marking nodes live, accessing/copying graph data that
 442       will be pruned (e.g. metadata that's important for the JIT, but not needed
 443       for the link process).
 444
 445    #. Prune (dead-strip) the ``LinkGraph``.
 446
 447       Removes all symbols and blocks not reachable from the initial set of live
 448       symbols.
 449
 450       This allows JITLink to remove unreachable symbols / content, including
 451       overridden weak and redundant ODR definitions.
 452
 453    #. Run post-prune passes.
 454
 455       These passes are run on the graph after dead-stripping, but before memory
 456       is allocated or nodes assigned their final target vmaddrs.
 457
 458       Passes run at this stage benefit from pruning, as dead functions and data
 459       have been stripped from the graph. However new content can still be added
 460       to the graph, as target and working memory have not been allocated yet.
 461
 462       Notable use cases: Building Global Offset Table (GOT), Procedure Linkage
 463       Table (PLT), and Thread Local Variable (TLV) entries.
 464
 465    #. Asynchronously allocate memory.
 466
 467       Calls the ``JITLinkContext``'s ``JITLinkMemoryManager`` to allocate both
 468       working and target memory for the graph. As part of this process the
 469       ``JITLinkMemoryManager`` will update the the addresses of all nodes
 470       defined in the graph to their assigned target address.
 471
 472       Note: This step only updates the addresses of nodes defined in this graph.
 473       External symbols will still have null addresses.
 474
 475 #. Phase 2
 476
 477    #. Run post-allocation passes.
 478
 479       These passes are run on the graph after working and target memory have
 480       been allocated, but before the ``JITLinkContext`` is notified of the
 481       final addresses of the symbols in the graph. This gives these passes a
 482       chance to set up data structures associated with target addresses before
 483       any JITLink clients (especially ORC queries for symbol resolution) can
 484       attempt to access them.
 485
 486       Notable use cases: Setting up mappings between target addresses and
 487       JIT data structures, such as a mapping between ``__dso_handle`` and
 488       ``JITDylib*``.
 489
 490    #. Notify the ``JITLinkContext`` of the assigned symbol addresses.
 491
 492       Calls ``JITLinkContext::notifyResolved`` on the link graph, allowing
 493       clients to react to the symbol address assignments made for this graph.
 494       In ORC this is used to notify any pending queries for *resolved* symbols,
 495       including pending queries from concurrently running JITLink instances that
 496       have reached the next step and are waiting on the address of a symbol in
 497       this graph to proceed with their link.
 498
 499    #. Identify external symbols and resolve their addresses asynchronously.
 500
 501       Calls the ``JITLinkContext`` to resolve the target address of any external
 502       symbols in the graph.
 503
 504 #. Phase 3
 505
 506    #. Apply external symbol resolution results.
 507
 508       This updates the addresses of all external symbols. At this point all
 509       nodes in the graph have their final target addresses, however node
 510       content still points back to the original data in the object file.
 511
 512    #. Run pre-fixup passes.
 513
 514       These passes are called on the graph after all nodes have been assigned
 515       their final target addresses, but before node content is copied into
 516       working memory and fixed up. Passes run at this stage can make late
 517       optimizations to the graph and content based on address layout.
 518
 519       Notable use cases: GOT and PLT relaxation, where GOT and PLT accesses are
 520       bypassed for fixup targets that are directly accessible under the assigned
 521       memory layout.
 522
 523    #. Copy block content to working memory and apply fixups.
 524
 525       Copies all block content into allocated working memory (following the
 526       target layout) and applies fixups. Graph blocks are updated to point at
 527       the fixed up content.
 528
 529    #. Run post-fixup passes.
 530
 531       These passes are called on the graph after fixups have been applied and
 532       blocks updated to point to the fixed up content.
 533
 534       Post-fixup passes can inspect blocks contents to see the exact bytes that
 535       will be copied to the assigned target addresses.
 536
 537    #. Finalize memory asynchronously.
 538
 539       Calls the ``JITLinkMemoryManager`` to copy working memory to the executor
 540       process and apply the requested permissions.
 541
 542 #. Phase 3.
 543
 544    #. Notify the context that the graph has been emitted.
 545
 546       Calls ``JITLinkContext::notifyFinalized`` and hands off the
 547       ``JITLinkMemoryManager::FinalizedAlloc`` object for this graph's memory
 548       allocation. This allows the context to track/hold memory allocations and
 549       react to the newly emitted definitions. In ORC this is used to update the
 550       ``ExecutionSession`` instance's dependence graph, which may result in
 551       these symbols (and possibly others) becoming *Ready* if all of their
 552       dependencies have also been emitted.
 553
 554 .. _passes:
 555
 556 Passes
 557 ------
 558
 559 JITLink passes are ``std::function<Error(LinkGraph&)>`` instances. They are free
 560 to inspect and modify the given ``LinkGraph`` subject to the constraints of
 561 whatever phase they are running in (see :ref:`generic_link_algorithm`). If a
 562 pass returns ``Error::success()`` then linking continues. If a pass returns
 563 a failure value then linking is stopped and the ``JITLinkContext`` is notified
 564 that the link failed.
 565
 566 Passes may be used by both JITLink backends (e.g. MachO/x86-64 implements GOT
 567 and PLT construction as a pass), and external clients like
 568 ``ObjectLinkingLayer::Plugin``.
 569
 570 In combination with the open ``LinkGraph`` API, JITLink passes enable the
 571 implementation of powerful new features. For example:
 572
 573 * Relaxation optimizations -- A pre-fixup pass can inspect GOT accesses and PLT
 574   calls and identify situations where the addresses of the entry target and the
 575   access are close enough to be accessed directly. In this case the pass can
 576   rewrite the instruction stream of the containing block and update the fixup
 577   edges to make the access direct.
 578
 579   Code for this looks like:
 580
 581 .. code-block:: c++
 582
 583   Error relaxGOTEdges(LinkGraph &G) {
 584     for (auto *B : G.blocks())
 585       for (auto &E : B->edges())
 586         if (E.getKind() == x86_64::GOTLoad) {
 587           auto &GOTTarget = getGOTEntryTarget(E.getTarget());
 588           if (isInRange(B.getFixupAddress(E), GOTTarget)) {
 589             // Rewrite B.getContent() at fixup address from
 590             // MOVQ to LEAQ
 591
 592             // Update edge target and kind.
 593             E.setTarget(GOTTarget);
 594             E.setKind(x86_64::PCRel32);
 595           }
 596         }
 597
 598     return Error::success();
 599   }
 600
 601 * Metadata registration -- Post allocation passes can be used to record the
 602   address range of sections in the target. This can be used to register the
 603   metadata (e.g exception handling frames, language metadata) in the target
 604   once memory has been finalized.
 605
 606 .. code-block:: c++
 607
 608   Error registerEHFrameSection(LinkGraph &G) {
 609     if (auto *Sec = G.findSectionByName("__eh_frame")) {
 610       SectionRange SR(*Sec);
 611       registerEHFrameSection(SR.getStart(), SR.getEnd());
 612     }
 613
 614     return Error::success();
 615   }
 616
 617 * Record call sites for later mutation -- A post-allocation pass can record
 618   the call sites of all calls to a particular function, allowing those call
 619   sites to be updated later at runtime (e.g. for instrumentation, or to
 620   enable the function to be lazily compiled but still called directly after
 621   compilation).
 622
 623 .. code-block:: c++
 624
 625   StringRef FunctionName = "foo";
 626   std::vector<ExecutorAddr> CallSitesForFunction;
 627
 628   auto RecordCallSites =
 629     [&](LinkGraph &G) -> Error {
 630       for (auto *B : G.blocks())
 631         for (auto &E : B.edges())
 632           if (E.getKind() == CallEdgeKind &&
 633               E.getTarget().hasName() &&
 634               E.getTraget().getName() == FunctionName)
 635             CallSitesForFunction.push_back(B.getFixupAddress(E));
 636       return Error::success();
 637     };
 638
 639 Memory Management with JITLinkMemoryManager
 640 -------------------------------------------
 641
 642 JIT linking requires allocation of two kinds of memory: working memory in the
 643 JIT process and target memory in the execution process (these processes and
 644 memory allocations may be one and the same, depending on how the user wants
 645 to build their JIT). It also requires that these allocations conform to the
 646 requested code model in the target process (e.g. MachO/x86-64's Small code
 647 model requires that all code and data for a simulated dylib is allocated within
 648 4Gb). Finally, it is natural to make the memory manager responsible for
 649 transferring memory to the target address space and applying memory protections,
 650 since the memory manager must know how to communicate with the executor, and
 651 since sharing and protection assignment can often be efficiently managed (in
 652 the common case of running across processes on the same machine for security)
 653 via the host operating system's virtual memory management APIs.
 654
 655 To satisfy these requirements ``JITLinkMemoryManager`` adopts the following
 656 design: The memory manager itself has just two virtual methods for asynchronous
 657 operations (each with convenience overloads for calling synchronously):
 658
 659 .. code-block:: c++
 660
 661   /// Called when allocation has been completed.
 662   using OnAllocatedFunction =
 663     unique_function<void(Expected<std::unique_ptr<InFlightAlloc>)>;
 664
 665   /// Called when deallocation has completed.
 666   using OnDeallocatedFunction = unique_function<void(Error)>;
 667
 668   /// Call to allocate memory.
 669   virtual void allocate(const JITLinkDylib *JD, LinkGraph &G,
 670                         OnAllocatedFunction OnAllocated) = 0;
 671
 672   /// Call to deallocate memory.
 673   virtual void deallocate(std::vector<FinalizedAlloc> Allocs,
 674                           OnDeallocatedFunction OnDeallocated) = 0;
 675
 676 The ``allocate`` method takes a ``JITLinkDylib*`` representing the target
 677 simulated dylib, a reference to the ``LinkGraph`` that must be allocated for,
 678 and a callback to run once an ``InFlightAlloc`` has been constructed.
 679 ``JITLinkMemoryManager`` implementations can (optionally) use the ``JD``
 680 argument to manage a per-simulated-dylib memory pool (since code model
 681 constraints are typically imposed on a per-dylib basis, and not across
 682 dylibs) [2]_. The ``LinkGraph`` describes the object file that we need to
 683 allocate memory for. The allocator must allocate working memory for all of
 684 the Blocks defined in the graph, assign address space for each Block within the
 685 executing processes memory, and update the Blocks' addresses to reflect this
 686 assignment. Block content should be copied to working memory, but does not need
 687 to be transferred to executor memory yet (that will be done once the content is
 688 fixed up). ``JITLinkMemoryManager`` implementations can take full
 689 responsibility for these steps, or use the ``BasicLayout`` utility to reduce
 690 the task to allocating working and executor memory for *segments*: chunks of
 691 memory defined by permissions, alignments, content sizes, and zero-fill sizes.
 692 Once the allocation step is complete the memory manager should construct an
 693 ``InFlightAlloc`` object to represent the allocation, and then pass this object
 694 to the ``OnAllocated`` callback.
 695
 696 The ``InFlightAlloc`` object has two virtual methods:
 697
 698 .. code-block:: c++
 699
 700     using OnFinalizedFunction = unique_function<void(Expected<FinalizedAlloc>)>;
 701     using OnAbandonedFunction = unique_function<void(Error)>;
 702
 703     /// Called prior to finalization if the allocation should be abandoned.
 704     virtual void abandon(OnAbandonedFunction OnAbandoned) = 0;
 705
 706     /// Called to transfer working memory to the target and apply finalization.
 707     virtual void finalize(OnFinalizedFunction OnFinalized) = 0;
 708
 709 The linking process will call the ``finalize`` method on the ``InFlightAlloc``
 710 object if linking succeeds up to the finalization step, otherwise it will call
 711 ``abandon`` to indicate that some error occurred during linking. A call to the
 712 ``InFlightAlloc::finalize`` method should cause content for the allocation to be
 713 transferred from working to executor memory, and permissions to be run. A call
 714 to ``abandon`` should result in both kinds of memory being deallocated.
 715
 716 On successful finalization, the ``InFlightAlloc::finalize`` method should
 717 construct a ``FinalizedAlloc`` object (an opaque uint64_t id that the
 718 ``JITLinkMemoryManager`` can use to identify executor memory for deallocation)
 719 and pass it to the ``OnFinalized`` callback.
 720
 721 Finalized allocations (represented by ``FinalizedAlloc`` objects) can be
 722 deallocated by calling the ``JITLinkMemoryManager::dealloc`` method. This method
 723 takes a vector of ``FinalizedAlloc`` objects, since it is common to deallocate
 724 multiple objects at the same time and this allows us to batch these requests for
 725 transmission to the executing process.
 726
 727 JITLink provides a simple in-process implementation of this interface:
 728 ``InProcessMemoryManager``. It allocates pages once and re-uses them as both
 729 working and target memory.
 730
 731 ORC provides a cross-process-capable ``MapperJITLinkMemoryManager`` that can use
 732 shared memory or ORC-RPC-based communication to transfer content to the executing
 733 process.
 734
 735 JITLinkMemoryManager and Security
 736 ---------------------------------
 737
 738 JITLink's ability to link JIT'd code for a separate executor process can be
 739 used to improve the security of a JIT system: The executor process can be
 740 sandboxed, run within a VM, or even run on a fully separate machine.
 741
 742 JITLink's memory manager interface is flexible enough to allow for a range of
 743 trade-offs between performance and security. For example, on a system where code
 744 pages must be signed (preventing code from being updated), the memory manager
 745 can deallocate working memory pages after linking to free memory in the process
 746 running JITLink. Alternatively, on a system that allows RWX pages, the memory
 747 manager may use the same pages for both working and target memory by marking
 748 them as RWX, allowing code to be modified in place without further overhead.
 749 Finally, if RWX pages are not permitted but dual-virtual-mappings of
 750 physical memory pages are, then the memory manager can dual map physical pages
 751 as RW- in the JITLink process and R-X in the executor process, allowing
 752 modification from the JITLink process but not from the executor (at the cost of
 753 extra administrative overhead for the dual mapping).
 754
 755 Error Handling
 756 --------------
 757
 758 JITLink makes extensive use of the ``llvm::Error`` type (see the error handling
 759 section of :doc:`ProgrammersManual` for details). The link process itself, all
 760 passes, the memory manager interface, and operations on the ``JITLinkContext``
 761 are all permitted to fail. Link graph construction utilities (especially parsers
 762 for object formats) are encouraged to validate input, and validate fixups
 763 (e.g. with range checks) before application.
 764
 765 Any error will halt the link process and notify the context of failure. In ORC,
 766 reported failures are propagated to queries pending on definitions provided by
 767 the failing link, and also through edges of the dependence graph to any queries
 768 waiting on dependent symbols.
 769
 770 .. _connection_to_orc_runtime:
 771
 772 Connection to the ORC Runtime
 773 =============================
 774
 775 The ORC Runtime (currently under development) aims to provide runtime support
 776 for advanced JIT features, including object format features that require
 777 non-trivial action in the executor (e.g. running initializers, managing thread
 778 local storage, registering with language runtimes, etc.).
 779
 780 ORC Runtime support for object format features typically requires cooperation
 781 between the runtime (which executes in the executor process) and JITLink (which
 782 runs in the JIT process and can inspect LinkGraphs to determine what actions
 783 must be taken in the executor). For example: Execution of MachO static
 784 initializers in the ORC runtime is performed by the ``jit_dlopen`` function,
 785 which calls back to the JIT process to ask for the list of address ranges of
 786 ``__mod_init`` sections to walk. This list is collated by the
 787 ``MachOPlatformPlugin``, which installs a pass to record this information for
 788 each object as it is linked into the target.
 789
 790 .. _constructing_linkgraphs:
 791
 792 Constructing LinkGraphs
 793 =======================
 794
 795 Clients usually access and manipulate ``LinkGraph`` instances that were created
 796 for them by an ``ObjectLinkingLayer`` instance, but they can be created manually:
 797
 798 #. By directly constructing and populating a ``LinkGraph`` instance.
 799
 800 #. By using the ``createLinkGraph`` family of functions to create a
 801    ``LinkGraph`` from an in-memory buffer containing an object file. This is how
 802    ``ObjectLinkingLayer`` usually creates ``LinkGraphs``.
 803
 804   #. ``createLinkGraph_<Object-Format>_<Architecture>`` can be used when
 805       both the object format and architecture are known ahead of time.
 806
 807   #. ``createLinkGraph_<Object-Format>`` can be used when the object format is
 808      known ahead of time, but the architecture is not. In this case the
 809      architecture will be determined by inspection of the object header.
 810
 811   #. ``createLinkGraph`` can be used when neither the object format nor
 812      the architecture are known ahead of time. In this case the object header
 813      will be inspected to determine both the format and architecture.
 814
 815 .. _jit_linking:
 816
 817 JIT Linking
 818 ===========
 819
 820 The JIT linker concept was introduced in LLVM's earlier generation of JIT APIs,
 821 MCJIT. In MCJIT the *RuntimeDyld* component enabled re-use of LLVM as an
 822 in-memory compiler by adding an in-memory link step to the end of the usual
 823 compiler pipeline. Rather than dumping relocatable objects to disk as a compiler
 824 usually would, MCJIT passed them to RuntimeDyld to be linked into a target
 825 process.
 826
 827 This approach to linking differs from standard *static* or *dynamic* linking:
 828
 829 A *static linker* takes one or more relocatable object files as input and links
 830 them into an executable or dynamic library on disk.
 831
 832 A *dynamic linker* applies relocations to executables and dynamic libraries that
 833 have been loaded into memory.
 834
 835 A *JIT linker* takes a single relocatable object file at a time and links it
 836 into a target process, usually using a context object to allow the linked code
 837 to resolve symbols in the target.
 838
 839 RuntimeDyld
 840 -----------
 841
 842 In order to keep RuntimeDyld's implementation simple MCJIT imposed some
 843 restrictions on compiled code:
 844
 845 #. It had to use the Large code model, and often restricted available relocation
 846    models in order to limit the kinds of relocations that had to be supported.
 847
 848 #. It required strong linkage and default visibility on all symbols -- behavior
 849    for other linkages/visibilities was not well defined.
 850
 851 #. It constrained and/or prohibited the use of features requiring runtime
 852    support, e.g. static initializers or thread local storage.
 853
 854 As a result of these restrictions not all language features supported by LLVM
 855 worked under MCJIT, and objects to be loaded under the JIT had to be compiled to
 856 target it (precluding the use of precompiled code from other sources under the
 857 JIT).
 858
 859 RuntimeDyld also provided very limited visibility into the linking process
 860 itself: Clients could access conservative estimates of section size
 861 (RuntimeDyld bundled stub size and padding estimates into the section size
 862 value) and the final relocated bytes, but could not access RuntimeDyld's
 863 internal object representations.
 864
 865 Eliminating these restrictions and limitations was one of the primary motivations
 866 for the development of JITLink.
 867
 868 The llvm-jitlink tool
 869 =====================
 870
 871 The ``llvm-jitlink`` tool is a command line wrapper for the JITLink library.
 872 It loads some set of relocatable object files and then links them using
 873 JITLink. Depending on the options used it will then execute them, or validate
 874 the linked memory.
 875
 876 The ``llvm-jitlink`` tool was originally designed to aid JITLink development by
 877 providing a simple environment for testing.
 878
 879 Basic usage
 880 -----------
 881
 882 By default, ``llvm-jitlink`` will link the set of objects passed on the command
 883 line, then search for a "main" function and execute it:
 884
 885 .. code-block:: sh
 886
 887   % cat hello-world.c
 888   #include <stdio.h>
 889
 890   int main(int argc, char *argv[]) {
 891     printf("hello, world!\n");
 892     return 0;
 893   }
 894
 895   % clang -c -o hello-world.o hello-world.c
 896   % llvm-jitlink hello-world.o
 897   Hello, World!
 898
 899 Multiple objects may be specified, and arguments may be provided to the JIT'd
 900 main function using the -args option:
 901
 902 .. code-block:: sh
 903
 904   % cat print-args.c
 905   #include <stdio.h>
 906
 907   void print_args(int argc, char *argv[]) {
 908     for (int i = 0; i != argc; ++i)
 909       printf("arg %i is \"%s\"\n", i, argv[i]);
 910   }
 911
 912   % cat print-args-main.c
 913   void print_args(int argc, char *argv[]);
 914
 915   int main(int argc, char *argv[]) {
 916     print_args(argc, argv);
 917     return 0;
 918   }
 919
 920   % clang -c -o print-args.o print-args.c
 921   % clang -c -o print-args-main.o print-args-main.c
 922   % llvm-jitlink print-args.o print-args-main.o -args a b c
 923   arg 0 is "a"
 924   arg 1 is "b"
 925   arg 2 is "c"
 926
 927 Alternative entry points may be specified using the ``-entry <entry point
 928 name>`` option.
 929
 930 Other options can be found by calling ``llvm-jitlink -help``.
 931
 932 llvm-jitlink as a regression testing utility
 933 --------------------------------------------
 934
 935 One of the primary aims of ``llvm-jitlink`` was to enable readable regression
 936 tests for JITLink. To do this it supports two options:
 937
 938 The ``-noexec`` option tells llvm-jitlink to stop after looking up the entry
 939 point, and before attempting to execute it. Since the linked code is not
 940 executed, this can be used to link for other targets even if you do not have
 941 access to the target being linked (the ``-define-abs`` or ``-phony-externals``
 942 options can be used to supply any missing definitions in this case).
 943
 944 The ``-check <check-file>`` option can be used to run a set of ``jitlink-check``
 945 expressions against working memory. It is typically used in conjunction with
 946 ``-noexec``, since the aim is to validate JIT'd memory rather than to run the
 947 code and ``-noexec`` allows us to link for any supported target architecture
 948 from the current process. In ``-check`` mode, ``llvm-jitlink`` will scan the
 949 given check-file for lines of the form ``# jitlink-check: <expr>``. See
 950 examples of this usage in ``llvm/test/ExecutionEngine/JITLink``.
 951
 952 Remote execution via llvm-jitlink-executor
 953 ------------------------------------------
 954
 955 By default ``llvm-jitlink`` will link the given objects into its own process,
 956 but this can be overridden by two options:
 957
 958 The ``-oop-executor[=/path/to/executor]`` option tells ``llvm-jitlink`` to
 959 execute the given executor (which defaults to ``llvm-jitlink-executor``) and
 960 communicate with it via file descriptors which it passes to the executor
 961 as the first argument with the format ``filedescs=<in-fd>,<out-fd>``.
 962
 963 The ``-oop-executor-connect=<host>:<port>`` option tells ``llvm-jitlink`` to
 964 connect to an already running executor via TCP on the given host and port. To
 965 use this option you will need to start ``llvm-jitlink-executor`` manually with
 966 ``listen=<host>:<port>`` as the first argument.
 967
 968 Harness mode
 969 ------------
 970
 971 The ``-harness`` option allows a set of input objects to be designated as a test
 972 harness, with the regular object files implicitly treated as objects to be
 973 tested. Definitions of symbols in the harness set override definitions in the
 974 test set, and external references from the harness cause automatic scope
 975 promotion of local symbols in the test set (these modifications to the usual
 976 linker rules are accomplished via an ``ObjectLinkingLayer::Plugin`` installed by
 977 ``llvm-jitlink`` when it sees the ``-harness`` option).
 978
 979 With these modifications in place we can selectively test functions in an object
 980 file by mocking those function's callees. For example, suppose we have an object
 981 file, ``test_code.o``, compiled from the following C source (which we need not
 982 have access to):
 983
 984 .. code-block:: c
 985
 986   void irrelevant_function() { irrelevant_external(); }
 987
 988   int function_to_mock(int X) {
 989     return /* some function of X */;
 990   }
 991
 992   static void function_to_test() {
 993     ...
 994     int Y = function_to_mock();
 995     printf("Y is %i\n", Y);
 996   }
 997
 998 If we want to know how ``function_to_test`` behaves when we change the behavior
 999 of ``function_to_mock`` we can test it by writing a test harness:
1000
1001 .. code-block:: c
1002
1003   void function_to_test();
1004
1005   int function_to_mock(int X) {
1006     printf("used mock utility function\n");
1007     return 42;
1008   }
1009
1010   int main(int argc, char *argv[]) {
1011     function_to_test():
1012     return 0;
1013   }
1014
1015 Under normal circumstances these objects could not be linked together:
1016 ``function_to_test`` is static and could not be resolved outside
1017 ``test_code.o``, the two ``function_to_mock`` functions would result in a
1018 duplicate definition error, and ``irrelevant_external`` is undefined.
1019 However, using ``-harness`` and ``-phony-externals`` we can run this code
1020 with:
1021
1022 .. code-block:: sh
1023
1024   % clang -c -o test_code_harness.o test_code_harness.c
1025   % llvm-jitlink -phony-externals test_code.o -harness test_code_harness.o
1026   used mock utility function
1027   Y is 42
1028
1029 The ``-harness`` option may be of interest to people who want to perform some
1030 very late testing on build products to verify that compiled code behaves as
1031 expected. On basic C test cases this is relatively straightforward. Mocks for
1032 more complicated languages (e.g. C++) are much trickier: Any code involving
1033 classes tends to have a lot of non-trivial surface area (e.g. vtables) that
1034 would require great care to mock.
1035
1036 Tips for JITLink backend developers
1037 -----------------------------------
1038
1039 #. Make liberal use of assert and ``llvm::Error``. Do *not* assume that the input
1040    object is well formed: Return any errors produced by libObject (or your own
1041    object parsing code) and validate as you construct. Think carefully about the
1042    distinction between contract (which should be validated with asserts and
1043    llvm_unreachable) and environmental errors (which should generate
1044    ``llvm::Error`` instances).
1045
1046 #. Don't assume you're linking in-process. Use libSupport's sized,
1047    endian-specific types when reading/writing content in the ``LinkGraph``.
1048
1049 As a "minimum viable" JITLink wrapper, the ``llvm-jitlink`` tool is an
1050 invaluable resource for developers bringing in a new JITLink backend. A standard
1051 workflow is to start by throwing an unsupported object at the tool and seeing
1052 what error is returned, then fixing that (you can often make a reasonable guess
1053 at what should be done based on existing code for other formats or
1054 architectures).
1055
1056 In debug builds of LLVM, the ``-debug-only=jitlink`` option dumps logs from the
1057 JITLink library during the link process. These can be useful for spotting some bugs at
1058 a glance. The ``-debug-only=llvm_jitlink`` option dumps logs from the ``llvm-jitlink``
1059 tool, which can be useful for debugging both testcases (it is often less verbose than
1060 ``-debug-only=jitlink``) and the tool itself.
1061
1062 The ``-oop-executor`` and ``-oop-executor-connect`` options are helpful for testing
1063 handling of cross-process and cross-architecture use cases.
1064
1065 Roadmap
1066 =======
1067
1068 JITLink is under active development. Work so far has focused on the MachO
1069 implementation. In LLVM 12 there is limited support for ELF on x86-64.
1070
1071 Major outstanding projects include:
1072
1073 * Refactor architecture support to maximize sharing across formats.
1074
1075   All formats should be able to share the bulk of the architecture specific
1076   code (especially relocations) for each supported architecture.
1077
1078 * Refactor ELF link graph construction.
1079
1080   ELF's link graph construction is currently implemented in the `ELF_x86_64.cpp`
1081   file, and tied to the x86-64 relocation parsing code. The bulk of the code is
1082   generic and should be split into an ELFLinkGraphBuilder base class along the
1083   same lines as the existing generic MachOLinkGraphBuilder.
1084
1085 * Implement support for arm32.
1086
1087 * Implement support for other new architectures.
1088
1089 JITLink Availability and Feature Status
1090 ---------------------------------------
1091
1092 The following table describes the status of the JITlink backends for various
1093 format / architecture combinations (as of July 2023).
1094
1095 Support levels:
1096
1097 * None: No backend. JITLink will return an "architecture not supported" error.
1098   Represented by empty cells in the table below.
1099 * Skeleton: A backend exists, but does not support commonly used relocations.
1100   Even simple programs are likely to trigger an "unsupported relocation" error.
1101   Backends in this state may be easy to improve by implementing new relocations.
1102   Consider getting involved!
1103 * Basic: The backend supports simple programs, isn't ready for general use yet.
1104 * Usable: The backend is useable for general use for at least one code and
1105   relocation model.
1106 * Good: The backend supports almost all relocations. Advanced features like
1107   native thread local storage may not be available yet.
1108 * Complete: The backend supports all relocations and object format features.
1109
1110 .. list-table:: Availability and Status
1111    :widths: 10 30 30 30
1112    :header-rows: 1
1113    :stub-columns: 1
1114
1115    * - Architecture
1116      - ELF
1117      - COFF
1118      - MachO
1119    * - arm32
1120      - Skeleton
1121      -
1122      -
1123    * - arm64
1124      - Usable
1125      -
1126      - Good
1127    * - LoongArch
1128      - Good
1129      -
1130      -
1131    * - PowerPC 64
1132      - Usable
1133      -
1134      -
1135    * - RISC-V
1136      - Good
1137      -
1138      -
1139    * - x86-32
1140      - Basic
1141      -
1142      -
1143    * - x86-64
1144      - Good
1145      - Usable
1146      - Good
1147
1148 .. [1] See ``llvm/examples/OrcV2Examples/LLJITWithObjectLinkingLayerPlugin`` for
1149        a full worked example.
1150
1151 .. [2] If not for *hidden* scoped symbols we could eliminate the
1152        ``JITLinkDylib*`` argument to ``JITLinkMemoryManager::allocate`` and
1153        treat every object as a separate simulated dylib for the purposes of
1154        memory layout. Hidden symbols break this by generating in-range accesses
1155        to external symbols, requiring the access and symbol to be allocated
1156        within range of one another. That said, providing a pre-reserved address
1157        range pool for each simulated dylib guarantees that the relaxation
1158        optimizations will kick in for all intra-dylib references, which is good
1159        for performance (at the cost of whatever overhead is introduced by
1160        reserving the address-range up-front).