8 .. _clang-offload-bundler:
13 For heterogeneous single source programming languages, use one or more
14 ``--offload-arch=<target-id>`` Clang options to specify the target IDs of the
15 code to generate for the offload code regions.
17 The tool chain may perform multiple compilations of a translation unit to
18 produce separate code objects for the host and potentially multiple offloaded
19 devices. The ``clang-offload-bundler`` tool may be used as part of the tool
20 chain to combine these multiple code objects into a single bundled code object.
22 The tool chain may use a bundled code object as an intermediate step so that
23 each tool chain step consumes and produces a single file as in traditional
24 non-heterogeneous tool chains. The bundled code object contains the code objects
25 for the host and all the offload devices.
27 A bundled code object may also be used to bundle just the offloaded code
28 objects, and embedded as data into the host code object. The host compilation
29 includes an ``init`` function that will use the runtime corresponding to the
30 offload kind (see :ref:`clang-offload-kind-table`) to load the offload code
31 objects appropriate to the devices present when the host program is executed.
33 .. _clang-bundled-code-object-layout:
35 Bundled Code Object Layout
36 ==========================
38 The layout of a bundled code object is defined by the following table:
40 .. table:: Bundled Code Object Layout
41 :name: bundled-code-object-layout-table
43 =================================== ======= ================ ===============================
44 Field Type Size in Bytes Description
45 =================================== ======= ================ ===============================
46 Magic String string 24 ``__CLANG_OFFLOAD_BUNDLE__``
47 Number Of Bundle Entries integer 8 Number of bundle entries.
48 1st Bundle Entry Code Object Offset integer 8 Byte offset from beginning of
49 bundled code object to 1st code
51 1st Bundle Entry Code Object Size integer 8 Byte size of 1st code object.
52 1st Bundle Entry ID Length integer 8 Character length of bundle
53 entry ID of 1st code object.
54 1st Bundle Entry ID string 1st Bundle Entry Bundle entry ID of 1st code
55 ID Length object. This is not NUL
57 :ref:`clang-bundle-entry-id`.
59 Nth Bundle Entry Code Object Offset integer 8
60 Nth Bundle Entry Code Object Size integer 8
61 Nth Bundle Entry ID Length integer 8
62 Nth Bundle Entry ID string 1st Bundle Entry
64 1st Bundle Entry Code Object bytes 1st Bundle Entry
67 Nth Bundle Entry Code Object bytes Nth Bundle Entry
69 =================================== ======= ================ ===============================
71 .. _clang-bundle-entry-id:
76 Each entry in a bundled code object (see
77 :ref:`clang-bundled-code-object-layout`) has a bundle entry ID that indicates
78 the kind of the entry's code object and the runtime that manages it.
80 Bundle entry ID syntax is defined by the following BNF syntax:
84 <bundle-entry-id> ::== <offload-kind> "-" <target-triple> [ "-" <target-id> ]
89 The runtime responsible for managing the bundled entry code object. See
90 :ref:`clang-offload-kind-table`.
92 .. table:: Bundled Code Object Offload Kind
93 :name: clang-offload-kind-table
95 ============= ==============================================================
96 Offload Kind Description
97 ============= ==============================================================
98 host Host code object. ``clang-offload-bundler`` always includes
99 this entry as the first bundled code object entry. For an
100 embedded bundled code object this entry is not used by the
101 runtime and so is generally an empty code object.
103 hip Offload code object for the HIP language. Used for all
104 HIP language offload code objects when the
105 ``clang-offload-bundler`` is used to bundle code objects as
106 intermediate steps of the tool chain. Also used for AMD GPU
107 code objects before ABI version V4 when the
108 ``clang-offload-bundler`` is used to create a *fat binary*
109 to be loaded by the HIP runtime. The fat binary can be
110 loaded directly from a file, or be embedded in the host code
111 object as a data section with the name ``.hip_fatbin``.
113 hipv4 Offload code object for the HIP language. Used for AMD GPU
114 code objects with at least ABI version V4 when the
115 ``clang-offload-bundler`` is used to create a *fat binary*
116 to be loaded by the HIP runtime. The fat binary can be
117 loaded directly from a file, or be embedded in the host code
118 object as a data section with the name ``.hip_fatbin``.
120 openmp Offload code object for the OpenMP language extension.
121 ============= ==============================================================
124 The target triple of the code object:
128 <Architecture>-<Vendor>-<OS>-<Environment>
130 It is required to have all four components present, if target-id is present.
131 Components are hyphen separated. If a component is not specified then the
132 empty string must be used in its place.
135 The canonical target ID of the code object. Present only if the target
136 supports a target ID. See :ref:`clang-target-id`.
138 Each entry of a bundled code object must have a different bundle entry ID. There
139 can be multiple entries for the same processor provided they differ in target
140 feature settings. If there is an entry with a target feature specified as *Any*,
141 then all entries must specify that target feature as *Any* for the same
142 processor. There may be additional target specific restrictions.
149 A target ID is used to indicate the processor and optionally its configuration,
150 expressed by a set of target features, that affect ISA generation. It is target
151 specific if a target ID is supported, or if the target triple alone is
152 sufficient to specify the ISA generation.
154 It is used with the ``-mcpu=<target-id>`` and ``--offload-arch=<target-id>``
155 Clang compilation options to specify the kind of code to generate.
157 It is also used as part of the bundle entry ID to identify the code object. See
158 :ref:`clang-bundle-entry-id`.
160 Target ID syntax is defined by the following BNF syntax:
164 <target-id> ::== <processor> ( ":" <target-feature> ( "+" | "-" ) )*
169 Is a the target specific processor or any alternative processor name.
172 Is a target feature name that is supported by the processor. Each target
173 feature must appear at most once in a target ID and can have one of three
177 Specified by omitting the target feature from the target ID.
178 A code object compiled with a target ID specifying the default
179 value of a target feature can be loaded and executed on a processor
180 configured with the target feature on or off.
183 Specified by ``+``, indicating the target feature is enabled. A code
184 object compiled with a target ID specifying a target feature on
185 can only be loaded on a processor configured with the target feature on.
188 specified by ``-``, indicating the target feature is disabled. A code
189 object compiled with a target ID specifying a target feature off
190 can only be loaded on a processor configured with the target feature off.
192 There are two forms of target ID:
195 The non-canonical form is used as the input to user commands to allow the user
196 greater convenience. It allows both the primary and alternative processor name
197 to be used and the target features may be specified in any order.
200 The canonical form is used for all generated output to allow greater
201 convenience for tools that consume the information. It is also used for
202 internal passing of information between tools. Only the primary and not
203 alternative processor name is used and the target features are specified in
204 alphabetic order. Command line tools convert non-canonical form to canonical
207 Target Specific information
208 ===========================
210 Target specific information is available for the following:
213 AMD GPU supports target ID and target features. See `User Guide for AMDGPU Backend
214 <https://llvm.org/docs/AMDGPUUsage.html>`_ which defines the `processors
215 <https://llvm.org/docs/AMDGPUUsage.html#amdgpu-processors>`_ and `target
216 features <https://llvm.org/docs/AMDGPUUsage.html#amdgpu-target-features>`_
219 Most other targets do not support target IDs.