1 .. _transformation-metadata:
3 ============================
4 Code Transformation Metadata
5 ============================
13 LLVM transformation passes can be controlled by attaching metadata to
14 the code to transform. By default, transformation passes use heuristics
15 to determine whether or not to perform transformations, and when doing
16 so, other details of how the transformations are applied (e.g., which
17 vectorization factor to select).
18 Unless the optimizer is otherwise directed, transformations are applied
19 conservatively. This conservatism generally allows the optimizer to
20 avoid unprofitable transformations, but in practice, this results in the
21 optimizer not applying transformations that would be highly profitable.
23 Frontends can give additional hints to LLVM passes on which
24 transformations they should apply. This can be additional knowledge that
25 cannot be derived from the emitted IR, or directives passed from the
26 user/programmer. OpenMP pragmas are an example of the latter.
28 If any such metadata is dropped from the program, the code's semantics
34 Attributes can be attached to loops as described in :ref:`llvm.loop`.
35 Attributes can describe properties of the loop, disable transformations,
36 force specific transformations and set transformation options.
38 Because metadata nodes are immutable (with the exception of
39 ``MDNode::replaceOperandWith`` which is dangerous to use on uniqued
40 metadata), in order to add or remove a loop attributes, a new ``MDNode``
41 must be created and assigned as the new ``llvm.loop`` metadata. Any
42 connection between the old ``MDNode`` and the loop is lost. The
43 ``llvm.loop`` node is also used as LoopID (``Loop::getLoopID()``), i.e.
44 the loop effectively gets a new identifier. For instance,
45 ``llvm.mem.parallel_loop_access`` references the LoopID. Therefore, if
46 the parallel access property is to be preserved after adding/removing
47 loop attributes, any ``llvm.mem.parallel_loop_access`` reference must be
48 updated to the new LoopID.
50 Transformation Metadata Structure
51 =================================
53 Some attributes describe code transformations (unrolling, vectorizing,
54 loop distribution, etc.). They can either be a hint to the optimizer
55 that a transformation might be beneficial, instruction to use a specific
56 option, , or convey a specific request from the user (such as
57 ``#pragma clang loop`` or ``#pragma omp simd``).
59 If a transformation is forced but cannot be carried-out for any reason,
60 an optimization-missed warning must be emitted. Semantic information
61 such as a transformation being safe (e.g.
62 ``llvm.mem.parallel_loop_access``) can be unused by the optimizer
63 without generating a warning.
65 Unless explicitly disabled, any optimization pass may heuristically
66 determine whether a transformation is beneficial and apply it. If
67 metadata for another transformation was specified, applying a different
68 transformation before it might be inadvertent due to being applied on a
69 different loop or the loop not existing anymore. To avoid having to
70 explicitly disable an unknown number of passes, the attribute
71 ``llvm.loop.disable_nonforced`` disables all optional, high-level,
72 restructuring transformations.
74 The following example avoids the loop being altered before being
75 vectorized, for instance being unrolled.
79 br i1 %exitcond, label %for.exit, label %for.header, !llvm.loop !0
81 !0 = distinct !{!0, !1, !2}
82 !1 = !{!"llvm.loop.vectorize.enable", i1 true}
83 !2 = !{!"llvm.loop.disable_nonforced"}
85 After a transformation is applied, follow-up attributes are set on the
86 transformed and/or new loop(s). This allows additional attributes
87 including followup-transformations to be specified. Specifying multiple
88 transformations in the same metadata node is possible for compatibility
89 reasons, but their execution order is undefined. For instance, when
90 ``llvm.loop.vectorize.enable`` and ``llvm.loop.unroll.enable`` are
91 specified at the same time, unrolling may occur either before or after
94 As an example, the following instructs a loop to be vectorized and only
99 !0 = distinct !{!0, !1, !2, !3}
100 !1 = !{!"llvm.loop.vectorize.enable", i1 true}
101 !2 = !{!"llvm.loop.disable_nonforced"}
102 !3 = !{!"llvm.loop.vectorize.followup_vectorized", !{"llvm.loop.unroll.enable"}}
104 If, and only if, no followup is specified, the pass may add attributes itself.
105 For instance, the vectorizer adds a ``llvm.loop.isvectorized`` attribute and
106 all attributes from the original loop excluding its loop vectorizer
107 attributes. To avoid this, an empty followup attribute can be used, e.g.
111 !3 = !{!"llvm.loop.vectorize.followup_vectorized"}
113 The followup attributes of a transformation that cannot be applied will
114 never be added to a loop and are therefore effectively ignored. This means
115 that any followup-transformation in such attributes requires that its
116 prior transformations are applied before the followup-transformation.
117 The user should receive a warning about the first transformation in the
118 transformation chain that could not be applied if it a forced
119 transformation. All following transformations are skipped.
121 Pass-Specific Transformation Metadata
122 =====================================
124 Transformation options are specific to each transformation. In the
125 following, we present the model for each LLVM loop optimization pass and
126 the metadata to influence them.
128 Loop Vectorization and Interleaving
129 -----------------------------------
131 Loop vectorization and interleaving is interpreted as a single
132 transformation. It is interpreted as forced if
133 ``!{"llvm.loop.vectorize.enable", i1 true}`` is set.
135 Assuming the pre-vectorization loop is
139 for (int i = 0; i < n; i+=1) // original loop
142 then the code after vectorization will be approximately (assuming an
149 for (; i + 3 < n; i+=4) // vectorized/interleaved loop
152 for (; i < n; i+=1) // epilogue loop
155 where ``rtc`` is a generated runtime check.
157 ``llvm.loop.vectorize.followup_vectorized`` will set the attributes for
158 the vectorized loop. If not specified, ``llvm.loop.isvectorized`` is
159 combined with the original loop's attributes to avoid it being
160 vectorized multiple times.
162 ``llvm.loop.vectorize.followup_epilogue`` will set the attributes for
163 the remainder loop. If not specified, it will have the original loop's
164 attributes combined with ``llvm.loop.isvectorized`` and
165 ``llvm.loop.unroll.runtime.disable`` (unless the original loop already
166 has unroll metadata).
168 The attributes specified by ``llvm.loop.vectorize.followup_all`` are
171 When using a follow-up attribute, it replaces any automatically deduced
172 attributes for the generated loop in question. Therefore it is
173 recommended to add ``llvm.loop.isvectorized`` to
174 ``llvm.loop.vectorize.followup_all`` which avoids that the loop
175 vectorizer tries to optimize the loops again.
180 Unrolling is interpreted as forced any ``!{!"llvm.loop.unroll.enable"}``
181 metadata or option (``llvm.loop.unroll.count``, ``llvm.loop.unroll.full``)
182 is present. Unrolling can be full unrolling, partial unrolling of a loop
183 with constant trip count or runtime unrolling of a loop with a trip
184 count unknown at compile-time.
186 If the loop has been unrolled fully, there is no followup-loop. For
187 partial/runtime unrolling, the original loop of
191 for (int i = 0; i < n; i+=1) // original loop
194 is transformed into (using an unroll factor of 4):
199 for (; i + 3 < n; i+=4) // unrolled loop
205 for (; i < n; i+=1) // remainder loop
208 ``llvm.loop.unroll.followup_unrolled`` will set the loop attributes of
209 the unrolled loop. If not specified, the attributes of the original loop
210 without the ``llvm.loop.unroll.*`` attributes are copied and
211 ``llvm.loop.unroll.disable`` added to it.
213 ``llvm.loop.unroll.followup_remainder`` defines the attributes of the
214 remainder loop. If not specified the remainder loop will have no
215 attributes. The remainder loop might not be present due to being fully
216 unrolled in which case this attribute has no effect.
218 Attributes defined in ``llvm.loop.unroll.followup_all`` are added to the
219 unrolled and remainder loops.
221 To avoid that the partially unrolled loop is unrolled again, it is
222 recommended to add ``llvm.loop.unroll.disable`` to
223 ``llvm.loop.unroll.followup_all``. If no follow-up attribute specified
224 for a generated loop, it is added automatically.
229 Unroll-and-jam uses the following transformation model (here with an
230 unroll factor if 2). Currently, it does not support a fallback version
231 when the transformation is unsafe.
235 for (int i = 0; i < n; i+=1) { // original outer loop
237 for (int j = 0; j < m; j+=1) // original inner loop
245 for (; i + 1 < n; i+=2) { // unrolled outer loop
248 for (int j = 0; j < m; j+=1) { // unrolled inner loop
255 for (; i < n; i+=1) { // remainder outer loop
257 for (int j = 0; j < m; j+=1) // remainder inner loop
262 ``llvm.loop.unroll_and_jam.followup_outer`` will set the loop attributes
263 of the unrolled outer loop. If not specified, the attributes of the
264 original outer loop without the ``llvm.loop.unroll.*`` attributes are
265 copied and ``llvm.loop.unroll.disable`` added to it.
267 ``llvm.loop.unroll_and_jam.followup_inner`` will set the loop attributes
268 of the unrolled inner loop. If not specified, the attributes of the
269 original inner loop are used unchanged.
271 ``llvm.loop.unroll_and_jam.followup_remainder_outer`` sets the loop
272 attributes of the outer remainder loop. If not specified it will not
273 have any attributes. The remainder loop might not be present due to
274 being fully unrolled.
276 ``llvm.loop.unroll_and_jam.followup_remainder_inner`` sets the loop
277 attributes of the inner remainder loop. If not specified it will have
278 the attributes of the original inner loop. It the outer remainder loop
279 is unrolled, the inner remainder loop might be present multiple times.
281 Attributes defined in ``llvm.loop.unroll_and_jam.followup_all`` are
282 added to all of the aforementioned output loops.
284 To avoid that the unrolled loop is unrolled again, it is
285 recommended to add ``llvm.loop.unroll.disable`` to
286 ``llvm.loop.unroll_and_jam.followup_all``. It suppresses unroll-and-jam
287 as well as an additional inner loop unrolling. If no follow-up
288 attribute specified for a generated loop, it is added automatically.
293 The LoopDistribution pass tries to separate vectorizable parts of a loop
294 from the non-vectorizable part (which otherwise would make the entire
295 loop non-vectorizable). Conceptually, it transforms a loop such as
299 for (int i = 1; i < n; i+=1) { // original loop
305 into the following code:
310 for (int i = 1; i < n; i+=1) // coincident loop
312 for (int i = 1; i < n; i+=1) // coincident loop
314 for (int i = 1; i < n; i+=1) // sequential loop
317 for (int i = 1; i < n; i+=1) { // fallback loop
324 where ``rtc`` is a generated runtime check.
326 ``llvm.loop.distribute.followup_coincident`` sets the loop attributes of
327 all loops without loop-carried dependencies (i.e. vectorizable loops).
328 There might be more than one such loops. If not defined, the loops will
329 inherit the original loop's attributes.
331 ``llvm.loop.distribute.followup_sequential`` sets the loop attributes of the
332 loop with potentially unsafe dependencies. There should be at most one
333 such loop. If not defined, the loop will inherit the original loop's
336 ``llvm.loop.distribute.followup_fallback`` defines the loop attributes
337 for the fallback loop, which is a copy of the original loop for when
338 loop versioning is required. If undefined, the fallback loop inherits
339 all attributes from the original loop.
341 Attributes defined in ``llvm.loop.distribute.followup_all`` are added to
342 all of the aforementioned output loops.
344 It is recommended to add ``llvm.loop.disable_nonforced`` to
345 ``llvm.loop.distribute.followup_fallback``. This avoids that the
346 fallback version (which is likely never executed) is further optimzed
347 which would increase the code size.
352 The pass hoists code out of loops that are only loop-invariant when
353 dynamic conditions apply. For instance, it transforms the loop
357 for (int i = 0; i < n; i+=1) // original loop
366 for (int i = 0; i < n; i+=1) // versioned loop
369 for (int i = 0; i < n; i+=1) // unversioned loop
373 The runtime condition (``rtc``) checks that the array ``A`` and the
374 element `B[0]` do not alias.
376 Currently, this transformation does not support followup-attributes.
381 Currently, the ``LoopInterchange`` pass does not use any metadata.
383 Ambiguous Transformation Order
384 ==============================
386 If there multiple transformations defined, the order in which they are
387 executed depends on the order in LLVM's pass pipeline, which is subject
388 to change. The default optimization pipeline (anything higher than
389 ``-O0``) has the following order.
391 When using the legacy pass manager:
393 - LoopInterchange (if enabled)
394 - SimpleLoopUnroll/LoopFullUnroll (only performs full unrolling)
395 - VersioningLICM (if enabled)
398 - LoopUnrollAndJam (if enabled)
399 - LoopUnroll (partial and runtime unrolling)
401 When using the legacy pass manager with LTO:
403 - LoopInterchange (if enabled)
404 - SimpleLoopUnroll/LoopFullUnroll (only performs full unrolling)
406 - LoopUnroll (partial and runtime unrolling)
408 When using the new pass manager:
410 - SimpleLoopUnroll/LoopFullUnroll (only performs full unrolling)
413 - LoopUnrollAndJam (if enabled)
414 - LoopUnroll (partial and runtime unrolling)
416 Leftover Transformations
417 ========================
419 Forced transformations that have not been applied after the last
420 transformation pass should be reported to the user. The transformation
421 passes themselves cannot be responsible for this reporting because they
422 might not be in the pipeline, there might be multiple passes able to
423 apply a transformation (e.g. ``LoopInterchange`` and Polly) or a
424 transformation attribute may be 'hidden' inside another passes' followup
427 The pass ``-transform-warning`` (``WarnMissedTransformationsPass``)
428 emits such warnings. It should be placed after the last transformation
431 The current pass pipeline has a fixed order in which transformations
432 passes are executed. A transformation can be in the followup of a pass
433 that is executed later and thus leftover. For instance, a loop nest
434 cannot be distributed and then interchanged with the current pass
435 pipeline. The loop distribution will execute, but there is no loop
436 interchange pass following such that any loop interchange metadata will
437 be ignored. The ``-transform-warning`` should emit a warning in this
440 Future versions of LLVM may fix this by executing transformations using