1 ==============================
2 LLVM Language Reference Manual
3 ==============================
12 This document is a reference manual for the LLVM assembly language. LLVM
13 is a Static Single Assignment (SSA) based representation that provides
14 type safety, low-level operations, flexibility, and the capability of
15 representing 'all' high-level languages cleanly. It is the common code
16 representation used throughout all phases of the LLVM compilation
22 The LLVM code representation is designed to be used in three different
23 forms: as an in-memory compiler IR, as an on-disk bitcode representation
24 (suitable for fast loading by a Just-In-Time compiler), and as a human
25 readable assembly language representation. This allows LLVM to provide a
26 powerful intermediate representation for efficient compiler
27 transformations and analysis, while providing a natural means to debug
28 and visualize the transformations. The three different forms of LLVM are
29 all equivalent. This document describes the human readable
30 representation and notation.
32 The LLVM representation aims to be light-weight and low-level while
33 being expressive, typed, and extensible at the same time. It aims to be
34 a "universal IR" of sorts, by being at a low enough level that
35 high-level ideas may be cleanly mapped to it (similar to how
36 microprocessors are "universal IR's", allowing many source languages to
37 be mapped to them). By providing type information, LLVM can be used as
38 the target of optimizations: for example, through pointer analysis, it
39 can be proven that a C automatic variable is never accessed outside of
40 the current function, allowing it to be promoted to a simple SSA value
41 instead of a memory location.
48 It is important to note that this document describes 'well formed' LLVM
49 assembly language. There is a difference between what the parser accepts
50 and what is considered 'well formed'. For example, the following
51 instruction is syntactically okay, but not well formed:
57 because the definition of ``%x`` does not dominate all of its uses. The
58 LLVM infrastructure provides a verification pass that may be used to
59 verify that an LLVM module is well formed. This pass is automatically
60 run by the parser after parsing input assembly and by the optimizer
61 before it outputs bitcode. The violations pointed out by the verifier
62 pass indicate bugs in transformation passes or input to the parser.
69 LLVM identifiers come in two basic types: global and local. Global
70 identifiers (functions, global variables) begin with the ``'@'``
71 character. Local identifiers (register names, types) begin with the
72 ``'%'`` character. Additionally, there are three different formats for
73 identifiers, for different purposes:
75 #. Named values are represented as a string of characters with their
76 prefix. For example, ``%foo``, ``@DivisionByZero``,
77 ``%a.really.long.identifier``. The actual regular expression used is
78 '``[%@][-a-zA-Z$._][-a-zA-Z$._0-9]*``'. Identifiers that require other
79 characters in their names can be surrounded with quotes. Special
80 characters may be escaped using ``"\xx"`` where ``xx`` is the ASCII
81 code for the character in hexadecimal. In this way, any character can
82 be used in a name value, even quotes themselves. The ``"\01"`` prefix
83 can be used on global values to suppress mangling.
84 #. Unnamed values are represented as an unsigned numeric value with
85 their prefix. For example, ``%12``, ``@2``, ``%44``.
86 #. Constants, which are described in the section Constants_ below.
88 LLVM requires that values start with a prefix for two reasons: Compilers
89 don't need to worry about name clashes with reserved words, and the set
90 of reserved words may be expanded in the future without penalty.
91 Additionally, unnamed identifiers allow a compiler to quickly come up
92 with a temporary variable without having to avoid symbol table
95 Reserved words in LLVM are very similar to reserved words in other
96 languages. There are keywords for different opcodes ('``add``',
97 '``bitcast``', '``ret``', etc...), for primitive type names ('``void``',
98 '``i32``', etc...), and others. These reserved words cannot conflict
99 with variable names, because none of them start with a prefix character
100 (``'%'`` or ``'@'``).
102 Here is an example of LLVM code to multiply the integer variable
109 %result = mul i32 %X, 8
111 After strength reduction:
115 %result = shl i32 %X, 3
121 %0 = add i32 %X, %X ; yields i32:%0
122 %1 = add i32 %0, %0 ; yields i32:%1
123 %result = add i32 %1, %1
125 This last way of multiplying ``%X`` by 8 illustrates several important
126 lexical features of LLVM:
128 #. Comments are delimited with a '``;``' and go until the end of line.
129 #. Unnamed temporaries are created when the result of a computation is
130 not assigned to a named value.
131 #. Unnamed temporaries are numbered sequentially (using a per-function
132 incrementing counter, starting with 0). Note that basic blocks and unnamed
133 function parameters are included in this numbering. For example, if the
134 entry basic block is not given a label name and all function parameters are
135 named, then it will get number 0.
137 It also shows a convention that we follow in this document. When
138 demonstrating instructions, we will follow an instruction with a comment
139 that defines the type and name of value produced.
147 LLVM programs are composed of ``Module``'s, each of which is a
148 translation unit of the input programs. Each module consists of
149 functions, global variables, and symbol table entries. Modules may be
150 combined together with the LLVM linker, which merges function (and
151 global variable) definitions, resolves forward declarations, and merges
152 symbol table entries. Here is an example of the "hello world" module:
156 ; Declare the string constant as a global constant.
157 @.str = private unnamed_addr constant [13 x i8] c"hello world\0A\00"
159 ; External declaration of the puts function
160 declare i32 @puts(ptr nocapture) nounwind
162 ; Definition of main function
164 ; Call puts function to write out the string to stdout.
165 call i32 @puts(ptr @.str)
170 !0 = !{i32 42, null, !"string"}
173 This example is made up of a :ref:`global variable <globalvars>` named
174 "``.str``", an external declaration of the "``puts``" function, a
175 :ref:`function definition <functionstructure>` for "``main``" and
176 :ref:`named metadata <namedmetadatastructure>` "``foo``".
178 In general, a module is made up of a list of global values (where both
179 functions and global variables are global values). Global values are
180 represented by a pointer to a memory location (in this case, a pointer
181 to an array of char, and a pointer to a function), and have one of the
182 following :ref:`linkage types <linkage>`.
189 All Global Variables and Functions have one of the following types of
193 Global values with "``private``" linkage are only directly
194 accessible by objects in the current module. In particular, linking
195 code into a module with a private global value may cause the
196 private to be renamed as necessary to avoid collisions. Because the
197 symbol is private to the module, all references can be updated. This
198 doesn't show up in any symbol table in the object file.
200 Similar to private, but the value shows as a local symbol
201 (``STB_LOCAL`` in the case of ELF) in the object file. This
202 corresponds to the notion of the '``static``' keyword in C.
203 ``available_externally``
204 Globals with "``available_externally``" linkage are never emitted into
205 the object file corresponding to the LLVM module. From the linker's
206 perspective, an ``available_externally`` global is equivalent to
207 an external declaration. They exist to allow inlining and other
208 optimizations to take place given knowledge of the definition of the
209 global, which is known to be somewhere outside the module. Globals
210 with ``available_externally`` linkage are allowed to be discarded at
211 will, and allow inlining and other optimizations. This linkage type is
212 only allowed on definitions, not declarations.
214 Globals with "``linkonce``" linkage are merged with other globals of
215 the same name when linkage occurs. This can be used to implement
216 some forms of inline functions, templates, or other code which must
217 be generated in each translation unit that uses it, but where the
218 body may be overridden with a more definitive definition later.
219 Unreferenced ``linkonce`` globals are allowed to be discarded. Note
220 that ``linkonce`` linkage does not actually allow the optimizer to
221 inline the body of this function into callers because it doesn't
222 know if this definition of the function is the definitive definition
223 within the program or whether it will be overridden by a stronger
224 definition. To enable inlining and other optimizations, use
225 "``linkonce_odr``" linkage.
227 "``weak``" linkage has the same merging semantics as ``linkonce``
228 linkage, except that unreferenced globals with ``weak`` linkage may
229 not be discarded. This is used for globals that are declared "weak"
232 "``common``" linkage is most similar to "``weak``" linkage, but they
233 are used for tentative definitions in C, such as "``int X;``" at
234 global scope. Symbols with "``common``" linkage are merged in the
235 same way as ``weak symbols``, and they may not be deleted if
236 unreferenced. ``common`` symbols may not have an explicit section,
237 must have a zero initializer, and may not be marked
238 ':ref:`constant <globalvars>`'. Functions and aliases may not have
241 .. _linkage_appending:
244 "``appending``" linkage may only be applied to global variables of
245 pointer to array type. When two global variables with appending
246 linkage are linked together, the two global arrays are appended
247 together. This is the LLVM, typesafe, equivalent of having the
248 system linker append together "sections" with identical names when
251 Unfortunately this doesn't correspond to any feature in .o files, so it
252 can only be used for variables like ``llvm.global_ctors`` which llvm
253 interprets specially.
256 The semantics of this linkage follow the ELF object file model: the
257 symbol is weak until linked, if not linked, the symbol becomes null
258 instead of being an undefined reference.
259 ``linkonce_odr``, ``weak_odr``
260 Some languages allow differing globals to be merged, such as two
261 functions with different semantics. Other languages, such as
262 ``C++``, ensure that only equivalent globals are ever merged (the
263 "one definition rule" --- "ODR"). Such languages can use the
264 ``linkonce_odr`` and ``weak_odr`` linkage types to indicate that the
265 global will only be merged with equivalent globals. These linkage
266 types are otherwise the same as their non-``odr`` versions.
268 If none of the above identifiers are used, the global is externally
269 visible, meaning that it participates in linkage and can be used to
270 resolve external symbol references.
272 It is illegal for a global variable or function *declaration* to have any
273 linkage type other than ``external`` or ``extern_weak``.
280 LLVM :ref:`functions <functionstructure>`, :ref:`calls <i_call>` and
281 :ref:`invokes <i_invoke>` can all have an optional calling convention
282 specified for the call. The calling convention of any pair of dynamic
283 caller/callee must match, or the behavior of the program is undefined.
284 The following calling conventions are supported by LLVM, and more may be
287 "``ccc``" - The C calling convention
288 This calling convention (the default if no other calling convention
289 is specified) matches the target C calling conventions. This calling
290 convention supports varargs function calls and tolerates some
291 mismatch in the declared prototype and implemented declaration of
292 the function (as does normal C).
293 "``fastcc``" - The fast calling convention
294 This calling convention attempts to make calls as fast as possible
295 (e.g. by passing things in registers). This calling convention
296 allows the target to use whatever tricks it wants to produce fast
297 code for the target, without having to conform to an externally
298 specified ABI (Application Binary Interface). `Tail calls can only
299 be optimized when this, the tailcc, the GHC or the HiPE convention is
300 used. <CodeGenerator.html#tail-call-optimization>`_ This calling
301 convention does not support varargs and requires the prototype of all
302 callees to exactly match the prototype of the function definition.
303 "``coldcc``" - The cold calling convention
304 This calling convention attempts to make code in the caller as
305 efficient as possible under the assumption that the call is not
306 commonly executed. As such, these calls often preserve all registers
307 so that the call does not break any live ranges in the caller side.
308 This calling convention does not support varargs and requires the
309 prototype of all callees to exactly match the prototype of the
310 function definition. Furthermore the inliner doesn't consider such function
312 "``cc 10``" - GHC convention
313 This calling convention has been implemented specifically for use by
314 the `Glasgow Haskell Compiler (GHC) <http://www.haskell.org/ghc>`_.
315 It passes everything in registers, going to extremes to achieve this
316 by disabling callee save registers. This calling convention should
317 not be used lightly but only for specific situations such as an
318 alternative to the *register pinning* performance technique often
319 used when implementing functional programming languages. At the
320 moment only X86 supports this convention and it has the following
323 - On *X86-32* only supports up to 4 bit type parameters. No
324 floating-point types are supported.
325 - On *X86-64* only supports up to 10 bit type parameters and 6
326 floating-point parameters.
328 This calling convention supports `tail call
329 optimization <CodeGenerator.html#tail-call-optimization>`_ but requires
330 both the caller and callee are using it.
331 "``cc 11``" - The HiPE calling convention
332 This calling convention has been implemented specifically for use by
333 the `High-Performance Erlang
334 (HiPE) <http://www.it.uu.se/research/group/hipe/>`_ compiler, *the*
335 native code compiler of the `Ericsson's Open Source Erlang/OTP
336 system <http://www.erlang.org/download.shtml>`_. It uses more
337 registers for argument passing than the ordinary C calling
338 convention and defines no callee-saved registers. The calling
339 convention properly supports `tail call
340 optimization <CodeGenerator.html#tail-call-optimization>`_ but requires
341 that both the caller and the callee use it. It uses a *register pinning*
342 mechanism, similar to GHC's convention, for keeping frequently
343 accessed runtime components pinned to specific hardware registers.
344 At the moment only X86 supports this convention (both 32 and 64
346 "``webkit_jscc``" - WebKit's JavaScript calling convention
347 This calling convention has been implemented for `WebKit FTL JIT
348 <https://trac.webkit.org/wiki/FTLJIT>`_. It passes arguments on the
349 stack right to left (as cdecl does), and returns a value in the
350 platform's customary return register.
351 "``anyregcc``" - Dynamic calling convention for code patching
352 This is a special convention that supports patching an arbitrary code
353 sequence in place of a call site. This convention forces the call
354 arguments into registers but allows them to be dynamically
355 allocated. This can currently only be used with calls to
356 llvm.experimental.patchpoint because only this intrinsic records
357 the location of its arguments in a side table. See :doc:`StackMaps`.
358 "``preserve_mostcc``" - The `PreserveMost` calling convention
359 This calling convention attempts to make the code in the caller as
360 unintrusive as possible. This convention behaves identically to the `C`
361 calling convention on how arguments and return values are passed, but it
362 uses a different set of caller/callee-saved registers. This alleviates the
363 burden of saving and recovering a large register set before and after the
364 call in the caller. If the arguments are passed in callee-saved registers,
365 then they will be preserved by the callee across the call. This doesn't
366 apply for values returned in callee-saved registers.
368 - On X86-64 the callee preserves all general purpose registers, except for
369 R11 and return registers, if any. R11 can be used as a scratch register.
370 Floating-point registers (XMMs/YMMs) are not preserved and need to be
373 - On AArch64 the callee preserve all general purpose registers, except X0-X8
376 The idea behind this convention is to support calls to runtime functions
377 that have a hot path and a cold path. The hot path is usually a small piece
378 of code that doesn't use many registers. The cold path might need to call out to
379 another function and therefore only needs to preserve the caller-saved
380 registers, which haven't already been saved by the caller. The
381 `PreserveMost` calling convention is very similar to the `cold` calling
382 convention in terms of caller/callee-saved registers, but they are used for
383 different types of function calls. `coldcc` is for function calls that are
384 rarely executed, whereas `preserve_mostcc` function calls are intended to be
385 on the hot path and definitely executed a lot. Furthermore `preserve_mostcc`
386 doesn't prevent the inliner from inlining the function call.
388 This calling convention will be used by a future version of the ObjectiveC
389 runtime and should therefore still be considered experimental at this time.
390 Although this convention was created to optimize certain runtime calls to
391 the ObjectiveC runtime, it is not limited to this runtime and might be used
392 by other runtimes in the future too. The current implementation only
393 supports X86-64, but the intention is to support more architectures in the
395 "``preserve_allcc``" - The `PreserveAll` calling convention
396 This calling convention attempts to make the code in the caller even less
397 intrusive than the `PreserveMost` calling convention. This calling
398 convention also behaves identical to the `C` calling convention on how
399 arguments and return values are passed, but it uses a different set of
400 caller/callee-saved registers. This removes the burden of saving and
401 recovering a large register set before and after the call in the caller. If
402 the arguments are passed in callee-saved registers, then they will be
403 preserved by the callee across the call. This doesn't apply for values
404 returned in callee-saved registers.
406 - On X86-64 the callee preserves all general purpose registers, except for
407 R11. R11 can be used as a scratch register. Furthermore it also preserves
408 all floating-point registers (XMMs/YMMs).
410 - On AArch64 the callee preserve all general purpose registers, except X0-X8
411 and X16-X18. Furthermore it also preserves lower 128 bits of V8-V31 SIMD -
412 floating point registers.
414 The idea behind this convention is to support calls to runtime functions
415 that don't need to call out to any other functions.
417 This calling convention, like the `PreserveMost` calling convention, will be
418 used by a future version of the ObjectiveC runtime and should be considered
419 experimental at this time.
420 "``cxx_fast_tlscc``" - The `CXX_FAST_TLS` calling convention for access functions
421 Clang generates an access function to access C++-style TLS. The access
422 function generally has an entry block, an exit block and an initialization
423 block that is run at the first time. The entry and exit blocks can access
424 a few TLS IR variables, each access will be lowered to a platform-specific
427 This calling convention aims to minimize overhead in the caller by
428 preserving as many registers as possible (all the registers that are
429 preserved on the fast path, composed of the entry and exit blocks).
431 This calling convention behaves identical to the `C` calling convention on
432 how arguments and return values are passed, but it uses a different set of
433 caller/callee-saved registers.
435 Given that each platform has its own lowering sequence, hence its own set
436 of preserved registers, we can't use the existing `PreserveMost`.
438 - On X86-64 the callee preserves all general purpose registers, except for
440 "``tailcc``" - Tail callable calling convention
441 This calling convention ensures that calls in tail position will always be
442 tail call optimized. This calling convention is equivalent to fastcc,
443 except for an additional guarantee that tail calls will be produced
444 whenever possible. `Tail calls can only be optimized when this, the fastcc,
445 the GHC or the HiPE convention is used. <CodeGenerator.html#tail-call-optimization>`_
446 This calling convention does not support varargs and requires the prototype of
447 all callees to exactly match the prototype of the function definition.
448 "``swiftcc``" - This calling convention is used for Swift language.
449 - On X86-64 RCX and R8 are available for additional integer returns, and
450 XMM2 and XMM3 are available for additional FP/vector returns.
451 - On iOS platforms, we use AAPCS-VFP calling convention.
453 This calling convention is like ``swiftcc`` in most respects, but also the
454 callee pops the argument area of the stack so that mandatory tail calls are
455 possible as in ``tailcc``.
456 "``cfguard_checkcc``" - Windows Control Flow Guard (Check mechanism)
457 This calling convention is used for the Control Flow Guard check function,
458 calls to which can be inserted before indirect calls to check that the call
459 target is a valid function address. The check function has no return value,
460 but it will trigger an OS-level error if the address is not a valid target.
461 The set of registers preserved by the check function, and the register
462 containing the target address are architecture-specific.
464 - On X86 the target address is passed in ECX.
465 - On ARM the target address is passed in R0.
466 - On AArch64 the target address is passed in X15.
467 "``cc <n>``" - Numbered convention
468 Any calling convention may be specified by number, allowing
469 target-specific calling conventions to be used. Target specific
470 calling conventions start at 64.
472 More calling conventions can be added/defined on an as-needed basis, to
473 support Pascal conventions or any other well-known target-independent
476 .. _visibilitystyles:
481 All Global Variables and Functions have one of the following visibility
484 "``default``" - Default style
485 On targets that use the ELF object file format, default visibility
486 means that the declaration is visible to other modules and, in
487 shared libraries, means that the declared entity may be overridden.
488 On Darwin, default visibility means that the declaration is visible
489 to other modules. On XCOFF, default visibility means no explicit
490 visibility bit will be set and whether the symbol is visible
491 (i.e "exported") to other modules depends primarily on export lists
492 provided to the linker. Default visibility corresponds to "external
493 linkage" in the language.
494 "``hidden``" - Hidden style
495 Two declarations of an object with hidden visibility refer to the
496 same object if they are in the same shared object. Usually, hidden
497 visibility indicates that the symbol will not be placed into the
498 dynamic symbol table, so no other module (executable or shared
499 library) can reference it directly.
500 "``protected``" - Protected style
501 On ELF, protected visibility indicates that the symbol will be
502 placed in the dynamic symbol table, but that references within the
503 defining module will bind to the local symbol. That is, the symbol
504 cannot be overridden by another module.
506 A symbol with ``internal`` or ``private`` linkage must have ``default``
514 All Global Variables, Functions and Aliases can have one of the following
518 "``dllimport``" causes the compiler to reference a function or variable via
519 a global pointer to a pointer that is set up by the DLL exporting the
520 symbol. On Microsoft Windows targets, the pointer name is formed by
521 combining ``__imp_`` and the function or variable name.
523 On Microsoft Windows targets, "``dllexport``" causes the compiler to provide
524 a global pointer to a pointer in a DLL, so that it can be referenced with the
525 ``dllimport`` attribute. the pointer name is formed by combining ``__imp_``
526 and the function or variable name. On XCOFF targets, ``dllexport`` indicates
527 that the symbol will be made visible to other modules using "exported"
528 visibility and thus placed by the linker in the loader section symbol table.
529 Since this storage class exists for defining a dll interface, the compiler,
530 assembler and linker know it is externally referenced and must refrain from
533 A symbol with ``internal`` or ``private`` linkage cannot have a DLL storage
538 Thread Local Storage Models
539 ---------------------------
541 A variable may be defined as ``thread_local``, which means that it will
542 not be shared by threads (each thread will have a separated copy of the
543 variable). Not all targets support thread-local variables. Optionally, a
544 TLS model may be specified:
547 For variables that are only used within the current shared library.
549 For variables in modules that will not be loaded dynamically.
551 For variables defined in the executable and only used within it.
553 If no explicit model is given, the "general dynamic" model is used.
555 The models correspond to the ELF TLS models; see `ELF Handling For
556 Thread-Local Storage <http://people.redhat.com/drepper/tls.pdf>`_ for
557 more information on under which circumstances the different models may
558 be used. The target may choose a different TLS model if the specified
559 model is not supported, or if a better choice of model can be made.
561 A model can also be specified in an alias, but then it only governs how
562 the alias is accessed. It will not have any effect in the aliasee.
564 For platforms without linker support of ELF TLS model, the -femulated-tls
565 flag can be used to generate GCC compatible emulated TLS code.
567 .. _runtime_preemption_model:
569 Runtime Preemption Specifiers
570 -----------------------------
572 Global variables, functions and aliases may have an optional runtime preemption
573 specifier. If a preemption specifier isn't given explicitly, then a
574 symbol is assumed to be ``dso_preemptable``.
577 Indicates that the function or variable may be replaced by a symbol from
578 outside the linkage unit at runtime.
581 The compiler may assume that a function or variable marked as ``dso_local``
582 will resolve to a symbol within the same linkage unit. Direct access will
583 be generated even if the definition is not within this compilation unit.
590 LLVM IR allows you to specify both "identified" and "literal" :ref:`structure
591 types <t_struct>`. Literal types are uniqued structurally, but identified types
592 are never uniqued. An :ref:`opaque structural type <t_opaque>` can also be used
593 to forward declare a type that is not yet available.
595 An example of an identified structure specification is:
599 %mytype = type { %mytype*, i32 }
601 Prior to the LLVM 3.0 release, identified types were structurally uniqued. Only
602 literal types are uniqued in recent versions of LLVM.
606 Non-Integral Pointer Type
607 -------------------------
609 Note: non-integral pointer types are a work in progress, and they should be
610 considered experimental at this time.
612 LLVM IR optionally allows the frontend to denote pointers in certain address
613 spaces as "non-integral" via the :ref:`datalayout string<langref_datalayout>`.
614 Non-integral pointer types represent pointers that have an *unspecified* bitwise
615 representation; that is, the integral representation may be target dependent or
616 unstable (not backed by a fixed integer).
618 ``inttoptr`` and ``ptrtoint`` instructions have the same semantics as for
619 integral (i.e. normal) pointers in that they convert integers to and from
620 corresponding pointer types, but there are additional implications to be
621 aware of. Because the bit-representation of a non-integral pointer may
622 not be stable, two identical casts of the same operand may or may not
623 return the same value. Said differently, the conversion to or from the
624 non-integral type depends on environmental state in an implementation
627 If the frontend wishes to observe a *particular* value following a cast, the
628 generated IR must fence with the underlying environment in an implementation
629 defined manner. (In practice, this tends to require ``noinline`` routines for
632 From the perspective of the optimizer, ``inttoptr`` and ``ptrtoint`` for
633 non-integral types are analogous to ones on integral types with one
634 key exception: the optimizer may not, in general, insert new dynamic
635 occurrences of such casts. If a new cast is inserted, the optimizer would
636 need to either ensure that a) all possible values are valid, or b)
637 appropriate fencing is inserted. Since the appropriate fencing is
638 implementation defined, the optimizer can't do the latter. The former is
639 challenging as many commonly expected properties, such as
640 ``ptrtoint(v)-ptrtoint(v) == 0``, don't hold for non-integral types.
647 Global variables define regions of memory allocated at compilation time
650 Global variable definitions must be initialized.
652 Global variables in other translation units can also be declared, in which
653 case they don't have an initializer.
655 Global variables can optionally specify a :ref:`linkage type <linkage>`.
657 Either global variable definitions or declarations may have an explicit section
658 to be placed in and may have an optional explicit alignment specified. If there
659 is a mismatch between the explicit or inferred section information for the
660 variable declaration and its definition the resulting behavior is undefined.
662 A variable may be defined as a global ``constant``, which indicates that
663 the contents of the variable will **never** be modified (enabling better
664 optimization, allowing the global data to be placed in the read-only
665 section of an executable, etc). Note that variables that need runtime
666 initialization cannot be marked ``constant`` as there is a store to the
669 LLVM explicitly allows *declarations* of global variables to be marked
670 constant, even if the final definition of the global is not. This
671 capability can be used to enable slightly better optimization of the
672 program, but requires the language definition to guarantee that
673 optimizations based on the 'constantness' are valid for the translation
674 units that do not include the definition.
676 As SSA values, global variables define pointer values that are in scope
677 (i.e. they dominate) all basic blocks in the program. Global variables
678 always define a pointer to their "content" type because they describe a
679 region of memory, and all memory objects in LLVM are accessed through
682 Global variables can be marked with ``unnamed_addr`` which indicates
683 that the address is not significant, only the content. Constants marked
684 like this can be merged with other constants if they have the same
685 initializer. Note that a constant with significant address *can* be
686 merged with a ``unnamed_addr`` constant, the result being a constant
687 whose address is significant.
689 If the ``local_unnamed_addr`` attribute is given, the address is known to
690 not be significant within the module.
692 A global variable may be declared to reside in a target-specific
693 numbered address space. For targets that support them, address spaces
694 may affect how optimizations are performed and/or what target
695 instructions are used to access the variable. The default address space
696 is zero. The address space qualifier must precede any other attributes.
698 LLVM allows an explicit section to be specified for globals. If the
699 target supports it, it will emit globals to the section specified.
700 Additionally, the global can placed in a comdat if the target has the necessary
703 External declarations may have an explicit section specified. Section
704 information is retained in LLVM IR for targets that make use of this
705 information. Attaching section information to an external declaration is an
706 assertion that its definition is located in the specified section. If the
707 definition is located in a different section, the behavior is undefined.
709 By default, global initializers are optimized by assuming that global
710 variables defined within the module are not modified from their
711 initial values before the start of the global initializer. This is
712 true even for variables potentially accessible from outside the
713 module, including those with external linkage or appearing in
714 ``@llvm.used`` or dllexported variables. This assumption may be suppressed
715 by marking the variable with ``externally_initialized``.
717 An explicit alignment may be specified for a global, which must be a
718 power of 2. If not present, or if the alignment is set to zero, the
719 alignment of the global is set by the target to whatever it feels
720 convenient. If an explicit alignment is specified, the global is forced
721 to have exactly that alignment. Targets and optimizers are not allowed
722 to over-align the global if the global has an assigned section. In this
723 case, the extra alignment could be observable: for example, code could
724 assume that the globals are densely packed in their section and try to
725 iterate over them as an array, alignment padding would break this
726 iteration. For TLS variables, the module flag ``MaxTLSAlign``, if present,
727 limits the alignment to the given value. Optimizers are not allowed to
728 impose a stronger alignment on these variables. The maximum alignment
731 For global variable declarations, as well as definitions that may be
732 replaced at link time (``linkonce``, ``weak``, ``extern_weak`` and ``common``
733 linkage types), the allocation size and alignment of the definition it resolves
734 to must be greater than or equal to that of the declaration or replaceable
735 definition, otherwise the behavior is undefined.
737 Globals can also have a :ref:`DLL storage class <dllstorageclass>`,
738 an optional :ref:`runtime preemption specifier <runtime_preemption_model>`,
739 an optional :ref:`global attributes <glattrs>` and
740 an optional list of attached :ref:`metadata <metadata>`.
742 Variables and aliases can have a
743 :ref:`Thread Local Storage Model <tls_model>`.
745 :ref:`Scalable vectors <t_vector>` cannot be global variables or members of
746 arrays because their size is unknown at compile time. They are allowed in
747 structs to facilitate intrinsics returning multiple values. Structs containing
748 scalable vectors cannot be used in loads, stores, allocas, or GEPs.
752 @<GlobalVarName> = [Linkage] [PreemptionSpecifier] [Visibility]
753 [DLLStorageClass] [ThreadLocal]
754 [(unnamed_addr|local_unnamed_addr)] [AddrSpace]
755 [ExternallyInitialized]
756 <global | constant> <Type> [<InitializerConstant>]
757 [, section "name"] [, partition "name"]
758 [, comdat [($name)]] [, align <Alignment>]
759 [, no_sanitize_address] [, no_sanitize_hwaddress]
760 [, sanitize_address_dyninit] [, sanitize_memtag]
763 For example, the following defines a global in a numbered address space
764 with an initializer, section, and alignment:
768 @G = addrspace(5) constant float 1.0, section "foo", align 4
770 The following example just declares a global variable
774 @G = external global i32
776 The following example defines a thread-local global with the
777 ``initialexec`` TLS model:
781 @G = thread_local(initialexec) global i32 0, align 4
783 .. _functionstructure:
788 LLVM function definitions consist of the "``define``" keyword, an
789 optional :ref:`linkage type <linkage>`, an optional :ref:`runtime preemption
790 specifier <runtime_preemption_model>`, an optional :ref:`visibility
791 style <visibility>`, an optional :ref:`DLL storage class <dllstorageclass>`,
792 an optional :ref:`calling convention <callingconv>`,
793 an optional ``unnamed_addr`` attribute, a return type, an optional
794 :ref:`parameter attribute <paramattrs>` for the return type, a function
795 name, a (possibly empty) argument list (each with optional :ref:`parameter
796 attributes <paramattrs>`), optional :ref:`function attributes <fnattrs>`,
797 an optional address space, an optional section, an optional partition,
798 an optional alignment, an optional :ref:`comdat <langref_comdats>`,
799 an optional :ref:`garbage collector name <gc>`, an optional :ref:`prefix <prefixdata>`,
800 an optional :ref:`prologue <prologuedata>`,
801 an optional :ref:`personality <personalityfn>`,
802 an optional list of attached :ref:`metadata <metadata>`,
803 an opening curly brace, a list of basic blocks, and a closing curly brace.
807 define [linkage] [PreemptionSpecifier] [visibility] [DLLStorageClass]
809 <ResultType> @<FunctionName> ([argument list])
810 [(unnamed_addr|local_unnamed_addr)] [AddrSpace] [fn Attrs]
811 [section "name"] [partition "name"] [comdat [($name)]] [align N]
812 [gc] [prefix Constant] [prologue Constant] [personality Constant]
815 The argument list is a comma separated sequence of arguments where each
816 argument is of the following form:
820 <type> [parameter Attrs] [name]
822 LLVM function declarations consist of the "``declare``" keyword, an
823 optional :ref:`linkage type <linkage>`, an optional :ref:`visibility style
824 <visibility>`, an optional :ref:`DLL storage class <dllstorageclass>`, an
825 optional :ref:`calling convention <callingconv>`, an optional ``unnamed_addr``
826 or ``local_unnamed_addr`` attribute, an optional address space, a return type,
827 an optional :ref:`parameter attribute <paramattrs>` for the return type, a function name, a possibly
828 empty list of arguments, an optional alignment, an optional :ref:`garbage
829 collector name <gc>`, an optional :ref:`prefix <prefixdata>`, and an optional
830 :ref:`prologue <prologuedata>`.
834 declare [linkage] [visibility] [DLLStorageClass]
836 <ResultType> @<FunctionName> ([argument list])
837 [(unnamed_addr|local_unnamed_addr)] [align N] [gc]
838 [prefix Constant] [prologue Constant]
840 A function definition contains a list of basic blocks, forming the CFG (Control
841 Flow Graph) for the function. Each basic block may optionally start with a label
842 (giving the basic block a symbol table entry), contains a list of instructions,
843 and ends with a :ref:`terminator <terminators>` instruction (such as a branch or
844 function return). If an explicit label name is not provided, a block is assigned
845 an implicit numbered label, using the next value from the same counter as used
846 for unnamed temporaries (:ref:`see above<identifiers>`). For example, if a
847 function entry block does not have an explicit label, it will be assigned label
848 "%0", then the first unnamed temporary in that block will be "%1", etc. If a
849 numeric label is explicitly specified, it must match the numeric label that
850 would be used implicitly.
852 The first basic block in a function is special in two ways: it is
853 immediately executed on entrance to the function, and it is not allowed
854 to have predecessor basic blocks (i.e. there can not be any branches to
855 the entry block of a function). Because the block can have no
856 predecessors, it also cannot have any :ref:`PHI nodes <i_phi>`.
858 LLVM allows an explicit section to be specified for functions. If the
859 target supports it, it will emit functions to the section specified.
860 Additionally, the function can be placed in a COMDAT.
862 An explicit alignment may be specified for a function. If not present,
863 or if the alignment is set to zero, the alignment of the function is set
864 by the target to whatever it feels convenient. If an explicit alignment
865 is specified, the function is forced to have at least that much
866 alignment. All alignments must be a power of 2.
868 If the ``unnamed_addr`` attribute is given, the address is known to not
869 be significant and two identical functions can be merged.
871 If the ``local_unnamed_addr`` attribute is given, the address is known to
872 not be significant within the module.
874 If an explicit address space is not given, it will default to the program
875 address space from the :ref:`datalayout string<langref_datalayout>`.
882 Aliases, unlike function or variables, don't create any new data. They
883 are just a new symbol and metadata for an existing position.
885 Aliases have a name and an aliasee that is either a global value or a
888 Aliases may have an optional :ref:`linkage type <linkage>`, an optional
889 :ref:`runtime preemption specifier <runtime_preemption_model>`, an optional
890 :ref:`visibility style <visibility>`, an optional :ref:`DLL storage class
891 <dllstorageclass>` and an optional :ref:`tls model <tls_model>`.
895 @<Name> = [Linkage] [PreemptionSpecifier] [Visibility] [DLLStorageClass] [ThreadLocal] [(unnamed_addr|local_unnamed_addr)] alias <AliaseeTy>, <AliaseeTy>* @<Aliasee>
898 The linkage must be one of ``private``, ``internal``, ``linkonce``, ``weak``,
899 ``linkonce_odr``, ``weak_odr``, ``external``, ``available_externally``. Note
900 that some system linkers might not correctly handle dropping a weak symbol that
903 Aliases that are not ``unnamed_addr`` are guaranteed to have the same address as
904 the aliasee expression. ``unnamed_addr`` ones are only guaranteed to point
907 If the ``local_unnamed_addr`` attribute is given, the address is known to
908 not be significant within the module.
910 Since aliases are only a second name, some restrictions apply, of which
911 some can only be checked when producing an object file:
913 * The expression defining the aliasee must be computable at assembly
914 time. Since it is just a name, no relocations can be used.
916 * No alias in the expression can be weak as the possibility of the
917 intermediate alias being overridden cannot be represented in an
920 * If the alias has the ``available_externally`` linkage, the aliasee must be an
921 ``available_externally`` global value; otherwise the aliasee can be an
922 expression but no global value in the expression can be a declaration, since
923 that would require a relocation, which is not possible.
925 * If either the alias or the aliasee may be replaced by a symbol outside the
926 module at link time or runtime, any optimization cannot replace the alias with
927 the aliasee, since the behavior may be different. The alias may be used as a
928 name guaranteed to point to the content in the current module.
935 IFuncs, like as aliases, don't create any new data or func. They are just a new
936 symbol that dynamic linker resolves at runtime by calling a resolver function.
938 IFuncs have a name and a resolver that is a function called by dynamic linker
939 that returns address of another function associated with the name.
941 IFunc may have an optional :ref:`linkage type <linkage>` and an optional
942 :ref:`visibility style <visibility>`.
946 @<Name> = [Linkage] [PreemptionSpecifier] [Visibility] ifunc <IFuncTy>, <ResolverTy>* @<Resolver>
955 Comdat IR provides access to object file COMDAT/section group functionality
956 which represents interrelated sections.
958 Comdats have a name which represents the COMDAT key and a selection kind to
959 provide input on how the linker deduplicates comdats with the same key in two
960 different object files. A comdat must be included or omitted as a unit.
961 Discarding the whole comdat is allowed but discarding a subset is not.
963 A global object may be a member of at most one comdat. Aliases are placed in the
964 same COMDAT that their aliasee computes to, if any.
968 $<Name> = comdat SelectionKind
970 For selection kinds other than ``nodeduplicate``, only one of the duplicate
971 comdats may be retained by the linker and the members of the remaining comdats
972 must be discarded. The following selection kinds are supported:
975 The linker may choose any COMDAT key, the choice is arbitrary.
977 The linker may choose any COMDAT key but the sections must contain the
980 The linker will choose the section containing the largest COMDAT key.
982 No deduplication is performed.
984 The linker may choose any COMDAT key but the sections must contain the
987 - XCOFF and Mach-O don't support COMDATs.
988 - COFF supports all selection kinds. Non-``nodeduplicate`` selection kinds need
989 a non-local linkage COMDAT symbol.
990 - ELF supports ``any`` and ``nodeduplicate``.
991 - WebAssembly only supports ``any``.
993 Here is an example of a COFF COMDAT where a function will only be selected if
994 the COMDAT key's section is the largest:
998 $foo = comdat largest
999 @foo = global i32 2, comdat($foo)
1001 define void @bar() comdat($foo) {
1005 In a COFF object file, this will create a COMDAT section with selection kind
1006 ``IMAGE_COMDAT_SELECT_LARGEST`` containing the contents of the ``@foo`` symbol
1007 and another COMDAT section with selection kind
1008 ``IMAGE_COMDAT_SELECT_ASSOCIATIVE`` which is associated with the first COMDAT
1009 section and contains the contents of the ``@bar`` symbol.
1011 As a syntactic sugar the ``$name`` can be omitted if the name is the same as
1014 .. code-block:: llvm
1017 @foo = global i32 2, comdat
1018 @bar = global i32 3, comdat($foo)
1020 There are some restrictions on the properties of the global object.
1021 It, or an alias to it, must have the same name as the COMDAT group when
1023 The contents and size of this object may be used during link-time to determine
1024 which COMDAT groups get selected depending on the selection kind.
1025 Because the name of the object must match the name of the COMDAT group, the
1026 linkage of the global object must not be local; local symbols can get renamed
1027 if a collision occurs in the symbol table.
1029 The combined use of COMDATS and section attributes may yield surprising results.
1032 .. code-block:: llvm
1036 @g1 = global i32 42, section "sec", comdat($foo)
1037 @g2 = global i32 42, section "sec", comdat($bar)
1039 From the object file perspective, this requires the creation of two sections
1040 with the same name. This is necessary because both globals belong to different
1041 COMDAT groups and COMDATs, at the object file level, are represented by
1044 Note that certain IR constructs like global variables and functions may
1045 create COMDATs in the object file in addition to any which are specified using
1046 COMDAT IR. This arises when the code generator is configured to emit globals
1047 in individual sections (e.g. when `-data-sections` or `-function-sections`
1048 is supplied to `llc`).
1050 .. _namedmetadatastructure:
1055 Named metadata is a collection of metadata. :ref:`Metadata
1056 nodes <metadata>` (but not metadata strings) are the only valid
1057 operands for a named metadata.
1059 #. Named metadata are represented as a string of characters with the
1060 metadata prefix. The rules for metadata names are the same as for
1061 identifiers, but quoted names are not allowed. ``"\xx"`` type escapes
1062 are still valid, which allows any character to be part of a name.
1066 ; Some unnamed metadata nodes, which are referenced by the named metadata.
1071 !name = !{!0, !1, !2}
1075 Parameter Attributes
1076 --------------------
1078 The return type and each parameter of a function type may have a set of
1079 *parameter attributes* associated with them. Parameter attributes are
1080 used to communicate additional information about the result or
1081 parameters of a function. Parameter attributes are considered to be part
1082 of the function, not of the function type, so functions with different
1083 parameter attributes can have the same function type.
1085 Parameter attributes are simple keywords that follow the type specified.
1086 If multiple parameter attributes are needed, they are space separated.
1089 .. code-block:: llvm
1091 declare i32 @printf(ptr noalias nocapture, ...)
1092 declare i32 @atoi(i8 zeroext)
1093 declare signext i8 @returns_signed_char()
1095 Note that any attributes for the function result (``nounwind``,
1096 ``readonly``) come immediately after the argument list.
1098 Currently, only the following parameter attributes are defined:
1101 This indicates to the code generator that the parameter or return
1102 value should be zero-extended to the extent required by the target's
1103 ABI by the caller (for a parameter) or the callee (for a return value).
1105 This indicates to the code generator that the parameter or return
1106 value should be sign-extended to the extent required by the target's
1107 ABI (which is usually 32-bits) by the caller (for a parameter) or
1108 the callee (for a return value).
1110 This indicates that this parameter or return value should be treated
1111 in a special target-dependent fashion while emitting code for
1112 a function call or return (usually, by putting it in a register as
1113 opposed to memory, though some targets use it to distinguish between
1114 two different kinds of registers). Use of this attribute is
1117 This indicates that the pointer parameter should really be passed by
1118 value to the function. The attribute implies that a hidden copy of
1119 the pointee is made between the caller and the callee, so the callee
1120 is unable to modify the value in the caller. This attribute is only
1121 valid on LLVM pointer arguments. It is generally used to pass
1122 structs and arrays by value, but is also valid on pointers to
1123 scalars. The copy is considered to belong to the caller not the
1124 callee (for example, ``readonly`` functions should not write to
1125 ``byval`` parameters). This is not a valid attribute for return
1128 The byval type argument indicates the in-memory value type, and
1129 must be the same as the pointee type of the argument.
1131 The byval attribute also supports specifying an alignment with the
1132 align attribute. It indicates the alignment of the stack slot to
1133 form and the known alignment of the pointer specified to the call
1134 site. If the alignment is not specified, then the code generator
1135 makes a target-specific assumption.
1141 The ``byref`` argument attribute allows specifying the pointee
1142 memory type of an argument. This is similar to ``byval``, but does
1143 not imply a copy is made anywhere, or that the argument is passed
1144 on the stack. This implies the pointer is dereferenceable up to
1145 the storage size of the type.
1147 It is not generally permissible to introduce a write to an
1148 ``byref`` pointer. The pointer may have any address space and may
1151 This is not a valid attribute for return values.
1153 The alignment for an ``byref`` parameter can be explicitly
1154 specified by combining it with the ``align`` attribute, similar to
1155 ``byval``. If the alignment is not specified, then the code generator
1156 makes a target-specific assumption.
1158 This is intended for representing ABI constraints, and is not
1159 intended to be inferred for optimization use.
1161 .. _attr_preallocated:
1163 ``preallocated(<ty>)``
1164 This indicates that the pointer parameter should really be passed by
1165 value to the function, and that the pointer parameter's pointee has
1166 already been initialized before the call instruction. This attribute
1167 is only valid on LLVM pointer arguments. The argument must be the value
1168 returned by the appropriate
1169 :ref:`llvm.call.preallocated.arg<int_call_preallocated_arg>` on non
1170 ``musttail`` calls, or the corresponding caller parameter in ``musttail``
1171 calls, although it is ignored during codegen.
1173 A non ``musttail`` function call with a ``preallocated`` attribute in
1174 any parameter must have a ``"preallocated"`` operand bundle. A ``musttail``
1175 function call cannot have a ``"preallocated"`` operand bundle.
1177 The preallocated attribute requires a type argument, which must be
1178 the same as the pointee type of the argument.
1180 The preallocated attribute also supports specifying an alignment with the
1181 align attribute. It indicates the alignment of the stack slot to
1182 form and the known alignment of the pointer specified to the call
1183 site. If the alignment is not specified, then the code generator
1184 makes a target-specific assumption.
1190 The ``inalloca`` argument attribute allows the caller to take the
1191 address of outgoing stack arguments. An ``inalloca`` argument must
1192 be a pointer to stack memory produced by an ``alloca`` instruction.
1193 The alloca, or argument allocation, must also be tagged with the
1194 inalloca keyword. Only the last argument may have the ``inalloca``
1195 attribute, and that argument is guaranteed to be passed in memory.
1197 An argument allocation may be used by a call at most once because
1198 the call may deallocate it. The ``inalloca`` attribute cannot be
1199 used in conjunction with other attributes that affect argument
1200 storage, like ``inreg``, ``nest``, ``sret``, or ``byval``. The
1201 ``inalloca`` attribute also disables LLVM's implicit lowering of
1202 large aggregate return values, which means that frontend authors
1203 must lower them with ``sret`` pointers.
1205 When the call site is reached, the argument allocation must have
1206 been the most recent stack allocation that is still live, or the
1207 behavior is undefined. It is possible to allocate additional stack
1208 space after an argument allocation and before its call site, but it
1209 must be cleared off with :ref:`llvm.stackrestore
1210 <int_stackrestore>`.
1212 The inalloca attribute requires a type argument, which must be the
1213 same as the pointee type of the argument.
1215 See :doc:`InAlloca` for more information on how to use this
1219 This indicates that the pointer parameter specifies the address of a
1220 structure that is the return value of the function in the source
1221 program. This pointer must be guaranteed by the caller to be valid:
1222 loads and stores to the structure may be assumed by the callee not
1223 to trap and to be properly aligned. This is not a valid attribute
1226 The sret type argument specifies the in memory type, which must be
1227 the same as the pointee type of the argument.
1229 .. _attr_elementtype:
1231 ``elementtype(<ty>)``
1233 The ``elementtype`` argument attribute can be used to specify a pointer
1234 element type in a way that is compatible with `opaque pointers
1235 <OpaquePointers.html>`__.
1237 The ``elementtype`` attribute by itself does not carry any specific
1238 semantics. However, certain intrinsics may require this attribute to be
1239 present and assign it particular semantics. This will be documented on
1240 individual intrinsics.
1242 The attribute may only be applied to pointer typed arguments of intrinsic
1243 calls. It cannot be applied to non-intrinsic calls, and cannot be applied
1244 to parameters on function declarations. For non-opaque pointers, the type
1245 passed to ``elementtype`` must match the pointer element type.
1249 ``align <n>`` or ``align(<n>)``
1250 This indicates that the pointer value or vector of pointers has the
1251 specified alignment. If applied to a vector of pointers, *all* pointers
1252 (elements) have the specified alignment. If the pointer value does not have
1253 the specified alignment, :ref:`poison value <poisonvalues>` is returned or
1254 passed instead. The ``align`` attribute should be combined with the
1255 ``noundef`` attribute to ensure a pointer is aligned, or otherwise the
1256 behavior is undefined. Note that ``align 1`` has no effect on non-byval,
1257 non-preallocated arguments.
1259 Note that this attribute has additional semantics when combined with the
1260 ``byval`` or ``preallocated`` attribute, which are documented there.
1265 This indicates that memory locations accessed via pointer values
1266 :ref:`based <pointeraliasing>` on the argument or return value are not also
1267 accessed, during the execution of the function, via pointer values not
1268 *based* on the argument or return value. This guarantee only holds for
1269 memory locations that are *modified*, by any means, during the execution of
1270 the function. The attribute on a return value also has additional semantics
1271 described below. The caller shares the responsibility with the callee for
1272 ensuring that these requirements are met. For further details, please see
1273 the discussion of the NoAlias response in :ref:`alias analysis <Must, May,
1276 Note that this definition of ``noalias`` is intentionally similar
1277 to the definition of ``restrict`` in C99 for function arguments.
1279 For function return values, C99's ``restrict`` is not meaningful,
1280 while LLVM's ``noalias`` is. Furthermore, the semantics of the ``noalias``
1281 attribute on return values are stronger than the semantics of the attribute
1282 when used on function arguments. On function return values, the ``noalias``
1283 attribute indicates that the function acts like a system memory allocation
1284 function, returning a pointer to allocated storage disjoint from the
1285 storage for any other object accessible to the caller.
1290 This indicates that the callee does not :ref:`capture <pointercapture>` the
1291 pointer. This is not a valid attribute for return values.
1292 This attribute applies only to the particular copy of the pointer passed in
1293 this argument. A caller could pass two copies of the same pointer with one
1294 being annotated nocapture and the other not, and the callee could validly
1295 capture through the non annotated parameter.
1297 .. code-block:: llvm
1299 define void @f(ptr nocapture %a, ptr %b) {
1303 call void @f(ptr @glb, ptr @glb) ; well-defined
1306 This indicates that callee does not free the pointer argument. This is not
1307 a valid attribute for return values.
1312 This indicates that the pointer parameter can be excised using the
1313 :ref:`trampoline intrinsics <int_trampoline>`. This is not a valid
1314 attribute for return values and can only be applied to one parameter.
1317 This indicates that the function always returns the argument as its return
1318 value. This is a hint to the optimizer and code generator used when
1319 generating the caller, allowing value propagation, tail call optimization,
1320 and omission of register saves and restores in some cases; it is not
1321 checked or enforced when generating the callee. The parameter and the
1322 function return type must be valid operands for the
1323 :ref:`bitcast instruction <i_bitcast>`. This is not a valid attribute for
1324 return values and can only be applied to one parameter.
1327 This indicates that the parameter or return pointer is not null. This
1328 attribute may only be applied to pointer typed parameters. This is not
1329 checked or enforced by LLVM; if the parameter or return pointer is null,
1330 :ref:`poison value <poisonvalues>` is returned or passed instead.
1331 The ``nonnull`` attribute should be combined with the ``noundef`` attribute
1332 to ensure a pointer is not null or otherwise the behavior is undefined.
1334 ``dereferenceable(<n>)``
1335 This indicates that the parameter or return pointer is dereferenceable. This
1336 attribute may only be applied to pointer typed parameters. A pointer that
1337 is dereferenceable can be loaded from speculatively without a risk of
1338 trapping. The number of bytes known to be dereferenceable must be provided
1339 in parentheses. It is legal for the number of bytes to be less than the
1340 size of the pointee type. The ``nonnull`` attribute does not imply
1341 dereferenceability (consider a pointer to one element past the end of an
1342 array), however ``dereferenceable(<n>)`` does imply ``nonnull`` in
1343 ``addrspace(0)`` (which is the default address space), except if the
1344 ``null_pointer_is_valid`` function attribute is present.
1345 ``n`` should be a positive number. The pointer should be well defined,
1346 otherwise it is undefined behavior. This means ``dereferenceable(<n>)``
1347 implies ``noundef``.
1349 ``dereferenceable_or_null(<n>)``
1350 This indicates that the parameter or return value isn't both
1351 non-null and non-dereferenceable (up to ``<n>`` bytes) at the same
1352 time. All non-null pointers tagged with
1353 ``dereferenceable_or_null(<n>)`` are ``dereferenceable(<n>)``.
1354 For address space 0 ``dereferenceable_or_null(<n>)`` implies that
1355 a pointer is exactly one of ``dereferenceable(<n>)`` or ``null``,
1356 and in other address spaces ``dereferenceable_or_null(<n>)``
1357 implies that a pointer is at least one of ``dereferenceable(<n>)``
1358 or ``null`` (i.e. it may be both ``null`` and
1359 ``dereferenceable(<n>)``). This attribute may only be applied to
1360 pointer typed parameters.
1363 This indicates that the parameter is the self/context parameter. This is not
1364 a valid attribute for return values and can only be applied to one
1370 This indicates that the parameter is the asynchronous context parameter and
1371 triggers the creation of a target-specific extended frame record to store
1372 this pointer. This is not a valid attribute for return values and can only
1373 be applied to one parameter.
1376 This attribute is motivated to model and optimize Swift error handling. It
1377 can be applied to a parameter with pointer to pointer type or a
1378 pointer-sized alloca. At the call site, the actual argument that corresponds
1379 to a ``swifterror`` parameter has to come from a ``swifterror`` alloca or
1380 the ``swifterror`` parameter of the caller. A ``swifterror`` value (either
1381 the parameter or the alloca) can only be loaded and stored from, or used as
1382 a ``swifterror`` argument. This is not a valid attribute for return values
1383 and can only be applied to one parameter.
1385 These constraints allow the calling convention to optimize access to
1386 ``swifterror`` variables by associating them with a specific register at
1387 call boundaries rather than placing them in memory. Since this does change
1388 the calling convention, a function which uses the ``swifterror`` attribute
1389 on a parameter is not ABI-compatible with one which does not.
1391 These constraints also allow LLVM to assume that a ``swifterror`` argument
1392 does not alias any other memory visible within a function and that a
1393 ``swifterror`` alloca passed as an argument does not escape.
1396 This indicates the parameter is required to be an immediate
1397 value. This must be a trivial immediate integer or floating-point
1398 constant. Undef or constant expressions are not valid. This is
1399 only valid on intrinsic declarations and cannot be applied to a
1400 call site or arbitrary function.
1403 This attribute applies to parameters and return values. If the value
1404 representation contains any undefined or poison bits, the behavior is
1405 undefined. Note that this does not refer to padding introduced by the
1406 type's storage representation.
1410 ``nofpclass(<test mask>)``
1411 This attribute applies to parameters and return values with
1412 floating-point and vector of floating-point types, as well as
1413 arrays of such types. The test mask has the same format as the
1414 second argument to the :ref:`llvm.is.fpclass <llvm.is.fpclass>`,
1415 and indicates which classes of floating-point values are not
1416 permitted for the value. For example a bitmask of 3 indicates
1417 the parameter may not be a NaN.
1419 If the value is a floating-point class indicated by the
1420 ``nofpclass`` test mask, a :ref:`poison value <poisonvalues>` is
1421 passed or returned instead.
1423 .. code-block:: text
1424 :caption: The following invariants hold
1426 @llvm.is.fpclass(nofpclass(test_mask) %x, test_mask) => false
1427 @llvm.is.fpclass(nofpclass(test_mask) %x, ~test_mask) => true
1428 nofpclass(all) => poison
1431 In textual IR, various string names are supported for readability
1432 and can be combined. For example ``nofpclass(nan pinf nzero)``
1433 evaluates to a mask of 547.
1435 This does not depend on the floating-point environment. For
1436 example, a function parameter marked ``nofpclass(zero)`` indicates
1437 no zero inputs. If this is applied to an argument in a function
1438 marked with :ref:`\"denormal-fp-math\" <denormal_fp_math>`
1439 indicating zero treatment of input denormals, it does not imply the
1440 value cannot be a denormal value which would compare equal to 0.
1442 .. table:: Recognized test mask names
1444 +-------+----------------------+---------------+
1445 | Name | floating-point class | Bitmask value |
1446 +=======+======================+===============+
1447 | nan | Any NaN | 3 |
1448 +-------+----------------------+---------------+
1449 | inf | +/- infinity | 516 |
1450 +-------+----------------------+---------------+
1451 | norm | +/- normal | 26 |
1452 +-------+----------------------+---------------+
1453 | sub | +/- subnormal | 144 |
1454 +-------+----------------------+---------------+
1455 | zero | +/- 0 | 96 |
1456 +-------+----------------------+---------------+
1457 | all | All values | 1023 |
1458 +-------+----------------------+---------------+
1459 | snan | Signaling NaN | 1 |
1460 +-------+----------------------+---------------+
1461 | qnan | Quiet NaN | 2 |
1462 +-------+----------------------+---------------+
1463 | ninf | Negative infinity | 4 |
1464 +-------+----------------------+---------------+
1465 | nnorm | Negative normal | 8 |
1466 +-------+----------------------+---------------+
1467 | nsub | Negative subnormal | 16 |
1468 +-------+----------------------+---------------+
1469 | nzero | Negative zero | 32 |
1470 +-------+----------------------+---------------+
1471 | pzero | Positive zero | 64 |
1472 +-------+----------------------+---------------+
1473 | psub | Positive subnormal | 128 |
1474 +-------+----------------------+---------------+
1475 | pnorm | Positive normal | 256 |
1476 +-------+----------------------+---------------+
1477 | pinf | Positive infinity | 512 |
1478 +-------+----------------------+---------------+
1482 This indicates the alignment that should be considered by the backend when
1483 assigning this parameter to a stack slot during calling convention
1484 lowering. The enforcement of the specified alignment is target-dependent,
1485 as target-specific calling convention rules may override this value. This
1486 attribute serves the purpose of carrying language specific alignment
1487 information that is not mapped to base types in the backend (for example,
1488 over-alignment specification through language attributes).
1491 The function parameter marked with this attribute is is the alignment in bytes of the
1492 newly allocated block returned by this function. The returned value must either have
1493 the specified alignment or be the null pointer. The return value MAY be more aligned
1494 than the requested alignment, but not less aligned. Invalid (e.g. non-power-of-2)
1495 alignments are permitted for the allocalign parameter, so long as the returned pointer
1496 is null. This attribute may only be applied to integer parameters.
1499 The function parameter marked with this attribute is the pointer
1500 that will be manipulated by the allocator. For a realloc-like
1501 function the pointer will be invalidated upon success (but the
1502 same address may be returned), for a free-like function the
1503 pointer will always be invalidated.
1506 This attribute indicates that the function does not dereference that
1507 pointer argument, even though it may read or write the memory that the
1508 pointer points to if accessed through other pointers.
1510 If a function reads from or writes to a readnone pointer argument, the
1511 behavior is undefined.
1514 This attribute indicates that the function does not write through this
1515 pointer argument, even though it may write to the memory that the pointer
1518 If a function writes to a readonly pointer argument, the behavior is
1522 This attribute indicates that the function may write to, but does not read
1523 through this pointer argument (even though it may read from the memory that
1524 the pointer points to).
1526 If a function reads from a writeonly pointer argument, the behavior is
1531 Garbage Collector Strategy Names
1532 --------------------------------
1534 Each function may specify a garbage collector strategy name, which is simply a
1537 .. code-block:: llvm
1539 define void @f() gc "name" { ... }
1541 The supported values of *name* includes those :ref:`built in to LLVM
1542 <builtin-gc-strategies>` and any provided by loaded plugins. Specifying a GC
1543 strategy will cause the compiler to alter its output in order to support the
1544 named garbage collection algorithm. Note that LLVM itself does not contain a
1545 garbage collector, this functionality is restricted to generating machine code
1546 which can interoperate with a collector provided externally.
1553 Prefix data is data associated with a function which the code
1554 generator will emit immediately before the function's entrypoint.
1555 The purpose of this feature is to allow frontends to associate
1556 language-specific runtime metadata with specific functions and make it
1557 available through the function pointer while still allowing the
1558 function pointer to be called.
1560 To access the data for a given function, a program may bitcast the
1561 function pointer to a pointer to the constant's type and dereference
1562 index -1. This implies that the IR symbol points just past the end of
1563 the prefix data. For instance, take the example of a function annotated
1564 with a single ``i32``,
1566 .. code-block:: llvm
1568 define void @f() prefix i32 123 { ... }
1570 The prefix data can be referenced as,
1572 .. code-block:: llvm
1574 %a = getelementptr inbounds i32, ptr @f, i32 -1
1575 %b = load i32, ptr %a
1577 Prefix data is laid out as if it were an initializer for a global variable
1578 of the prefix data's type. The function will be placed such that the
1579 beginning of the prefix data is aligned. This means that if the size
1580 of the prefix data is not a multiple of the alignment size, the
1581 function's entrypoint will not be aligned. If alignment of the
1582 function's entrypoint is desired, padding must be added to the prefix
1585 A function may have prefix data but no body. This has similar semantics
1586 to the ``available_externally`` linkage in that the data may be used by the
1587 optimizers but will not be emitted in the object file.
1594 The ``prologue`` attribute allows arbitrary code (encoded as bytes) to
1595 be inserted prior to the function body. This can be used for enabling
1596 function hot-patching and instrumentation.
1598 To maintain the semantics of ordinary function calls, the prologue data must
1599 have a particular format. Specifically, it must begin with a sequence of
1600 bytes which decode to a sequence of machine instructions, valid for the
1601 module's target, which transfer control to the point immediately succeeding
1602 the prologue data, without performing any other visible action. This allows
1603 the inliner and other passes to reason about the semantics of the function
1604 definition without needing to reason about the prologue data. Obviously this
1605 makes the format of the prologue data highly target dependent.
1607 A trivial example of valid prologue data for the x86 architecture is ``i8 144``,
1608 which encodes the ``nop`` instruction:
1610 .. code-block:: text
1612 define void @f() prologue i8 144 { ... }
1614 Generally prologue data can be formed by encoding a relative branch instruction
1615 which skips the metadata, as in this example of valid prologue data for the
1616 x86_64 architecture, where the first two bytes encode ``jmp .+10``:
1618 .. code-block:: text
1620 %0 = type <{ i8, i8, ptr }>
1622 define void @f() prologue %0 <{ i8 235, i8 8, ptr @md}> { ... }
1624 A function may have prologue data but no body. This has similar semantics
1625 to the ``available_externally`` linkage in that the data may be used by the
1626 optimizers but will not be emitted in the object file.
1630 Personality Function
1631 --------------------
1633 The ``personality`` attribute permits functions to specify what function
1634 to use for exception handling.
1641 Attribute groups are groups of attributes that are referenced by objects within
1642 the IR. They are important for keeping ``.ll`` files readable, because a lot of
1643 functions will use the same set of attributes. In the degenerative case of a
1644 ``.ll`` file that corresponds to a single ``.c`` file, the single attribute
1645 group will capture the important command line flags used to build that file.
1647 An attribute group is a module-level object. To use an attribute group, an
1648 object references the attribute group's ID (e.g. ``#37``). An object may refer
1649 to more than one attribute group. In that situation, the attributes from the
1650 different groups are merged.
1652 Here is an example of attribute groups for a function that should always be
1653 inlined, has a stack alignment of 4, and which shouldn't use SSE instructions:
1655 .. code-block:: llvm
1657 ; Target-independent attributes:
1658 attributes #0 = { alwaysinline alignstack=4 }
1660 ; Target-dependent attributes:
1661 attributes #1 = { "no-sse" }
1663 ; Function @f has attributes: alwaysinline, alignstack=4, and "no-sse".
1664 define void @f() #0 #1 { ... }
1671 Function attributes are set to communicate additional information about
1672 a function. Function attributes are considered to be part of the
1673 function, not of the function type, so functions with different function
1674 attributes can have the same function type.
1676 Function attributes are simple keywords that follow the type specified.
1677 If multiple attributes are needed, they are space separated. For
1680 .. code-block:: llvm
1682 define void @f() noinline { ... }
1683 define void @f() alwaysinline { ... }
1684 define void @f() alwaysinline optsize { ... }
1685 define void @f() optsize { ... }
1688 This attribute indicates that, when emitting the prologue and
1689 epilogue, the backend should forcibly align the stack pointer.
1690 Specify the desired alignment, which must be a power of two, in
1692 ``"alloc-family"="FAMILY"``
1693 This indicates which "family" an allocator function is part of. To avoid
1694 collisions, the family name should match the mangled name of the primary
1695 allocator function, that is "malloc" for malloc/calloc/realloc/free,
1696 "_Znwm" for ``::operator::new`` and ``::operator::delete``, and
1697 "_ZnwmSt11align_val_t" for aligned ``::operator::new`` and
1698 ``::operator::delete``. Matching malloc/realloc/free calls within a family
1699 can be optimized, but mismatched ones will be left alone.
1700 ``allockind("KIND")``
1701 Describes the behavior of an allocation function. The KIND string contains comma
1702 separated entries from the following options:
1704 * "alloc": the function returns a new block of memory or null.
1705 * "realloc": the function returns a new block of memory or null. If the
1706 result is non-null the memory contents from the start of the block up to
1707 the smaller of the original allocation size and the new allocation size
1708 will match that of the ``allocptr`` argument and the ``allocptr``
1709 argument is invalidated, even if the function returns the same address.
1710 * "free": the function frees the block of memory specified by ``allocptr``.
1711 Functions marked as "free" ``allockind`` must return void.
1712 * "uninitialized": Any newly-allocated memory (either a new block from
1713 a "alloc" function or the enlarged capacity from a "realloc" function)
1714 will be uninitialized.
1715 * "zeroed": Any newly-allocated memory (either a new block from a "alloc"
1716 function or the enlarged capacity from a "realloc" function) will be
1718 * "aligned": the function returns memory aligned according to the
1719 ``allocalign`` parameter.
1721 The first three options are mutually exclusive, and the remaining options
1722 describe more details of how the function behaves. The remaining options
1723 are invalid for "free"-type functions.
1724 ``allocsize(<EltSizeParam>[, <NumEltsParam>])``
1725 This attribute indicates that the annotated function will always return at
1726 least a given number of bytes (or null). Its arguments are zero-indexed
1727 parameter numbers; if one argument is provided, then it's assumed that at
1728 least ``CallSite.Args[EltSizeParam]`` bytes will be available at the
1729 returned pointer. If two are provided, then it's assumed that
1730 ``CallSite.Args[EltSizeParam] * CallSite.Args[NumEltsParam]`` bytes are
1731 available. The referenced parameters must be integer types. No assumptions
1732 are made about the contents of the returned block of memory.
1734 This attribute indicates that the inliner should attempt to inline
1735 this function into callers whenever possible, ignoring any active
1736 inlining size threshold for this caller.
1738 This indicates that the callee function at a call site should be
1739 recognized as a built-in function, even though the function's declaration
1740 uses the ``nobuiltin`` attribute. This is only valid at call sites for
1741 direct calls to functions that are declared with the ``nobuiltin``
1744 This attribute indicates that this function is rarely called. When
1745 computing edge weights, basic blocks post-dominated by a cold
1746 function call are also considered to be cold; and, thus, given low
1749 In some parallel execution models, there exist operations that cannot be
1750 made control-dependent on any additional values. We call such operations
1751 ``convergent``, and mark them with this attribute.
1753 The ``convergent`` attribute may appear on functions or call/invoke
1754 instructions. When it appears on a function, it indicates that calls to
1755 this function should not be made control-dependent on additional values.
1756 For example, the intrinsic ``llvm.nvvm.barrier0`` is ``convergent``, so
1757 calls to this intrinsic cannot be made control-dependent on additional
1760 When it appears on a call/invoke, the ``convergent`` attribute indicates
1761 that we should treat the call as though we're calling a convergent
1762 function. This is particularly useful on indirect calls; without this we
1763 may treat such calls as though the target is non-convergent.
1765 The optimizer may remove the ``convergent`` attribute on functions when it
1766 can prove that the function does not execute any convergent operations.
1767 Similarly, the optimizer may remove ``convergent`` on calls/invokes when it
1768 can prove that the call/invoke cannot call a convergent function.
1769 ``disable_sanitizer_instrumentation``
1770 When instrumenting code with sanitizers, it can be important to skip certain
1771 functions to ensure no instrumentation is applied to them.
1773 This attribute is not always similar to absent ``sanitize_<name>``
1774 attributes: depending on the specific sanitizer, code can be inserted into
1775 functions regardless of the ``sanitize_<name>`` attribute to prevent false
1778 ``disable_sanitizer_instrumentation`` disables all kinds of instrumentation,
1779 taking precedence over the ``sanitize_<name>`` attributes and other compiler
1781 ``"dontcall-error"``
1782 This attribute denotes that an error diagnostic should be emitted when a
1783 call of a function with this attribute is not eliminated via optimization.
1784 Front ends can provide optional ``srcloc`` metadata nodes on call sites of
1785 such callees to attach information about where in the source language such a
1786 call came from. A string value can be provided as a note.
1788 This attribute denotes that a warning diagnostic should be emitted when a
1789 call of a function with this attribute is not eliminated via optimization.
1790 Front ends can provide optional ``srcloc`` metadata nodes on call sites of
1791 such callees to attach information about where in the source language such a
1792 call came from. A string value can be provided as a note.
1793 ``fn_ret_thunk_extern``
1794 This attribute tells the code generator that returns from functions should
1795 be replaced with jumps to externally-defined architecture-specific symbols.
1796 For X86, this symbol's identifier is ``__x86_return_thunk``.
1798 This attribute tells the code generator whether the function
1799 should keep the frame pointer. The code generator may emit the frame pointer
1800 even if this attribute says the frame pointer can be eliminated.
1801 The allowed string values are:
1803 * ``"none"`` (default) - the frame pointer can be eliminated.
1804 * ``"non-leaf"`` - the frame pointer should be kept if the function calls
1806 * ``"all"`` - the frame pointer should be kept.
1808 This attribute indicates that this function is a hot spot of the program
1809 execution. The function will be optimized more aggressively and will be
1810 placed into special subsection of the text section to improving locality.
1812 When profile feedback is enabled, this attribute has the precedence over
1813 the profile information. By marking a function ``hot``, users can work
1814 around the cases where the training input does not have good coverage
1815 on all the hot functions.
1817 This attribute indicates that the source code contained a hint that
1818 inlining this function is desirable (such as the "inline" keyword in
1819 C/C++). It is just a hint; it imposes no requirements on the
1822 This attribute indicates that the function should be added to a
1823 jump-instruction table at code-generation time, and that all address-taken
1824 references to this function should be replaced with a reference to the
1825 appropriate jump-instruction-table function pointer. Note that this creates
1826 a new pointer for the original function, which means that code that depends
1827 on function-pointer identity can break. So, any function annotated with
1828 ``jumptable`` must also be ``unnamed_addr``.
1830 This attribute specifies the possible memory effects of the call-site or
1831 function. It allows specifying the possible access kinds (``none``,
1832 ``read``, ``write``, or ``readwrite``) for the possible memory location
1833 kinds (``argmem``, ``inaccessiblemem``, as well as a default). It is best
1834 understood by example:
1836 - ``memory(none)``: Does not access any memory.
1837 - ``memory(read)``: May read (but not write) any memory.
1838 - ``memory(write)``: May write (but not read) any memory.
1839 - ``memory(readwrite)``: May read or write any memory.
1840 - ``memory(argmem: read)``: May only read argument memory.
1841 - ``memory(argmem: read, inaccessiblemem: write)``: May only read argument
1842 memory and only write inaccessible memory.
1843 - ``memory(read, argmem: readwrite)``: May read any memory (default mode)
1844 and additionally write argument memory.
1845 - ``memory(readwrite, argmem: none)``: May access any memory apart from
1848 The supported memory location kinds are:
1850 - ``argmem``: This refers to accesses that are based on pointer arguments
1852 - ``inaccessiblemem``: This refers to accesses to memory which is not
1853 accessible by the current module (before return from the function -- an
1854 allocator function may return newly accessible memory while only
1855 accessing inaccessible memory itself). Inaccessible memory is often used
1856 to model control dependencies of intrinsics.
1857 - The default access kind (specified without a location prefix) applies to
1858 all locations that haven't been specified explicitly, including those that
1859 don't currently have a dedicated location kind (e.g. accesses to globals
1860 or captured pointers).
1862 If the ``memory`` attribute is not specified, then ``memory(readwrite)``
1863 is implied (all memory effects are possible).
1865 The memory effects of a call can be computed as
1866 ``CallSiteEffects & (FunctionEffects | OperandBundleEffects)``. Thus, the
1867 call-site annotation takes precedence over the potential effects described
1868 by either the function annotation or the operand bundles.
1870 This attribute suggests that optimization passes and code generator
1871 passes make choices that keep the code size of this function as small
1872 as possible and perform optimizations that may sacrifice runtime
1873 performance in order to minimize the size of the generated code.
1875 This attribute disables prologue / epilogue emission for the
1876 function. This can have very system-specific consequences.
1877 ``"no-inline-line-tables"``
1878 When this attribute is set to true, the inliner discards source locations
1879 when inlining code and instead uses the source location of the call site.
1880 Breakpoints set on code that was inlined into the current function will
1881 not fire during the execution of the inlined call sites. If the debugger
1882 stops inside an inlined call site, it will appear to be stopped at the
1883 outermost inlined call site.
1885 When this attribute is set to true, the jump tables and lookup tables that
1886 can be generated from a switch case lowering are disabled.
1888 This indicates that the callee function at a call site is not recognized as
1889 a built-in function. LLVM will retain the original call and not replace it
1890 with equivalent code based on the semantics of the built-in function, unless
1891 the call site uses the ``builtin`` attribute. This is valid at call sites
1892 and on function declarations and definitions.
1894 This attribute indicates that the function is only allowed to jump back into
1895 caller's module by a return or an exception, and is not allowed to jump back
1896 by invoking a callback function, a direct, possibly transitive, external
1897 function call, use of ``longjmp``, or other means. It is a compiler hint that
1898 is used at module level to improve dataflow analysis, dropped during linking,
1899 and has no effect on functions defined in the current module.
1901 This attribute indicates that calls to the function cannot be
1902 duplicated. A call to a ``noduplicate`` function may be moved
1903 within its parent function, but may not be duplicated within
1904 its parent function.
1906 A function containing a ``noduplicate`` call may still
1907 be an inlining candidate, provided that the call is not
1908 duplicated by inlining. That implies that the function has
1909 internal linkage and only has one call site, so the original
1910 call is dead after inlining.
1912 This function attribute indicates that the function does not, directly or
1913 transitively, call a memory-deallocation function (``free``, for example)
1914 on a memory allocation which existed before the call.
1916 As a result, uncaptured pointers that are known to be dereferenceable
1917 prior to a call to a function with the ``nofree`` attribute are still
1918 known to be dereferenceable after the call. The capturing condition is
1919 necessary in environments where the function might communicate the
1920 pointer to another thread which then deallocates the memory. Alternatively,
1921 ``nosync`` would ensure such communication cannot happen and even captured
1922 pointers cannot be freed by the function.
1924 A ``nofree`` function is explicitly allowed to free memory which it
1925 allocated or (if not ``nosync``) arrange for another thread to free
1926 memory on it's behalf. As a result, perhaps surprisingly, a ``nofree``
1927 function can return a pointer to a previously deallocated memory object.
1929 Disallows implicit floating-point code. This inhibits optimizations that
1930 use floating-point code and floating-point registers for operations that are
1931 not nominally floating-point. LLVM instructions that perform floating-point
1932 operations or require access to floating-point registers may still cause
1933 floating-point code to be generated.
1935 Also inhibits optimizations that create SIMD/vector code and registers from
1936 scalar code such as vectorization or memcpy/memset optimization. This
1937 includes integer vectors. Vector instructions present in IR may still cause
1938 vector code to be generated.
1940 This attribute indicates that the inliner should never inline this
1941 function in any situation. This attribute may not be used together
1942 with the ``alwaysinline`` attribute.
1944 This attribute indicates that calls to this function should never be merged
1945 during optimization. For example, it will prevent tail merging otherwise
1946 identical code sequences that raise an exception or terminate the program.
1947 Tail merging normally reduces the precision of source location information,
1948 making stack traces less useful for debugging. This attribute gives the
1949 user control over the tradeoff between code size and debug information
1952 This attribute suppresses lazy symbol binding for the function. This
1953 may make calls to the function faster, at the cost of extra program
1954 startup time if the function is not called during program startup.
1956 This function attribute prevents instrumentation based profiling, used for
1957 coverage or profile based optimization, from being added to a function. It
1958 also blocks inlining if the caller and callee have different values of this
1961 This function attribute prevents instrumentation based profiling, used for
1962 coverage or profile based optimization, from being added to a function. This
1963 attribute does not restrict inlining, so instrumented instruction could end
1964 up in this function.
1966 This attribute indicates that the code generator should not use a
1967 red zone, even if the target-specific ABI normally permits it.
1968 ``indirect-tls-seg-refs``
1969 This attribute indicates that the code generator should not use
1970 direct TLS access through segment registers, even if the
1971 target-specific ABI normally permits it.
1973 This function attribute indicates that the function never returns
1974 normally, hence through a return instruction. This produces undefined
1975 behavior at runtime if the function ever does dynamically return. Annotated
1976 functions may still raise an exception, i.a., ``nounwind`` is not implied.
1978 This function attribute indicates that the function does not call itself
1979 either directly or indirectly down any possible call path. This produces
1980 undefined behavior at runtime if the function ever does recurse.
1982 .. _langref_willreturn:
1985 This function attribute indicates that a call of this function will
1986 either exhibit undefined behavior or comes back and continues execution
1987 at a point in the existing call stack that includes the current invocation.
1988 Annotated functions may still raise an exception, i.a., ``nounwind`` is not implied.
1989 If an invocation of an annotated function does not return control back
1990 to a point in the call stack, the behavior is undefined.
1992 This function attribute indicates that the function does not communicate
1993 (synchronize) with another thread through memory or other well-defined means.
1994 Synchronization is considered possible in the presence of `atomic` accesses
1995 that enforce an order, thus not "unordered" and "monotonic", `volatile` accesses,
1996 as well as `convergent` function calls. Note that through `convergent` function calls
1997 non-memory communication, e.g., cross-lane operations, are possible and are also
1998 considered synchronization. However `convergent` does not contradict `nosync`.
1999 If an annotated function does ever synchronize with another thread,
2000 the behavior is undefined.
2002 This function attribute indicates that the function never raises an
2003 exception. If the function does raise an exception, its runtime
2004 behavior is undefined. However, functions marked nounwind may still
2005 trap or generate asynchronous exceptions. Exception handling schemes
2006 that are recognized by LLVM to handle asynchronous exceptions, such
2007 as SEH, will still provide their implementation defined semantics.
2008 ``nosanitize_bounds``
2009 This attribute indicates that bounds checking sanitizer instrumentation
2010 is disabled for this function.
2011 ``nosanitize_coverage``
2012 This attribute indicates that SanitizerCoverage instrumentation is disabled
2014 ``null_pointer_is_valid``
2015 If ``null_pointer_is_valid`` is set, then the ``null`` address
2016 in address-space 0 is considered to be a valid address for memory loads and
2017 stores. Any analysis or optimization should not treat dereferencing a
2018 pointer to ``null`` as undefined behavior in this function.
2019 Note: Comparing address of a global variable to ``null`` may still
2020 evaluate to false because of a limitation in querying this attribute inside
2021 constant expressions.
2023 This attribute indicates that this function should be optimized
2024 for maximum fuzzing signal.
2026 This function attribute indicates that most optimization passes will skip
2027 this function, with the exception of interprocedural optimization passes.
2028 Code generation defaults to the "fast" instruction selector.
2029 This attribute cannot be used together with the ``alwaysinline``
2030 attribute; this attribute is also incompatible
2031 with the ``minsize`` attribute and the ``optsize`` attribute.
2033 This attribute requires the ``noinline`` attribute to be specified on
2034 the function as well, so the function is never inlined into any caller.
2035 Only functions with the ``alwaysinline`` attribute are valid
2036 candidates for inlining into the body of this function.
2038 This attribute suggests that optimization passes and code generator
2039 passes make choices that keep the code size of this function low,
2040 and otherwise do optimizations specifically to reduce code size as
2041 long as they do not significantly impact runtime performance.
2042 ``"patchable-function"``
2043 This attribute tells the code generator that the code
2044 generated for this function needs to follow certain conventions that
2045 make it possible for a runtime function to patch over it later.
2046 The exact effect of this attribute depends on its string value,
2047 for which there currently is one legal possibility:
2049 * ``"prologue-short-redirect"`` - This style of patchable
2050 function is intended to support patching a function prologue to
2051 redirect control away from the function in a thread safe
2052 manner. It guarantees that the first instruction of the
2053 function will be large enough to accommodate a short jump
2054 instruction, and will be sufficiently aligned to allow being
2055 fully changed via an atomic compare-and-swap instruction.
2056 While the first requirement can be satisfied by inserting large
2057 enough NOP, LLVM can and will try to re-purpose an existing
2058 instruction (i.e. one that would have to be emitted anyway) as
2059 the patchable instruction larger than a short jump.
2061 ``"prologue-short-redirect"`` is currently only supported on
2064 This attribute by itself does not imply restrictions on
2065 inter-procedural optimizations. All of the semantic effects the
2066 patching may have to be separately conveyed via the linkage type.
2068 This attribute indicates that the function will trigger a guard region
2069 in the end of the stack. It ensures that accesses to the stack must be
2070 no further apart than the size of the guard region to a previous
2071 access of the stack. It takes one required string value, the name of
2072 the stack probing function that will be called.
2074 If a function that has a ``"probe-stack"`` attribute is inlined into
2075 a function with another ``"probe-stack"`` attribute, the resulting
2076 function has the ``"probe-stack"`` attribute of the caller. If a
2077 function that has a ``"probe-stack"`` attribute is inlined into a
2078 function that has no ``"probe-stack"`` attribute at all, the resulting
2079 function has the ``"probe-stack"`` attribute of the callee.
2080 ``"stack-probe-size"``
2081 This attribute controls the behavior of stack probes: either
2082 the ``"probe-stack"`` attribute, or ABI-required stack probes, if any.
2083 It defines the size of the guard region. It ensures that if the function
2084 may use more stack space than the size of the guard region, stack probing
2085 sequence will be emitted. It takes one required integer value, which
2088 If a function that has a ``"stack-probe-size"`` attribute is inlined into
2089 a function with another ``"stack-probe-size"`` attribute, the resulting
2090 function has the ``"stack-probe-size"`` attribute that has the lower
2091 numeric value. If a function that has a ``"stack-probe-size"`` attribute is
2092 inlined into a function that has no ``"stack-probe-size"`` attribute
2093 at all, the resulting function has the ``"stack-probe-size"`` attribute
2095 ``"no-stack-arg-probe"``
2096 This attribute disables ABI-required stack probes, if any.
2098 This attribute indicates that this function can return twice. The C
2099 ``setjmp`` is an example of such a function. The compiler disables
2100 some optimizations (like tail calls) in the caller of these
2103 This attribute indicates that
2104 `SafeStack <https://clang.llvm.org/docs/SafeStack.html>`_
2105 protection is enabled for this function.
2107 If a function that has a ``safestack`` attribute is inlined into a
2108 function that doesn't have a ``safestack`` attribute or which has an
2109 ``ssp``, ``sspstrong`` or ``sspreq`` attribute, then the resulting
2110 function will have a ``safestack`` attribute.
2111 ``sanitize_address``
2112 This attribute indicates that AddressSanitizer checks
2113 (dynamic address safety analysis) are enabled for this function.
2115 This attribute indicates that MemorySanitizer checks (dynamic detection
2116 of accesses to uninitialized memory) are enabled for this function.
2118 This attribute indicates that ThreadSanitizer checks
2119 (dynamic thread safety analysis) are enabled for this function.
2120 ``sanitize_hwaddress``
2121 This attribute indicates that HWAddressSanitizer checks
2122 (dynamic address safety analysis based on tagged pointers) are enabled for
2125 This attribute indicates that MemTagSanitizer checks
2126 (dynamic address safety analysis based on Armv8 MTE) are enabled for
2128 ``speculative_load_hardening``
2129 This attribute indicates that
2130 `Speculative Load Hardening <https://llvm.org/docs/SpeculativeLoadHardening.html>`_
2131 should be enabled for the function body.
2133 Speculative Load Hardening is a best-effort mitigation against
2134 information leak attacks that make use of control flow
2135 miss-speculation - specifically miss-speculation of whether a branch
2136 is taken or not. Typically vulnerabilities enabling such attacks are
2137 classified as "Spectre variant #1". Notably, this does not attempt to
2138 mitigate against miss-speculation of branch target, classified as
2139 "Spectre variant #2" vulnerabilities.
2141 When inlining, the attribute is sticky. Inlining a function that carries
2142 this attribute will cause the caller to gain the attribute. This is intended
2143 to provide a maximally conservative model where the code in a function
2144 annotated with this attribute will always (even after inlining) end up
2147 This function attribute indicates that the function does not have any
2148 effects besides calculating its result and does not have undefined behavior.
2149 Note that ``speculatable`` is not enough to conclude that along any
2150 particular execution path the number of calls to this function will not be
2151 externally observable. This attribute is only valid on functions
2152 and declarations, not on individual call sites. If a function is
2153 incorrectly marked as speculatable and really does exhibit
2154 undefined behavior, the undefined behavior may be observed even
2155 if the call site is dead code.
2158 This attribute indicates that the function should emit a stack
2159 smashing protector. It is in the form of a "canary" --- a random value
2160 placed on the stack before the local variables that's checked upon
2161 return from the function to see if it has been overwritten. A
2162 heuristic is used to determine if a function needs stack protectors
2163 or not. The heuristic used will enable protectors for functions with:
2165 - Character arrays larger than ``ssp-buffer-size`` (default 8).
2166 - Aggregates containing character arrays larger than ``ssp-buffer-size``.
2167 - Calls to alloca() with variable sizes or constant sizes greater than
2168 ``ssp-buffer-size``.
2170 Variables that are identified as requiring a protector will be arranged
2171 on the stack such that they are adjacent to the stack protector guard.
2173 If a function with an ``ssp`` attribute is inlined into a calling function,
2174 the attribute is not carried over to the calling function.
2177 This attribute indicates that the function should emit a stack smashing
2178 protector. This attribute causes a strong heuristic to be used when
2179 determining if a function needs stack protectors. The strong heuristic
2180 will enable protectors for functions with:
2182 - Arrays of any size and type
2183 - Aggregates containing an array of any size and type.
2184 - Calls to alloca().
2185 - Local variables that have had their address taken.
2187 Variables that are identified as requiring a protector will be arranged
2188 on the stack such that they are adjacent to the stack protector guard.
2189 The specific layout rules are:
2191 #. Large arrays and structures containing large arrays
2192 (``>= ssp-buffer-size``) are closest to the stack protector.
2193 #. Small arrays and structures containing small arrays
2194 (``< ssp-buffer-size``) are 2nd closest to the protector.
2195 #. Variables that have had their address taken are 3rd closest to the
2198 This overrides the ``ssp`` function attribute.
2200 If a function with an ``sspstrong`` attribute is inlined into a calling
2201 function which has an ``ssp`` attribute, the calling function's attribute
2202 will be upgraded to ``sspstrong``.
2205 This attribute indicates that the function should *always* emit a stack
2206 smashing protector. This overrides the ``ssp`` and ``sspstrong`` function
2209 Variables that are identified as requiring a protector will be arranged
2210 on the stack such that they are adjacent to the stack protector guard.
2211 The specific layout rules are:
2213 #. Large arrays and structures containing large arrays
2214 (``>= ssp-buffer-size``) are closest to the stack protector.
2215 #. Small arrays and structures containing small arrays
2216 (``< ssp-buffer-size``) are 2nd closest to the protector.
2217 #. Variables that have had their address taken are 3rd closest to the
2220 If a function with an ``sspreq`` attribute is inlined into a calling
2221 function which has an ``ssp`` or ``sspstrong`` attribute, the calling
2222 function's attribute will be upgraded to ``sspreq``.
2225 This attribute indicates that the function was called from a scope that
2226 requires strict floating-point semantics. LLVM will not attempt any
2227 optimizations that require assumptions about the floating-point rounding
2228 mode or that might alter the state of floating-point status flags that
2229 might otherwise be set or cleared by calling this function. LLVM will
2230 not introduce any new floating-point instructions that may trap.
2232 .. _denormal_fp_math:
2234 ``"denormal-fp-math"``
2235 This indicates the denormal (subnormal) handling that may be
2236 assumed for the default floating-point environment. This is a
2237 comma separated pair. The elements may be one of ``"ieee"``,
2238 ``"preserve-sign"``, ``"positive-zero"``, or ``"dynamic"``. The
2239 first entry indicates the flushing mode for the result of floating
2240 point operations. The second indicates the handling of denormal inputs
2241 to floating point instructions. For compatibility with older
2242 bitcode, if the second value is omitted, both input and output
2243 modes will assume the same mode.
2245 If this is attribute is not specified, the default is ``"ieee,ieee"``.
2247 If the output mode is ``"preserve-sign"``, or ``"positive-zero"``,
2248 denormal outputs may be flushed to zero by standard floating-point
2249 operations. It is not mandated that flushing to zero occurs, but if
2250 a denormal output is flushed to zero, it must respect the sign
2251 mode. Not all targets support all modes.
2253 If the mode is ``"dynamic"``, the behavior is derived from the
2254 dynamic state of the floating-point environment. Transformations
2255 which depend on the behavior of denormal values should not be
2258 While this indicates the expected floating point mode the function
2259 will be executed with, this does not make any attempt to ensure
2260 the mode is consistent. User or platform code is expected to set
2261 the floating point mode appropriately before function entry.
2263 If the input mode is ``"preserve-sign"``, or ``"positive-zero"``,
2264 a floating-point operation must treat any input denormal value as
2265 zero. In some situations, if an instruction does not respect this
2266 mode, the input may need to be converted to 0 as if by
2267 ``@llvm.canonicalize`` during lowering for correctness.
2269 ``"denormal-fp-math-f32"``
2270 Same as ``"denormal-fp-math"``, but only controls the behavior of
2271 the 32-bit float type (or vectors of 32-bit floats). If both are
2272 are present, this overrides ``"denormal-fp-math"``. Not all targets
2273 support separately setting the denormal mode per type, and no
2274 attempt is made to diagnose unsupported uses. Currently this
2275 attribute is respected by the AMDGPU and NVPTX backends.
2278 This attribute indicates that the function will delegate to some other
2279 function with a tail call. The prototype of a thunk should not be used for
2280 optimization purposes. The caller is expected to cast the thunk prototype to
2281 match the thunk target prototype.
2283 ``"tls-load-hoist"``
2284 This attribute indicates that the function will try to reduce redundant
2285 tls address calculation by hoisting tls variable.
2287 ``uwtable[(sync|async)]``
2288 This attribute indicates that the ABI being targeted requires that
2289 an unwind table entry be produced for this function even if we can
2290 show that no exceptions passes by it. This is normally the case for
2291 the ELF x86-64 abi, but it can be disabled for some compilation
2292 units. The optional parameter describes what kind of unwind tables
2293 to generate: ``sync`` for normal unwind tables, ``async`` for asynchronous
2294 (instruction precise) unwind tables. Without the parameter, the attribute
2295 ``uwtable`` is equivalent to ``uwtable(async)``.
2297 This attribute indicates that no control-flow check will be performed on
2298 the attributed entity. It disables -fcf-protection=<> for a specific
2299 entity to fine grain the HW control flow protection mechanism. The flag
2300 is target independent and currently appertains to a function or function
2303 This attribute indicates that the ShadowCallStack checks are enabled for
2304 the function. The instrumentation checks that the return address for the
2305 function has not changed between the function prolog and epilog. It is
2306 currently x86_64-specific.
2308 .. _langref_mustprogress:
2311 This attribute indicates that the function is required to return, unwind,
2312 or interact with the environment in an observable way e.g. via a volatile
2313 memory access, I/O, or other synchronization. The ``mustprogress``
2314 attribute is intended to model the requirements of the first section of
2315 [intro.progress] of the C++ Standard. As a consequence, a loop in a
2316 function with the `mustprogress` attribute can be assumed to terminate if
2317 it does not interact with the environment in an observable way, and
2318 terminating loops without side-effects can be removed. If a `mustprogress`
2319 function does not satisfy this contract, the behavior is undefined. This
2320 attribute does not apply transitively to callees, but does apply to call
2321 sites within the function. Note that `willreturn` implies `mustprogress`.
2322 ``"warn-stack-size"="<threshold>"``
2323 This attribute sets a threshold to emit diagnostics once the frame size is
2324 known should the frame size exceed the specified value. It takes one
2325 required integer value, which should be a non-negative integer, and less
2326 than `UINT_MAX`. It's unspecified which threshold will be used when
2327 duplicate definitions are linked together with differing values.
2328 ``vscale_range(<min>[, <max>])``
2329 This attribute indicates the minimum and maximum vscale value for the given
2330 function. The min must be greater than 0. A maximum value of 0 means
2331 unbounded. If the optional max value is omitted then max is set to the
2332 value of min. If the attribute is not present, no assumptions are made
2333 about the range of vscale.
2335 This attribute indicates that outlining passes should not modify the
2338 Call Site Attributes
2339 ----------------------
2341 In addition to function attributes the following call site only
2342 attributes are supported:
2344 ``vector-function-abi-variant``
2345 This attribute can be attached to a :ref:`call <i_call>` to list
2346 the vector functions associated to the function. Notice that the
2347 attribute cannot be attached to a :ref:`invoke <i_invoke>` or a
2348 :ref:`callbr <i_callbr>` instruction. The attribute consists of a
2349 comma separated list of mangled names. The order of the list does
2350 not imply preference (it is logically a set). The compiler is free
2351 to pick any listed vector function of its choosing.
2353 The syntax for the mangled names is as follows:::
2355 _ZGV<isa><mask><vlen><parameters>_<scalar_name>[(<vector_redirection>)]
2357 When present, the attribute informs the compiler that the function
2358 ``<scalar_name>`` has a corresponding vector variant that can be
2359 used to perform the concurrent invocation of ``<scalar_name>`` on
2360 vectors. The shape of the vector function is described by the
2361 tokens between the prefix ``_ZGV`` and the ``<scalar_name>``
2362 token. The standard name of the vector function is
2363 ``_ZGV<isa><mask><vlen><parameters>_<scalar_name>``. When present,
2364 the optional token ``(<vector_redirection>)`` informs the compiler
2365 that a custom name is provided in addition to the standard one
2366 (custom names can be provided for example via the use of ``declare
2367 variant`` in OpenMP 5.0). The declaration of the variant must be
2368 present in the IR Module. The signature of the vector variant is
2369 determined by the rules of the Vector Function ABI (VFABI)
2370 specifications of the target. For Arm and X86, the VFABI can be
2371 found at https://github.com/ARM-software/abi-aa and
2372 https://software.intel.com/content/www/us/en/develop/download/vector-simd-function-abi.html,
2375 For X86 and Arm targets, the values of the tokens in the standard
2376 name are those that are defined in the VFABI. LLVM has an internal
2377 ``<isa>`` token that can be used to create scalar-to-vector
2378 mappings for functions that are not directly associated to any of
2379 the target ISAs (for example, some of the mappings stored in the
2380 TargetLibraryInfo). Valid values for the ``<isa>`` token are:::
2382 <isa>:= b | c | d | e -> X86 SSE, AVX, AVX2, AVX512
2383 | n | s -> Armv8 Advanced SIMD, SVE
2384 | __LLVM__ -> Internal LLVM Vector ISA
2386 For all targets currently supported (x86, Arm and Internal LLVM),
2387 the remaining tokens can have the following values:::
2389 <mask>:= M | N -> mask | no mask
2391 <vlen>:= number -> number of lanes
2392 | x -> VLA (Vector Length Agnostic)
2394 <parameters>:= v -> vector
2395 | l | l <number> -> linear
2396 | R | R <number> -> linear with ref modifier
2397 | L | L <number> -> linear with val modifier
2398 | U | U <number> -> linear with uval modifier
2399 | ls <pos> -> runtime linear
2400 | Rs <pos> -> runtime linear with ref modifier
2401 | Ls <pos> -> runtime linear with val modifier
2402 | Us <pos> -> runtime linear with uval modifier
2405 <scalar_name>:= name of the scalar function
2407 <vector_redirection>:= optional, custom name of the vector function
2409 ``preallocated(<ty>)``
2410 This attribute is required on calls to ``llvm.call.preallocated.arg``
2411 and cannot be used on any other call. See
2412 :ref:`llvm.call.preallocated.arg<int_call_preallocated_arg>` for more
2420 Attributes may be set to communicate additional information about a global variable.
2421 Unlike :ref:`function attributes <fnattrs>`, attributes on a global variable
2422 are grouped into a single :ref:`attribute group <attrgrp>`.
2424 ``no_sanitize_address``
2425 This attribute indicates that the global variable should not have
2426 AddressSanitizer instrumentation applied to it, because it was annotated
2427 with `__attribute__((no_sanitize("address")))`,
2428 `__attribute__((disable_sanitizer_instrumentation))`, or included in the
2429 `-fsanitize-ignorelist` file.
2430 ``no_sanitize_hwaddress``
2431 This attribute indicates that the global variable should not have
2432 HWAddressSanitizer instrumentation applied to it, because it was annotated
2433 with `__attribute__((no_sanitize("hwaddress")))`,
2434 `__attribute__((disable_sanitizer_instrumentation))`, or included in the
2435 `-fsanitize-ignorelist` file.
2437 This attribute indicates that the global variable should have AArch64 memory
2438 tags (MTE) instrumentation applied to it. This attribute causes the
2439 suppression of certain optimisations, like GlobalMerge, as well as ensuring
2440 extra directives are emitted in the assembly and extra bits of metadata are
2441 placed in the object file so that the linker can ensure the accesses are
2442 protected by MTE. This attribute is added by clang when
2443 `-fsanitize=memtag-globals` is provided, as long as the global is not marked
2444 with `__attribute__((no_sanitize("memtag")))`,
2445 `__attribute__((disable_sanitizer_instrumentation))`, or included in the
2446 `-fsanitize-ignorelist` file. The AArch64 Globals Tagging pass may remove
2447 this attribute when it's not possible to tag the global (e.g. it's a TLS
2449 ``sanitize_address_dyninit``
2450 This attribute indicates that the global variable, when instrumented with
2451 AddressSanitizer, should be checked for ODR violations. This attribute is
2452 applied to global variables that are dynamically initialized according to
2460 Operand bundles are tagged sets of SSA values that can be associated
2461 with certain LLVM instructions (currently only ``call`` s and
2462 ``invoke`` s). In a way they are like metadata, but dropping them is
2463 incorrect and will change program semantics.
2467 operand bundle set ::= '[' operand bundle (, operand bundle )* ']'
2468 operand bundle ::= tag '(' [ bundle operand ] (, bundle operand )* ')'
2469 bundle operand ::= SSA value
2470 tag ::= string constant
2472 Operand bundles are **not** part of a function's signature, and a
2473 given function may be called from multiple places with different kinds
2474 of operand bundles. This reflects the fact that the operand bundles
2475 are conceptually a part of the ``call`` (or ``invoke``), not the
2476 callee being dispatched to.
2478 Operand bundles are a generic mechanism intended to support
2479 runtime-introspection-like functionality for managed languages. While
2480 the exact semantics of an operand bundle depend on the bundle tag,
2481 there are certain limitations to how much the presence of an operand
2482 bundle can influence the semantics of a program. These restrictions
2483 are described as the semantics of an "unknown" operand bundle. As
2484 long as the behavior of an operand bundle is describable within these
2485 restrictions, LLVM does not need to have special knowledge of the
2486 operand bundle to not miscompile programs containing it.
2488 - The bundle operands for an unknown operand bundle escape in unknown
2489 ways before control is transferred to the callee or invokee.
2490 - Calls and invokes with operand bundles have unknown read / write
2491 effect on the heap on entry and exit (even if the call target specifies
2492 a ``memory`` attribute), unless they're overridden with
2493 callsite specific attributes.
2494 - An operand bundle at a call site cannot change the implementation
2495 of the called function. Inter-procedural optimizations work as
2496 usual as long as they take into account the first two properties.
2498 More specific types of operand bundles are described below.
2500 .. _deopt_opbundles:
2502 Deoptimization Operand Bundles
2503 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2505 Deoptimization operand bundles are characterized by the ``"deopt"``
2506 operand bundle tag. These operand bundles represent an alternate
2507 "safe" continuation for the call site they're attached to, and can be
2508 used by a suitable runtime to deoptimize the compiled frame at the
2509 specified call site. There can be at most one ``"deopt"`` operand
2510 bundle attached to a call site. Exact details of deoptimization is
2511 out of scope for the language reference, but it usually involves
2512 rewriting a compiled frame into a set of interpreted frames.
2514 From the compiler's perspective, deoptimization operand bundles make
2515 the call sites they're attached to at least ``readonly``. They read
2516 through all of their pointer typed operands (even if they're not
2517 otherwise escaped) and the entire visible heap. Deoptimization
2518 operand bundles do not capture their operands except during
2519 deoptimization, in which case control will not be returned to the
2522 The inliner knows how to inline through calls that have deoptimization
2523 operand bundles. Just like inlining through a normal call site
2524 involves composing the normal and exceptional continuations, inlining
2525 through a call site with a deoptimization operand bundle needs to
2526 appropriately compose the "safe" deoptimization continuation. The
2527 inliner does this by prepending the parent's deoptimization
2528 continuation to every deoptimization continuation in the inlined body.
2529 E.g. inlining ``@f`` into ``@g`` in the following example
2531 .. code-block:: llvm
2534 call void @x() ;; no deopt state
2535 call void @y() [ "deopt"(i32 10) ]
2536 call void @y() [ "deopt"(i32 10), "unknown"(ptr null) ]
2541 call void @f() [ "deopt"(i32 20) ]
2547 .. code-block:: llvm
2550 call void @x() ;; still no deopt state
2551 call void @y() [ "deopt"(i32 20, i32 10) ]
2552 call void @y() [ "deopt"(i32 20, i32 10), "unknown"(ptr null) ]
2556 It is the frontend's responsibility to structure or encode the
2557 deoptimization state in a way that syntactically prepending the
2558 caller's deoptimization state to the callee's deoptimization state is
2559 semantically equivalent to composing the caller's deoptimization
2560 continuation after the callee's deoptimization continuation.
2564 Funclet Operand Bundles
2565 ^^^^^^^^^^^^^^^^^^^^^^^
2567 Funclet operand bundles are characterized by the ``"funclet"``
2568 operand bundle tag. These operand bundles indicate that a call site
2569 is within a particular funclet. There can be at most one
2570 ``"funclet"`` operand bundle attached to a call site and it must have
2571 exactly one bundle operand.
2573 If any funclet EH pads have been "entered" but not "exited" (per the
2574 `description in the EH doc\ <ExceptionHandling.html#wineh-constraints>`_),
2575 it is undefined behavior to execute a ``call`` or ``invoke`` which:
2577 * does not have a ``"funclet"`` bundle and is not a ``call`` to a nounwind
2579 * has a ``"funclet"`` bundle whose operand is not the most-recently-entered
2580 not-yet-exited funclet EH pad.
2582 Similarly, if no funclet EH pads have been entered-but-not-yet-exited,
2583 executing a ``call`` or ``invoke`` with a ``"funclet"`` bundle is undefined behavior.
2585 GC Transition Operand Bundles
2586 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2588 GC transition operand bundles are characterized by the
2589 ``"gc-transition"`` operand bundle tag. These operand bundles mark a
2590 call as a transition between a function with one GC strategy to a
2591 function with a different GC strategy. If coordinating the transition
2592 between GC strategies requires additional code generation at the call
2593 site, these bundles may contain any values that are needed by the
2594 generated code. For more details, see :ref:`GC Transitions
2595 <gc_transition_args>`.
2597 The bundle contain an arbitrary list of Values which need to be passed
2598 to GC transition code. They will be lowered and passed as operands to
2599 the appropriate GC_TRANSITION nodes in the selection DAG. It is assumed
2600 that these arguments must be available before and after (but not
2601 necessarily during) the execution of the callee.
2603 .. _assume_opbundles:
2605 Assume Operand Bundles
2606 ^^^^^^^^^^^^^^^^^^^^^^
2608 Operand bundles on an :ref:`llvm.assume <int_assume>` allows representing
2609 assumptions, such as that a :ref:`parameter attribute <paramattrs>` or a
2610 :ref:`function attribute <fnattrs>` holds for a certain value at a certain
2611 location. Operand bundles enable assumptions that are either hard or impossible
2612 to represent as a boolean argument of an :ref:`llvm.assume <int_assume>`.
2614 An assume operand bundle has the form:
2618 "<tag>"([ <arguments>] ])
2620 In the case of function or parameter attributes, the operand bundle has the
2625 "<tag>"([ <holds for value> [, <attribute argument>] ])
2627 * The tag of the operand bundle is usually the name of attribute that can be
2628 assumed to hold. It can also be `ignore`, this tag doesn't contain any
2629 information and should be ignored.
2630 * The first argument if present is the value for which the attribute hold.
2631 * The second argument if present is an argument of the attribute.
2633 If there are no arguments the attribute is a property of the call location.
2637 .. code-block:: llvm
2639 call void @llvm.assume(i1 true) ["align"(ptr %val, i32 8)]
2641 allows the optimizer to assume that at location of call to
2642 :ref:`llvm.assume <int_assume>` ``%val`` has an alignment of at least 8.
2644 .. code-block:: llvm
2646 call void @llvm.assume(i1 %cond) ["cold"(), "nonnull"(ptr %val)]
2648 allows the optimizer to assume that the :ref:`llvm.assume <int_assume>`
2649 call location is cold and that ``%val`` may not be null.
2651 Just like for the argument of :ref:`llvm.assume <int_assume>`, if any of the
2652 provided guarantees are violated at runtime the behavior is undefined.
2654 While attributes expect constant arguments, assume operand bundles may be
2655 provided a dynamic value, for example:
2657 .. code-block:: llvm
2659 call void @llvm.assume(i1 true) ["align"(ptr %val, i32 %align)]
2661 If the operand bundle value violates any requirements on the attribute value,
2662 the behavior is undefined, unless one of the following exceptions applies:
2664 * ``"align"`` operand bundles may specify a non-power-of-two alignment
2665 (including a zero alignment). If this is the case, then the pointer value
2666 must be a null pointer, otherwise the behavior is undefined.
2668 In addition to allowing operand bundles encoding function and parameter
2669 attributes, an assume operand bundle my also encode a ``separate_storage``
2670 operand bundle. This has the form:
2672 .. code-block:: llvm
2674 separate_storage(<val1>, <val2>)``
2676 This indicates that no pointer :ref:`based <pointeraliasing>` on one of its
2677 arguments can alias any pointer based on the other.
2679 Even if the assumed property can be encoded as a boolean value, like
2680 ``nonnull``, using operand bundles to express the property can still have
2683 * Attributes that can be expressed via operand bundles are directly the
2684 property that the optimizer uses and cares about. Encoding attributes as
2685 operand bundles removes the need for an instruction sequence that represents
2686 the property (e.g., `icmp ne ptr %p, null` for `nonnull`) and for the
2687 optimizer to deduce the property from that instruction sequence.
2688 * Expressing the property using operand bundles makes it easy to identify the
2689 use of the value as a use in an :ref:`llvm.assume <int_assume>`. This then
2690 simplifies and improves heuristics, e.g., for use "use-sensitive"
2693 .. _ob_preallocated:
2695 Preallocated Operand Bundles
2696 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2698 Preallocated operand bundles are characterized by the ``"preallocated"``
2699 operand bundle tag. These operand bundles allow separation of the allocation
2700 of the call argument memory from the call site. This is necessary to pass
2701 non-trivially copyable objects by value in a way that is compatible with MSVC
2702 on some targets. There can be at most one ``"preallocated"`` operand bundle
2703 attached to a call site and it must have exactly one bundle operand, which is
2704 a token generated by ``@llvm.call.preallocated.setup``. A call with this
2705 operand bundle should not adjust the stack before entering the function, as
2706 that will have been done by one of the ``@llvm.call.preallocated.*`` intrinsics.
2708 .. code-block:: llvm
2710 %foo = type { i64, i32 }
2714 %t = call token @llvm.call.preallocated.setup(i32 1)
2715 %a = call ptr @llvm.call.preallocated.arg(token %t, i32 0) preallocated(%foo)
2717 call void @bar(i32 42, ptr preallocated(%foo) %a) ["preallocated"(token %t)]
2721 GC Live Operand Bundles
2722 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2724 A "gc-live" operand bundle is only valid on a :ref:`gc.statepoint <gc_statepoint>`
2725 intrinsic. The operand bundle must contain every pointer to a garbage collected
2726 object which potentially needs to be updated by the garbage collector.
2728 When lowered, any relocated value will be recorded in the corresponding
2729 :ref:`stackmap entry <statepoint-stackmap-format>`. See the intrinsic description
2730 for further details.
2732 ObjC ARC Attached Call Operand Bundles
2733 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2735 A ``"clang.arc.attachedcall"`` operand bundle on a call indicates the call is
2736 implicitly followed by a marker instruction and a call to an ObjC runtime
2737 function that uses the result of the call. The operand bundle takes a mandatory
2738 pointer to the runtime function (``@objc_retainAutoreleasedReturnValue`` or
2739 ``@objc_unsafeClaimAutoreleasedReturnValue``).
2740 The return value of a call with this bundle is used by a call to
2741 ``@llvm.objc.clang.arc.noop.use`` unless the called function's return type is
2742 void, in which case the operand bundle is ignored.
2744 .. code-block:: llvm
2746 ; The marker instruction and a runtime function call are inserted after the call
2748 call ptr @foo() [ "clang.arc.attachedcall"(ptr @objc_retainAutoreleasedReturnValue) ]
2749 call ptr @foo() [ "clang.arc.attachedcall"(ptr @objc_unsafeClaimAutoreleasedReturnValue) ]
2751 The operand bundle is needed to ensure the call is immediately followed by the
2752 marker instruction and the ObjC runtime call in the final output.
2756 Pointer Authentication Operand Bundles
2757 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2759 Pointer Authentication operand bundles are characterized by the
2760 ``"ptrauth"`` operand bundle tag. They are described in the
2761 `Pointer Authentication <PointerAuth.html#operand-bundle>`__ document.
2765 KCFI Operand Bundles
2766 ^^^^^^^^^^^^^^^^^^^^
2768 A ``"kcfi"`` operand bundle on an indirect call indicates that the call will
2769 be preceded by a runtime type check, which validates that the call target is
2770 prefixed with a :ref:`type identifier<md_kcfi_type>` that matches the operand
2771 bundle attribute. For example:
2773 .. code-block:: llvm
2775 call void %0() ["kcfi"(i32 1234)]
2777 Clang emits KCFI operand bundles and the necessary metadata with
2778 ``-fsanitize=kcfi``.
2782 Module-Level Inline Assembly
2783 ----------------------------
2785 Modules may contain "module-level inline asm" blocks, which corresponds
2786 to the GCC "file scope inline asm" blocks. These blocks are internally
2787 concatenated by LLVM and treated as a single unit, but may be separated
2788 in the ``.ll`` file if desired. The syntax is very simple:
2790 .. code-block:: llvm
2792 module asm "inline asm code goes here"
2793 module asm "more can go here"
2795 The strings can contain any character by escaping non-printable
2796 characters. The escape sequence used is simply "\\xx" where "xx" is the
2797 two digit hex code for the number.
2799 Note that the assembly string *must* be parseable by LLVM's integrated assembler
2800 (unless it is disabled), even when emitting a ``.s`` file.
2802 .. _langref_datalayout:
2807 A module may specify a target specific data layout string that specifies
2808 how data is to be laid out in memory. The syntax for the data layout is
2811 .. code-block:: llvm
2813 target datalayout = "layout specification"
2815 The *layout specification* consists of a list of specifications
2816 separated by the minus sign character ('-'). Each specification starts
2817 with a letter and may include other information after the letter to
2818 define some aspect of the data layout. The specifications accepted are
2822 Specifies that the target lays out data in big-endian form. That is,
2823 the bits with the most significance have the lowest address
2826 Specifies that the target lays out data in little-endian form. That
2827 is, the bits with the least significance have the lowest address
2830 Specifies the natural alignment of the stack in bits. Alignment
2831 promotion of stack variables is limited to the natural stack
2832 alignment to avoid dynamic stack realignment. The stack alignment
2833 must be a multiple of 8-bits. If omitted, the natural stack
2834 alignment defaults to "unspecified", which does not prevent any
2835 alignment promotions.
2836 ``P<address space>``
2837 Specifies the address space that corresponds to program memory.
2838 Harvard architectures can use this to specify what space LLVM
2839 should place things such as functions into. If omitted, the
2840 program memory space defaults to the default address space of 0,
2841 which corresponds to a Von Neumann architecture that has code
2842 and data in the same space.
2843 ``G<address space>``
2844 Specifies the address space to be used by default when creating global
2845 variables. If omitted, the globals address space defaults to the default
2847 Note: variable declarations without an address space are always created in
2848 address space 0, this property only affects the default value to be used
2849 when creating globals without additional contextual information (e.g. in
2851 ``A<address space>``
2852 Specifies the address space of objects created by '``alloca``'.
2853 Defaults to the default address space of 0.
2854 ``p[n]:<size>:<abi>[:<pref>][:<idx>]``
2855 This specifies the *size* of a pointer and its ``<abi>`` and
2856 ``<pref>``\erred alignments for address space ``n``. ``<pref>`` is optional
2857 and defaults to ``<abi>``. The fourth parameter ``<idx>`` is the size of the
2858 index that used for address calculation. If not
2859 specified, the default index size is equal to the pointer size. All sizes
2860 are in bits. The address space, ``n``, is optional, and if not specified,
2861 denotes the default address space 0. The value of ``n`` must be
2862 in the range [1,2^24).
2863 ``i<size>:<abi>[:<pref>]``
2864 This specifies the alignment for an integer type of a given bit
2865 ``<size>``. The value of ``<size>`` must be in the range [1,2^24).
2866 ``<pref>`` is optional and defaults to ``<abi>``.
2867 For ``i8``, the ``<abi>`` value must equal 8,
2868 that is, ``i8`` must be naturally aligned.
2869 ``v<size>:<abi>[:<pref>]``
2870 This specifies the alignment for a vector type of a given bit
2871 ``<size>``. The value of ``<size>`` must be in the range [1,2^24).
2872 ``<pref>`` is optional and defaults to ``<abi>``.
2873 ``f<size>:<abi>[:<pref>]``
2874 This specifies the alignment for a floating-point type of a given bit
2875 ``<size>``. Only values of ``<size>`` that are supported by the target
2876 will work. 32 (float) and 64 (double) are supported on all targets; 80
2877 or 128 (different flavors of long double) are also supported on some
2878 targets. The value of ``<size>`` must be in the range [1,2^24).
2879 ``<pref>`` is optional and defaults to ``<abi>``.
2880 ``a:<abi>[:<pref>]``
2881 This specifies the alignment for an object of aggregate type.
2882 ``<pref>`` is optional and defaults to ``<abi>``.
2884 This specifies the alignment for function pointers.
2885 The options for ``<type>`` are:
2887 * ``i``: The alignment of function pointers is independent of the alignment
2888 of functions, and is a multiple of ``<abi>``.
2889 * ``n``: The alignment of function pointers is a multiple of the explicit
2890 alignment specified on the function, and is a multiple of ``<abi>``.
2892 If present, specifies that llvm names are mangled in the output. Symbols
2893 prefixed with the mangling escape character ``\01`` are passed through
2894 directly to the assembler without the escape character. The mangling style
2897 * ``e``: ELF mangling: Private symbols get a ``.L`` prefix.
2898 * ``l``: GOFF mangling: Private symbols get a ``@`` prefix.
2899 * ``m``: Mips mangling: Private symbols get a ``$`` prefix.
2900 * ``o``: Mach-O mangling: Private symbols get ``L`` prefix. Other
2901 symbols get a ``_`` prefix.
2902 * ``x``: Windows x86 COFF mangling: Private symbols get the usual prefix.
2903 Regular C symbols get a ``_`` prefix. Functions with ``__stdcall``,
2904 ``__fastcall``, and ``__vectorcall`` have custom mangling that appends
2905 ``@N`` where N is the number of bytes used to pass parameters. C++ symbols
2906 starting with ``?`` are not mangled in any way.
2907 * ``w``: Windows COFF mangling: Similar to ``x``, except that normal C
2908 symbols do not receive a ``_`` prefix.
2909 * ``a``: XCOFF mangling: Private symbols get a ``L..`` prefix.
2910 ``n<size1>:<size2>:<size3>...``
2911 This specifies a set of native integer widths for the target CPU in
2912 bits. For example, it might contain ``n32`` for 32-bit PowerPC,
2913 ``n32:64`` for PowerPC 64, or ``n8:16:32:64`` for X86-64. Elements of
2914 this set are considered to support most general arithmetic operations
2916 ``ni:<address space0>:<address space1>:<address space2>...``
2917 This specifies pointer types with the specified address spaces
2918 as :ref:`Non-Integral Pointer Type <nointptrtype>` s. The ``0``
2919 address space cannot be specified as non-integral.
2921 On every specification that takes a ``<abi>:<pref>``, specifying the
2922 ``<pref>`` alignment is optional. If omitted, the preceding ``:``
2923 should be omitted too and ``<pref>`` will be equal to ``<abi>``.
2925 When constructing the data layout for a given target, LLVM starts with a
2926 default set of specifications which are then (possibly) overridden by
2927 the specifications in the ``datalayout`` keyword. The default
2928 specifications are given in this list:
2930 - ``e`` - little endian
2931 - ``p:64:64:64`` - 64-bit pointers with 64-bit alignment.
2932 - ``p[n]:64:64:64`` - Other address spaces are assumed to be the
2933 same as the default address space.
2934 - ``S0`` - natural stack alignment is unspecified
2935 - ``i1:8:8`` - i1 is 8-bit (byte) aligned
2936 - ``i8:8:8`` - i8 is 8-bit (byte) aligned as mandated
2937 - ``i16:16:16`` - i16 is 16-bit aligned
2938 - ``i32:32:32`` - i32 is 32-bit aligned
2939 - ``i64:32:64`` - i64 has ABI alignment of 32-bits but preferred
2940 alignment of 64-bits
2941 - ``f16:16:16`` - half is 16-bit aligned
2942 - ``f32:32:32`` - float is 32-bit aligned
2943 - ``f64:64:64`` - double is 64-bit aligned
2944 - ``f128:128:128`` - quad is 128-bit aligned
2945 - ``v64:64:64`` - 64-bit vector is 64-bit aligned
2946 - ``v128:128:128`` - 128-bit vector is 128-bit aligned
2947 - ``a:0:64`` - aggregates are 64-bit aligned
2949 When LLVM is determining the alignment for a given type, it uses the
2952 #. If the type sought is an exact match for one of the specifications,
2953 that specification is used.
2954 #. If no match is found, and the type sought is an integer type, then
2955 the smallest integer type that is larger than the bitwidth of the
2956 sought type is used. If none of the specifications are larger than
2957 the bitwidth then the largest integer type is used. For example,
2958 given the default specifications above, the i7 type will use the
2959 alignment of i8 (next largest) while both i65 and i256 will use the
2960 alignment of i64 (largest specified).
2962 The function of the data layout string may not be what you expect.
2963 Notably, this is not a specification from the frontend of what alignment
2964 the code generator should use.
2966 Instead, if specified, the target data layout is required to match what
2967 the ultimate *code generator* expects. This string is used by the
2968 mid-level optimizers to improve code, and this only works if it matches
2969 what the ultimate code generator uses. There is no way to generate IR
2970 that does not embed this target-specific detail into the IR. If you
2971 don't specify the string, the default specifications will be used to
2972 generate a Data Layout and the optimization phases will operate
2973 accordingly and introduce target specificity into the IR with respect to
2974 these default specifications.
2981 A module may specify a target triple string that describes the target
2982 host. The syntax for the target triple is simply:
2984 .. code-block:: llvm
2986 target triple = "x86_64-apple-macosx10.7.0"
2988 The *target triple* string consists of a series of identifiers delimited
2989 by the minus sign character ('-'). The canonical forms are:
2993 ARCHITECTURE-VENDOR-OPERATING_SYSTEM
2994 ARCHITECTURE-VENDOR-OPERATING_SYSTEM-ENVIRONMENT
2996 This information is passed along to the backend so that it generates
2997 code for the proper architecture. It's possible to override this on the
2998 command line with the ``-mtriple`` command line option.
3003 ----------------------
3005 A memory object, or simply object, is a region of a memory space that is
3006 reserved by a memory allocation such as :ref:`alloca <i_alloca>`, heap
3007 allocation calls, and global variable definitions.
3008 Once it is allocated, the bytes stored in the region can only be read or written
3009 through a pointer that is :ref:`based on <pointeraliasing>` the allocation
3011 If a pointer that is not based on the object tries to read or write to the
3012 object, it is undefined behavior.
3014 A lifetime of a memory object is a property that decides its accessibility.
3015 Unless stated otherwise, a memory object is alive since its allocation, and
3016 dead after its deallocation.
3017 It is undefined behavior to access a memory object that isn't alive, but
3018 operations that don't dereference it such as
3019 :ref:`getelementptr <i_getelementptr>`, :ref:`ptrtoint <i_ptrtoint>` and
3020 :ref:`icmp <i_icmp>` return a valid result.
3021 This explains code motion of these instructions across operations that
3022 impact the object's lifetime.
3023 A stack object's lifetime can be explicitly specified using
3024 :ref:`llvm.lifetime.start <int_lifestart>` and
3025 :ref:`llvm.lifetime.end <int_lifeend>` intrinsic function calls.
3027 .. _pointeraliasing:
3029 Pointer Aliasing Rules
3030 ----------------------
3032 Any memory access must be done through a pointer value associated with
3033 an address range of the memory access, otherwise the behavior is
3034 undefined. Pointer values are associated with address ranges according
3035 to the following rules:
3037 - A pointer value is associated with the addresses associated with any
3038 value it is *based* on.
3039 - An address of a global variable is associated with the address range
3040 of the variable's storage.
3041 - The result value of an allocation instruction is associated with the
3042 address range of the allocated storage.
3043 - A null pointer in the default address-space is associated with no
3045 - An :ref:`undef value <undefvalues>` in *any* address-space is
3046 associated with no address.
3047 - An integer constant other than zero or a pointer value returned from
3048 a function not defined within LLVM may be associated with address
3049 ranges allocated through mechanisms other than those provided by
3050 LLVM. Such ranges shall not overlap with any ranges of addresses
3051 allocated by mechanisms provided by LLVM.
3053 A pointer value is *based* on another pointer value according to the
3056 - A pointer value formed from a scalar ``getelementptr`` operation is *based* on
3057 the pointer-typed operand of the ``getelementptr``.
3058 - The pointer in lane *l* of the result of a vector ``getelementptr`` operation
3059 is *based* on the pointer in lane *l* of the vector-of-pointers-typed operand
3060 of the ``getelementptr``.
3061 - The result value of a ``bitcast`` is *based* on the operand of the
3063 - A pointer value formed by an ``inttoptr`` is *based* on all pointer
3064 values that contribute (directly or indirectly) to the computation of
3065 the pointer's value.
3066 - The "*based* on" relationship is transitive.
3068 Note that this definition of *"based"* is intentionally similar to the
3069 definition of *"based"* in C99, though it is slightly weaker.
3071 LLVM IR does not associate types with memory. The result type of a
3072 ``load`` merely indicates the size and alignment of the memory from
3073 which to load, as well as the interpretation of the value. The first
3074 operand type of a ``store`` similarly only indicates the size and
3075 alignment of the store.
3077 Consequently, type-based alias analysis, aka TBAA, aka
3078 ``-fstrict-aliasing``, is not applicable to general unadorned LLVM IR.
3079 :ref:`Metadata <metadata>` may be used to encode additional information
3080 which specialized optimization passes may use to implement type-based
3088 Given a function call and a pointer that is passed as an argument or stored in
3089 the memory before the call, a pointer is *captured* by the call if it makes a
3090 copy of any part of the pointer that outlives the call.
3091 To be precise, a pointer is captured if one or more of the following conditions
3094 1. The call stores any bit of the pointer carrying information into a place,
3095 and the stored bits can be read from the place by the caller after this call
3098 .. code-block:: llvm
3100 @glb = global ptr null
3101 @glb2 = global ptr null
3102 @glb3 = global ptr null
3103 @glbi = global i32 0
3105 define ptr @f(ptr %a, ptr %b, ptr %c, ptr %d, ptr %e) {
3106 store ptr %a, ptr @glb ; %a is captured by this call
3108 store ptr %b, ptr @glb2 ; %b isn't captured because the stored value is overwritten by the store below
3109 store ptr null, ptr @glb2
3111 store ptr %c, ptr @glb3
3112 call void @g() ; If @g makes a copy of %c that outlives this call (@f), %c is captured
3113 store ptr null, ptr @glb3
3115 %i = ptrtoint ptr %d to i64
3116 %j = trunc i64 %i to i32
3117 store i32 %j, ptr @glbi ; %d is captured
3119 ret ptr %e ; %e is captured
3122 2. The call stores any bit of the pointer carrying information into a place,
3123 and the stored bits can be safely read from the place by another thread via
3126 .. code-block:: llvm
3128 @lock = global i1 true
3130 define void @f(ptr %a) {
3131 store ptr %a, ptr* @glb
3132 store atomic i1 false, ptr @lock release ; %a is captured because another thread can safely read @glb
3133 store ptr null, ptr @glb
3137 3. The call's behavior depends on any bit of the pointer carrying information.
3139 .. code-block:: llvm
3143 define void @f(ptr %a) {
3144 %c = icmp eq ptr %a, @glb
3145 br i1 %c, label %BB_EXIT, label %BB_CONTINUE ; escapes %a
3153 4. The pointer is used in a volatile access as its address.
3158 Volatile Memory Accesses
3159 ------------------------
3161 Certain memory accesses, such as :ref:`load <i_load>`'s,
3162 :ref:`store <i_store>`'s, and :ref:`llvm.memcpy <int_memcpy>`'s may be
3163 marked ``volatile``. The optimizers must not change the number of
3164 volatile operations or change their order of execution relative to other
3165 volatile operations. The optimizers *may* change the order of volatile
3166 operations relative to non-volatile operations. This is not Java's
3167 "volatile" and has no cross-thread synchronization behavior.
3169 A volatile load or store may have additional target-specific semantics.
3170 Any volatile operation can have side effects, and any volatile operation
3171 can read and/or modify state which is not accessible via a regular load
3172 or store in this module. Volatile operations may use addresses which do
3173 not point to memory (like MMIO registers). This means the compiler may
3174 not use a volatile operation to prove a non-volatile access to that
3175 address has defined behavior.
3177 The allowed side-effects for volatile accesses are limited. If a
3178 non-volatile store to a given address would be legal, a volatile
3179 operation may modify the memory at that address. A volatile operation
3180 may not modify any other memory accessible by the module being compiled.
3181 A volatile operation may not call any code in the current module.
3183 In general (without target specific context), the address space of a
3184 volatile operation may not be changed. Different address spaces may
3185 have different trapping behavior when dereferencing an invalid
3188 The compiler may assume execution will continue after a volatile operation,
3189 so operations which modify memory or may have undefined behavior can be
3190 hoisted past a volatile operation.
3192 As an exception to the preceding rule, the compiler may not assume execution
3193 will continue after a volatile store operation. This restriction is necessary
3194 to support the somewhat common pattern in C of intentionally storing to an
3195 invalid pointer to crash the program. In the future, it might make sense to
3196 allow frontends to control this behavior.
3198 IR-level volatile loads and stores cannot safely be optimized into llvm.memcpy
3199 or llvm.memmove intrinsics even when those intrinsics are flagged volatile.
3200 Likewise, the backend should never split or merge target-legal volatile
3201 load/store instructions. Similarly, IR-level volatile loads and stores cannot
3202 change from integer to floating-point or vice versa.
3204 .. admonition:: Rationale
3206 Platforms may rely on volatile loads and stores of natively supported
3207 data width to be executed as single instruction. For example, in C
3208 this holds for an l-value of volatile primitive type with native
3209 hardware support, but not necessarily for aggregate types. The
3210 frontend upholds these expectations, which are intentionally
3211 unspecified in the IR. The rules above ensure that IR transformations
3212 do not violate the frontend's contract with the language.
3216 Memory Model for Concurrent Operations
3217 --------------------------------------
3219 The LLVM IR does not define any way to start parallel threads of
3220 execution or to register signal handlers. Nonetheless, there are
3221 platform-specific ways to create them, and we define LLVM IR's behavior
3222 in their presence. This model is inspired by the C++0x memory model.
3224 For a more informal introduction to this model, see the :doc:`Atomics`.
3226 We define a *happens-before* partial order as the least partial order
3229 - Is a superset of single-thread program order, and
3230 - When a *synchronizes-with* ``b``, includes an edge from ``a`` to
3231 ``b``. *Synchronizes-with* pairs are introduced by platform-specific
3232 techniques, like pthread locks, thread creation, thread joining,
3233 etc., and by atomic instructions. (See also :ref:`Atomic Memory Ordering
3234 Constraints <ordering>`).
3236 Note that program order does not introduce *happens-before* edges
3237 between a thread and signals executing inside that thread.
3239 Every (defined) read operation (load instructions, memcpy, atomic
3240 loads/read-modify-writes, etc.) R reads a series of bytes written by
3241 (defined) write operations (store instructions, atomic
3242 stores/read-modify-writes, memcpy, etc.). For the purposes of this
3243 section, initialized globals are considered to have a write of the
3244 initializer which is atomic and happens before any other read or write
3245 of the memory in question. For each byte of a read R, R\ :sub:`byte`
3246 may see any write to the same byte, except:
3248 - If write\ :sub:`1` happens before write\ :sub:`2`, and
3249 write\ :sub:`2` happens before R\ :sub:`byte`, then
3250 R\ :sub:`byte` does not see write\ :sub:`1`.
3251 - If R\ :sub:`byte` happens before write\ :sub:`3`, then
3252 R\ :sub:`byte` does not see write\ :sub:`3`.
3254 Given that definition, R\ :sub:`byte` is defined as follows:
3256 - If R is volatile, the result is target-dependent. (Volatile is
3257 supposed to give guarantees which can support ``sig_atomic_t`` in
3258 C/C++, and may be used for accesses to addresses that do not behave
3259 like normal memory. It does not generally provide cross-thread
3261 - Otherwise, if there is no write to the same byte that happens before
3262 R\ :sub:`byte`, R\ :sub:`byte` returns ``undef`` for that byte.
3263 - Otherwise, if R\ :sub:`byte` may see exactly one write,
3264 R\ :sub:`byte` returns the value written by that write.
3265 - Otherwise, if R is atomic, and all the writes R\ :sub:`byte` may
3266 see are atomic, it chooses one of the values written. See the :ref:`Atomic
3267 Memory Ordering Constraints <ordering>` section for additional
3268 constraints on how the choice is made.
3269 - Otherwise R\ :sub:`byte` returns ``undef``.
3271 R returns the value composed of the series of bytes it read. This
3272 implies that some bytes within the value may be ``undef`` **without**
3273 the entire value being ``undef``. Note that this only defines the
3274 semantics of the operation; it doesn't mean that targets will emit more
3275 than one instruction to read the series of bytes.
3277 Note that in cases where none of the atomic intrinsics are used, this
3278 model places only one restriction on IR transformations on top of what
3279 is required for single-threaded execution: introducing a store to a byte
3280 which might not otherwise be stored is not allowed in general.
3281 (Specifically, in the case where another thread might write to and read
3282 from an address, introducing a store can change a load that may see
3283 exactly one write into a load that may see multiple writes.)
3287 Atomic Memory Ordering Constraints
3288 ----------------------------------
3290 Atomic instructions (:ref:`cmpxchg <i_cmpxchg>`,
3291 :ref:`atomicrmw <i_atomicrmw>`, :ref:`fence <i_fence>`,
3292 :ref:`atomic load <i_load>`, and :ref:`atomic store <i_store>`) take
3293 ordering parameters that determine which other atomic instructions on
3294 the same address they *synchronize with*. These semantics are borrowed
3295 from Java and C++0x, but are somewhat more colloquial. If these
3296 descriptions aren't precise enough, check those specs (see spec
3297 references in the :doc:`atomics guide <Atomics>`).
3298 :ref:`fence <i_fence>` instructions treat these orderings somewhat
3299 differently since they don't take an address. See that instruction's
3300 documentation for details.
3302 For a simpler introduction to the ordering constraints, see the
3306 The set of values that can be read is governed by the happens-before
3307 partial order. A value cannot be read unless some operation wrote
3308 it. This is intended to provide a guarantee strong enough to model
3309 Java's non-volatile shared variables. This ordering cannot be
3310 specified for read-modify-write operations; it is not strong enough
3311 to make them atomic in any interesting way.
3313 In addition to the guarantees of ``unordered``, there is a single
3314 total order for modifications by ``monotonic`` operations on each
3315 address. All modification orders must be compatible with the
3316 happens-before order. There is no guarantee that the modification
3317 orders can be combined to a global total order for the whole program
3318 (and this often will not be possible). The read in an atomic
3319 read-modify-write operation (:ref:`cmpxchg <i_cmpxchg>` and
3320 :ref:`atomicrmw <i_atomicrmw>`) reads the value in the modification
3321 order immediately before the value it writes. If one atomic read
3322 happens before another atomic read of the same address, the later
3323 read must see the same value or a later value in the address's
3324 modification order. This disallows reordering of ``monotonic`` (or
3325 stronger) operations on the same address. If an address is written
3326 ``monotonic``-ally by one thread, and other threads ``monotonic``-ally
3327 read that address repeatedly, the other threads must eventually see
3328 the write. This corresponds to the C++0x/C1x
3329 ``memory_order_relaxed``.
3331 In addition to the guarantees of ``monotonic``, a
3332 *synchronizes-with* edge may be formed with a ``release`` operation.
3333 This is intended to model C++'s ``memory_order_acquire``.
3335 In addition to the guarantees of ``monotonic``, if this operation
3336 writes a value which is subsequently read by an ``acquire``
3337 operation, it *synchronizes-with* that operation. (This isn't a
3338 complete description; see the C++0x definition of a release
3339 sequence.) This corresponds to the C++0x/C1x
3340 ``memory_order_release``.
3341 ``acq_rel`` (acquire+release)
3342 Acts as both an ``acquire`` and ``release`` operation on its
3343 address. This corresponds to the C++0x/C1x ``memory_order_acq_rel``.
3344 ``seq_cst`` (sequentially consistent)
3345 In addition to the guarantees of ``acq_rel`` (``acquire`` for an
3346 operation that only reads, ``release`` for an operation that only
3347 writes), there is a global total order on all
3348 sequentially-consistent operations on all addresses, which is
3349 consistent with the *happens-before* partial order and with the
3350 modification orders of all the affected addresses. Each
3351 sequentially-consistent read sees the last preceding write to the
3352 same address in this global order. This corresponds to the C++0x/C1x
3353 ``memory_order_seq_cst`` and Java volatile.
3357 If an atomic operation is marked ``syncscope("singlethread")``, it only
3358 *synchronizes with* and only participates in the seq\_cst total orderings of
3359 other operations running in the same thread (for example, in signal handlers).
3361 If an atomic operation is marked ``syncscope("<target-scope>")``, where
3362 ``<target-scope>`` is a target specific synchronization scope, then it is target
3363 dependent if it *synchronizes with* and participates in the seq\_cst total
3364 orderings of other operations.
3366 Otherwise, an atomic operation that is not marked ``syncscope("singlethread")``
3367 or ``syncscope("<target-scope>")`` *synchronizes with* and participates in the
3368 seq\_cst total orderings of other operations that are not marked
3369 ``syncscope("singlethread")`` or ``syncscope("<target-scope>")``.
3373 Floating-Point Environment
3374 --------------------------
3376 The default LLVM floating-point environment assumes that traps are disabled and
3377 status flags are not observable. Therefore, floating-point math operations do
3378 not have side effects and may be speculated freely. Results assume the
3379 round-to-nearest rounding mode.
3381 Floating-point math operations are allowed to treat all NaNs as if they were
3382 quiet NaNs. For example, "pow(1.0, SNaN)" may be simplified to 1.0. This also
3383 means that SNaN may be passed through a math operation without quieting. For
3384 example, "fmul SNaN, 1.0" may be simplified to SNaN rather than QNaN. However,
3385 SNaN values are never created by math operations. They may only occur when
3386 provided as a program input value.
3388 Code that requires different behavior than this should use the
3389 :ref:`Constrained Floating-Point Intrinsics <constrainedfp>`.
3396 LLVM IR floating-point operations (:ref:`fneg <i_fneg>`, :ref:`fadd <i_fadd>`,
3397 :ref:`fsub <i_fsub>`, :ref:`fmul <i_fmul>`, :ref:`fdiv <i_fdiv>`,
3398 :ref:`frem <i_frem>`, :ref:`fcmp <i_fcmp>`), :ref:`phi <i_phi>`,
3399 :ref:`select <i_select>` and :ref:`call <i_call>`
3400 may use the following flags to enable otherwise unsafe
3401 floating-point transformations.
3404 No NaNs - Allow optimizations to assume the arguments and result are not
3405 NaN. If an argument is a nan, or the result would be a nan, it produces
3406 a :ref:`poison value <poisonvalues>` instead.
3409 No Infs - Allow optimizations to assume the arguments and result are not
3410 +/-Inf. If an argument is +/-Inf, or the result would be +/-Inf, it
3411 produces a :ref:`poison value <poisonvalues>` instead.
3414 No Signed Zeros - Allow optimizations to treat the sign of a zero
3415 argument or zero result as insignificant. This does not imply that -0.0
3416 is poison and/or guaranteed to not exist in the operation.
3419 Allow Reciprocal - Allow optimizations to use the reciprocal of an
3420 argument rather than perform division.
3423 Allow floating-point contraction (e.g. fusing a multiply followed by an
3424 addition into a fused multiply-and-add). This does not enable reassociating
3425 to form arbitrary contractions. For example, ``(a*b) + (c*d) + e`` can not
3426 be transformed into ``(a*b) + ((c*d) + e)`` to create two fma operations.
3429 Approximate functions - Allow substitution of approximate calculations for
3430 functions (sin, log, sqrt, etc). See floating-point intrinsic definitions
3431 for places where this can apply to LLVM's intrinsic math functions.
3434 Allow reassociation transformations for floating-point instructions.
3435 This may dramatically change results in floating-point.
3438 This flag implies all of the others.
3442 Use-list Order Directives
3443 -------------------------
3445 Use-list directives encode the in-memory order of each use-list, allowing the
3446 order to be recreated. ``<order-indexes>`` is a comma-separated list of
3447 indexes that are assigned to the referenced value's uses. The referenced
3448 value's use-list is immediately sorted by these indexes.
3450 Use-list directives may appear at function scope or global scope. They are not
3451 instructions, and have no effect on the semantics of the IR. When they're at
3452 function scope, they must appear after the terminator of the final basic block.
3454 If basic blocks have their address taken via ``blockaddress()`` expressions,
3455 ``uselistorder_bb`` can be used to reorder their use-lists from outside their
3462 uselistorder <ty> <value>, { <order-indexes> }
3463 uselistorder_bb @function, %block { <order-indexes> }
3469 define void @foo(i32 %arg1, i32 %arg2) {
3471 ; ... instructions ...
3473 ; ... instructions ...
3475 ; At function scope.
3476 uselistorder i32 %arg1, { 1, 0, 2 }
3477 uselistorder label %bb, { 1, 0 }
3481 uselistorder ptr @global, { 1, 2, 0 }
3482 uselistorder i32 7, { 1, 0 }
3483 uselistorder i32 (i32) @bar, { 1, 0 }
3484 uselistorder_bb @foo, %bb, { 5, 1, 3, 2, 0, 4 }
3486 .. _source_filename:
3491 The *source filename* string is set to the original module identifier,
3492 which will be the name of the compiled source file when compiling from
3493 source through the clang front end, for example. It is then preserved through
3496 This is currently necessary to generate a consistent unique global
3497 identifier for local functions used in profile data, which prepends the
3498 source file name to the local function name.
3500 The syntax for the source file name is simply:
3502 .. code-block:: text
3504 source_filename = "/path/to/source.c"
3511 The LLVM type system is one of the most important features of the
3512 intermediate representation. Being typed enables a number of
3513 optimizations to be performed on the intermediate representation
3514 directly, without having to do extra analyses on the side before the
3515 transformation. A strong type system makes it easier to read the
3516 generated code and enables novel analyses and transformations that are
3517 not feasible to perform on normal three address code representations.
3527 The void type does not represent any value and has no size.
3545 The function type can be thought of as a function signature. It consists of a
3546 return type and a list of formal parameter types. The return type of a function
3547 type is a void type or first class type --- except for :ref:`label <t_label>`
3548 and :ref:`metadata <t_metadata>` types.
3554 <returntype> (<parameter list>)
3556 ...where '``<parameter list>``' is a comma-separated list of type
3557 specifiers. Optionally, the parameter list may include a type ``...``, which
3558 indicates that the function takes a variable number of arguments. Variable
3559 argument functions can access their arguments with the :ref:`variable argument
3560 handling intrinsic <int_varargs>` functions. '``<returntype>``' is any type
3561 except :ref:`label <t_label>` and :ref:`metadata <t_metadata>`.
3565 +---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+
3566 | ``i32 (i32)`` | function taking an ``i32``, returning an ``i32`` |
3567 +---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+
3568 | ``i32 (ptr, ...)`` | A vararg function that takes at least one :ref:`pointer <t_pointer>` argument and returns an integer. This is the signature for ``printf`` in LLVM. |
3569 +---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+
3570 | ``{i32, i32} (i32)`` | A function taking an ``i32``, returning a :ref:`structure <t_struct>` containing two ``i32`` values |
3571 +---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+
3578 The :ref:`first class <t_firstclass>` types are perhaps the most important.
3579 Values of these types are the only ones which can be produced by
3587 These are the types that are valid in registers from CodeGen's perspective.
3596 The integer type is a very simple type that simply specifies an
3597 arbitrary bit width for the integer type desired. Any bit width from 1
3598 bit to 2\ :sup:`23`\ (about 8 million) can be specified.
3606 The number of bits the integer will occupy is specified by the ``N``
3612 +----------------+------------------------------------------------+
3613 | ``i1`` | a single-bit integer. |
3614 +----------------+------------------------------------------------+
3615 | ``i32`` | a 32-bit integer. |
3616 +----------------+------------------------------------------------+
3617 | ``i1942652`` | a really big integer of over 1 million bits. |
3618 +----------------+------------------------------------------------+
3622 Floating-Point Types
3623 """"""""""""""""""""
3632 - 16-bit floating-point value
3635 - 16-bit "brain" floating-point value (7-bit significand). Provides the
3636 same number of exponent bits as ``float``, so that it matches its dynamic
3637 range, but with greatly reduced precision. Used in Intel's AVX-512 BF16
3638 extensions and Arm's ARMv8.6-A extensions, among others.
3641 - 32-bit floating-point value
3644 - 64-bit floating-point value
3647 - 128-bit floating-point value (113-bit significand)
3650 - 80-bit floating-point value (X87)
3653 - 128-bit floating-point value (two 64-bits)
3655 The binary format of half, float, double, and fp128 correspond to the
3656 IEEE-754-2008 specifications for binary16, binary32, binary64, and binary128
3664 The x86_amx type represents a value held in an AMX tile register on an x86
3665 machine. The operations allowed on it are quite limited. Only few intrinsics
3666 are allowed: stride load and store, zero and dot product. No instruction is
3667 allowed for this type. There are no arguments, arrays, pointers, vectors
3668 or constants of this type.
3682 The x86_mmx type represents a value held in an MMX register on an x86
3683 machine. The operations allowed on it are quite limited: parameters and
3684 return values, load and store, and bitcast. User-specified MMX
3685 instructions are represented as intrinsic or asm calls with arguments
3686 and/or results of this type. There are no arrays, vectors or constants
3703 The pointer type ``ptr`` is used to specify memory locations. Pointers are
3704 commonly used to reference objects in memory.
3706 Pointer types may have an optional address space attribute defining
3707 the numbered address space where the pointed-to object resides. For
3708 example, ``ptr addrspace(5)`` is a pointer to address space 5.
3709 In addition to integer constants, ``addrspace`` can also reference one of the
3710 address spaces defined in the :ref:`datalayout string<langref_datalayout>`.
3711 ``addrspace("A")`` will use the alloca address space, ``addrspace("G")``
3712 the default globals address space and ``addrspace("P")`` the program address
3715 The default address space is number zero.
3717 The semantics of non-zero address spaces are target-specific. Memory
3718 access through a non-dereferenceable pointer is undefined behavior in
3719 any address space. Pointers with the bit-value 0 are only assumed to
3720 be non-dereferenceable in address space 0, unless the function is
3721 marked with the ``null_pointer_is_valid`` attribute.
3723 If an object can be proven accessible through a pointer with a
3724 different address space, the access may be modified to use that
3725 address space. Exceptions apply if the operation is ``volatile``.
3727 Prior to LLVM 15, pointer types also specified a pointee type, such as
3728 ``i8*``, ``[4 x i32]*`` or ``i32 (i32*)*``. In LLVM 15, such "typed
3729 pointers" are still supported under non-default options. See the
3730 `opaque pointers document <OpaquePointers.html>`__ for more information.
3734 Target Extension Type
3735 """""""""""""""""""""
3739 Target extension types represent types that must be preserved through
3740 optimization, but are otherwise generally opaque to the compiler. They may be
3741 used as function parameters or arguments, and in :ref:`phi <i_phi>` or
3742 :ref:`select <i_select>` instructions. Some types may be also used in
3743 :ref:`alloca <i_alloca>` instructions or as global values, and correspondingly
3744 it is legal to use :ref:`load <i_load>` and :ref:`store <i_store>` instructions
3745 on them. Full semantics for these types are defined by the target.
3747 The only constants that target extension types may have are ``zeroinitializer``,
3748 ``undef``, and ``poison``. Other possible values for target extension types may
3749 arise from target-specific intrinsics and functions.
3751 These types cannot be converted to other types. As such, it is not legal to use
3752 them in :ref:`bitcast <i_bitcast>` instructions (as a source or target type),
3753 nor is it legal to use them in :ref:`ptrtoint <i_ptrtoint>` or
3754 :ref:`inttoptr <i_inttoptr>` instructions. Similarly, they are not legal to use
3755 in an :ref:`icmp <i_icmp>` instruction.
3757 Target extension types have a name and optional type or integer parameters. The
3758 meanings of name and parameters are defined by the target. When being defined in
3759 LLVM IR, all of the type parameters must precede all of the integer parameters.
3761 Specific target extension types are registered with LLVM as having specific
3762 properties. These properties can be used to restrict the type from appearing in
3763 certain contexts, such as being the type of a global variable or having a
3764 ``zeroinitializer`` constant be valid. A complete list of type properties may be
3765 found in the documentation for ``llvm::TargetExtType::Property`` (`doxygen
3766 <https://llvm.org/doxygen/classllvm_1_1TargetExtType.html>`_).
3770 .. code-block:: llvm
3773 target("label", void)
3774 target("label", void, i32)
3775 target("label", 0, 1, 2)
3776 target("label", void, i32, 0, 1, 2)
3786 A vector type is a simple derived type that represents a vector of
3787 elements. Vector types are used when multiple primitive data are
3788 operated in parallel using a single instruction (SIMD). A vector type
3789 requires a size (number of elements), an underlying primitive data type,
3790 and a scalable property to represent vectors where the exact hardware
3791 vector length is unknown at compile time. Vector types are considered
3792 :ref:`first class <t_firstclass>`.
3796 In general vector elements are laid out in memory in the same way as
3797 :ref:`array types <t_array>`. Such an analogy works fine as long as the vector
3798 elements are byte sized. However, when the elements of the vector aren't byte
3799 sized it gets a bit more complicated. One way to describe the layout is by
3800 describing what happens when a vector such as <N x iM> is bitcasted to an
3801 integer type with N*M bits, and then following the rules for storing such an
3804 A bitcast from a vector type to a scalar integer type will see the elements
3805 being packed together (without padding). The order in which elements are
3806 inserted in the integer depends on endianess. For little endian element zero
3807 is put in the least significant bits of the integer, and for big endian
3808 element zero is put in the most significant bits.
3810 Using a vector such as ``<i4 1, i4 2, i4 3, i4 5>`` as an example, together
3811 with the analogy that we can replace a vector store by a bitcast followed by
3812 an integer store, we get this for big endian:
3814 .. code-block:: llvm
3816 %val = bitcast <4 x i4> <i4 1, i4 2, i4 3, i4 5> to i16
3818 ; Bitcasting from a vector to an integral type can be seen as
3819 ; concatenating the values:
3820 ; %val now has the hexadecimal value 0x1235.
3822 store i16 %val, ptr %ptr
3824 ; In memory the content will be (8-bit addressing):
3826 ; [%ptr + 0]: 00010010 (0x12)
3827 ; [%ptr + 1]: 00110101 (0x35)
3829 The same example for little endian:
3831 .. code-block:: llvm
3833 %val = bitcast <4 x i4> <i4 1, i4 2, i4 3, i4 5> to i16
3835 ; Bitcasting from a vector to an integral type can be seen as
3836 ; concatenating the values:
3837 ; %val now has the hexadecimal value 0x5321.
3839 store i16 %val, ptr %ptr
3841 ; In memory the content will be (8-bit addressing):
3843 ; [%ptr + 0]: 01010011 (0x53)
3844 ; [%ptr + 1]: 00100001 (0x21)
3846 When ``<N*M>`` isn't evenly divisible by the byte size the exact memory layout
3847 is unspecified (just like it is for an integral type of the same size). This
3848 is because different targets could put the padding at different positions when
3849 the type size is smaller than the type's store size.
3855 < <# elements> x <elementtype> > ; Fixed-length vector
3856 < vscale x <# elements> x <elementtype> > ; Scalable vector
3858 The number of elements is a constant integer value larger than 0;
3859 elementtype may be any integer, floating-point or pointer type. Vectors
3860 of size zero are not allowed. For scalable vectors, the total number of
3861 elements is a constant multiple (called vscale) of the specified number
3862 of elements; vscale is a positive integer that is unknown at compile time
3863 and the same hardware-dependent constant for all scalable vectors at run
3864 time. The size of a specific scalable vector type is thus constant within
3865 IR, even if the exact size in bytes cannot be determined until run time.
3869 +------------------------+----------------------------------------------------+
3870 | ``<4 x i32>`` | Vector of 4 32-bit integer values. |
3871 +------------------------+----------------------------------------------------+
3872 | ``<8 x float>`` | Vector of 8 32-bit floating-point values. |
3873 +------------------------+----------------------------------------------------+
3874 | ``<2 x i64>`` | Vector of 2 64-bit integer values. |
3875 +------------------------+----------------------------------------------------+
3876 | ``<4 x ptr>`` | Vector of 4 pointers |
3877 +------------------------+----------------------------------------------------+
3878 | ``<vscale x 4 x i32>`` | Vector with a multiple of 4 32-bit integer values. |
3879 +------------------------+----------------------------------------------------+
3888 The label type represents code labels.
3903 The token type is used when a value is associated with an instruction
3904 but all uses of the value must not attempt to introspect or obscure it.
3905 As such, it is not appropriate to have a :ref:`phi <i_phi>` or
3906 :ref:`select <i_select>` of type token.
3923 The metadata type represents embedded metadata. No derived types may be
3924 created from metadata except for :ref:`function <t_function>` arguments.
3937 Aggregate Types are a subset of derived types that can contain multiple
3938 member types. :ref:`Arrays <t_array>` and :ref:`structs <t_struct>` are
3939 aggregate types. :ref:`Vectors <t_vector>` are not considered to be
3949 The array type is a very simple derived type that arranges elements
3950 sequentially in memory. The array type requires a size (number of
3951 elements) and an underlying data type.
3957 [<# elements> x <elementtype>]
3959 The number of elements is a constant integer value; ``elementtype`` may
3960 be any type with a size.
3964 +------------------+--------------------------------------+
3965 | ``[40 x i32]`` | Array of 40 32-bit integer values. |
3966 +------------------+--------------------------------------+
3967 | ``[41 x i32]`` | Array of 41 32-bit integer values. |
3968 +------------------+--------------------------------------+
3969 | ``[4 x i8]`` | Array of 4 8-bit integer values. |
3970 +------------------+--------------------------------------+
3972 Here are some examples of multidimensional arrays:
3974 +-----------------------------+----------------------------------------------------------+
3975 | ``[3 x [4 x i32]]`` | 3x4 array of 32-bit integer values. |
3976 +-----------------------------+----------------------------------------------------------+
3977 | ``[12 x [10 x float]]`` | 12x10 array of single precision floating-point values. |
3978 +-----------------------------+----------------------------------------------------------+
3979 | ``[2 x [3 x [4 x i16]]]`` | 2x3x4 array of 16-bit integer values. |
3980 +-----------------------------+----------------------------------------------------------+
3982 There is no restriction on indexing beyond the end of the array implied
3983 by a static type (though there are restrictions on indexing beyond the
3984 bounds of an allocated object in some cases). This means that
3985 single-dimension 'variable sized array' addressing can be implemented in
3986 LLVM with a zero length array type. An implementation of 'pascal style
3987 arrays' in LLVM could use the type "``{ i32, [0 x float]}``", for
3997 The structure type is used to represent a collection of data members
3998 together in memory. The elements of a structure may be any type that has
4001 Structures in memory are accessed using '``load``' and '``store``' by
4002 getting a pointer to a field with the '``getelementptr``' instruction.
4003 Structures in registers are accessed using the '``extractvalue``' and
4004 '``insertvalue``' instructions.
4006 Structures may optionally be "packed" structures, which indicate that
4007 the alignment of the struct is one byte, and that there is no padding
4008 between the elements. In non-packed structs, padding between field types
4009 is inserted as defined by the DataLayout string in the module, which is
4010 required to match what the underlying code generator expects.
4012 Structures can either be "literal" or "identified". A literal structure
4013 is defined inline with other types (e.g. ``[2 x {i32, i32}]``) whereas
4014 identified types are always defined at the top level with a name.
4015 Literal types are uniqued by their contents and can never be recursive
4016 or opaque since there is no way to write one. Identified types can be
4017 recursive, can be opaqued, and are never uniqued.
4023 %T1 = type { <type list> } ; Identified normal struct type
4024 %T2 = type <{ <type list> }> ; Identified packed struct type
4028 +------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
4029 | ``{ i32, i32, i32 }`` | A triple of three ``i32`` values |
4030 +------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
4031 | ``{ float, ptr }`` | A pair, where the first element is a ``float`` and the second element is a :ref:`pointer <t_pointer>`. |
4032 +------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
4033 | ``<{ i8, i32 }>`` | A packed struct known to be 5 bytes in size. |
4034 +------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
4038 Opaque Structure Types
4039 """"""""""""""""""""""
4043 Opaque structure types are used to represent structure types that
4044 do not have a body specified. This corresponds (for example) to the C
4045 notion of a forward declared structure. They can be named (``%X``) or
4057 +--------------+-------------------+
4058 | ``opaque`` | An opaque type. |
4059 +--------------+-------------------+
4066 LLVM has several different basic types of constants. This section
4067 describes them all and their syntax.
4072 **Boolean constants**
4073 The two strings '``true``' and '``false``' are both valid constants
4075 **Integer constants**
4076 Standard integers (such as '4') are constants of the
4077 :ref:`integer <t_integer>` type. Negative numbers may be used with
4079 **Floating-point constants**
4080 Floating-point constants use standard decimal notation (e.g.
4081 123.421), exponential notation (e.g. 1.23421e+2), or a more precise
4082 hexadecimal notation (see below). The assembler requires the exact
4083 decimal value of a floating-point constant. For example, the
4084 assembler accepts 1.25 but rejects 1.3 because 1.3 is a repeating
4085 decimal in binary. Floating-point constants must have a
4086 :ref:`floating-point <t_floating>` type.
4087 **Null pointer constants**
4088 The identifier '``null``' is recognized as a null pointer constant
4089 and must be of :ref:`pointer type <t_pointer>`.
4091 The identifier '``none``' is recognized as an empty token constant
4092 and must be of :ref:`token type <t_token>`.
4094 The one non-intuitive notation for constants is the hexadecimal form of
4095 floating-point constants. For example, the form
4096 '``double 0x432ff973cafa8000``' is equivalent to (but harder to read
4097 than) '``double 4.5e+15``'. The only time hexadecimal floating-point
4098 constants are required (and the only time that they are generated by the
4099 disassembler) is when a floating-point constant must be emitted but it
4100 cannot be represented as a decimal floating-point number in a reasonable
4101 number of digits. For example, NaN's, infinities, and other special
4102 values are represented in their IEEE hexadecimal format so that assembly
4103 and disassembly do not cause any bits to change in the constants.
4105 When using the hexadecimal form, constants of types bfloat, half, float, and
4106 double are represented using the 16-digit form shown above (which matches the
4107 IEEE754 representation for double); bfloat, half and float values must, however,
4108 be exactly representable as bfloat, IEEE 754 half, and IEEE 754 single
4109 precision respectively. Hexadecimal format is always used for long double, and
4110 there are three forms of long double. The 80-bit format used by x86 is
4111 represented as ``0xK`` followed by 20 hexadecimal digits. The 128-bit format
4112 used by PowerPC (two adjacent doubles) is represented by ``0xM`` followed by 32
4113 hexadecimal digits. The IEEE 128-bit format is represented by ``0xL`` followed
4114 by 32 hexadecimal digits. Long doubles will only work if they match the long
4115 double format on your target. The IEEE 16-bit format (half precision) is
4116 represented by ``0xH`` followed by 4 hexadecimal digits. The bfloat 16-bit
4117 format is represented by ``0xR`` followed by 4 hexadecimal digits. All
4118 hexadecimal formats are big-endian (sign bit at the left).
4120 There are no constants of type x86_mmx and x86_amx.
4122 .. _complexconstants:
4127 Complex constants are a (potentially recursive) combination of simple
4128 constants and smaller complex constants.
4130 **Structure constants**
4131 Structure constants are represented with notation similar to
4132 structure type definitions (a comma separated list of elements,
4133 surrounded by braces (``{}``)). For example:
4134 "``{ i32 4, float 17.0, ptr @G }``", where "``@G``" is declared as
4135 "``@G = external global i32``". Structure constants must have
4136 :ref:`structure type <t_struct>`, and the number and types of elements
4137 must match those specified by the type.
4139 Array constants are represented with notation similar to array type
4140 definitions (a comma separated list of elements, surrounded by
4141 square brackets (``[]``)). For example:
4142 "``[ i32 42, i32 11, i32 74 ]``". Array constants must have
4143 :ref:`array type <t_array>`, and the number and types of elements must
4144 match those specified by the type. As a special case, character array
4145 constants may also be represented as a double-quoted string using the ``c``
4146 prefix. For example: "``c"Hello World\0A\00"``".
4147 **Vector constants**
4148 Vector constants are represented with notation similar to vector
4149 type definitions (a comma separated list of elements, surrounded by
4150 less-than/greater-than's (``<>``)). For example:
4151 "``< i32 42, i32 11, i32 74, i32 100 >``". Vector constants
4152 must have :ref:`vector type <t_vector>`, and the number and types of
4153 elements must match those specified by the type.
4154 **Zero initialization**
4155 The string '``zeroinitializer``' can be used to zero initialize a
4156 value to zero of *any* type, including scalar and
4157 :ref:`aggregate <t_aggregate>` types. This is often used to avoid
4158 having to print large zero initializers (e.g. for large arrays) and
4159 is always exactly equivalent to using explicit zero initializers.
4161 A metadata node is a constant tuple without types. For example:
4162 "``!{!0, !{!2, !0}, !"test"}``". Metadata can reference constant values,
4163 for example: "``!{!0, i32 0, ptr @global, ptr @function, !"str"}``".
4164 Unlike other typed constants that are meant to be interpreted as part of
4165 the instruction stream, metadata is a place to attach additional
4166 information such as debug info.
4168 Global Variable and Function Addresses
4169 --------------------------------------
4171 The addresses of :ref:`global variables <globalvars>` and
4172 :ref:`functions <functionstructure>` are always implicitly valid
4173 (link-time) constants. These constants are explicitly referenced when
4174 the :ref:`identifier for the global <identifiers>` is used and always have
4175 :ref:`pointer <t_pointer>` type. For example, the following is a legal LLVM
4178 .. code-block:: llvm
4182 @Z = global [2 x ptr] [ ptr @X, ptr @Y ]
4189 The string '``undef``' can be used anywhere a constant is expected, and
4190 indicates that the user of the value may receive an unspecified
4191 bit-pattern. Undefined values may be of any type (other than '``label``'
4192 or '``void``') and be used anywhere a constant is permitted.
4196 A '``poison``' value (described in the next section) should be used instead of
4197 '``undef``' whenever possible. Poison values are stronger than undef, and
4198 enable more optimizations. Just the existence of '``undef``' blocks certain
4199 optimizations (see the examples below).
4201 Undefined values are useful because they indicate to the compiler that
4202 the program is well defined no matter what value is used. This gives the
4203 compiler more freedom to optimize. Here are some examples of
4204 (potentially surprising) transformations that are valid (in pseudo IR):
4206 .. code-block:: llvm
4216 This is safe because all of the output bits are affected by the undef
4217 bits. Any output bit can have a zero or one depending on the input bits.
4219 .. code-block:: llvm
4227 %A = %X ;; By choosing undef as 0
4228 %B = %X ;; By choosing undef as -1
4233 These logical operations have bits that are not always affected by the
4234 input. For example, if ``%X`` has a zero bit, then the output of the
4235 '``and``' operation will always be a zero for that bit, no matter what
4236 the corresponding bit from the '``undef``' is. As such, it is unsafe to
4237 optimize or assume that the result of the '``and``' is '``undef``'.
4238 However, it is safe to assume that all bits of the '``undef``' could be
4239 0, and optimize the '``and``' to 0. Likewise, it is safe to assume that
4240 all the bits of the '``undef``' operand to the '``or``' could be set,
4241 allowing the '``or``' to be folded to -1.
4243 .. code-block:: llvm
4245 %A = select undef, %X, %Y
4246 %B = select undef, 42, %Y
4247 %C = select %X, %Y, undef
4251 %C = %Y (if %Y is provably not poison; unsafe otherwise)
4257 This set of examples shows that undefined '``select``' (and conditional
4258 branch) conditions can go *either way*, but they have to come from one
4259 of the two operands. In the ``%A`` example, if ``%X`` and ``%Y`` were
4260 both known to have a clear low bit, then ``%A`` would have to have a
4261 cleared low bit. However, in the ``%C`` example, the optimizer is
4262 allowed to assume that the '``undef``' operand could be the same as
4263 ``%Y`` if ``%Y`` is provably not '``poison``', allowing the whole '``select``'
4264 to be eliminated. This is because '``poison``' is stronger than '``undef``'.
4266 .. code-block:: llvm
4268 %A = xor undef, undef
4285 This example points out that two '``undef``' operands are not
4286 necessarily the same. This can be surprising to people (and also matches
4287 C semantics) where they assume that "``X^X``" is always zero, even if
4288 ``X`` is undefined. This isn't true for a number of reasons, but the
4289 short answer is that an '``undef``' "variable" can arbitrarily change
4290 its value over its "live range". This is true because the variable
4291 doesn't actually *have a live range*. Instead, the value is logically
4292 read from arbitrary registers that happen to be around when needed, so
4293 the value is not necessarily consistent over time. In fact, ``%A`` and
4294 ``%C`` need to have the same semantics or the core LLVM "replace all
4295 uses with" concept would not hold.
4297 To ensure all uses of a given register observe the same value (even if
4298 '``undef``'), the :ref:`freeze instruction <i_freeze>` can be used.
4300 .. code-block:: llvm
4308 These examples show the crucial difference between an *undefined value*
4309 and *undefined behavior*. An undefined value (like '``undef``') is
4310 allowed to have an arbitrary bit-pattern. This means that the ``%A``
4311 operation can be constant folded to '``0``', because the '``undef``'
4312 could be zero, and zero divided by any value is zero.
4313 However, in the second example, we can make a more aggressive
4314 assumption: because the ``undef`` is allowed to be an arbitrary value,
4315 we are allowed to assume that it could be zero. Since a divide by zero
4316 has *undefined behavior*, we are allowed to assume that the operation
4317 does not execute at all. This allows us to delete the divide and all
4318 code after it. Because the undefined operation "can't happen", the
4319 optimizer can assume that it occurs in dead code.
4321 .. code-block:: text
4323 a: store undef -> %X
4324 b: store %X -> undef
4326 a: <deleted> (if the stored value in %X is provably not poison)
4329 A store *of* an undefined value can be assumed to not have any effect;
4330 we can assume that the value is overwritten with bits that happen to
4331 match what was already there. This argument is only valid if the stored value
4332 is provably not ``poison``. However, a store *to* an undefined
4333 location could clobber arbitrary memory, therefore, it has undefined
4336 Branching on an undefined value is undefined behavior.
4337 This explains optimizations that depend on branch conditions to construct
4338 predicates, such as Correlated Value Propagation and Global Value Numbering.
4339 In case of switch instruction, the branch condition should be frozen, otherwise
4340 it is undefined behavior.
4342 .. code-block:: llvm
4345 br undef, BB1, BB2 ; UB
4347 %X = and i32 undef, 255
4348 switch %X, label %ret [ .. ] ; UB
4350 store undef, ptr %ptr
4351 %X = load ptr %ptr ; %X is undef
4352 switch i8 %X, label %ret [ .. ] ; UB
4355 %X = or i8 undef, 255 ; always 255
4356 switch i8 %X, label %ret [ .. ] ; Well-defined
4358 %X = freeze i1 undef
4359 br %X, BB1, BB2 ; Well-defined (non-deterministic jump)
4368 A poison value is a result of an erroneous operation.
4369 In order to facilitate speculative execution, many instructions do not
4370 invoke immediate undefined behavior when provided with illegal operands,
4371 and return a poison value instead.
4372 The string '``poison``' can be used anywhere a constant is expected, and
4373 operations such as :ref:`add <i_add>` with the ``nsw`` flag can produce
4376 Most instructions return '``poison``' when one of their arguments is
4377 '``poison``'. A notable exception is the :ref:`select instruction <i_select>`.
4378 Propagation of poison can be stopped with the
4379 :ref:`freeze instruction <i_freeze>`.
4381 It is correct to replace a poison value with an
4382 :ref:`undef value <undefvalues>` or any value of the type.
4384 This means that immediate undefined behavior occurs if a poison value is
4385 used as an instruction operand that has any values that trigger undefined
4386 behavior. Notably this includes (but is not limited to):
4388 - The pointer operand of a :ref:`load <i_load>`, :ref:`store <i_store>` or
4389 any other pointer dereferencing instruction (independent of address
4391 - The divisor operand of a ``udiv``, ``sdiv``, ``urem`` or ``srem``
4393 - The condition operand of a :ref:`br <i_br>` instruction.
4394 - The callee operand of a :ref:`call <i_call>` or :ref:`invoke <i_invoke>`
4396 - The parameter operand of a :ref:`call <i_call>` or :ref:`invoke <i_invoke>`
4397 instruction, when the function or invoking call site has a ``noundef``
4398 attribute in the corresponding position.
4399 - The operand of a :ref:`ret <i_ret>` instruction if the function or invoking
4400 call site has a `noundef` attribute in the return value position.
4402 Here are some examples:
4404 .. code-block:: llvm
4407 %poison = sub nuw i32 0, 1 ; Results in a poison value.
4408 %poison2 = sub i32 poison, 1 ; Also results in a poison value.
4409 %still_poison = and i32 %poison, 0 ; 0, but also poison.
4410 %poison_yet_again = getelementptr i32, ptr @h, i32 %still_poison
4411 store i32 0, ptr %poison_yet_again ; Undefined behavior due to
4414 store i32 %poison, ptr @g ; Poison value stored to memory.
4415 %poison3 = load i32, ptr @g ; Poison value loaded back from memory.
4417 %poison4 = load i16, ptr @g ; Returns a poison value.
4418 %poison5 = load i64, ptr @g ; Returns a poison value.
4420 %cmp = icmp slt i32 %poison, 0 ; Returns a poison value.
4421 br i1 %cmp, label %end, label %end ; undefined behavior
4425 .. _welldefinedvalues:
4430 Given a program execution, a value is *well defined* if the value does not
4431 have an undef bit and is not poison in the execution.
4432 An aggregate value or vector is well defined if its elements are well defined.
4433 The padding of an aggregate isn't considered, since it isn't visible
4434 without storing it into memory and loading it with a different type.
4436 A constant of a :ref:`single value <t_single_value>`, non-vector type is well
4437 defined if it is neither '``undef``' constant nor '``poison``' constant.
4438 The result of :ref:`freeze instruction <i_freeze>` is well defined regardless
4443 Addresses of Basic Blocks
4444 -------------------------
4446 ``blockaddress(@function, %block)``
4448 The '``blockaddress``' constant computes the address of the specified
4449 basic block in the specified function.
4451 It always has an ``ptr addrspace(P)`` type, where ``P`` is the address space
4452 of the function containing ``%block`` (usually ``addrspace(0)``).
4454 Taking the address of the entry block is illegal.
4456 This value only has defined behavior when used as an operand to the
4457 ':ref:`indirectbr <i_indirectbr>`' or for comparisons against null. Pointer
4458 equality tests between labels addresses results in undefined behavior ---
4459 though, again, comparison against null is ok, and no label is equal to the null
4460 pointer. This may be passed around as an opaque pointer sized value as long as
4461 the bits are not inspected. This allows ``ptrtoint`` and arithmetic to be
4462 performed on these values so long as the original value is reconstituted before
4463 the ``indirectbr`` instruction.
4465 Finally, some targets may provide defined semantics when using the value
4466 as the operand to an inline assembly, but that is target specific.
4468 .. _dso_local_equivalent:
4470 DSO Local Equivalent
4471 --------------------
4473 ``dso_local_equivalent @func``
4475 A '``dso_local_equivalent``' constant represents a function which is
4476 functionally equivalent to a given function, but is always defined in the
4477 current linkage unit. The resulting pointer has the same type as the underlying
4478 function. The resulting pointer is permitted, but not required, to be different
4479 from a pointer to the function, and it may have different values in different
4482 The target function may not have ``extern_weak`` linkage.
4484 ``dso_local_equivalent`` can be implemented as such:
4486 - If the function has local linkage, hidden visibility, or is
4487 ``dso_local``, ``dso_local_equivalent`` can be implemented as simply a pointer
4489 - ``dso_local_equivalent`` can be implemented with a stub that tail-calls the
4490 function. Many targets support relocations that resolve at link time to either
4491 a function or a stub for it, depending on if the function is defined within the
4492 linkage unit; LLVM will use this when available. (This is commonly called a
4493 "PLT stub".) On other targets, the stub may need to be emitted explicitly.
4495 This can be used wherever a ``dso_local`` instance of a function is needed without
4496 needing to explicitly make the original function ``dso_local``. An instance where
4497 this can be used is for static offset calculations between a function and some other
4498 ``dso_local`` symbol. This is especially useful for the Relative VTables C++ ABI,
4499 where dynamic relocations for function pointers in VTables can be replaced with
4500 static relocations for offsets between the VTable and virtual functions which
4501 may not be ``dso_local``.
4503 This is currently only supported for ELF binary formats.
4512 With `Control-Flow Integrity (CFI)
4513 <https://clang.llvm.org/docs/ControlFlowIntegrity.html>`_, a '``no_cfi``'
4514 constant represents a function reference that does not get replaced with a
4515 reference to the CFI jump table in the ``LowerTypeTests`` pass. These constants
4516 may be useful in low-level programs, such as operating system kernels, which
4517 need to refer to the actual function body.
4521 Constant Expressions
4522 --------------------
4524 Constant expressions are used to allow expressions involving other
4525 constants to be used as constants. Constant expressions may be of any
4526 :ref:`first class <t_firstclass>` type and may involve any LLVM operation
4527 that does not have side effects (e.g. load and call are not supported).
4528 The following is the syntax for constant expressions:
4530 ``trunc (CST to TYPE)``
4531 Perform the :ref:`trunc operation <i_trunc>` on constants.
4532 ``zext (CST to TYPE)``
4533 Perform the :ref:`zext operation <i_zext>` on constants.
4534 ``sext (CST to TYPE)``
4535 Perform the :ref:`sext operation <i_sext>` on constants.
4536 ``fptrunc (CST to TYPE)``
4537 Truncate a floating-point constant to another floating-point type.
4538 The size of CST must be larger than the size of TYPE. Both types
4539 must be floating-point.
4540 ``fpext (CST to TYPE)``
4541 Floating-point extend a constant to another type. The size of CST
4542 must be smaller or equal to the size of TYPE. Both types must be
4544 ``fptoui (CST to TYPE)``
4545 Convert a floating-point constant to the corresponding unsigned
4546 integer constant. TYPE must be a scalar or vector integer type. CST
4547 must be of scalar or vector floating-point type. Both CST and TYPE
4548 must be scalars, or vectors of the same number of elements. If the
4549 value won't fit in the integer type, the result is a
4550 :ref:`poison value <poisonvalues>`.
4551 ``fptosi (CST to TYPE)``
4552 Convert a floating-point constant to the corresponding signed
4553 integer constant. TYPE must be a scalar or vector integer type. CST
4554 must be of scalar or vector floating-point type. Both CST and TYPE
4555 must be scalars, or vectors of the same number of elements. If the
4556 value won't fit in the integer type, the result is a
4557 :ref:`poison value <poisonvalues>`.
4558 ``uitofp (CST to TYPE)``
4559 Convert an unsigned integer constant to the corresponding
4560 floating-point constant. TYPE must be a scalar or vector floating-point
4561 type. CST must be of scalar or vector integer type. Both CST and TYPE must
4562 be scalars, or vectors of the same number of elements.
4563 ``sitofp (CST to TYPE)``
4564 Convert a signed integer constant to the corresponding floating-point
4565 constant. TYPE must be a scalar or vector floating-point type.
4566 CST must be of scalar or vector integer type. Both CST and TYPE must
4567 be scalars, or vectors of the same number of elements.
4568 ``ptrtoint (CST to TYPE)``
4569 Perform the :ref:`ptrtoint operation <i_ptrtoint>` on constants.
4570 ``inttoptr (CST to TYPE)``
4571 Perform the :ref:`inttoptr operation <i_inttoptr>` on constants.
4572 This one is *really* dangerous!
4573 ``bitcast (CST to TYPE)``
4574 Convert a constant, CST, to another TYPE.
4575 The constraints of the operands are the same as those for the
4576 :ref:`bitcast instruction <i_bitcast>`.
4577 ``addrspacecast (CST to TYPE)``
4578 Convert a constant pointer or constant vector of pointer, CST, to another
4579 TYPE in a different address space. The constraints of the operands are the
4580 same as those for the :ref:`addrspacecast instruction <i_addrspacecast>`.
4581 ``getelementptr (TY, CSTPTR, IDX0, IDX1, ...)``, ``getelementptr inbounds (TY, CSTPTR, IDX0, IDX1, ...)``
4582 Perform the :ref:`getelementptr operation <i_getelementptr>` on
4583 constants. As with the :ref:`getelementptr <i_getelementptr>`
4584 instruction, the index list may have one or more indexes, which are
4585 required to make sense for the type of "pointer to TY". These indexes
4586 may be implicitly sign-extended or truncated to match the index size
4587 of CSTPTR's address space.
4588 ``icmp COND (VAL1, VAL2)``
4589 Perform the :ref:`icmp operation <i_icmp>` on constants.
4590 ``fcmp COND (VAL1, VAL2)``
4591 Perform the :ref:`fcmp operation <i_fcmp>` on constants.
4592 ``extractelement (VAL, IDX)``
4593 Perform the :ref:`extractelement operation <i_extractelement>` on
4595 ``insertelement (VAL, ELT, IDX)``
4596 Perform the :ref:`insertelement operation <i_insertelement>` on
4598 ``shufflevector (VEC1, VEC2, IDXMASK)``
4599 Perform the :ref:`shufflevector operation <i_shufflevector>` on
4602 Perform an addition on constants.
4604 Perform a subtraction on constants.
4606 Perform a multiplication on constants.
4608 Perform a left shift on constants.
4610 Perform a logical right shift on constants.
4612 Perform an arithmetic right shift on constants.
4614 Perform a bitwise and on constants.
4616 Perform a bitwise or on constants.
4618 Perform a bitwise xor on constants.
4625 Inline Assembler Expressions
4626 ----------------------------
4628 LLVM supports inline assembler expressions (as opposed to :ref:`Module-Level
4629 Inline Assembly <moduleasm>`) through the use of a special value. This value
4630 represents the inline assembler as a template string (containing the
4631 instructions to emit), a list of operand constraints (stored as a string), a
4632 flag that indicates whether or not the inline asm expression has side effects,
4633 and a flag indicating whether the function containing the asm needs to align its
4634 stack conservatively.
4636 The template string supports argument substitution of the operands using "``$``"
4637 followed by a number, to indicate substitution of the given register/memory
4638 location, as specified by the constraint string. "``${NUM:MODIFIER}``" may also
4639 be used, where ``MODIFIER`` is a target-specific annotation for how to print the
4640 operand (See :ref:`inline-asm-modifiers`).
4642 A literal "``$``" may be included by using "``$$``" in the template. To include
4643 other special characters into the output, the usual "``\XX``" escapes may be
4644 used, just as in other strings. Note that after template substitution, the
4645 resulting assembly string is parsed by LLVM's integrated assembler unless it is
4646 disabled -- even when emitting a ``.s`` file -- and thus must contain assembly
4647 syntax known to LLVM.
4649 LLVM also supports a few more substitutions useful for writing inline assembly:
4651 - ``${:uid}``: Expands to a decimal integer unique to this inline assembly blob.
4652 This substitution is useful when declaring a local label. Many standard
4653 compiler optimizations, such as inlining, may duplicate an inline asm blob.
4654 Adding a blob-unique identifier ensures that the two labels will not conflict
4655 during assembly. This is used to implement `GCC's %= special format
4656 string <https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html>`_.
4657 - ``${:comment}``: Expands to the comment character of the current target's
4658 assembly dialect. This is usually ``#``, but many targets use other strings,
4659 such as ``;``, ``//``, or ``!``.
4660 - ``${:private}``: Expands to the assembler private label prefix. Labels with
4661 this prefix will not appear in the symbol table of the assembled object.
4662 Typically the prefix is ``L``, but targets may use other strings. ``.L`` is
4665 LLVM's support for inline asm is modeled closely on the requirements of Clang's
4666 GCC-compatible inline-asm support. Thus, the feature-set and the constraint and
4667 modifier codes listed here are similar or identical to those in GCC's inline asm
4668 support. However, to be clear, the syntax of the template and constraint strings
4669 described here is *not* the same as the syntax accepted by GCC and Clang, and,
4670 while most constraint letters are passed through as-is by Clang, some get
4671 translated to other codes when converting from the C source to the LLVM
4674 An example inline assembler expression is:
4676 .. code-block:: llvm
4678 i32 (i32) asm "bswap $0", "=r,r"
4680 Inline assembler expressions may **only** be used as the callee operand
4681 of a :ref:`call <i_call>` or an :ref:`invoke <i_invoke>` instruction.
4682 Thus, typically we have:
4684 .. code-block:: llvm
4686 %X = call i32 asm "bswap $0", "=r,r"(i32 %Y)
4688 Inline asms with side effects not visible in the constraint list must be
4689 marked as having side effects. This is done through the use of the
4690 '``sideeffect``' keyword, like so:
4692 .. code-block:: llvm
4694 call void asm sideeffect "eieio", ""()
4696 In some cases inline asms will contain code that will not work unless
4697 the stack is aligned in some way, such as calls or SSE instructions on
4698 x86, yet will not contain code that does that alignment within the asm.
4699 The compiler should make conservative assumptions about what the asm
4700 might contain and should generate its usual stack alignment code in the
4701 prologue if the '``alignstack``' keyword is present:
4703 .. code-block:: llvm
4705 call void asm alignstack "eieio", ""()
4707 Inline asms also support using non-standard assembly dialects. The
4708 assumed dialect is ATT. When the '``inteldialect``' keyword is present,
4709 the inline asm is using the Intel dialect. Currently, ATT and Intel are
4710 the only supported dialects. An example is:
4712 .. code-block:: llvm
4714 call void asm inteldialect "eieio", ""()
4716 In the case that the inline asm might unwind the stack,
4717 the '``unwind``' keyword must be used, so that the compiler emits
4718 unwinding information:
4720 .. code-block:: llvm
4722 call void asm unwind "call func", ""()
4724 If the inline asm unwinds the stack and isn't marked with
4725 the '``unwind``' keyword, the behavior is undefined.
4727 If multiple keywords appear, the '``sideeffect``' keyword must come
4728 first, the '``alignstack``' keyword second, the '``inteldialect``' keyword
4729 third and the '``unwind``' keyword last.
4731 Inline Asm Constraint String
4732 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
4734 The constraint list is a comma-separated string, each element containing one or
4735 more constraint codes.
4737 For each element in the constraint list an appropriate register or memory
4738 operand will be chosen, and it will be made available to assembly template
4739 string expansion as ``$0`` for the first constraint in the list, ``$1`` for the
4742 There are three different types of constraints, which are distinguished by a
4743 prefix symbol in front of the constraint code: Output, Input, and Clobber. The
4744 constraints must always be given in that order: outputs first, then inputs, then
4745 clobbers. They cannot be intermingled.
4747 There are also three different categories of constraint codes:
4749 - Register constraint. This is either a register class, or a fixed physical
4750 register. This kind of constraint will allocate a register, and if necessary,
4751 bitcast the argument or result to the appropriate type.
4752 - Memory constraint. This kind of constraint is for use with an instruction
4753 taking a memory operand. Different constraints allow for different addressing
4754 modes used by the target.
4755 - Immediate value constraint. This kind of constraint is for an integer or other
4756 immediate value which can be rendered directly into an instruction. The
4757 various target-specific constraints allow the selection of a value in the
4758 proper range for the instruction you wish to use it with.
4763 Output constraints are specified by an "``=``" prefix (e.g. "``=r``"). This
4764 indicates that the assembly will write to this operand, and the operand will
4765 then be made available as a return value of the ``asm`` expression. Output
4766 constraints do not consume an argument from the call instruction. (Except, see
4767 below about indirect outputs).
4769 Normally, it is expected that no output locations are written to by the assembly
4770 expression until *all* of the inputs have been read. As such, LLVM may assign
4771 the same register to an output and an input. If this is not safe (e.g. if the
4772 assembly contains two instructions, where the first writes to one output, and
4773 the second reads an input and writes to a second output), then the "``&``"
4774 modifier must be used (e.g. "``=&r``") to specify that the output is an
4775 "early-clobber" output. Marking an output as "early-clobber" ensures that LLVM
4776 will not use the same register for any inputs (other than an input tied to this
4782 Input constraints do not have a prefix -- just the constraint codes. Each input
4783 constraint will consume one argument from the call instruction. It is not
4784 permitted for the asm to write to any input register or memory location (unless
4785 that input is tied to an output). Note also that multiple inputs may all be
4786 assigned to the same register, if LLVM can determine that they necessarily all
4787 contain the same value.
4789 Instead of providing a Constraint Code, input constraints may also "tie"
4790 themselves to an output constraint, by providing an integer as the constraint
4791 string. Tied inputs still consume an argument from the call instruction, and
4792 take up a position in the asm template numbering as is usual -- they will simply
4793 be constrained to always use the same register as the output they've been tied
4794 to. For example, a constraint string of "``=r,0``" says to assign a register for
4795 output, and use that register as an input as well (it being the 0'th
4798 It is permitted to tie an input to an "early-clobber" output. In that case, no
4799 *other* input may share the same register as the input tied to the early-clobber
4800 (even when the other input has the same value).
4802 You may only tie an input to an output which has a register constraint, not a
4803 memory constraint. Only a single input may be tied to an output.
4805 There is also an "interesting" feature which deserves a bit of explanation: if a
4806 register class constraint allocates a register which is too small for the value
4807 type operand provided as input, the input value will be split into multiple
4808 registers, and all of them passed to the inline asm.
4810 However, this feature is often not as useful as you might think.
4812 Firstly, the registers are *not* guaranteed to be consecutive. So, on those
4813 architectures that have instructions which operate on multiple consecutive
4814 instructions, this is not an appropriate way to support them. (e.g. the 32-bit
4815 SparcV8 has a 64-bit load, which instruction takes a single 32-bit register. The
4816 hardware then loads into both the named register, and the next register. This
4817 feature of inline asm would not be useful to support that.)
4819 A few of the targets provide a template string modifier allowing explicit access
4820 to the second register of a two-register operand (e.g. MIPS ``L``, ``M``, and
4821 ``D``). On such an architecture, you can actually access the second allocated
4822 register (yet, still, not any subsequent ones). But, in that case, you're still
4823 probably better off simply splitting the value into two separate operands, for
4824 clarity. (e.g. see the description of the ``A`` constraint on X86, which,
4825 despite existing only for use with this feature, is not really a good idea to
4828 Indirect inputs and outputs
4829 """""""""""""""""""""""""""
4831 Indirect output or input constraints can be specified by the "``*``" modifier
4832 (which goes after the "``=``" in case of an output). This indicates that the asm
4833 will write to or read from the contents of an *address* provided as an input
4834 argument. (Note that in this way, indirect outputs act more like an *input* than
4835 an output: just like an input, they consume an argument of the call expression,
4836 rather than producing a return value. An indirect output constraint is an
4837 "output" only in that the asm is expected to write to the contents of the input
4838 memory location, instead of just read from it).
4840 This is most typically used for memory constraint, e.g. "``=*m``", to pass the
4841 address of a variable as a value.
4843 It is also possible to use an indirect *register* constraint, but only on output
4844 (e.g. "``=*r``"). This will cause LLVM to allocate a register for an output
4845 value normally, and then, separately emit a store to the address provided as
4846 input, after the provided inline asm. (It's not clear what value this
4847 functionality provides, compared to writing the store explicitly after the asm
4848 statement, and it can only produce worse code, since it bypasses many
4849 optimization passes. I would recommend not using it.)
4851 Call arguments for indirect constraints must have pointer type and must specify
4852 the :ref:`elementtype <attr_elementtype>` attribute to indicate the pointer
4858 A clobber constraint is indicated by a "``~``" prefix. A clobber does not
4859 consume an input operand, nor generate an output. Clobbers cannot use any of the
4860 general constraint code letters -- they may use only explicit register
4861 constraints, e.g. "``~{eax}``". The one exception is that a clobber string of
4862 "``~{memory}``" indicates that the assembly writes to arbitrary undeclared
4863 memory locations -- not only the memory pointed to by a declared indirect
4866 Note that clobbering named registers that are also present in output
4867 constraints is not legal.
4872 A label constraint is indicated by a "``!``" prefix and typically used in the
4873 form ``"!i"``. Instead of consuming call arguments, label constraints consume
4874 indirect destination labels of ``callbr`` instructions.
4876 Label constraints can only be used in conjunction with ``callbr`` and the
4877 number of label constraints must match the number of indirect destination
4878 labels in the ``callbr`` instruction.
4883 After a potential prefix comes constraint code, or codes.
4885 A Constraint Code is either a single letter (e.g. "``r``"), a "``^``" character
4886 followed by two letters (e.g. "``^wc``"), or "``{``" register-name "``}``"
4889 The one and two letter constraint codes are typically chosen to be the same as
4890 GCC's constraint codes.
4892 A single constraint may include one or more than constraint code in it, leaving
4893 it up to LLVM to choose which one to use. This is included mainly for
4894 compatibility with the translation of GCC inline asm coming from clang.
4896 There are two ways to specify alternatives, and either or both may be used in an
4897 inline asm constraint list:
4899 1) Append the codes to each other, making a constraint code set. E.g. "``im``"
4900 or "``{eax}m``". This means "choose any of the options in the set". The
4901 choice of constraint is made independently for each constraint in the
4904 2) Use "``|``" between constraint code sets, creating alternatives. Every
4905 constraint in the constraint list must have the same number of alternative
4906 sets. With this syntax, the same alternative in *all* of the items in the
4907 constraint list will be chosen together.
4909 Putting those together, you might have a two operand constraint string like
4910 ``"rm|r,ri|rm"``. This indicates that if operand 0 is ``r`` or ``m``, then
4911 operand 1 may be one of ``r`` or ``i``. If operand 0 is ``r``, then operand 1
4912 may be one of ``r`` or ``m``. But, operand 0 and 1 cannot both be of type m.
4914 However, the use of either of the alternatives features is *NOT* recommended, as
4915 LLVM is not able to make an intelligent choice about which one to use. (At the
4916 point it currently needs to choose, not enough information is available to do so
4917 in a smart way.) Thus, it simply tries to make a choice that's most likely to
4918 compile, not one that will be optimal performance. (e.g., given "``rm``", it'll
4919 always choose to use memory, not registers). And, if given multiple registers,
4920 or multiple register classes, it will simply choose the first one. (In fact, it
4921 doesn't currently even ensure explicitly specified physical registers are
4922 unique, so specifying multiple physical registers as alternatives, like
4923 ``{r11}{r12},{r11}{r12}``, will assign r11 to both operands, not at all what was
4926 Supported Constraint Code List
4927 """"""""""""""""""""""""""""""
4929 The constraint codes are, in general, expected to behave the same way they do in
4930 GCC. LLVM's support is often implemented on an 'as-needed' basis, to support C
4931 inline asm code which was supported by GCC. A mismatch in behavior between LLVM
4932 and GCC likely indicates a bug in LLVM.
4934 Some constraint codes are typically supported by all targets:
4936 - ``r``: A register in the target's general purpose register class.
4937 - ``m``: A memory address operand. It is target-specific what addressing modes
4938 are supported, typical examples are register, or register + register offset,
4939 or register + immediate offset (of some target-specific size).
4940 - ``p``: An address operand. Similar to ``m``, but used by "load address"
4941 type instructions without touching memory.
4942 - ``i``: An integer constant (of target-specific width). Allows either a simple
4943 immediate, or a relocatable value.
4944 - ``n``: An integer constant -- *not* including relocatable values.
4945 - ``s``: An integer constant, but allowing *only* relocatable values.
4946 - ``X``: Allows an operand of any kind, no constraint whatsoever. Typically
4947 useful to pass a label for an asm branch or call.
4949 .. FIXME: but that surely isn't actually okay to jump out of an asm
4950 block without telling llvm about the control transfer???)
4952 - ``{register-name}``: Requires exactly the named physical register.
4954 Other constraints are target-specific:
4958 - ``z``: An immediate integer 0. Outputs ``WZR`` or ``XZR``, as appropriate.
4959 - ``I``: An immediate integer valid for an ``ADD`` or ``SUB`` instruction,
4960 i.e. 0 to 4095 with optional shift by 12.
4961 - ``J``: An immediate integer that, when negated, is valid for an ``ADD`` or
4962 ``SUB`` instruction, i.e. -1 to -4095 with optional left shift by 12.
4963 - ``K``: An immediate integer that is valid for the 'bitmask immediate 32' of a
4964 logical instruction like ``AND``, ``EOR``, or ``ORR`` with a 32-bit register.
4965 - ``L``: An immediate integer that is valid for the 'bitmask immediate 64' of a
4966 logical instruction like ``AND``, ``EOR``, or ``ORR`` with a 64-bit register.
4967 - ``M``: An immediate integer for use with the ``MOV`` assembly alias on a
4968 32-bit register. This is a superset of ``K``: in addition to the bitmask
4969 immediate, also allows immediate integers which can be loaded with a single
4970 ``MOVZ`` or ``MOVL`` instruction.
4971 - ``N``: An immediate integer for use with the ``MOV`` assembly alias on a
4972 64-bit register. This is a superset of ``L``.
4973 - ``Q``: Memory address operand must be in a single register (no
4974 offsets). (However, LLVM currently does this for the ``m`` constraint as
4976 - ``r``: A 32 or 64-bit integer register (W* or X*).
4977 - ``w``: A 32, 64, or 128-bit floating-point, SIMD or SVE vector register.
4978 - ``x``: Like w, but restricted to registers 0 to 15 inclusive.
4979 - ``y``: Like w, but restricted to SVE vector registers Z0 to Z7 inclusive.
4980 - ``Upl``: One of the low eight SVE predicate registers (P0 to P7)
4981 - ``Upa``: Any of the SVE predicate registers (P0 to P15)
4985 - ``r``: A 32 or 64-bit integer register.
4986 - ``[0-9]v``: The 32-bit VGPR register, number 0-9.
4987 - ``[0-9]s``: The 32-bit SGPR register, number 0-9.
4988 - ``[0-9]a``: The 32-bit AGPR register, number 0-9.
4989 - ``I``: An integer inline constant in the range from -16 to 64.
4990 - ``J``: A 16-bit signed integer constant.
4991 - ``A``: An integer or a floating-point inline constant.
4992 - ``B``: A 32-bit signed integer constant.
4993 - ``C``: A 32-bit unsigned integer constant or an integer inline constant in the range from -16 to 64.
4994 - ``DA``: A 64-bit constant that can be split into two "A" constants.
4995 - ``DB``: A 64-bit constant that can be split into two "B" constants.
4999 - ``Q``, ``Um``, ``Un``, ``Uq``, ``Us``, ``Ut``, ``Uv``, ``Uy``: Memory address
5000 operand. Treated the same as operand ``m``, at the moment.
5001 - ``Te``: An even general-purpose 32-bit integer register: ``r0,r2,...,r12,r14``
5002 - ``To``: An odd general-purpose 32-bit integer register: ``r1,r3,...,r11``
5004 ARM and ARM's Thumb2 mode:
5006 - ``j``: An immediate integer between 0 and 65535 (valid for ``MOVW``)
5007 - ``I``: An immediate integer valid for a data-processing instruction.
5008 - ``J``: An immediate integer between -4095 and 4095.
5009 - ``K``: An immediate integer whose bitwise inverse is valid for a
5010 data-processing instruction. (Can be used with template modifier "``B``" to
5011 print the inverted value).
5012 - ``L``: An immediate integer whose negation is valid for a data-processing
5013 instruction. (Can be used with template modifier "``n``" to print the negated
5015 - ``M``: A power of two or an integer between 0 and 32.
5016 - ``N``: Invalid immediate constraint.
5017 - ``O``: Invalid immediate constraint.
5018 - ``r``: A general-purpose 32-bit integer register (``r0-r15``).
5019 - ``l``: In Thumb2 mode, low 32-bit GPR registers (``r0-r7``). In ARM mode, same
5021 - ``h``: In Thumb2 mode, a high 32-bit GPR register (``r8-r15``). In ARM mode,
5023 - ``w``: A 32, 64, or 128-bit floating-point/SIMD register in the ranges
5024 ``s0-s31``, ``d0-d31``, or ``q0-q15``, respectively.
5025 - ``t``: A 32, 64, or 128-bit floating-point/SIMD register in the ranges
5026 ``s0-s31``, ``d0-d15``, or ``q0-q7``, respectively.
5027 - ``x``: A 32, 64, or 128-bit floating-point/SIMD register in the ranges
5028 ``s0-s15``, ``d0-d7``, or ``q0-q3``, respectively.
5032 - ``I``: An immediate integer between 0 and 255.
5033 - ``J``: An immediate integer between -255 and -1.
5034 - ``K``: An immediate integer between 0 and 255, with optional left-shift by
5036 - ``L``: An immediate integer between -7 and 7.
5037 - ``M``: An immediate integer which is a multiple of 4 between 0 and 1020.
5038 - ``N``: An immediate integer between 0 and 31.
5039 - ``O``: An immediate integer which is a multiple of 4 between -508 and 508.
5040 - ``r``: A low 32-bit GPR register (``r0-r7``).
5041 - ``l``: A low 32-bit GPR register (``r0-r7``).
5042 - ``h``: A high GPR register (``r0-r7``).
5043 - ``w``: A 32, 64, or 128-bit floating-point/SIMD register in the ranges
5044 ``s0-s31``, ``d0-d31``, or ``q0-q15``, respectively.
5045 - ``t``: A 32, 64, or 128-bit floating-point/SIMD register in the ranges
5046 ``s0-s31``, ``d0-d15``, or ``q0-q7``, respectively.
5047 - ``x``: A 32, 64, or 128-bit floating-point/SIMD register in the ranges
5048 ``s0-s15``, ``d0-d7``, or ``q0-q3``, respectively.
5052 - ``o``, ``v``: A memory address operand, treated the same as constraint ``m``,
5054 - ``r``: A 32 or 64-bit register.
5058 - ``f``: A floating-point register (if available).
5059 - ``k``: A memory operand whose address is formed by a base register and
5060 (optionally scaled) index register.
5061 - ``l``: A signed 16-bit constant.
5062 - ``m``: A memory operand whose address is formed by a base register and
5063 offset that is suitable for use in instructions with the same addressing
5064 mode as st.w and ld.w.
5065 - ``I``: A signed 12-bit constant (for arithmetic instructions).
5066 - ``J``: An immediate integer zero.
5067 - ``K``: An unsigned 12-bit constant (for logic instructions).
5068 - ``ZB``: An address that is held in a general-purpose register. The offset
5070 - ``ZC``: A memory operand whose address is formed by a base register and
5071 offset that is suitable for use in instructions with the same addressing
5072 mode as ll.w and sc.w.
5076 - ``r``: An 8 or 16-bit register.
5080 - ``I``: An immediate signed 16-bit integer.
5081 - ``J``: An immediate integer zero.
5082 - ``K``: An immediate unsigned 16-bit integer.
5083 - ``L``: An immediate 32-bit integer, where the lower 16 bits are 0.
5084 - ``N``: An immediate integer between -65535 and -1.
5085 - ``O``: An immediate signed 15-bit integer.
5086 - ``P``: An immediate integer between 1 and 65535.
5087 - ``m``: A memory address operand. In MIPS-SE mode, allows a base address
5088 register plus 16-bit immediate offset. In MIPS mode, just a base register.
5089 - ``R``: A memory address operand. In MIPS-SE mode, allows a base address
5090 register plus a 9-bit signed offset. In MIPS mode, the same as constraint
5092 - ``ZC``: A memory address operand, suitable for use in a ``pref``, ``ll``, or
5093 ``sc`` instruction on the given subtarget (details vary).
5094 - ``r``, ``d``, ``y``: A 32 or 64-bit GPR register.
5095 - ``f``: A 32 or 64-bit FPU register (``F0-F31``), or a 128-bit MSA register
5096 (``W0-W31``). In the case of MSA registers, it is recommended to use the ``w``
5097 argument modifier for compatibility with GCC.
5098 - ``c``: A 32-bit or 64-bit GPR register suitable for indirect jump (always
5100 - ``l``: The ``lo`` register, 32 or 64-bit.
5105 - ``b``: A 1-bit integer register.
5106 - ``c`` or ``h``: A 16-bit integer register.
5107 - ``r``: A 32-bit integer register.
5108 - ``l`` or ``N``: A 64-bit integer register.
5109 - ``f``: A 32-bit float register.
5110 - ``d``: A 64-bit float register.
5115 - ``I``: An immediate signed 16-bit integer.
5116 - ``J``: An immediate unsigned 16-bit integer, shifted left 16 bits.
5117 - ``K``: An immediate unsigned 16-bit integer.
5118 - ``L``: An immediate signed 16-bit integer, shifted left 16 bits.
5119 - ``M``: An immediate integer greater than 31.
5120 - ``N``: An immediate integer that is an exact power of 2.
5121 - ``O``: The immediate integer constant 0.
5122 - ``P``: An immediate integer constant whose negation is a signed 16-bit
5124 - ``es``, ``o``, ``Q``, ``Z``, ``Zy``: A memory address operand, currently
5125 treated the same as ``m``.
5126 - ``r``: A 32 or 64-bit integer register.
5127 - ``b``: A 32 or 64-bit integer register, excluding ``R0`` (that is:
5129 - ``f``: A 32 or 64-bit float register (``F0-F31``),
5130 - ``v``: For ``4 x f32`` or ``4 x f64`` types, a 128-bit altivec vector
5131 register (``V0-V31``).
5133 - ``y``: Condition register (``CR0-CR7``).
5134 - ``wc``: An individual CR bit in a CR register.
5135 - ``wa``, ``wd``, ``wf``: Any 128-bit VSX vector register, from the full VSX
5136 register set (overlapping both the floating-point and vector register files).
5137 - ``ws``: A 32 or 64-bit floating-point register, from the full VSX register
5142 - ``A``: An address operand (using a general-purpose register, without an
5144 - ``I``: A 12-bit signed integer immediate operand.
5145 - ``J``: A zero integer immediate operand.
5146 - ``K``: A 5-bit unsigned integer immediate operand.
5147 - ``f``: A 32- or 64-bit floating-point register (requires F or D extension).
5148 - ``r``: A 32- or 64-bit general-purpose register (depending on the platform
5150 - ``vr``: A vector register. (requires V extension).
5151 - ``vm``: A vector register for masking operand. (requires V extension).
5155 - ``I``: An immediate 13-bit signed integer.
5156 - ``r``: A 32-bit integer register.
5157 - ``f``: Any floating-point register on SparcV8, or a floating-point
5158 register in the "low" half of the registers on SparcV9.
5159 - ``e``: Any floating-point register. (Same as ``f`` on SparcV8.)
5163 - ``I``: An immediate unsigned 8-bit integer.
5164 - ``J``: An immediate unsigned 12-bit integer.
5165 - ``K``: An immediate signed 16-bit integer.
5166 - ``L``: An immediate signed 20-bit integer.
5167 - ``M``: An immediate integer 0x7fffffff.
5168 - ``Q``: A memory address operand with a base address and a 12-bit immediate
5169 unsigned displacement.
5170 - ``R``: A memory address operand with a base address, a 12-bit immediate
5171 unsigned displacement, and an index register.
5172 - ``S``: A memory address operand with a base address and a 20-bit immediate
5173 signed displacement.
5174 - ``T``: A memory address operand with a base address, a 20-bit immediate
5175 signed displacement, and an index register.
5176 - ``r`` or ``d``: A 32, 64, or 128-bit integer register.
5177 - ``a``: A 32, 64, or 128-bit integer address register (excludes R0, which in an
5178 address context evaluates as zero).
5179 - ``h``: A 32-bit value in the high part of a 64bit data register
5181 - ``f``: A 32, 64, or 128-bit floating-point register.
5185 - ``I``: An immediate integer between 0 and 31.
5186 - ``J``: An immediate integer between 0 and 64.
5187 - ``K``: An immediate signed 8-bit integer.
5188 - ``L``: An immediate integer, 0xff or 0xffff or (in 64-bit mode only)
5190 - ``M``: An immediate integer between 0 and 3.
5191 - ``N``: An immediate unsigned 8-bit integer.
5192 - ``O``: An immediate integer between 0 and 127.
5193 - ``e``: An immediate 32-bit signed integer.
5194 - ``Z``: An immediate 32-bit unsigned integer.
5195 - ``o``, ``v``: Treated the same as ``m``, at the moment.
5196 - ``q``: An 8, 16, 32, or 64-bit register which can be accessed as an 8-bit
5197 ``l`` integer register. On X86-32, this is the ``a``, ``b``, ``c``, and ``d``
5198 registers, and on X86-64, it is all of the integer registers.
5199 - ``Q``: An 8, 16, 32, or 64-bit register which can be accessed as an 8-bit
5200 ``h`` integer register. This is the ``a``, ``b``, ``c``, and ``d`` registers.
5201 - ``r`` or ``l``: An 8, 16, 32, or 64-bit integer register.
5202 - ``R``: An 8, 16, 32, or 64-bit "legacy" integer register -- one which has
5203 existed since i386, and can be accessed without the REX prefix.
5204 - ``f``: A 32, 64, or 80-bit '387 FPU stack pseudo-register.
5205 - ``y``: A 64-bit MMX register, if MMX is enabled.
5206 - ``x``: If SSE is enabled: a 32 or 64-bit scalar operand, or 128-bit vector
5207 operand in a SSE register. If AVX is also enabled, can also be a 256-bit
5208 vector operand in an AVX register. If AVX-512 is also enabled, can also be a
5209 512-bit vector operand in an AVX512 register, Otherwise, an error.
5210 - ``Y``: The same as ``x``, if *SSE2* is enabled, otherwise an error.
5211 - ``A``: Special case: allocates EAX first, then EDX, for a single operand (in
5212 32-bit mode, a 64-bit integer operand will get split into two registers). It
5213 is not recommended to use this constraint, as in 64-bit mode, the 64-bit
5214 operand will get allocated only to RAX -- if two 32-bit operands are needed,
5215 you're better off splitting it yourself, before passing it to the asm
5220 - ``r``: A 32-bit integer register.
5223 .. _inline-asm-modifiers:
5225 Asm template argument modifiers
5226 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
5228 In the asm template string, modifiers can be used on the operand reference, like
5231 The modifiers are, in general, expected to behave the same way they do in
5232 GCC. LLVM's support is often implemented on an 'as-needed' basis, to support C
5233 inline asm code which was supported by GCC. A mismatch in behavior between LLVM
5234 and GCC likely indicates a bug in LLVM.
5238 - ``c``: Print an immediate integer constant unadorned, without
5239 the target-specific immediate punctuation (e.g. no ``$`` prefix).
5240 - ``n``: Negate and print immediate integer constant unadorned, without the
5241 target-specific immediate punctuation (e.g. no ``$`` prefix).
5242 - ``l``: Print as an unadorned label, without the target-specific label
5243 punctuation (e.g. no ``$`` prefix).
5247 - ``w``: Print a GPR register with a ``w*`` name instead of ``x*`` name. E.g.,
5248 instead of ``x30``, print ``w30``.
5249 - ``x``: Print a GPR register with a ``x*`` name. (this is the default, anyhow).
5250 - ``b``, ``h``, ``s``, ``d``, ``q``: Print a floating-point/SIMD register with a
5251 ``b*``, ``h*``, ``s*``, ``d*``, or ``q*`` name, rather than the default of
5260 - ``a``: Print an operand as an address (with ``[`` and ``]`` surrounding a
5264 - ``y``: Print a VFP single-precision register as an indexed double (e.g. print
5265 as ``d4[1]`` instead of ``s9``)
5266 - ``B``: Bitwise invert and print an immediate integer constant without ``#``
5268 - ``L``: Print the low 16-bits of an immediate integer constant.
5269 - ``M``: Print as a register set suitable for ldm/stm. Also prints *all*
5270 register operands subsequent to the specified one (!), so use carefully.
5271 - ``Q``: Print the low-order register of a register-pair, or the low-order
5272 register of a two-register operand.
5273 - ``R``: Print the high-order register of a register-pair, or the high-order
5274 register of a two-register operand.
5275 - ``H``: Print the second register of a register-pair. (On a big-endian system,
5276 ``H`` is equivalent to ``Q``, and on little-endian system, ``H`` is equivalent
5279 .. FIXME: H doesn't currently support printing the second register
5280 of a two-register operand.
5282 - ``e``: Print the low doubleword register of a NEON quad register.
5283 - ``f``: Print the high doubleword register of a NEON quad register.
5284 - ``m``: Print the base register of a memory operand without the ``[`` and ``]``
5289 - ``L``: Print the second register of a two-register operand. Requires that it
5290 has been allocated consecutively to the first.
5292 .. FIXME: why is it restricted to consecutive ones? And there's
5293 nothing that ensures that happens, is there?
5295 - ``I``: Print the letter 'i' if the operand is an integer constant, otherwise
5296 nothing. Used to print 'addi' vs 'add' instructions.
5300 - ``z``: Print $zero register if operand is zero, otherwise print it normally.
5304 No additional modifiers.
5308 - ``X``: Print an immediate integer as hexadecimal
5309 - ``x``: Print the low 16 bits of an immediate integer as hexadecimal.
5310 - ``d``: Print an immediate integer as decimal.
5311 - ``m``: Subtract one and print an immediate integer as decimal.
5312 - ``z``: Print $0 if an immediate zero, otherwise print normally.
5313 - ``L``: Print the low-order register of a two-register operand, or prints the
5314 address of the low-order word of a double-word memory operand.
5316 .. FIXME: L seems to be missing memory operand support.
5318 - ``M``: Print the high-order register of a two-register operand, or prints the
5319 address of the high-order word of a double-word memory operand.
5321 .. FIXME: M seems to be missing memory operand support.
5323 - ``D``: Print the second register of a two-register operand, or prints the
5324 second word of a double-word memory operand. (On a big-endian system, ``D`` is
5325 equivalent to ``L``, and on little-endian system, ``D`` is equivalent to
5327 - ``w``: No effect. Provided for compatibility with GCC which requires this
5328 modifier in order to print MSA registers (``W0-W31``) with the ``f``
5337 - ``L``: Print the second register of a two-register operand. Requires that it
5338 has been allocated consecutively to the first.
5340 .. FIXME: why is it restricted to consecutive ones? And there's
5341 nothing that ensures that happens, is there?
5343 - ``I``: Print the letter 'i' if the operand is an integer constant, otherwise
5344 nothing. Used to print 'addi' vs 'add' instructions.
5345 - ``y``: For a memory operand, prints formatter for a two-register X-form
5346 instruction. (Currently always prints ``r0,OPERAND``).
5347 - ``U``: Prints 'u' if the memory operand is an update form, and nothing
5348 otherwise. (NOTE: LLVM does not support update form, so this will currently
5349 always print nothing)
5350 - ``X``: Prints 'x' if the memory operand is an indexed form. (NOTE: LLVM does
5351 not support indexed form, so this will currently always print nothing)
5355 - ``i``: Print the letter 'i' if the operand is not a register, otherwise print
5356 nothing. Used to print 'addi' vs 'add' instructions, etc.
5357 - ``z``: Print the register ``zero`` if an immediate zero, otherwise print
5366 SystemZ implements only ``n``, and does *not* support any of the other
5367 target-independent modifiers.
5371 - ``c``: Print an unadorned integer or symbol name. (The latter is
5372 target-specific behavior for this typically target-independent modifier).
5373 - ``A``: Print a register name with a '``*``' before it.
5374 - ``b``: Print an 8-bit register name (e.g. ``al``); do nothing on a memory
5376 - ``h``: Print the upper 8-bit register name (e.g. ``ah``); do nothing on a
5378 - ``w``: Print the 16-bit register name (e.g. ``ax``); do nothing on a memory
5380 - ``k``: Print the 32-bit register name (e.g. ``eax``); do nothing on a memory
5382 - ``q``: Print the 64-bit register name (e.g. ``rax``), if 64-bit registers are
5383 available, otherwise the 32-bit register name; do nothing on a memory operand.
5384 - ``n``: Negate and print an unadorned integer, or, for operands other than an
5385 immediate integer (e.g. a relocatable symbol expression), print a '-' before
5386 the operand. (The behavior for relocatable symbol expressions is a
5387 target-specific behavior for this typically target-independent modifier)
5388 - ``H``: Print a memory reference with additional offset +8.
5389 - ``P``: Print a memory reference used as the argument of a call instruction or
5390 used with explicit base reg and index reg as its offset. So it can not use
5391 additional regs to present the memory reference. (E.g. omit ``(rip)``, even
5392 though it's PC-relative.)
5396 No additional modifiers.
5402 The call instructions that wrap inline asm nodes may have a
5403 "``!srcloc``" MDNode attached to it that contains a list of constant
5404 integers. If present, the code generator will use the integer as the
5405 location cookie value when report errors through the ``LLVMContext``
5406 error reporting mechanisms. This allows a front-end to correlate backend
5407 errors that occur with inline asm back to the source code that produced
5410 .. code-block:: llvm
5412 call void asm sideeffect "something bad", ""(), !srcloc !42
5414 !42 = !{ i32 1234567 }
5416 It is up to the front-end to make sense of the magic numbers it places
5417 in the IR. If the MDNode contains multiple constants, the code generator
5418 will use the one that corresponds to the line of the asm that the error
5426 LLVM IR allows metadata to be attached to instructions and global objects in the
5427 program that can convey extra information about the code to the optimizers and
5428 code generator. One example application of metadata is source-level
5429 debug information. There are two metadata primitives: strings and nodes.
5431 Metadata does not have a type, and is not a value. If referenced from a
5432 ``call`` instruction, it uses the ``metadata`` type.
5434 All metadata are identified in syntax by an exclamation point ('``!``').
5436 .. _metadata-string:
5438 Metadata Nodes and Metadata Strings
5439 -----------------------------------
5441 A metadata string is a string surrounded by double quotes. It can
5442 contain any character by escaping non-printable characters with
5443 "``\xx``" where "``xx``" is the two digit hex code. For example:
5446 Metadata nodes are represented with notation similar to structure
5447 constants (a comma separated list of elements, surrounded by braces and
5448 preceded by an exclamation point). Metadata nodes can have any values as
5449 their operand. For example:
5451 .. code-block:: llvm
5453 !{ !"test\00", i32 10}
5455 Metadata nodes that aren't uniqued use the ``distinct`` keyword. For example:
5457 .. code-block:: text
5459 !0 = distinct !{!"test\00", i32 10}
5461 ``distinct`` nodes are useful when nodes shouldn't be merged based on their
5462 content. They can also occur when transformations cause uniquing collisions
5463 when metadata operands change.
5465 A :ref:`named metadata <namedmetadatastructure>` is a collection of
5466 metadata nodes, which can be looked up in the module symbol table. For
5469 .. code-block:: llvm
5473 Metadata can be used as function arguments. Here the ``llvm.dbg.value``
5474 intrinsic is using three metadata arguments:
5476 .. code-block:: llvm
5478 call void @llvm.dbg.value(metadata !24, metadata !25, metadata !26)
5480 Metadata can be attached to an instruction. Here metadata ``!21`` is attached
5481 to the ``add`` instruction using the ``!dbg`` identifier:
5483 .. code-block:: llvm
5485 %indvar.next = add i64 %indvar, 1, !dbg !21
5487 Instructions may not have multiple metadata attachments with the same
5490 Metadata can also be attached to a function or a global variable. Here metadata
5491 ``!22`` is attached to the ``f1`` and ``f2`` functions, and the globals ``g1``
5492 and ``g2`` using the ``!dbg`` identifier:
5494 .. code-block:: llvm
5496 declare !dbg !22 void @f1()
5497 define void @f2() !dbg !22 {
5501 @g1 = global i32 0, !dbg !22
5502 @g2 = external global i32, !dbg !22
5504 Unlike instructions, global objects (functions and global variables) may have
5505 multiple metadata attachments with the same identifier.
5507 A transformation is required to drop any metadata attachment that it
5508 does not know or know it can't preserve. Currently there is an
5509 exception for metadata attachment to globals for ``!func_sanitize``,
5510 ``!type``, ``!absolute_symbol`` and ``!associated`` which can't be
5511 unconditionally dropped unless the global is itself deleted.
5513 Metadata attached to a module using named metadata may not be dropped, with
5514 the exception of debug metadata (named metadata with the name ``!llvm.dbg.*``).
5516 More information about specific metadata nodes recognized by the
5517 optimizers and code generator is found below.
5519 .. _specialized-metadata:
5521 Specialized Metadata Nodes
5522 ^^^^^^^^^^^^^^^^^^^^^^^^^^
5524 Specialized metadata nodes are custom data structures in metadata (as opposed
5525 to generic tuples). Their fields are labelled, and can be specified in any
5528 These aren't inherently debug info centric, but currently all the specialized
5529 metadata nodes are related to debug info.
5536 ``DICompileUnit`` nodes represent a compile unit. The ``enums:``,
5537 ``retainedTypes:``, ``globals:``, ``imports:`` and ``macros:`` fields are tuples
5538 containing the debug info to be emitted along with the compile unit, regardless
5539 of code optimizations (some nodes are only emitted if there are references to
5540 them from instructions). The ``debugInfoForProfiling:`` field is a boolean
5541 indicating whether or not line-table discriminators are updated to provide
5542 more-accurate debug info for profiling results.
5544 .. code-block:: text
5546 !0 = !DICompileUnit(language: DW_LANG_C99, file: !1, producer: "clang",
5547 isOptimized: true, flags: "-O2", runtimeVersion: 2,
5548 splitDebugFilename: "abc.debug", emissionKind: FullDebug,
5549 enums: !2, retainedTypes: !3, globals: !4, imports: !5,
5550 macros: !6, dwoId: 0x0abcd)
5552 Compile unit descriptors provide the root scope for objects declared in a
5553 specific compilation unit. File descriptors are defined using this scope. These
5554 descriptors are collected by a named metadata node ``!llvm.dbg.cu``. They keep
5555 track of global variables, type information, and imported entities (declarations
5563 ``DIFile`` nodes represent files. The ``filename:`` can include slashes.
5565 .. code-block:: none
5567 !0 = !DIFile(filename: "path/to/file", directory: "/path/to/dir",
5568 checksumkind: CSK_MD5,
5569 checksum: "000102030405060708090a0b0c0d0e0f")
5571 Files are sometimes used in ``scope:`` fields, and are the only valid target
5572 for ``file:`` fields.
5573 Valid values for ``checksumkind:`` field are: {CSK_None, CSK_MD5, CSK_SHA1, CSK_SHA256}
5580 ``DIBasicType`` nodes represent primitive types, such as ``int``, ``bool`` and
5581 ``float``. ``tag:`` defaults to ``DW_TAG_base_type``.
5583 .. code-block:: text
5585 !0 = !DIBasicType(name: "unsigned char", size: 8, align: 8,
5586 encoding: DW_ATE_unsigned_char)
5587 !1 = !DIBasicType(tag: DW_TAG_unspecified_type, name: "decltype(nullptr)")
5589 The ``encoding:`` describes the details of the type. Usually it's one of the
5592 .. code-block:: text
5598 DW_ATE_signed_char = 6
5600 DW_ATE_unsigned_char = 8
5602 .. _DISubroutineType:
5607 ``DISubroutineType`` nodes represent subroutine types. Their ``types:`` field
5608 refers to a tuple; the first operand is the return type, while the rest are the
5609 types of the formal arguments in order. If the first operand is ``null``, that
5610 represents a function with no return value (such as ``void foo() {}`` in C++).
5612 .. code-block:: text
5614 !0 = !BasicType(name: "int", size: 32, align: 32, DW_ATE_signed)
5615 !1 = !BasicType(name: "char", size: 8, align: 8, DW_ATE_signed_char)
5616 !2 = !DISubroutineType(types: !{null, !0, !1}) ; void (int, char)
5623 ``DIDerivedType`` nodes represent types derived from other types, such as
5626 .. code-block:: text
5628 !0 = !DIBasicType(name: "unsigned char", size: 8, align: 8,
5629 encoding: DW_ATE_unsigned_char)
5630 !1 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !0, size: 32,
5633 The following ``tag:`` values are valid:
5635 .. code-block:: text
5638 DW_TAG_pointer_type = 15
5639 DW_TAG_reference_type = 16
5641 DW_TAG_inheritance = 28
5642 DW_TAG_ptr_to_member_type = 31
5643 DW_TAG_const_type = 38
5645 DW_TAG_volatile_type = 53
5646 DW_TAG_restrict_type = 55
5647 DW_TAG_atomic_type = 71
5648 DW_TAG_immutable_type = 75
5650 .. _DIDerivedTypeMember:
5652 ``DW_TAG_member`` is used to define a member of a :ref:`composite type
5653 <DICompositeType>`. The type of the member is the ``baseType:``. The
5654 ``offset:`` is the member's bit offset. If the composite type has an ODR
5655 ``identifier:`` and does not set ``flags: DIFwdDecl``, then the member is
5656 uniqued based only on its ``name:`` and ``scope:``.
5658 ``DW_TAG_inheritance`` and ``DW_TAG_friend`` are used in the ``elements:``
5659 field of :ref:`composite types <DICompositeType>` to describe parents and
5662 ``DW_TAG_typedef`` is used to provide a name for the ``baseType:``.
5664 ``DW_TAG_pointer_type``, ``DW_TAG_reference_type``, ``DW_TAG_const_type``,
5665 ``DW_TAG_volatile_type``, ``DW_TAG_restrict_type``, ``DW_TAG_atomic_type`` and
5666 ``DW_TAG_immutable_type`` are used to qualify the ``baseType:``.
5668 Note that the ``void *`` type is expressed as a type derived from NULL.
5670 .. _DICompositeType:
5675 ``DICompositeType`` nodes represent types composed of other types, like
5676 structures and unions. ``elements:`` points to a tuple of the composed types.
5678 If the source language supports ODR, the ``identifier:`` field gives the unique
5679 identifier used for type merging between modules. When specified,
5680 :ref:`subprogram declarations <DISubprogramDeclaration>` and :ref:`member
5681 derived types <DIDerivedTypeMember>` that reference the ODR-type in their
5682 ``scope:`` change uniquing rules.
5684 For a given ``identifier:``, there should only be a single composite type that
5685 does not have ``flags: DIFlagFwdDecl`` set. LLVM tools that link modules
5686 together will unique such definitions at parse time via the ``identifier:``
5687 field, even if the nodes are ``distinct``.
5689 .. code-block:: text
5691 !0 = !DIEnumerator(name: "SixKind", value: 7)
5692 !1 = !DIEnumerator(name: "SevenKind", value: 7)
5693 !2 = !DIEnumerator(name: "NegEightKind", value: -8)
5694 !3 = !DICompositeType(tag: DW_TAG_enumeration_type, name: "Enum", file: !12,
5695 line: 2, size: 32, align: 32, identifier: "_M4Enum",
5696 elements: !{!0, !1, !2})
5698 The following ``tag:`` values are valid:
5700 .. code-block:: text
5702 DW_TAG_array_type = 1
5703 DW_TAG_class_type = 2
5704 DW_TAG_enumeration_type = 4
5705 DW_TAG_structure_type = 19
5706 DW_TAG_union_type = 23
5708 For ``DW_TAG_array_type``, the ``elements:`` should be :ref:`subrange
5709 descriptors <DISubrange>`, each representing the range of subscripts at that
5710 level of indexing. The ``DIFlagVector`` flag to ``flags:`` indicates that an
5711 array type is a native packed vector. The optional ``dataLocation`` is a
5712 DIExpression that describes how to get from an object's address to the actual
5713 raw data, if they aren't equivalent. This is only supported for array types,
5714 particularly to describe Fortran arrays, which have an array descriptor in
5715 addition to the array data. Alternatively it can also be DIVariable which
5716 has the address of the actual raw data. The Fortran language supports pointer
5717 arrays which can be attached to actual arrays, this attachment between pointer
5718 and pointee is called association. The optional ``associated`` is a
5719 DIExpression that describes whether the pointer array is currently associated.
5720 The optional ``allocated`` is a DIExpression that describes whether the
5721 allocatable array is currently allocated. The optional ``rank`` is a
5722 DIExpression that describes the rank (number of dimensions) of fortran assumed
5723 rank array (rank is known at runtime).
5725 For ``DW_TAG_enumeration_type``, the ``elements:`` should be :ref:`enumerator
5726 descriptors <DIEnumerator>`, each representing the definition of an enumeration
5727 value for the set. All enumeration type descriptors are collected in the
5728 ``enums:`` field of the :ref:`compile unit <DICompileUnit>`.
5730 For ``DW_TAG_structure_type``, ``DW_TAG_class_type``, and
5731 ``DW_TAG_union_type``, the ``elements:`` should be :ref:`derived types
5732 <DIDerivedType>` with ``tag: DW_TAG_member``, ``tag: DW_TAG_inheritance``, or
5733 ``tag: DW_TAG_friend``; or :ref:`subprograms <DISubprogram>` with
5734 ``isDefinition: false``.
5741 ``DISubrange`` nodes are the elements for ``DW_TAG_array_type`` variants of
5742 :ref:`DICompositeType`.
5744 - ``count: -1`` indicates an empty array.
5745 - ``count: !10`` describes the count with a :ref:`DILocalVariable`.
5746 - ``count: !12`` describes the count with a :ref:`DIGlobalVariable`.
5748 .. code-block:: text
5750 !0 = !DISubrange(count: 5, lowerBound: 0) ; array counting from 0
5751 !1 = !DISubrange(count: 5, lowerBound: 1) ; array counting from 1
5752 !2 = !DISubrange(count: -1) ; empty array.
5754 ; Scopes used in rest of example
5755 !6 = !DIFile(filename: "vla.c", directory: "/path/to/file")
5756 !7 = distinct !DICompileUnit(language: DW_LANG_C99, file: !6)
5757 !8 = distinct !DISubprogram(name: "foo", scope: !7, file: !6, line: 5)
5759 ; Use of local variable as count value
5760 !9 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
5761 !10 = !DILocalVariable(name: "count", scope: !8, file: !6, line: 42, type: !9)
5762 !11 = !DISubrange(count: !10, lowerBound: 0)
5764 ; Use of global variable as count value
5765 !12 = !DIGlobalVariable(name: "count", scope: !8, file: !6, line: 22, type: !9)
5766 !13 = !DISubrange(count: !12, lowerBound: 0)
5773 ``DIEnumerator`` nodes are the elements for ``DW_TAG_enumeration_type``
5774 variants of :ref:`DICompositeType`.
5776 .. code-block:: text
5778 !0 = !DIEnumerator(name: "SixKind", value: 7)
5779 !1 = !DIEnumerator(name: "SevenKind", value: 7)
5780 !2 = !DIEnumerator(name: "NegEightKind", value: -8)
5782 DITemplateTypeParameter
5783 """""""""""""""""""""""
5785 ``DITemplateTypeParameter`` nodes represent type parameters to generic source
5786 language constructs. They are used (optionally) in :ref:`DICompositeType` and
5787 :ref:`DISubprogram` ``templateParams:`` fields.
5789 .. code-block:: text
5791 !0 = !DITemplateTypeParameter(name: "Ty", type: !1)
5793 DITemplateValueParameter
5794 """"""""""""""""""""""""
5796 ``DITemplateValueParameter`` nodes represent value parameters to generic source
5797 language constructs. ``tag:`` defaults to ``DW_TAG_template_value_parameter``,
5798 but if specified can also be set to ``DW_TAG_GNU_template_template_param`` or
5799 ``DW_TAG_GNU_template_param_pack``. They are used (optionally) in
5800 :ref:`DICompositeType` and :ref:`DISubprogram` ``templateParams:`` fields.
5802 .. code-block:: text
5804 !0 = !DITemplateValueParameter(name: "Ty", type: !1, value: i32 7)
5809 ``DINamespace`` nodes represent namespaces in the source language.
5811 .. code-block:: text
5813 !0 = !DINamespace(name: "myawesomeproject", scope: !1, file: !2, line: 7)
5815 .. _DIGlobalVariable:
5820 ``DIGlobalVariable`` nodes represent global variables in the source language.
5822 .. code-block:: text
5824 @foo = global i32, !dbg !0
5825 !0 = !DIGlobalVariableExpression(var: !1, expr: !DIExpression())
5826 !1 = !DIGlobalVariable(name: "foo", linkageName: "foo", scope: !2,
5827 file: !3, line: 7, type: !4, isLocal: true,
5828 isDefinition: false, declaration: !5)
5831 DIGlobalVariableExpression
5832 """"""""""""""""""""""""""
5834 ``DIGlobalVariableExpression`` nodes tie a :ref:`DIGlobalVariable` together
5835 with a :ref:`DIExpression`.
5837 .. code-block:: text
5839 @lower = global i32, !dbg !0
5840 @upper = global i32, !dbg !1
5841 !0 = !DIGlobalVariableExpression(
5843 expr: !DIExpression(DW_OP_LLVM_fragment, 0, 32)
5845 !1 = !DIGlobalVariableExpression(
5847 expr: !DIExpression(DW_OP_LLVM_fragment, 32, 32)
5849 !2 = !DIGlobalVariable(name: "split64", linkageName: "split64", scope: !3,
5850 file: !4, line: 8, type: !5, declaration: !6)
5852 All global variable expressions should be referenced by the `globals:` field of
5853 a :ref:`compile unit <DICompileUnit>`.
5860 ``DISubprogram`` nodes represent functions from the source language. A distinct
5861 ``DISubprogram`` may be attached to a function definition using ``!dbg``
5862 metadata. A unique ``DISubprogram`` may be attached to a function declaration
5863 used for call site debug info. The ``retainedNodes:`` field is a list of
5864 :ref:`variables <DILocalVariable>` and :ref:`labels <DILabel>` that must be
5865 retained, even if their IR counterparts are optimized out of the IR. The
5866 ``type:`` field must point at an :ref:`DISubroutineType`.
5868 .. _DISubprogramDeclaration:
5870 When ``spFlags: DISPFlagDefinition`` is not present, subprograms describe a
5871 declaration in the type tree as opposed to a definition of a function. In this
5872 case, the ``declaration`` field must be empty. If the scope is a composite type
5873 with an ODR ``identifier:`` and that does not set ``flags: DIFwdDecl``, then
5874 the subprogram declaration is uniqued based only on its ``linkageName:`` and
5877 .. code-block:: text
5879 define void @_Z3foov() !dbg !0 {
5883 !0 = distinct !DISubprogram(name: "foo", linkageName: "_Zfoov", scope: !1,
5884 file: !2, line: 7, type: !3,
5885 spFlags: DISPFlagDefinition | DISPFlagLocalToUnit,
5886 scopeLine: 8, containingType: !4,
5887 virtuality: DW_VIRTUALITY_pure_virtual,
5888 virtualIndex: 10, flags: DIFlagPrototyped,
5889 isOptimized: true, unit: !5, templateParams: !6,
5890 declaration: !7, retainedNodes: !8,
5898 ``DILexicalBlock`` nodes describe nested blocks within a :ref:`subprogram
5899 <DISubprogram>`. The line number and column numbers are used to distinguish
5900 two lexical blocks at same depth. They are valid targets for ``scope:``
5903 .. code-block:: text
5905 !0 = distinct !DILexicalBlock(scope: !1, file: !2, line: 7, column: 35)
5907 Usually lexical blocks are ``distinct`` to prevent node merging based on
5910 .. _DILexicalBlockFile:
5915 ``DILexicalBlockFile`` nodes are used to discriminate between sections of a
5916 :ref:`lexical block <DILexicalBlock>`. The ``file:`` field can be changed to
5917 indicate textual inclusion, or the ``discriminator:`` field can be used to
5918 discriminate between control flow within a single block in the source language.
5920 .. code-block:: text
5922 !0 = !DILexicalBlock(scope: !3, file: !4, line: 7, column: 35)
5923 !1 = !DILexicalBlockFile(scope: !0, file: !4, discriminator: 0)
5924 !2 = !DILexicalBlockFile(scope: !0, file: !4, discriminator: 1)
5931 ``DILocation`` nodes represent source debug locations. The ``scope:`` field is
5932 mandatory, and points at an :ref:`DILexicalBlockFile`, an
5933 :ref:`DILexicalBlock`, or an :ref:`DISubprogram`.
5935 .. code-block:: text
5937 !0 = !DILocation(line: 2900, column: 42, scope: !1, inlinedAt: !2)
5939 .. _DILocalVariable:
5944 ``DILocalVariable`` nodes represent local variables in the source language. If
5945 the ``arg:`` field is set to non-zero, then this variable is a subprogram
5946 parameter, and it will be included in the ``retainedNodes:`` field of its
5947 :ref:`DISubprogram`.
5949 .. code-block:: text
5951 !0 = !DILocalVariable(name: "this", arg: 1, scope: !3, file: !2, line: 7,
5952 type: !3, flags: DIFlagArtificial)
5953 !1 = !DILocalVariable(name: "x", arg: 2, scope: !4, file: !2, line: 7,
5955 !2 = !DILocalVariable(name: "y", scope: !5, file: !2, line: 7, type: !3)
5962 ``DIExpression`` nodes represent expressions that are inspired by the DWARF
5963 expression language. They are used in :ref:`debug intrinsics<dbg_intrinsics>`
5964 (such as ``llvm.dbg.declare`` and ``llvm.dbg.value``) to describe how the
5965 referenced LLVM variable relates to the source language variable. Debug
5966 intrinsics are interpreted left-to-right: start by pushing the value/address
5967 operand of the intrinsic onto a stack, then repeatedly push and evaluate
5968 opcodes from the DIExpression until the final variable description is produced.
5970 The current supported opcode vocabulary is limited:
5972 - ``DW_OP_deref`` dereferences the top of the expression stack.
5973 - ``DW_OP_plus`` pops the last two entries from the expression stack, adds
5974 them together and appends the result to the expression stack.
5975 - ``DW_OP_minus`` pops the last two entries from the expression stack, subtracts
5976 the last entry from the second last entry and appends the result to the
5978 - ``DW_OP_plus_uconst, 93`` adds ``93`` to the working expression.
5979 - ``DW_OP_LLVM_fragment, 16, 8`` specifies the offset and size (``16`` and ``8``
5980 here, respectively) of the variable fragment from the working expression. Note
5981 that contrary to DW_OP_bit_piece, the offset is describing the location
5982 within the described source variable.
5983 - ``DW_OP_LLVM_convert, 16, DW_ATE_signed`` specifies a bit size and encoding
5984 (``16`` and ``DW_ATE_signed`` here, respectively) to which the top of the
5985 expression stack is to be converted. Maps into a ``DW_OP_convert`` operation
5986 that references a base type constructed from the supplied values.
5987 - ``DW_OP_LLVM_tag_offset, tag_offset`` specifies that a memory tag should be
5988 optionally applied to the pointer. The memory tag is derived from the
5989 given tag offset in an implementation-defined manner.
5990 - ``DW_OP_swap`` swaps top two stack entries.
5991 - ``DW_OP_xderef`` provides extended dereference mechanism. The entry at the top
5992 of the stack is treated as an address. The second stack entry is treated as an
5993 address space identifier.
5994 - ``DW_OP_stack_value`` marks a constant value.
5995 - ``DW_OP_LLVM_entry_value, N`` refers to the value a register had upon
5996 function entry. When targeting DWARF, a ``DBG_VALUE(reg, ...,
5997 DIExpression(DW_OP_LLVM_entry_value, 1, ...)`` is lowered to
5998 ``DW_OP_entry_value [reg], ...``, which pushes the value ``reg`` had upon
5999 function entry onto the DWARF expression stack.
6001 The next ``(N - 1)`` operations will be part of the ``DW_OP_entry_value``
6002 block argument. For example, ``!DIExpression(DW_OP_LLVM_entry_value, 1,
6003 DW_OP_plus_uconst, 123, DW_OP_stack_value)`` specifies an expression where
6004 the entry value of ``reg`` is pushed onto the stack, and is added with 123.
6005 Due to framework limitations ``N`` must be 1, in other words,
6006 ``DW_OP_entry_value`` always refers to the value/address operand of the
6009 Because ``DW_OP_LLVM_entry_value`` is defined in terms of registers, it is
6010 usually used in MIR, but it is also allowed in LLVM IR when targetting a
6011 :ref:`swiftasync <swiftasync>` argument. The operation is introduced by:
6013 - ``LiveDebugValues`` pass, which applies it to function parameters that
6014 are unmodified throughout the function. Support is limited to simple
6015 register location descriptions, or as indirect locations (e.g.,
6016 parameters passed-by-value to a callee via a pointer to a temporary copy
6017 made in the caller).
6018 - ``AsmPrinter`` pass when a call site parameter value
6019 (``DW_AT_call_site_parameter_value``) is represented as entry value of
6021 - ``CoroSplit`` pass, which may move variables from allocas into a
6022 coroutine frame. If the coroutine frame is a
6023 :ref:`swiftasync <swiftasync>` argument, the variable is described with
6024 an ``DW_OP_LLVM_entry_value`` operation.
6026 - ``DW_OP_LLVM_arg, N`` is used in debug intrinsics that refer to more than one
6027 value, such as one that calculates the sum of two registers. This is always
6028 used in combination with an ordered list of values, such that
6029 ``DW_OP_LLVM_arg, N`` refers to the ``N``\ :sup:`th` element in that list. For
6030 example, ``!DIExpression(DW_OP_LLVM_arg, 0, DW_OP_LLVM_arg, 1, DW_OP_minus,
6031 DW_OP_stack_value)`` used with the list ``(%reg1, %reg2)`` would evaluate to
6032 ``%reg1 - reg2``. This list of values should be provided by the containing
6033 intrinsic/instruction.
6034 - ``DW_OP_breg`` (or ``DW_OP_bregx``) represents a content on the provided
6035 signed offset of the specified register. The opcode is only generated by the
6036 ``AsmPrinter`` pass to describe call site parameter value which requires an
6037 expression over two registers.
6038 - ``DW_OP_push_object_address`` pushes the address of the object which can then
6039 serve as a descriptor in subsequent calculation. This opcode can be used to
6040 calculate bounds of fortran allocatable array which has array descriptors.
6041 - ``DW_OP_over`` duplicates the entry currently second in the stack at the top
6042 of the stack. This opcode can be used to calculate bounds of fortran assumed
6043 rank array which has rank known at run time and current dimension number is
6044 implicitly first element of the stack.
6045 - ``DW_OP_LLVM_implicit_pointer`` It specifies the dereferenced value. It can
6046 be used to represent pointer variables which are optimized out but the value
6047 it points to is known. This operator is required as it is different than DWARF
6048 operator DW_OP_implicit_pointer in representation and specification (number
6049 and types of operands) and later can not be used as multiple level.
6051 .. code-block:: text
6055 call void @llvm.dbg.value(metadata i32 4, metadata !17, metadata !20)
6056 !17 = !DILocalVariable(name: "ptr1", scope: !12, file: !3, line: 5,
6058 !18 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !19, size: 64)
6059 !19 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
6060 !20 = !DIExpression(DW_OP_LLVM_implicit_pointer))
6064 call void @llvm.dbg.value(metadata i32 4, metadata !17, metadata !21)
6065 !17 = !DILocalVariable(name: "ptr1", scope: !12, file: !3, line: 5,
6067 !18 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !19, size: 64)
6068 !19 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !20, size: 64)
6069 !20 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
6070 !21 = !DIExpression(DW_OP_LLVM_implicit_pointer,
6071 DW_OP_LLVM_implicit_pointer))
6073 DWARF specifies three kinds of simple location descriptions: Register, memory,
6074 and implicit location descriptions. Note that a location description is
6075 defined over certain ranges of a program, i.e the location of a variable may
6076 change over the course of the program. Register and memory location
6077 descriptions describe the *concrete location* of a source variable (in the
6078 sense that a debugger might modify its value), whereas *implicit locations*
6079 describe merely the actual *value* of a source variable which might not exist
6080 in registers or in memory (see ``DW_OP_stack_value``).
6082 A ``llvm.dbg.declare`` intrinsic describes an indirect value (the address) of a
6083 source variable. The first operand of the intrinsic must be an address of some
6084 kind. A DIExpression attached to the intrinsic refines this address to produce a
6085 concrete location for the source variable.
6087 A ``llvm.dbg.value`` intrinsic describes the direct value of a source variable.
6088 The first operand of the intrinsic may be a direct or indirect value. A
6089 DIExpression attached to the intrinsic refines the first operand to produce a
6090 direct value. For example, if the first operand is an indirect value, it may be
6091 necessary to insert ``DW_OP_deref`` into the DIExpression in order to produce a
6092 valid debug intrinsic.
6096 A DIExpression is interpreted in the same way regardless of which kind of
6097 debug intrinsic it's attached to.
6099 .. code-block:: text
6101 !0 = !DIExpression(DW_OP_deref)
6102 !1 = !DIExpression(DW_OP_plus_uconst, 3)
6103 !1 = !DIExpression(DW_OP_constu, 3, DW_OP_plus)
6104 !2 = !DIExpression(DW_OP_bit_piece, 3, 7)
6105 !3 = !DIExpression(DW_OP_deref, DW_OP_constu, 3, DW_OP_plus, DW_OP_LLVM_fragment, 3, 7)
6106 !4 = !DIExpression(DW_OP_constu, 2, DW_OP_swap, DW_OP_xderef)
6107 !5 = !DIExpression(DW_OP_constu, 42, DW_OP_stack_value)
6112 ``DIAssignID`` nodes have no operands and are always distinct. They are used to
6113 link together `@llvm.dbg.assign` intrinsics (:ref:`debug
6114 intrinsics<dbg_intrinsics>`) and instructions that store in IR. See `Debug Info
6115 Assignment Tracking <AssignmentTracking.html>`_ for more info.
6117 .. code-block:: llvm
6119 store i32 %a, ptr %a.addr, align 4, !DIAssignID !2
6120 llvm.dbg.assign(metadata %a, metadata !1, metadata !DIExpression(), !2, metadata %a.addr, metadata !DIExpression()), !dbg !3
6122 !2 = distinct !DIAssignID()
6127 ``DIArgList`` nodes hold a list of constant or SSA value references. These are
6128 used in :ref:`debug intrinsics<dbg_intrinsics>` (currently only in
6129 ``llvm.dbg.value``) in combination with a ``DIExpression`` that uses the
6130 ``DW_OP_LLVM_arg`` operator. Because a DIArgList may refer to local values
6131 within a function, it must only be used as a function argument, must always be
6132 inlined, and cannot appear in named metadata.
6134 .. code-block:: text
6136 llvm.dbg.value(metadata !DIArgList(i32 %a, i32 %b),
6138 metadata !DIExpression(DW_OP_LLVM_arg, 0, DW_OP_LLVM_arg, 1, DW_OP_plus))
6143 These flags encode various properties of DINodes.
6145 The `ExportSymbols` flag marks a class, struct or union whose members
6146 may be referenced as if they were defined in the containing class or
6147 union. This flag is used to decide whether the DW_AT_export_symbols can
6148 be used for the structure type.
6153 ``DIObjCProperty`` nodes represent Objective-C property nodes.
6155 .. code-block:: text
6157 !3 = !DIObjCProperty(name: "foo", file: !1, line: 7, setter: "setFoo",
6158 getter: "getFoo", attributes: 7, type: !2)
6163 ``DIImportedEntity`` nodes represent entities (such as modules) imported into a
6164 compile unit. The ``elements`` field is a list of renamed entities (such as
6165 variables and subprograms) in the imported entity (such as module).
6167 .. code-block:: text
6169 !2 = !DIImportedEntity(tag: DW_TAG_imported_module, name: "foo", scope: !0,
6170 entity: !1, line: 7, elements: !3)
6172 !4 = !DIImportedEntity(tag: DW_TAG_imported_declaration, name: "bar", scope: !0,
6173 entity: !5, line: 7)
6178 ``DIMacro`` nodes represent definition or undefinition of a macro identifiers.
6179 The ``name:`` field is the macro identifier, followed by macro parameters when
6180 defining a function-like macro, and the ``value`` field is the token-string
6181 used to expand the macro identifier.
6183 .. code-block:: text
6185 !2 = !DIMacro(macinfo: DW_MACINFO_define, line: 7, name: "foo(x)",
6187 !3 = !DIMacro(macinfo: DW_MACINFO_undef, line: 30, name: "foo")
6192 ``DIMacroFile`` nodes represent inclusion of source files.
6193 The ``nodes:`` field is a list of ``DIMacro`` and ``DIMacroFile`` nodes that
6194 appear in the included source file.
6196 .. code-block:: text
6198 !2 = !DIMacroFile(macinfo: DW_MACINFO_start_file, line: 7, file: !2,
6206 ``DILabel`` nodes represent labels within a :ref:`DISubprogram`. All fields of
6207 a ``DILabel`` are mandatory. The ``scope:`` field must be one of either a
6208 :ref:`DILexicalBlockFile`, a :ref:`DILexicalBlock`, or a :ref:`DISubprogram`.
6209 The ``name:`` field is the label identifier. The ``file:`` field is the
6210 :ref:`DIFile` the label is present in. The ``line:`` field is the source line
6211 within the file where the label is declared.
6213 .. code-block:: text
6215 !2 = !DILabel(scope: !0, name: "foo", file: !1, line: 7)
6220 In LLVM IR, memory does not have types, so LLVM's own type system is not
6221 suitable for doing type based alias analysis (TBAA). Instead, metadata is
6222 added to the IR to describe a type system of a higher level language. This
6223 can be used to implement C/C++ strict type aliasing rules, but it can also
6224 be used to implement custom alias analysis behavior for other languages.
6226 This description of LLVM's TBAA system is broken into two parts:
6227 :ref:`Semantics<tbaa_node_semantics>` talks about high level issues, and
6228 :ref:`Representation<tbaa_node_representation>` talks about the metadata
6229 encoding of various entities.
6231 It is always possible to trace any TBAA node to a "root" TBAA node (details
6232 in the :ref:`Representation<tbaa_node_representation>` section). TBAA
6233 nodes with different roots have an unknown aliasing relationship, and LLVM
6234 conservatively infers ``MayAlias`` between them. The rules mentioned in
6235 this section only pertain to TBAA nodes living under the same root.
6237 .. _tbaa_node_semantics:
6242 The TBAA metadata system, referred to as "struct path TBAA" (not to be
6243 confused with ``tbaa.struct``), consists of the following high level
6244 concepts: *Type Descriptors*, further subdivided into scalar type
6245 descriptors and struct type descriptors; and *Access Tags*.
6247 **Type descriptors** describe the type system of the higher level language
6248 being compiled. **Scalar type descriptors** describe types that do not
6249 contain other types. Each scalar type has a parent type, which must also
6250 be a scalar type or the TBAA root. Via this parent relation, scalar types
6251 within a TBAA root form a tree. **Struct type descriptors** denote types
6252 that contain a sequence of other type descriptors, at known offsets. These
6253 contained type descriptors can either be struct type descriptors themselves
6254 or scalar type descriptors.
6256 **Access tags** are metadata nodes attached to load and store instructions.
6257 Access tags use type descriptors to describe the *location* being accessed
6258 in terms of the type system of the higher level language. Access tags are
6259 tuples consisting of a base type, an access type and an offset. The base
6260 type is a scalar type descriptor or a struct type descriptor, the access
6261 type is a scalar type descriptor, and the offset is a constant integer.
6263 The access tag ``(BaseTy, AccessTy, Offset)`` can describe one of two
6266 * If ``BaseTy`` is a struct type, the tag describes a memory access (load
6267 or store) of a value of type ``AccessTy`` contained in the struct type
6268 ``BaseTy`` at offset ``Offset``.
6270 * If ``BaseTy`` is a scalar type, ``Offset`` must be 0 and ``BaseTy`` and
6271 ``AccessTy`` must be the same; and the access tag describes a scalar
6272 access with scalar type ``AccessTy``.
6274 We first define an ``ImmediateParent`` relation on ``(BaseTy, Offset)``
6277 * If ``BaseTy`` is a scalar type then ``ImmediateParent(BaseTy, 0)`` is
6278 ``(ParentTy, 0)`` where ``ParentTy`` is the parent of the scalar type as
6279 described in the TBAA metadata. ``ImmediateParent(BaseTy, Offset)`` is
6280 undefined if ``Offset`` is non-zero.
6282 * If ``BaseTy`` is a struct type then ``ImmediateParent(BaseTy, Offset)``
6283 is ``(NewTy, NewOffset)`` where ``NewTy`` is the type contained in
6284 ``BaseTy`` at offset ``Offset`` and ``NewOffset`` is ``Offset`` adjusted
6285 to be relative within that inner type.
6287 A memory access with an access tag ``(BaseTy1, AccessTy1, Offset1)``
6288 aliases a memory access with an access tag ``(BaseTy2, AccessTy2,
6289 Offset2)`` if either ``(BaseTy1, Offset1)`` is reachable from ``(Base2,
6290 Offset2)`` via the ``Parent`` relation or vice versa.
6292 As a concrete example, the type descriptor graph for the following program
6298 float f; // offset 4
6302 float f; // offset 0
6303 double d; // offset 4
6304 struct Inner inner_a; // offset 12
6307 void f(struct Outer* outer, struct Inner* inner, float* f, int* i, char* c) {
6308 outer->f = 0; // tag0: (OuterStructTy, FloatScalarTy, 0)
6309 outer->inner_a.i = 0; // tag1: (OuterStructTy, IntScalarTy, 12)
6310 outer->inner_a.f = 0.0; // tag2: (OuterStructTy, FloatScalarTy, 16)
6311 *f = 0.0; // tag3: (FloatScalarTy, FloatScalarTy, 0)
6314 is (note that in C and C++, ``char`` can be used to access any arbitrary
6317 .. code-block:: text
6320 CharScalarTy = ("char", Root, 0)
6321 FloatScalarTy = ("float", CharScalarTy, 0)
6322 DoubleScalarTy = ("double", CharScalarTy, 0)
6323 IntScalarTy = ("int", CharScalarTy, 0)
6324 InnerStructTy = {"Inner" (IntScalarTy, 0), (FloatScalarTy, 4)}
6325 OuterStructTy = {"Outer", (FloatScalarTy, 0), (DoubleScalarTy, 4),
6326 (InnerStructTy, 12)}
6329 with (e.g.) ``ImmediateParent(OuterStructTy, 12)`` = ``(InnerStructTy,
6330 0)``, ``ImmediateParent(InnerStructTy, 0)`` = ``(IntScalarTy, 0)``, and
6331 ``ImmediateParent(IntScalarTy, 0)`` = ``(CharScalarTy, 0)``.
6333 .. _tbaa_node_representation:
6338 The root node of a TBAA type hierarchy is an ``MDNode`` with 0 operands or
6339 with exactly one ``MDString`` operand.
6341 Scalar type descriptors are represented as an ``MDNode`` s with two
6342 operands. The first operand is an ``MDString`` denoting the name of the
6343 struct type. LLVM does not assign meaning to the value of this operand, it
6344 only cares about it being an ``MDString``. The second operand is an
6345 ``MDNode`` which points to the parent for said scalar type descriptor,
6346 which is either another scalar type descriptor or the TBAA root. Scalar
6347 type descriptors can have an optional third argument, but that must be the
6348 constant integer zero.
6350 Struct type descriptors are represented as ``MDNode`` s with an odd number
6351 of operands greater than 1. The first operand is an ``MDString`` denoting
6352 the name of the struct type. Like in scalar type descriptors the actual
6353 value of this name operand is irrelevant to LLVM. After the name operand,
6354 the struct type descriptors have a sequence of alternating ``MDNode`` and
6355 ``ConstantInt`` operands. With N starting from 1, the 2N - 1 th operand,
6356 an ``MDNode``, denotes a contained field, and the 2N th operand, a
6357 ``ConstantInt``, is the offset of the said contained field. The offsets
6358 must be in non-decreasing order.
6360 Access tags are represented as ``MDNode`` s with either 3 or 4 operands.
6361 The first operand is an ``MDNode`` pointing to the node representing the
6362 base type. The second operand is an ``MDNode`` pointing to the node
6363 representing the access type. The third operand is a ``ConstantInt`` that
6364 states the offset of the access. If a fourth field is present, it must be
6365 a ``ConstantInt`` valued at 0 or 1. If it is 1 then the access tag states
6366 that the location being accessed is "constant" (meaning
6367 ``pointsToConstantMemory`` should return true; see `other useful
6368 AliasAnalysis methods <AliasAnalysis.html#OtherItfs>`_). The TBAA root of
6369 the access type and the base type of an access tag must be the same, and
6370 that is the TBAA root of the access tag.
6372 '``tbaa.struct``' Metadata
6373 ^^^^^^^^^^^^^^^^^^^^^^^^^^
6375 The :ref:`llvm.memcpy <int_memcpy>` is often used to implement
6376 aggregate assignment operations in C and similar languages, however it
6377 is defined to copy a contiguous region of memory, which is more than
6378 strictly necessary for aggregate types which contain holes due to
6379 padding. Also, it doesn't contain any TBAA information about the fields
6382 ``!tbaa.struct`` metadata can describe which memory subregions in a
6383 memcpy are padding and what the TBAA tags of the struct are.
6385 The current metadata format is very simple. ``!tbaa.struct`` metadata
6386 nodes are a list of operands which are in conceptual groups of three.
6387 For each group of three, the first operand gives the byte offset of a
6388 field in bytes, the second gives its size in bytes, and the third gives
6391 .. code-block:: llvm
6393 !4 = !{ i64 0, i64 4, !1, i64 8, i64 4, !2 }
6395 This describes a struct with two fields. The first is at offset 0 bytes
6396 with size 4 bytes, and has tbaa tag !1. The second is at offset 8 bytes
6397 and has size 4 bytes and has tbaa tag !2.
6399 Note that the fields need not be contiguous. In this example, there is a
6400 4 byte gap between the two fields. This gap represents padding which
6401 does not carry useful data and need not be preserved.
6403 '``noalias``' and '``alias.scope``' Metadata
6404 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6406 ``noalias`` and ``alias.scope`` metadata provide the ability to specify generic
6407 noalias memory-access sets. This means that some collection of memory access
6408 instructions (loads, stores, memory-accessing calls, etc.) that carry
6409 ``noalias`` metadata can specifically be specified not to alias with some other
6410 collection of memory access instructions that carry ``alias.scope`` metadata.
6411 Each type of metadata specifies a list of scopes where each scope has an id and
6414 When evaluating an aliasing query, if for some domain, the set
6415 of scopes with that domain in one instruction's ``alias.scope`` list is a
6416 subset of (or equal to) the set of scopes for that domain in another
6417 instruction's ``noalias`` list, then the two memory accesses are assumed not to
6420 Because scopes in one domain don't affect scopes in other domains, separate
6421 domains can be used to compose multiple independent noalias sets. This is
6422 used for example during inlining. As the noalias function parameters are
6423 turned into noalias scope metadata, a new domain is used every time the
6424 function is inlined.
6426 The metadata identifying each domain is itself a list containing one or two
6427 entries. The first entry is the name of the domain. Note that if the name is a
6428 string then it can be combined across functions and translation units. A
6429 self-reference can be used to create globally unique domain names. A
6430 descriptive string may optionally be provided as a second list entry.
6432 The metadata identifying each scope is also itself a list containing two or
6433 three entries. The first entry is the name of the scope. Note that if the name
6434 is a string then it can be combined across functions and translation units. A
6435 self-reference can be used to create globally unique scope names. A metadata
6436 reference to the scope's domain is the second entry. A descriptive string may
6437 optionally be provided as a third list entry.
6441 .. code-block:: llvm
6443 ; Two scope domains:
6447 ; Some scopes in these domains:
6453 !5 = !{!4} ; A list containing only scope !4
6457 ; These two instructions don't alias:
6458 %0 = load float, ptr %c, align 4, !alias.scope !5
6459 store float %0, ptr %arrayidx.i, align 4, !noalias !5
6461 ; These two instructions also don't alias (for domain !1, the set of scopes
6462 ; in the !alias.scope equals that in the !noalias list):
6463 %2 = load float, ptr %c, align 4, !alias.scope !5
6464 store float %2, ptr %arrayidx.i2, align 4, !noalias !6
6466 ; These two instructions may alias (for domain !0, the set of scopes in
6467 ; the !noalias list is not a superset of, or equal to, the scopes in the
6468 ; !alias.scope list):
6469 %2 = load float, ptr %c, align 4, !alias.scope !6
6470 store float %0, ptr %arrayidx.i, align 4, !noalias !7
6472 '``fpmath``' Metadata
6473 ^^^^^^^^^^^^^^^^^^^^^
6475 ``fpmath`` metadata may be attached to any instruction of floating-point
6476 type. It can be used to express the maximum acceptable error in the
6477 result of that instruction, in ULPs, thus potentially allowing the
6478 compiler to use a more efficient but less accurate method of computing
6479 it. ULP is defined as follows:
6481 If ``x`` is a real number that lies between two finite consecutive
6482 floating-point numbers ``a`` and ``b``, without being equal to one
6483 of them, then ``ulp(x) = |b - a|``, otherwise ``ulp(x)`` is the
6484 distance between the two non-equal finite floating-point numbers
6485 nearest ``x``. Moreover, ``ulp(NaN)`` is ``NaN``.
6487 The metadata node shall consist of a single positive float type number
6488 representing the maximum relative error, for example:
6490 .. code-block:: llvm
6492 !0 = !{ float 2.5 } ; maximum acceptable inaccuracy is 2.5 ULPs
6496 '``range``' Metadata
6497 ^^^^^^^^^^^^^^^^^^^^
6499 ``range`` metadata may be attached only to ``load``, ``call`` and ``invoke`` of
6500 integer or vector of integer types. It expresses the possible ranges the loaded
6501 value or the value returned by the called function at this call site is in. If
6502 the loaded or returned value is not in the specified range, a poison value is
6503 returned instead. The ranges are represented with a flattened list of integers.
6504 The loaded value or the value returned is known to be in the union of the ranges
6505 defined by each consecutive pair. Each pair has the following properties:
6507 - The type must match the scalar type of the instruction.
6508 - The pair ``a,b`` represents the range ``[a,b)``.
6509 - Both ``a`` and ``b`` are constants.
6510 - The range is allowed to wrap.
6511 - The range should not represent the full or empty set. That is,
6514 In addition, the pairs must be in signed order of the lower bound and
6515 they must be non-contiguous.
6517 For vector-typed instructions, the range is applied element-wise.
6521 .. code-block:: llvm
6523 %a = load i8, ptr %x, align 1, !range !0 ; Can only be 0 or 1
6524 %b = load i8, ptr %y, align 1, !range !1 ; Can only be 255 (-1), 0 or 1
6525 %c = call i8 @foo(), !range !2 ; Can only be 0, 1, 3, 4 or 5
6526 %d = invoke i8 @bar() to label %cont
6527 unwind label %lpad, !range !3 ; Can only be -2, -1, 3, 4 or 5
6528 %e = load <2 x i8>, ptr %x, !range 0 ; Can only be <0 or 1, 0 or 1>
6530 !0 = !{ i8 0, i8 2 }
6531 !1 = !{ i8 255, i8 2 }
6532 !2 = !{ i8 0, i8 2, i8 3, i8 6 }
6533 !3 = !{ i8 -2, i8 0, i8 3, i8 6 }
6535 '``absolute_symbol``' Metadata
6536 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6538 ``absolute_symbol`` metadata may be attached to a global variable
6539 declaration. It marks the declaration as a reference to an absolute symbol,
6540 which causes the backend to use absolute relocations for the symbol even
6541 in position independent code, and expresses the possible ranges that the
6542 global variable's *address* (not its value) is in, in the same format as
6543 ``range`` metadata, with the extension that the pair ``all-ones,all-ones``
6544 may be used to represent the full set.
6546 Example (assuming 64-bit pointers):
6548 .. code-block:: llvm
6550 @a = external global i8, !absolute_symbol !0 ; Absolute symbol in range [0,256)
6551 @b = external global i8, !absolute_symbol !1 ; Absolute symbol in range [0,2^64)
6554 !0 = !{ i64 0, i64 256 }
6555 !1 = !{ i64 -1, i64 -1 }
6557 '``callees``' Metadata
6558 ^^^^^^^^^^^^^^^^^^^^^^
6560 ``callees`` metadata may be attached to indirect call sites. If ``callees``
6561 metadata is attached to a call site, and any callee is not among the set of
6562 functions provided by the metadata, the behavior is undefined. The intent of
6563 this metadata is to facilitate optimizations such as indirect-call promotion.
6564 For example, in the code below, the call instruction may only target the
6565 ``add`` or ``sub`` functions:
6567 .. code-block:: llvm
6569 %result = call i64 %binop(i64 %x, i64 %y), !callees !0
6572 !0 = !{ptr @add, ptr @sub}
6574 '``callback``' Metadata
6575 ^^^^^^^^^^^^^^^^^^^^^^^
6577 ``callback`` metadata may be attached to a function declaration, or definition.
6578 (Call sites are excluded only due to the lack of a use case.) For ease of
6579 exposition, we'll refer to the function annotated w/ metadata as a broker
6580 function. The metadata describes how the arguments of a call to the broker are
6581 in turn passed to the callback function specified by the metadata. Thus, the
6582 ``callback`` metadata provides a partial description of a call site inside the
6583 broker function with regards to the arguments of a call to the broker. The only
6584 semantic restriction on the broker function itself is that it is not allowed to
6585 inspect or modify arguments referenced in the ``callback`` metadata as
6586 pass-through to the callback function.
6588 The broker is not required to actually invoke the callback function at runtime.
6589 However, the assumptions about not inspecting or modifying arguments that would
6590 be passed to the specified callback function still hold, even if the callback
6591 function is not dynamically invoked. The broker is allowed to invoke the
6592 callback function more than once per invocation of the broker. The broker is
6593 also allowed to invoke (directly or indirectly) the function passed as a
6594 callback through another use. Finally, the broker is also allowed to relay the
6595 callback callee invocation to a different thread.
6597 The metadata is structured as follows: At the outer level, ``callback``
6598 metadata is a list of ``callback`` encodings. Each encoding starts with a
6599 constant ``i64`` which describes the argument position of the callback function
6600 in the call to the broker. The following elements, except the last, describe
6601 what arguments are passed to the callback function. Each element is again an
6602 ``i64`` constant identifying the argument of the broker that is passed through,
6603 or ``i64 -1`` to indicate an unknown or inspected argument. The order in which
6604 they are listed has to be the same in which they are passed to the callback
6605 callee. The last element of the encoding is a boolean which specifies how
6606 variadic arguments of the broker are handled. If it is true, all variadic
6607 arguments of the broker are passed through to the callback function *after* the
6608 arguments encoded explicitly before.
6610 In the code below, the ``pthread_create`` function is marked as a broker
6611 through the ``!callback !1`` metadata. In the example, there is only one
6612 callback encoding, namely ``!2``, associated with the broker. This encoding
6613 identifies the callback function as the second argument of the broker (``i64
6614 2``) and the sole argument of the callback function as the third one of the
6615 broker function (``i64 3``).
6617 .. FIXME why does the llvm-sphinx-docs builder give a highlighting
6618 error if the below is set to highlight as 'llvm', despite that we
6619 have misc.highlighting_failure set?
6621 .. code-block:: text
6623 declare !callback !1 dso_local i32 @pthread_create(ptr, ptr, ptr, ptr)
6626 !2 = !{i64 2, i64 3, i1 false}
6629 Another example is shown below. The callback callee is the second argument of
6630 the ``__kmpc_fork_call`` function (``i64 2``). The callee is given two unknown
6631 values (each identified by a ``i64 -1``) and afterwards all
6632 variadic arguments that are passed to the ``__kmpc_fork_call`` call (due to the
6635 .. FIXME why does the llvm-sphinx-docs builder give a highlighting
6636 error if the below is set to highlight as 'llvm', despite that we
6637 have misc.highlighting_failure set?
6639 .. code-block:: text
6641 declare !callback !0 dso_local void @__kmpc_fork_call(ptr, i32, ptr, ...)
6644 !1 = !{i64 2, i64 -1, i64 -1, i1 true}
6647 '``exclude``' Metadata
6648 ^^^^^^^^^^^^^^^^^^^^^^
6650 ``exclude`` metadata may be attached to a global variable to signify that its
6651 section should not be included in the final executable or shared library. This
6652 option is only valid for global variables with an explicit section targeting ELF
6653 or COFF. This is done using the ``SHF_EXCLUDE`` flag on ELF targets and the
6654 ``IMAGE_SCN_LNK_REMOVE`` and ``IMAGE_SCN_MEM_DISCARDABLE`` flags for COFF
6655 targets. Additionally, this metadata is only used as a flag, so the associated
6656 node must be empty. The explicit section should not conflict with any other
6657 sections that the user does not want removed after linking.
6659 .. code-block:: text
6661 @object = private constant [1 x i8] c"\00", section ".foo" !exclude !0
6666 '``unpredictable``' Metadata
6667 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6669 ``unpredictable`` metadata may be attached to any branch or switch
6670 instruction. It can be used to express the unpredictability of control
6671 flow. Similar to the llvm.expect intrinsic, it may be used to alter
6672 optimizations related to compare and branch instructions. The metadata
6673 is treated as a boolean value; if it exists, it signals that the branch
6674 or switch that it is attached to is completely unpredictable.
6676 .. _md_dereferenceable:
6678 '``dereferenceable``' Metadata
6679 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6681 The existence of the ``!dereferenceable`` metadata on the instruction
6682 tells the optimizer that the value loaded is known to be dereferenceable,
6683 otherwise the behavior is undefined.
6684 The number of bytes known to be dereferenceable is specified by the integer
6685 value in the metadata node. This is analogous to the ''dereferenceable''
6686 attribute on parameters and return values.
6688 .. _md_dereferenceable_or_null:
6690 '``dereferenceable_or_null``' Metadata
6691 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6693 The existence of the ``!dereferenceable_or_null`` metadata on the
6694 instruction tells the optimizer that the value loaded is known to be either
6695 dereferenceable or null, otherwise the behavior is undefined.
6696 The number of bytes known to be dereferenceable is specified by the integer
6697 value in the metadata node. This is analogous to the ''dereferenceable_or_null''
6698 attribute on parameters and return values.
6705 It is sometimes useful to attach information to loop constructs. Currently,
6706 loop metadata is implemented as metadata attached to the branch instruction
6707 in the loop latch block. The loop metadata node is a list of
6708 other metadata nodes, each representing a property of the loop. Usually,
6709 the first item of the property node is a string. For example, the
6710 ``llvm.loop.unroll.count`` suggests an unroll factor to the loop
6713 .. code-block:: llvm
6715 br i1 %exitcond, label %._crit_edge, label %.lr.ph, !llvm.loop !0
6718 !1 = !{!"llvm.loop.unroll.enable"}
6719 !2 = !{!"llvm.loop.unroll.count", i32 4}
6721 For legacy reasons, the first item of a loop metadata node must be a
6722 reference to itself. Before the advent of the 'distinct' keyword, this
6723 forced the preservation of otherwise identical metadata nodes. Since
6724 the loop-metadata node can be attached to multiple nodes, the 'distinct'
6725 keyword has become unnecessary.
6727 Prior to the property nodes, one or two ``DILocation`` (debug location)
6728 nodes can be present in the list. The first, if present, identifies the
6729 source-code location where the loop begins. The second, if present,
6730 identifies the source-code location where the loop ends.
6732 Loop metadata nodes cannot be used as unique identifiers. They are
6733 neither persistent for the same loop through transformations nor
6734 necessarily unique to just one loop.
6736 '``llvm.loop.disable_nonforced``'
6737 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6739 This metadata disables all optional loop transformations unless
6740 explicitly instructed using other transformation metadata such as
6741 ``llvm.loop.unroll.enable``. That is, no heuristic will try to determine
6742 whether a transformation is profitable. The purpose is to avoid that the
6743 loop is transformed to a different loop before an explicitly requested
6744 (forced) transformation is applied. For instance, loop fusion can make
6745 other transformations impossible. Mandatory loop canonicalizations such
6746 as loop rotation are still applied.
6748 It is recommended to use this metadata in addition to any llvm.loop.*
6749 transformation directive. Also, any loop should have at most one
6750 directive applied to it (and a sequence of transformations built using
6751 followup-attributes). Otherwise, which transformation will be applied
6752 depends on implementation details such as the pass pipeline order.
6754 See :ref:`transformation-metadata` for details.
6756 '``llvm.loop.vectorize``' and '``llvm.loop.interleave``'
6757 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6759 Metadata prefixed with ``llvm.loop.vectorize`` or ``llvm.loop.interleave`` are
6760 used to control per-loop vectorization and interleaving parameters such as
6761 vectorization width and interleave count. These metadata should be used in
6762 conjunction with ``llvm.loop`` loop identification metadata. The
6763 ``llvm.loop.vectorize`` and ``llvm.loop.interleave`` metadata are only
6764 optimization hints and the optimizer will only interleave and vectorize loops if
6765 it believes it is safe to do so. The ``llvm.loop.parallel_accesses`` metadata
6766 which contains information about loop-carried memory dependencies can be helpful
6767 in determining the safety of these transformations.
6769 '``llvm.loop.interleave.count``' Metadata
6770 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6772 This metadata suggests an interleave count to the loop interleaver.
6773 The first operand is the string ``llvm.loop.interleave.count`` and the
6774 second operand is an integer specifying the interleave count. For
6777 .. code-block:: llvm
6779 !0 = !{!"llvm.loop.interleave.count", i32 4}
6781 Note that setting ``llvm.loop.interleave.count`` to 1 disables interleaving
6782 multiple iterations of the loop. If ``llvm.loop.interleave.count`` is set to 0
6783 then the interleave count will be determined automatically.
6785 '``llvm.loop.vectorize.enable``' Metadata
6786 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6788 This metadata selectively enables or disables vectorization for the loop. The
6789 first operand is the string ``llvm.loop.vectorize.enable`` and the second operand
6790 is a bit. If the bit operand value is 1 vectorization is enabled. A value of
6791 0 disables vectorization:
6793 .. code-block:: llvm
6795 !0 = !{!"llvm.loop.vectorize.enable", i1 0}
6796 !1 = !{!"llvm.loop.vectorize.enable", i1 1}
6798 '``llvm.loop.vectorize.predicate.enable``' Metadata
6799 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6801 This metadata selectively enables or disables creating predicated instructions
6802 for the loop, which can enable folding of the scalar epilogue loop into the
6803 main loop. The first operand is the string
6804 ``llvm.loop.vectorize.predicate.enable`` and the second operand is a bit. If
6805 the bit operand value is 1 vectorization is enabled. A value of 0 disables
6808 .. code-block:: llvm
6810 !0 = !{!"llvm.loop.vectorize.predicate.enable", i1 0}
6811 !1 = !{!"llvm.loop.vectorize.predicate.enable", i1 1}
6813 '``llvm.loop.vectorize.scalable.enable``' Metadata
6814 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6816 This metadata selectively enables or disables scalable vectorization for the
6817 loop, and only has any effect if vectorization for the loop is already enabled.
6818 The first operand is the string ``llvm.loop.vectorize.scalable.enable``
6819 and the second operand is a bit. If the bit operand value is 1 scalable
6820 vectorization is enabled, whereas a value of 0 reverts to the default fixed
6821 width vectorization:
6823 .. code-block:: llvm
6825 !0 = !{!"llvm.loop.vectorize.scalable.enable", i1 0}
6826 !1 = !{!"llvm.loop.vectorize.scalable.enable", i1 1}
6828 '``llvm.loop.vectorize.width``' Metadata
6829 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6831 This metadata sets the target width of the vectorizer. The first
6832 operand is the string ``llvm.loop.vectorize.width`` and the second
6833 operand is an integer specifying the width. For example:
6835 .. code-block:: llvm
6837 !0 = !{!"llvm.loop.vectorize.width", i32 4}
6839 Note that setting ``llvm.loop.vectorize.width`` to 1 disables
6840 vectorization of the loop. If ``llvm.loop.vectorize.width`` is set to
6841 0 or if the loop does not have this metadata the width will be
6842 determined automatically.
6844 '``llvm.loop.vectorize.followup_vectorized``' Metadata
6845 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6847 This metadata defines which loop attributes the vectorized loop will
6848 have. See :ref:`transformation-metadata` for details.
6850 '``llvm.loop.vectorize.followup_epilogue``' Metadata
6851 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6853 This metadata defines which loop attributes the epilogue will have. The
6854 epilogue is not vectorized and is executed when either the vectorized
6855 loop is not known to preserve semantics (because e.g., it processes two
6856 arrays that are found to alias by a runtime check) or for the last
6857 iterations that do not fill a complete set of vector lanes. See
6858 :ref:`Transformation Metadata <transformation-metadata>` for details.
6860 '``llvm.loop.vectorize.followup_all``' Metadata
6861 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6863 Attributes in the metadata will be added to both the vectorized and
6865 See :ref:`Transformation Metadata <transformation-metadata>` for details.
6867 '``llvm.loop.unroll``'
6868 ^^^^^^^^^^^^^^^^^^^^^^
6870 Metadata prefixed with ``llvm.loop.unroll`` are loop unrolling
6871 optimization hints such as the unroll factor. ``llvm.loop.unroll``
6872 metadata should be used in conjunction with ``llvm.loop`` loop
6873 identification metadata. The ``llvm.loop.unroll`` metadata are only
6874 optimization hints and the unrolling will only be performed if the
6875 optimizer believes it is safe to do so.
6877 '``llvm.loop.unroll.count``' Metadata
6878 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6880 This metadata suggests an unroll factor to the loop unroller. The
6881 first operand is the string ``llvm.loop.unroll.count`` and the second
6882 operand is a positive integer specifying the unroll factor. For
6885 .. code-block:: llvm
6887 !0 = !{!"llvm.loop.unroll.count", i32 4}
6889 If the trip count of the loop is less than the unroll count the loop
6890 will be partially unrolled.
6892 '``llvm.loop.unroll.disable``' Metadata
6893 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6895 This metadata disables loop unrolling. The metadata has a single operand
6896 which is the string ``llvm.loop.unroll.disable``. For example:
6898 .. code-block:: llvm
6900 !0 = !{!"llvm.loop.unroll.disable"}
6902 '``llvm.loop.unroll.runtime.disable``' Metadata
6903 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6905 This metadata disables runtime loop unrolling. The metadata has a single
6906 operand which is the string ``llvm.loop.unroll.runtime.disable``. For example:
6908 .. code-block:: llvm
6910 !0 = !{!"llvm.loop.unroll.runtime.disable"}
6912 '``llvm.loop.unroll.enable``' Metadata
6913 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6915 This metadata suggests that the loop should be fully unrolled if the trip count
6916 is known at compile time and partially unrolled if the trip count is not known
6917 at compile time. The metadata has a single operand which is the string
6918 ``llvm.loop.unroll.enable``. For example:
6920 .. code-block:: llvm
6922 !0 = !{!"llvm.loop.unroll.enable"}
6924 '``llvm.loop.unroll.full``' Metadata
6925 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6927 This metadata suggests that the loop should be unrolled fully. The
6928 metadata has a single operand which is the string ``llvm.loop.unroll.full``.
6931 .. code-block:: llvm
6933 !0 = !{!"llvm.loop.unroll.full"}
6935 '``llvm.loop.unroll.followup``' Metadata
6936 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6938 This metadata defines which loop attributes the unrolled loop will have.
6939 See :ref:`Transformation Metadata <transformation-metadata>` for details.
6941 '``llvm.loop.unroll.followup_remainder``' Metadata
6942 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6944 This metadata defines which loop attributes the remainder loop after
6945 partial/runtime unrolling will have. See
6946 :ref:`Transformation Metadata <transformation-metadata>` for details.
6948 '``llvm.loop.unroll_and_jam``'
6949 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6951 This metadata is treated very similarly to the ``llvm.loop.unroll`` metadata
6952 above, but affect the unroll and jam pass. In addition any loop with
6953 ``llvm.loop.unroll`` metadata but no ``llvm.loop.unroll_and_jam`` metadata will
6954 disable unroll and jam (so ``llvm.loop.unroll`` metadata will be left to the
6955 unroller, plus ``llvm.loop.unroll.disable`` metadata will disable unroll and jam
6958 The metadata for unroll and jam otherwise is the same as for ``unroll``.
6959 ``llvm.loop.unroll_and_jam.enable``, ``llvm.loop.unroll_and_jam.disable`` and
6960 ``llvm.loop.unroll_and_jam.count`` do the same as for unroll.
6961 ``llvm.loop.unroll_and_jam.full`` is not supported. Again these are only hints
6962 and the normal safety checks will still be performed.
6964 '``llvm.loop.unroll_and_jam.count``' Metadata
6965 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6967 This metadata suggests an unroll and jam factor to use, similarly to
6968 ``llvm.loop.unroll.count``. The first operand is the string
6969 ``llvm.loop.unroll_and_jam.count`` and the second operand is a positive integer
6970 specifying the unroll factor. For example:
6972 .. code-block:: llvm
6974 !0 = !{!"llvm.loop.unroll_and_jam.count", i32 4}
6976 If the trip count of the loop is less than the unroll count the loop
6977 will be partially unroll and jammed.
6979 '``llvm.loop.unroll_and_jam.disable``' Metadata
6980 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6982 This metadata disables loop unroll and jamming. The metadata has a single
6983 operand which is the string ``llvm.loop.unroll_and_jam.disable``. For example:
6985 .. code-block:: llvm
6987 !0 = !{!"llvm.loop.unroll_and_jam.disable"}
6989 '``llvm.loop.unroll_and_jam.enable``' Metadata
6990 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6992 This metadata suggests that the loop should be fully unroll and jammed if the
6993 trip count is known at compile time and partially unrolled if the trip count is
6994 not known at compile time. The metadata has a single operand which is the
6995 string ``llvm.loop.unroll_and_jam.enable``. For example:
6997 .. code-block:: llvm
6999 !0 = !{!"llvm.loop.unroll_and_jam.enable"}
7001 '``llvm.loop.unroll_and_jam.followup_outer``' Metadata
7002 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7004 This metadata defines which loop attributes the outer unrolled loop will
7005 have. See :ref:`Transformation Metadata <transformation-metadata>` for
7008 '``llvm.loop.unroll_and_jam.followup_inner``' Metadata
7009 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7011 This metadata defines which loop attributes the inner jammed loop will
7012 have. See :ref:`Transformation Metadata <transformation-metadata>` for
7015 '``llvm.loop.unroll_and_jam.followup_remainder_outer``' Metadata
7016 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7018 This metadata defines which attributes the epilogue of the outer loop
7019 will have. This loop is usually unrolled, meaning there is no such
7020 loop. This attribute will be ignored in this case. See
7021 :ref:`Transformation Metadata <transformation-metadata>` for details.
7023 '``llvm.loop.unroll_and_jam.followup_remainder_inner``' Metadata
7024 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7026 This metadata defines which attributes the inner loop of the epilogue
7027 will have. The outer epilogue will usually be unrolled, meaning there
7028 can be multiple inner remainder loops. See
7029 :ref:`Transformation Metadata <transformation-metadata>` for details.
7031 '``llvm.loop.unroll_and_jam.followup_all``' Metadata
7032 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7034 Attributes specified in the metadata is added to all
7035 ``llvm.loop.unroll_and_jam.*`` loops. See
7036 :ref:`Transformation Metadata <transformation-metadata>` for details.
7038 '``llvm.loop.licm_versioning.disable``' Metadata
7039 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7041 This metadata indicates that the loop should not be versioned for the purpose
7042 of enabling loop-invariant code motion (LICM). The metadata has a single operand
7043 which is the string ``llvm.loop.licm_versioning.disable``. For example:
7045 .. code-block:: llvm
7047 !0 = !{!"llvm.loop.licm_versioning.disable"}
7049 '``llvm.loop.distribute.enable``' Metadata
7050 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7052 Loop distribution allows splitting a loop into multiple loops. Currently,
7053 this is only performed if the entire loop cannot be vectorized due to unsafe
7054 memory dependencies. The transformation will attempt to isolate the unsafe
7055 dependencies into their own loop.
7057 This metadata can be used to selectively enable or disable distribution of the
7058 loop. The first operand is the string ``llvm.loop.distribute.enable`` and the
7059 second operand is a bit. If the bit operand value is 1 distribution is
7060 enabled. A value of 0 disables distribution:
7062 .. code-block:: llvm
7064 !0 = !{!"llvm.loop.distribute.enable", i1 0}
7065 !1 = !{!"llvm.loop.distribute.enable", i1 1}
7067 This metadata should be used in conjunction with ``llvm.loop`` loop
7068 identification metadata.
7070 '``llvm.loop.distribute.followup_coincident``' Metadata
7071 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7073 This metadata defines which attributes extracted loops with no cyclic
7074 dependencies will have (i.e. can be vectorized). See
7075 :ref:`Transformation Metadata <transformation-metadata>` for details.
7077 '``llvm.loop.distribute.followup_sequential``' Metadata
7078 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7080 This metadata defines which attributes the isolated loops with unsafe
7081 memory dependencies will have. See
7082 :ref:`Transformation Metadata <transformation-metadata>` for details.
7084 '``llvm.loop.distribute.followup_fallback``' Metadata
7085 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7087 If loop versioning is necessary, this metadata defined the attributes
7088 the non-distributed fallback version will have. See
7089 :ref:`Transformation Metadata <transformation-metadata>` for details.
7091 '``llvm.loop.distribute.followup_all``' Metadata
7092 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7094 The attributes in this metadata is added to all followup loops of the
7095 loop distribution pass. See
7096 :ref:`Transformation Metadata <transformation-metadata>` for details.
7098 '``llvm.licm.disable``' Metadata
7099 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7101 This metadata indicates that loop-invariant code motion (LICM) should not be
7102 performed on this loop. The metadata has a single operand which is the string
7103 ``llvm.licm.disable``. For example:
7105 .. code-block:: llvm
7107 !0 = !{!"llvm.licm.disable"}
7109 Note that although it operates per loop it isn't given the llvm.loop prefix
7110 as it is not affected by the ``llvm.loop.disable_nonforced`` metadata.
7112 '``llvm.access.group``' Metadata
7113 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7115 ``llvm.access.group`` metadata can be attached to any instruction that
7116 potentially accesses memory. It can point to a single distinct metadata
7117 node, which we call access group. This node represents all memory access
7118 instructions referring to it via ``llvm.access.group``. When an
7119 instruction belongs to multiple access groups, it can also point to a
7120 list of accesses groups, illustrated by the following example.
7122 .. code-block:: llvm
7124 %val = load i32, ptr %arrayidx, !llvm.access.group !0
7130 It is illegal for the list node to be empty since it might be confused
7131 with an access group.
7133 The access group metadata node must be 'distinct' to avoid collapsing
7134 multiple access groups by content. A access group metadata node must
7135 always be empty which can be used to distinguish an access group
7136 metadata node from a list of access groups. Being empty avoids the
7137 situation that the content must be updated which, because metadata is
7138 immutable by design, would required finding and updating all references
7139 to the access group node.
7141 The access group can be used to refer to a memory access instruction
7142 without pointing to it directly (which is not possible in global
7143 metadata). Currently, the only metadata making use of it is
7144 ``llvm.loop.parallel_accesses``.
7146 '``llvm.loop.parallel_accesses``' Metadata
7147 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7149 The ``llvm.loop.parallel_accesses`` metadata refers to one or more
7150 access group metadata nodes (see ``llvm.access.group``). It denotes that
7151 no loop-carried memory dependence exist between it and other instructions
7152 in the loop with this metadata.
7154 Let ``m1`` and ``m2`` be two instructions that both have the
7155 ``llvm.access.group`` metadata to the access group ``g1``, respectively
7156 ``g2`` (which might be identical). If a loop contains both access groups
7157 in its ``llvm.loop.parallel_accesses`` metadata, then the compiler can
7158 assume that there is no dependency between ``m1`` and ``m2`` carried by
7159 this loop. Instructions that belong to multiple access groups are
7160 considered having this property if at least one of the access groups
7161 matches the ``llvm.loop.parallel_accesses`` list.
7163 If all memory-accessing instructions in a loop have
7164 ``llvm.access.group`` metadata that each refer to one of the access
7165 groups of a loop's ``llvm.loop.parallel_accesses`` metadata, then the
7166 loop has no loop carried memory dependences and is considered to be a
7169 Note that if not all memory access instructions belong to an access
7170 group referred to by ``llvm.loop.parallel_accesses``, then the loop must
7171 not be considered trivially parallel. Additional
7172 memory dependence analysis is required to make that determination. As a fail
7173 safe mechanism, this causes loops that were originally parallel to be considered
7174 sequential (if optimization passes that are unaware of the parallel semantics
7175 insert new memory instructions into the loop body).
7177 Example of a loop that is considered parallel due to its correct use of
7178 both ``llvm.access.group`` and ``llvm.loop.parallel_accesses``
7181 .. code-block:: llvm
7185 %val0 = load i32, ptr %arrayidx, !llvm.access.group !1
7187 store i32 %val0, ptr %arrayidx1, !llvm.access.group !1
7189 br i1 %exitcond, label %for.end, label %for.body, !llvm.loop !0
7193 !0 = distinct !{!0, !{!"llvm.loop.parallel_accesses", !1}}
7196 It is also possible to have nested parallel loops:
7198 .. code-block:: llvm
7202 %val1 = load i32, ptr %arrayidx3, !llvm.access.group !4
7204 br label %inner.for.body
7208 %val0 = load i32, ptr %arrayidx1, !llvm.access.group !3
7210 store i32 %val0, ptr %arrayidx2, !llvm.access.group !3
7212 br i1 %exitcond, label %inner.for.end, label %inner.for.body, !llvm.loop !1
7216 store i32 %val1, ptr %arrayidx4, !llvm.access.group !4
7218 br i1 %exitcond, label %outer.for.end, label %outer.for.body, !llvm.loop !2
7220 outer.for.end: ; preds = %for.body
7222 !1 = distinct !{!1, !{!"llvm.loop.parallel_accesses", !3}} ; metadata for the inner loop
7223 !2 = distinct !{!2, !{!"llvm.loop.parallel_accesses", !3, !4}} ; metadata for the outer loop
7224 !3 = distinct !{} ; access group for instructions in the inner loop (which are implicitly contained in outer loop as well)
7225 !4 = distinct !{} ; access group for instructions in the outer, but not the inner loop
7227 .. _langref_llvm_loop_mustprogress:
7229 '``llvm.loop.mustprogress``' Metadata
7230 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7232 The ``llvm.loop.mustprogress`` metadata indicates that this loop is required to
7233 terminate, unwind, or interact with the environment in an observable way e.g.
7234 via a volatile memory access, I/O, or other synchronization. If such a loop is
7235 not found to interact with the environment in an observable way, the loop may
7236 be removed. This corresponds to the ``mustprogress`` function attribute.
7238 '``irr_loop``' Metadata
7239 ^^^^^^^^^^^^^^^^^^^^^^^
7241 ``irr_loop`` metadata may be attached to the terminator instruction of a basic
7242 block that's an irreducible loop header (note that an irreducible loop has more
7243 than once header basic blocks.) If ``irr_loop`` metadata is attached to the
7244 terminator instruction of a basic block that is not really an irreducible loop
7245 header, the behavior is undefined. The intent of this metadata is to improve the
7246 accuracy of the block frequency propagation. For example, in the code below, the
7247 block ``header0`` may have a loop header weight (relative to the other headers of
7248 the irreducible loop) of 100:
7250 .. code-block:: llvm
7254 br i1 %cmp, label %t1, label %t2, !irr_loop !0
7257 !0 = !{"loop_header_weight", i64 100}
7259 Irreducible loop header weights are typically based on profile data.
7261 .. _md_invariant.group:
7263 '``invariant.group``' Metadata
7264 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7266 The experimental ``invariant.group`` metadata may be attached to
7267 ``load``/``store`` instructions referencing a single metadata with no entries.
7268 The existence of the ``invariant.group`` metadata on the instruction tells
7269 the optimizer that every ``load`` and ``store`` to the same pointer operand
7270 can be assumed to load or store the same
7271 value (but see the ``llvm.launder.invariant.group`` intrinsic which affects
7272 when two pointers are considered the same). Pointers returned by bitcast or
7273 getelementptr with only zero indices are considered the same.
7277 .. code-block:: llvm
7279 @unknownPtr = external global i8
7282 store i8 42, ptr %ptr, !invariant.group !0
7283 call void @foo(ptr %ptr)
7285 %a = load i8, ptr %ptr, !invariant.group !0 ; Can assume that value under %ptr didn't change
7286 call void @foo(ptr %ptr)
7288 %newPtr = call ptr @getPointer(ptr %ptr)
7289 %c = load i8, ptr %newPtr, !invariant.group !0 ; Can't assume anything, because we only have information about %ptr
7291 %unknownValue = load i8, ptr @unknownPtr
7292 store i8 %unknownValue, ptr %ptr, !invariant.group !0 ; Can assume that %unknownValue == 42
7294 call void @foo(ptr %ptr)
7295 %newPtr2 = call ptr @llvm.launder.invariant.group.p0(ptr %ptr)
7296 %d = load i8, ptr %newPtr2, !invariant.group !0 ; Can't step through launder.invariant.group to get value of %ptr
7299 declare void @foo(ptr)
7300 declare ptr @getPointer(ptr)
7301 declare ptr @llvm.launder.invariant.group.p0(ptr)
7305 The invariant.group metadata must be dropped when replacing one pointer by
7306 another based on aliasing information. This is because invariant.group is tied
7307 to the SSA value of the pointer operand.
7309 .. code-block:: llvm
7311 %v = load i8, ptr %x, !invariant.group !0
7312 ; if %x mustalias %y then we can replace the above instruction with
7313 %v = load i8, ptr %y
7315 Note that this is an experimental feature, which means that its semantics might
7316 change in the future.
7321 See :doc:`TypeMetadata`.
7323 '``associated``' Metadata
7324 ^^^^^^^^^^^^^^^^^^^^^^^^^
7326 The ``associated`` metadata may be attached to a global variable definition with
7327 a single argument that references a global object (optionally through an alias).
7329 This metadata lowers to the ELF section flag ``SHF_LINK_ORDER`` which prevents
7330 discarding of the global variable in linker GC unless the referenced object is
7331 also discarded. The linker support for this feature is spotty. For best
7332 compatibility, globals carrying this metadata should:
7334 - Be in ``@llvm.compiler.used``.
7335 - If the referenced global variable is in a comdat, be in the same comdat.
7337 ``!associated`` can not express many-to-one relationship. A global variable with
7338 the metadata should generally not be referenced by a function: the function may
7339 be inlined into other functions, leading to more references to the metadata.
7340 Ideally we would want to keep metadata alive as long as any inline location is
7341 alive, but this many-to-one relationship is not representable. Moreover, if the
7342 metadata is retained while the function is discarded, the linker will report an
7343 error of a relocation referencing a discarded section.
7345 The metadata is often used with an explicit section consisting of valid C
7346 identifiers so that the runtime can find the metadata section with
7347 linker-defined encapsulation symbols ``__start_<section_name>`` and
7348 ``__stop_<section_name>``.
7350 It does not have any effect on non-ELF targets.
7354 .. code-block:: text
7357 @a = global i32 1, comdat $a
7358 @b = internal global i32 2, comdat $a, section "abc", !associated !0
7365 The ``prof`` metadata is used to record profile data in the IR.
7366 The first operand of the metadata node indicates the profile metadata
7367 type. There are currently 3 types:
7368 :ref:`branch_weights<prof_node_branch_weights>`,
7369 :ref:`function_entry_count<prof_node_function_entry_count>`, and
7370 :ref:`VP<prof_node_VP>`.
7372 .. _prof_node_branch_weights:
7377 Branch weight metadata attached to a branch, select, switch or call instruction
7378 represents the likeliness of the associated branch being taken.
7379 For more information, see :doc:`BranchWeightMetadata`.
7381 .. _prof_node_function_entry_count:
7383 function_entry_count
7384 """"""""""""""""""""
7386 Function entry count metadata can be attached to function definitions
7387 to record the number of times the function is called. Used with BFI
7388 information, it is also used to derive the basic block profile count.
7389 For more information, see :doc:`BranchWeightMetadata`.
7396 VP (value profile) metadata can be attached to instructions that have
7397 value profile information. Currently this is indirect calls (where it
7398 records the hottest callees) and calls to memory intrinsics such as memcpy,
7399 memmove, and memset (where it records the hottest byte lengths).
7401 Each VP metadata node contains "VP" string, then a uint32_t value for the value
7402 profiling kind, a uint64_t value for the total number of times the instruction
7403 is executed, followed by uint64_t value and execution count pairs.
7404 The value profiling kind is 0 for indirect call targets and 1 for memory
7405 operations. For indirect call targets, each profile value is a hash
7406 of the callee function name, and for memory operations each value is the
7409 Note that the value counts do not need to add up to the total count
7410 listed in the third operand (in practice only the top hottest values
7411 are tracked and reported).
7413 Indirect call example:
7415 .. code-block:: llvm
7417 call void %f(), !prof !1
7418 !1 = !{!"VP", i32 0, i64 1600, i64 7651369219802541373, i64 1030, i64 -4377547752858689819, i64 410}
7420 Note that the VP type is 0 (the second operand), which indicates this is
7421 an indirect call value profile data. The third operand indicates that the
7422 indirect call executed 1600 times. The 4th and 6th operands give the
7423 hashes of the 2 hottest target functions' names (this is the same hash used
7424 to represent function names in the profile database), and the 5th and 7th
7425 operands give the execution count that each of the respective prior target
7426 functions was called.
7430 '``annotation``' Metadata
7431 ^^^^^^^^^^^^^^^^^^^^^^^^^
7433 The ``annotation`` metadata can be used to attach a tuple of annotation strings
7434 or a tuple of a tuple of annotation strings to any instruction. This metadata does
7435 not impact the semantics of the program and may only be used to provide additional
7436 insight about the program and transformations to users.
7440 .. code-block:: text
7442 %a.addr = alloca ptr, align 8, !annotation !0
7443 !0 = !{!"auto-init"}
7445 Embedding tuple of strings example:
7447 .. code-block:: text
7449 %a.ptr = getelementptr ptr, ptr %base, i64 0. !annotation !0
7451 !1 = !{!"gep offset", !"0"}
7453 '``func_sanitize``' Metadata
7454 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7456 The ``func_sanitize`` metadata is used to attach two values for the function
7457 sanitizer instrumentation. The first value is the ubsan function signature.
7458 The second value is the address of the proxy variable which stores the address
7459 of the RTTI descriptor. If :ref:`prologue <prologuedata>` and '``func_sanitize``'
7460 are used at the same time, :ref:`prologue <prologuedata>` is emitted before
7461 '``func_sanitize``' in the output.
7465 .. code-block:: text
7467 @__llvm_rtti_proxy = private unnamed_addr constant ptr @_ZTIFvvE
7468 define void @_Z3funv() !func_sanitize !0 {
7471 !0 = !{i32 846595819, ptr @__llvm_rtti_proxy}
7475 '``kcfi_type``' Metadata
7476 ^^^^^^^^^^^^^^^^^^^^^^^^
7478 The ``kcfi_type`` metadata can be used to attach a type identifier to
7479 functions that can be called indirectly. The type data is emitted before the
7480 function entry in the assembly. Indirect calls with the :ref:`kcfi operand
7481 bundle<ob_kcfi>` will emit a check that compares the type identifier to the
7486 .. code-block:: text
7488 define dso_local i32 @f() !kcfi_type !0 {
7491 !0 = !{i32 12345678}
7493 Clang emits ``kcfi_type`` metadata nodes for address-taken functions with
7494 ``-fsanitize=kcfi``.
7498 '``memprof``' Metadata
7499 ^^^^^^^^^^^^^^^^^^^^^^^^
7501 The ``memprof`` metadata is used to record memory profile data on heap
7502 allocation calls. Multiple context-sensitive profiles can be represented
7503 with a single ``memprof`` metadata attachment.
7507 .. code-block:: text
7509 %call = call ptr @_Znam(i64 10), !memprof !0, !callsite !5
7512 !2 = !{i64 4854880825882961848, i64 1905834578520680781}
7513 !3 = !{!4, !"notcold"}
7514 !4 = !{i64 4854880825882961848, i64 -6528110295079665978}
7515 !5 = !{i64 4854880825882961848}
7517 Each operand in the ``memprof`` metadata attachment describes the profiled
7518 behavior of memory allocated by the associated allocation for a given context.
7519 In the above example, there were 2 profiled contexts, one allocating memory
7520 that was typically cold and one allocating memory that was typically not cold.
7522 The format of the metadata describing a context specific profile (e.g.
7523 ``!1`` and ``!3`` above) requires a first operand that is a metadata node
7524 describing the context, followed by a list of string metadata tags describing
7525 the profile behavior (e.g. ``cold`` and ``notcold``) above. The metadata nodes
7526 describing the context (e.g. ``!2`` and ``!4`` above) are unique ids
7527 corresponding to callsites, which can be matched to associated IR calls via
7528 :ref:`callsite metadata<md_callsite>`. In practice these ids are formed via
7529 a hash of the callsite's debug info, and the associated call may be in a
7530 different module. The contexts are listed in order from leaf-most call (the
7531 allocation itself) to the outermost callsite context required for uniquely
7532 identifying the described profile behavior (note this may not be the top of
7533 the profiled call stack).
7537 '``callsite``' Metadata
7538 ^^^^^^^^^^^^^^^^^^^^^^^^
7540 The ``callsite`` metadata is used to identify callsites involved in memory
7541 profile contexts described in :ref:`memprof metadata<md_memprof>`.
7543 It is attached both to the profile allocation calls (see the example in
7544 :ref:`memprof metadata<md_memprof>`), as well as to other callsites
7545 in profiled contexts described in heap allocation ``memprof`` metadata.
7549 .. code-block:: text
7551 %call = call ptr @_Z1Bb(void), !callsite !0
7552 !0 = !{i64 -6528110295079665978, i64 5462047985461644151}
7554 Each operand in the ``callsite`` metadata attachment is a unique id
7555 corresponding to a callsite (possibly inlined). In practice these ids are
7556 formed via a hash of the callsite's debug info. If the call was not inlined
7557 into any callers it will contain a single operand (id). If it was inlined
7558 it will contain a list of ids, including the ids of the callsites in the
7559 full inline sequence, in order from the leaf-most call's id to the outermost
7562 Module Flags Metadata
7563 =====================
7565 Information about the module as a whole is difficult to convey to LLVM's
7566 subsystems. The LLVM IR isn't sufficient to transmit this information.
7567 The ``llvm.module.flags`` named metadata exists in order to facilitate
7568 this. These flags are in the form of key / value pairs --- much like a
7569 dictionary --- making it easy for any subsystem who cares about a flag to
7572 The ``llvm.module.flags`` metadata contains a list of metadata triplets.
7573 Each triplet has the following form:
7575 - The first element is a *behavior* flag, which specifies the behavior
7576 when two (or more) modules are merged together, and it encounters two
7577 (or more) metadata with the same ID. The supported behaviors are
7579 - The second element is a metadata string that is a unique ID for the
7580 metadata. Each module may only have one flag entry for each unique ID (not
7581 including entries with the **Require** behavior).
7582 - The third element is the value of the flag.
7584 When two (or more) modules are merged together, the resulting
7585 ``llvm.module.flags`` metadata is the union of the modules' flags. That is, for
7586 each unique metadata ID string, there will be exactly one entry in the merged
7587 modules ``llvm.module.flags`` metadata table, and the value for that entry will
7588 be determined by the merge behavior flag, as described below. The only exception
7589 is that entries with the *Require* behavior are always preserved.
7591 The following behaviors are supported:
7602 Emits an error if two values disagree, otherwise the resulting value
7603 is that of the operands.
7607 Emits a warning if two values disagree. The result value will be the
7608 operand for the flag from the first module being linked, or the max
7609 if the other module uses **Max** (in which case the resulting flag
7614 Adds a requirement that another module flag be present and have a
7615 specified value after linking is performed. The value must be a
7616 metadata pair, where the first element of the pair is the ID of the
7617 module flag to be restricted, and the second element of the pair is
7618 the value the module flag should be restricted to. This behavior can
7619 be used to restrict the allowable results (via triggering of an
7620 error) of linking IDs with the **Override** behavior.
7624 Uses the specified value, regardless of the behavior or value of the
7625 other module. If both modules specify **Override**, but the values
7626 differ, an error will be emitted.
7630 Appends the two values, which are required to be metadata nodes.
7634 Appends the two values, which are required to be metadata
7635 nodes. However, duplicate entries in the second list are dropped
7636 during the append operation.
7640 Takes the max of the two values, which are required to be integers.
7644 Takes the min of the two values, which are required to be non-negative integers.
7645 An absent module flag is treated as having the value 0.
7647 It is an error for a particular unique flag ID to have multiple behaviors,
7648 except in the case of **Require** (which adds restrictions on another metadata
7649 value) or **Override**.
7651 An example of module flags:
7653 .. code-block:: llvm
7655 !0 = !{ i32 1, !"foo", i32 1 }
7656 !1 = !{ i32 4, !"bar", i32 37 }
7657 !2 = !{ i32 2, !"qux", i32 42 }
7658 !3 = !{ i32 3, !"qux",
7663 !llvm.module.flags = !{ !0, !1, !2, !3 }
7665 - Metadata ``!0`` has the ID ``!"foo"`` and the value '1'. The behavior
7666 if two or more ``!"foo"`` flags are seen is to emit an error if their
7667 values are not equal.
7669 - Metadata ``!1`` has the ID ``!"bar"`` and the value '37'. The
7670 behavior if two or more ``!"bar"`` flags are seen is to use the value
7673 - Metadata ``!2`` has the ID ``!"qux"`` and the value '42'. The
7674 behavior if two or more ``!"qux"`` flags are seen is to emit a
7675 warning if their values are not equal.
7677 - Metadata ``!3`` has the ID ``!"qux"`` and the value:
7683 The behavior is to emit an error if the ``llvm.module.flags`` does not
7684 contain a flag with the ID ``!"foo"`` that has the value '1' after linking is
7687 Synthesized Functions Module Flags Metadata
7688 -------------------------------------------
7690 These metadata specify the default attributes synthesized functions should have.
7691 These metadata are currently respected by a few instrumentation passes, such as
7694 These metadata correspond to a few function attributes with significant code
7695 generation behaviors. Function attributes with just optimization purposes
7696 should not be listed because the performance impact of these synthesized
7699 - "frame-pointer": **Max**. The value can be 0, 1, or 2. A synthesized function
7700 will get the "frame-pointer" function attribute, with value being "none",
7701 "non-leaf", or "all", respectively.
7702 - "function_return_thunk_extern": The synthesized function will get the
7703 ``fn_return_thunk_extern`` function attribute.
7704 - "uwtable": **Max**. The value can be 0, 1, or 2. If the value is 1, a synthesized
7705 function will get the ``uwtable(sync)`` function attribute, if the value is 2,
7706 a synthesized function will get the ``uwtable(async)`` function attribute.
7708 Objective-C Garbage Collection Module Flags Metadata
7709 ----------------------------------------------------
7711 On the Mach-O platform, Objective-C stores metadata about garbage
7712 collection in a special section called "image info". The metadata
7713 consists of a version number and a bitmask specifying what types of
7714 garbage collection are supported (if any) by the file. If two or more
7715 modules are linked together their garbage collection metadata needs to
7716 be merged rather than appended together.
7718 The Objective-C garbage collection module flags metadata consists of the
7719 following key-value pairs:
7728 * - ``Objective-C Version``
7729 - **[Required]** --- The Objective-C ABI version. Valid values are 1 and 2.
7731 * - ``Objective-C Image Info Version``
7732 - **[Required]** --- The version of the image info section. Currently
7735 * - ``Objective-C Image Info Section``
7736 - **[Required]** --- The section to place the metadata. Valid values are
7737 ``"__OBJC, __image_info, regular"`` for Objective-C ABI version 1, and
7738 ``"__DATA,__objc_imageinfo, regular, no_dead_strip"`` for
7739 Objective-C ABI version 2.
7741 * - ``Objective-C Garbage Collection``
7742 - **[Required]** --- Specifies whether garbage collection is supported or
7743 not. Valid values are 0, for no garbage collection, and 2, for garbage
7744 collection supported.
7746 * - ``Objective-C GC Only``
7747 - **[Optional]** --- Specifies that only garbage collection is supported.
7748 If present, its value must be 6. This flag requires that the
7749 ``Objective-C Garbage Collection`` flag have the value 2.
7751 Some important flag interactions:
7753 - If a module with ``Objective-C Garbage Collection`` set to 0 is
7754 merged with a module with ``Objective-C Garbage Collection`` set to
7755 2, then the resulting module has the
7756 ``Objective-C Garbage Collection`` flag set to 0.
7757 - A module with ``Objective-C Garbage Collection`` set to 0 cannot be
7758 merged with a module with ``Objective-C GC Only`` set to 6.
7760 C type width Module Flags Metadata
7761 ----------------------------------
7763 The ARM backend emits a section into each generated object file describing the
7764 options that it was compiled with (in a compiler-independent way) to prevent
7765 linking incompatible objects, and to allow automatic library selection. Some
7766 of these options are not visible at the IR level, namely wchar_t width and enum
7769 To pass this information to the backend, these options are encoded in module
7770 flags metadata, using the following key-value pairs:
7780 - * 0 --- sizeof(wchar_t) == 4
7781 * 1 --- sizeof(wchar_t) == 2
7784 - * 0 --- Enums are at least as large as an ``int``.
7785 * 1 --- Enums are stored in the smallest integer type which can
7786 represent all of its values.
7788 For example, the following metadata section specifies that the module was
7789 compiled with a ``wchar_t`` width of 4 bytes, and the underlying type of an
7790 enum is the smallest type which can represent all of its values::
7792 !llvm.module.flags = !{!0, !1}
7793 !0 = !{i32 1, !"short_wchar", i32 1}
7794 !1 = !{i32 1, !"short_enum", i32 0}
7796 LTO Post-Link Module Flags Metadata
7797 -----------------------------------
7799 Some optimisations are only when the entire LTO unit is present in the current
7800 module. This is represented by the ``LTOPostLink`` module flags metadata, which
7801 will be created with a value of ``1`` when LTO linking occurs.
7803 Embedded Objects Names Metadata
7804 ===============================
7806 Offloading compilations need to embed device code into the host section table to
7807 create a fat binary. This metadata node references each global that will be
7808 embedded in the module. The primary use for this is to make referencing these
7809 globals more efficient in the IR. The metadata references nodes containing
7810 pointers to the global to be embedded followed by the section name it will be
7813 !llvm.embedded.objects = !{!0}
7814 !0 = !{ptr @object, !".section"}
7816 Automatic Linker Flags Named Metadata
7817 =====================================
7819 Some targets support embedding of flags to the linker inside individual object
7820 files. Typically this is used in conjunction with language extensions which
7821 allow source files to contain linker command line options, and have these
7822 automatically be transmitted to the linker via object files.
7824 These flags are encoded in the IR using named metadata with the name
7825 ``!llvm.linker.options``. Each operand is expected to be a metadata node
7826 which should be a list of other metadata nodes, each of which should be a
7827 list of metadata strings defining linker options.
7829 For example, the following metadata section specifies two separate sets of
7830 linker options, presumably to link against ``libz`` and the ``Cocoa``
7834 !1 = !{ !"-framework", !"Cocoa" }
7835 !llvm.linker.options = !{ !0, !1 }
7837 The metadata encoding as lists of lists of options, as opposed to a collapsed
7838 list of options, is chosen so that the IR encoding can use multiple option
7839 strings to specify e.g., a single library, while still having that specifier be
7840 preserved as an atomic element that can be recognized by a target specific
7841 assembly writer or object file emitter.
7843 Each individual option is required to be either a valid option for the target's
7844 linker, or an option that is reserved by the target specific assembly writer or
7845 object file emitter. No other aspect of these options is defined by the IR.
7847 Dependent Libs Named Metadata
7848 =============================
7850 Some targets support embedding of strings into object files to indicate
7851 a set of libraries to add to the link. Typically this is used in conjunction
7852 with language extensions which allow source files to explicitly declare the
7853 libraries they depend on, and have these automatically be transmitted to the
7854 linker via object files.
7856 The list is encoded in the IR using named metadata with the name
7857 ``!llvm.dependent-libraries``. Each operand is expected to be a metadata node
7858 which should contain a single string operand.
7860 For example, the following metadata section contains two library specifiers::
7862 !0 = !{!"a library specifier"}
7863 !1 = !{!"another library specifier"}
7864 !llvm.dependent-libraries = !{ !0, !1 }
7866 Each library specifier will be handled independently by the consuming linker.
7867 The effect of the library specifiers are defined by the consuming linker.
7874 Compiling with `ThinLTO <https://clang.llvm.org/docs/ThinLTO.html>`_
7875 causes the building of a compact summary of the module that is emitted into
7876 the bitcode. The summary is emitted into the LLVM assembly and identified
7877 in syntax by a caret ('``^``').
7879 The summary is parsed into a bitcode output, along with the Module
7880 IR, via the "``llvm-as``" tool. Tools that parse the Module IR for the purposes
7881 of optimization (e.g. "``clang -x ir``" and "``opt``"), will ignore the
7882 summary entries (just as they currently ignore summary entries in a bitcode
7885 Eventually, the summary will be parsed into a ModuleSummaryIndex object under
7886 the same conditions where summary index is currently built from bitcode.
7887 Specifically, tools that test the Thin Link portion of a ThinLTO compile
7888 (i.e. llvm-lto and llvm-lto2), or when parsing a combined index
7889 for a distributed ThinLTO backend via clang's "``-fthinlto-index=<>``" flag
7890 (this part is not yet implemented, use llvm-as to create a bitcode object
7891 before feeding into thin link tools for now).
7893 There are currently 3 types of summary entries in the LLVM assembly:
7894 :ref:`module paths<module_path_summary>`,
7895 :ref:`global values<gv_summary>`, and
7896 :ref:`type identifiers<typeid_summary>`.
7898 .. _module_path_summary:
7900 Module Path Summary Entry
7901 -------------------------
7903 Each module path summary entry lists a module containing global values included
7904 in the summary. For a single IR module there will be one such entry, but
7905 in a combined summary index produced during the thin link, there will be
7906 one module path entry per linked module with summary.
7910 .. code-block:: text
7912 ^0 = module: (path: "/path/to/file.o", hash: (2468601609, 1329373163, 1565878005, 638838075, 3148790418))
7914 The ``path`` field is a string path to the bitcode file, and the ``hash``
7915 field is the 160-bit SHA-1 hash of the IR bitcode contents, used for
7916 incremental builds and caching.
7920 Global Value Summary Entry
7921 --------------------------
7923 Each global value summary entry corresponds to a global value defined or
7924 referenced by a summarized module.
7928 .. code-block:: text
7930 ^4 = gv: (name: "f"[, summaries: (Summary)[, (Summary)]*]?) ; guid = 14740650423002898831
7932 For declarations, there will not be a summary list. For definitions, a
7933 global value will contain a list of summaries, one per module containing
7934 a definition. There can be multiple entries in a combined summary index
7935 for symbols with weak linkage.
7937 Each ``Summary`` format will depend on whether the global value is a
7938 :ref:`function<function_summary>`, :ref:`variable<variable_summary>`, or
7939 :ref:`alias<alias_summary>`.
7941 .. _function_summary:
7946 If the global value is a function, the ``Summary`` entry will look like:
7948 .. code-block:: text
7950 function: (module: ^0, flags: (linkage: external, notEligibleToImport: 0, live: 0, dsoLocal: 0), insts: 2[, FuncFlags]?[, Calls]?[, TypeIdInfo]?[, Params]?[, Refs]?
7952 The ``module`` field includes the summary entry id for the module containing
7953 this definition, and the ``flags`` field contains information such as
7954 the linkage type, a flag indicating whether it is legal to import the
7955 definition, whether it is globally live and whether the linker resolved it
7956 to a local definition (the latter two are populated during the thin link).
7957 The ``insts`` field contains the number of IR instructions in the function.
7958 Finally, there are several optional fields: :ref:`FuncFlags<funcflags_summary>`,
7959 :ref:`Calls<calls_summary>`, :ref:`TypeIdInfo<typeidinfo_summary>`,
7960 :ref:`Params<params_summary>`, :ref:`Refs<refs_summary>`.
7962 .. _variable_summary:
7964 Global Variable Summary
7965 ^^^^^^^^^^^^^^^^^^^^^^^
7967 If the global value is a variable, the ``Summary`` entry will look like:
7969 .. code-block:: text
7971 variable: (module: ^0, flags: (linkage: external, notEligibleToImport: 0, live: 0, dsoLocal: 0)[, Refs]?
7973 The variable entry contains a subset of the fields in a
7974 :ref:`function summary <function_summary>`, see the descriptions there.
7981 If the global value is an alias, the ``Summary`` entry will look like:
7983 .. code-block:: text
7985 alias: (module: ^0, flags: (linkage: external, notEligibleToImport: 0, live: 0, dsoLocal: 0), aliasee: ^2)
7987 The ``module`` and ``flags`` fields are as described for a
7988 :ref:`function summary <function_summary>`. The ``aliasee`` field
7989 contains a reference to the global value summary entry of the aliasee.
7991 .. _funcflags_summary:
7996 The optional ``FuncFlags`` field looks like:
7998 .. code-block:: text
8000 funcFlags: (readNone: 0, readOnly: 0, noRecurse: 0, returnDoesNotAlias: 0, noInline: 0, alwaysInline: 0, noUnwind: 1, mayThrow: 0, hasUnknownCall: 0)
8002 If unspecified, flags are assumed to hold the conservative ``false`` value of
8010 The optional ``Calls`` field looks like:
8012 .. code-block:: text
8014 calls: ((Callee)[, (Callee)]*)
8016 where each ``Callee`` looks like:
8018 .. code-block:: text
8020 callee: ^1[, hotness: None]?[, relbf: 0]?
8022 The ``callee`` refers to the summary entry id of the callee. At most one
8023 of ``hotness`` (which can take the values ``Unknown``, ``Cold``, ``None``,
8024 ``Hot``, and ``Critical``), and ``relbf`` (which holds the integer
8025 branch frequency relative to the entry frequency, scaled down by 2^8)
8026 may be specified. The defaults are ``Unknown`` and ``0``, respectively.
8033 The optional ``Params`` is used by ``StackSafety`` and looks like:
8035 .. code-block:: text
8037 Params: ((Param)[, (Param)]*)
8039 where each ``Param`` describes pointer parameter access inside of the
8040 function and looks like:
8042 .. code-block:: text
8044 param: 4, offset: [0, 5][, calls: ((Callee)[, (Callee)]*)]?
8046 where the first ``param`` is the number of the parameter it describes,
8047 ``offset`` is the inclusive range of offsets from the pointer parameter to bytes
8048 which can be accessed by the function. This range does not include accesses by
8049 function calls from ``calls`` list.
8051 where each ``Callee`` describes how parameter is forwarded into other
8052 functions and looks like:
8054 .. code-block:: text
8056 callee: ^3, param: 5, offset: [-3, 3]
8058 The ``callee`` refers to the summary entry id of the callee, ``param`` is
8059 the number of the callee parameter which points into the callers parameter
8060 with offset known to be inside of the ``offset`` range. ``calls`` will be
8061 consumed and removed by thin link stage to update ``Param::offset`` so it
8062 covers all accesses possible by ``calls``.
8064 Pointer parameter without corresponding ``Param`` is considered unsafe and we
8065 assume that access with any offset is possible.
8069 If we have the following function:
8071 .. code-block:: text
8073 define i64 @foo(ptr %0, ptr %1, ptr %2, i8 %3) {
8074 store ptr %1, ptr @x
8075 %5 = getelementptr inbounds i8, ptr %2, i64 5
8076 %6 = load i8, ptr %5
8077 %7 = getelementptr inbounds i8, ptr %2, i8 %3
8078 tail call void @bar(i8 %3, ptr %7)
8079 %8 = load i64, ptr %0
8083 We can expect the record like this:
8085 .. code-block:: text
8087 params: ((param: 0, offset: [0, 7]),(param: 2, offset: [5, 5], calls: ((callee: ^3, param: 1, offset: [-128, 127]))))
8089 The function may access just 8 bytes of the parameter %0 . ``calls`` is empty,
8090 so the parameter is either not used for function calls or ``offset`` already
8091 covers all accesses from nested function calls.
8092 Parameter %1 escapes, so access is unknown.
8093 The function itself can access just a single byte of the parameter %2. Additional
8094 access is possible inside of the ``@bar`` or ``^3``. The function adds signed
8095 offset to the pointer and passes the result as the argument %1 into ``^3``.
8096 This record itself does not tell us how ``^3`` will access the parameter.
8097 Parameter %3 is not a pointer.
8104 The optional ``Refs`` field looks like:
8106 .. code-block:: text
8108 refs: ((Ref)[, (Ref)]*)
8110 where each ``Ref`` contains a reference to the summary id of the referenced
8111 value (e.g. ``^1``).
8113 .. _typeidinfo_summary:
8118 The optional ``TypeIdInfo`` field, used for
8119 `Control Flow Integrity <https://clang.llvm.org/docs/ControlFlowIntegrity.html>`_,
8122 .. code-block:: text
8124 typeIdInfo: [(TypeTests)]?[, (TypeTestAssumeVCalls)]?[, (TypeCheckedLoadVCalls)]?[, (TypeTestAssumeConstVCalls)]?[, (TypeCheckedLoadConstVCalls)]?
8126 These optional fields have the following forms:
8131 .. code-block:: text
8133 typeTests: (TypeIdRef[, TypeIdRef]*)
8135 Where each ``TypeIdRef`` refers to a :ref:`type id<typeid_summary>`
8136 by summary id or ``GUID``.
8138 TypeTestAssumeVCalls
8139 """"""""""""""""""""
8141 .. code-block:: text
8143 typeTestAssumeVCalls: (VFuncId[, VFuncId]*)
8145 Where each VFuncId has the format:
8147 .. code-block:: text
8149 vFuncId: (TypeIdRef, offset: 16)
8151 Where each ``TypeIdRef`` refers to a :ref:`type id<typeid_summary>`
8152 by summary id or ``GUID`` preceded by a ``guid:`` tag.
8154 TypeCheckedLoadVCalls
8155 """""""""""""""""""""
8157 .. code-block:: text
8159 typeCheckedLoadVCalls: (VFuncId[, VFuncId]*)
8161 Where each VFuncId has the format described for ``TypeTestAssumeVCalls``.
8163 TypeTestAssumeConstVCalls
8164 """""""""""""""""""""""""
8166 .. code-block:: text
8168 typeTestAssumeConstVCalls: (ConstVCall[, ConstVCall]*)
8170 Where each ConstVCall has the format:
8172 .. code-block:: text
8174 (VFuncId, args: (Arg[, Arg]*))
8176 and where each VFuncId has the format described for ``TypeTestAssumeVCalls``,
8177 and each Arg is an integer argument number.
8179 TypeCheckedLoadConstVCalls
8180 """"""""""""""""""""""""""
8182 .. code-block:: text
8184 typeCheckedLoadConstVCalls: (ConstVCall[, ConstVCall]*)
8186 Where each ConstVCall has the format described for
8187 ``TypeTestAssumeConstVCalls``.
8191 Type ID Summary Entry
8192 ---------------------
8194 Each type id summary entry corresponds to a type identifier resolution
8195 which is generated during the LTO link portion of the compile when building
8196 with `Control Flow Integrity <https://clang.llvm.org/docs/ControlFlowIntegrity.html>`_,
8197 so these are only present in a combined summary index.
8201 .. code-block:: text
8203 ^4 = typeid: (name: "_ZTS1A", summary: (typeTestRes: (kind: allOnes, sizeM1BitWidth: 7[, alignLog2: 0]?[, sizeM1: 0]?[, bitMask: 0]?[, inlineBits: 0]?)[, WpdResolutions]?)) ; guid = 7004155349499253778
8205 The ``typeTestRes`` gives the type test resolution ``kind`` (which may
8206 be ``unsat``, ``byteArray``, ``inline``, ``single``, or ``allOnes``), and
8207 the ``size-1`` bit width. It is followed by optional flags, which default to 0,
8208 and an optional WpdResolutions (whole program devirtualization resolution)
8209 field that looks like:
8211 .. code-block:: text
8213 wpdResolutions: ((offset: 0, WpdRes)[, (offset: 1, WpdRes)]*
8215 where each entry is a mapping from the given byte offset to the whole-program
8216 devirtualization resolution WpdRes, that has one of the following formats:
8218 .. code-block:: text
8220 wpdRes: (kind: branchFunnel)
8221 wpdRes: (kind: singleImpl, singleImplName: "_ZN1A1nEi")
8222 wpdRes: (kind: indir)
8224 Additionally, each wpdRes has an optional ``resByArg`` field, which
8225 describes the resolutions for calls with all constant integer arguments:
8227 .. code-block:: text
8229 resByArg: (ResByArg[, ResByArg]*)
8233 .. code-block:: text
8235 args: (Arg[, Arg]*), byArg: (kind: UniformRetVal[, info: 0][, byte: 0][, bit: 0])
8237 Where the ``kind`` can be ``Indir``, ``UniformRetVal``, ``UniqueRetVal``
8238 or ``VirtualConstProp``. The ``info`` field is only used if the kind
8239 is ``UniformRetVal`` (indicates the uniform return value), or
8240 ``UniqueRetVal`` (holds the return value associated with the unique vtable
8241 (0 or 1)). The ``byte`` and ``bit`` fields are only used if the target does
8242 not support the use of absolute symbols to store constants.
8244 .. _intrinsicglobalvariables:
8246 Intrinsic Global Variables
8247 ==========================
8249 LLVM has a number of "magic" global variables that contain data that
8250 affect code generation or other IR semantics. These are documented here.
8251 All globals of this sort should have a section specified as
8252 "``llvm.metadata``". This section and all globals that start with
8253 "``llvm.``" are reserved for use by LLVM.
8257 The '``llvm.used``' Global Variable
8258 -----------------------------------
8260 The ``@llvm.used`` global is an array which has
8261 :ref:`appending linkage <linkage_appending>`. This array contains a list of
8262 pointers to named global variables, functions and aliases which may optionally
8263 have a pointer cast formed of bitcast or getelementptr. For example, a legal
8266 .. code-block:: llvm
8271 @llvm.used = appending global [2 x ptr] [
8274 ], section "llvm.metadata"
8276 If a symbol appears in the ``@llvm.used`` list, then the compiler, assembler,
8277 and linker are required to treat the symbol as if there is a reference to the
8278 symbol that it cannot see (which is why they have to be named). For example, if
8279 a variable has internal linkage and no references other than that from the
8280 ``@llvm.used`` list, it cannot be deleted. This is commonly used to represent
8281 references from inline asms and other things the compiler cannot "see", and
8282 corresponds to "``attribute((used))``" in GNU C.
8284 On some targets, the code generator must emit a directive to the
8285 assembler or object file to prevent the assembler and linker from
8286 removing the symbol.
8288 .. _gv_llvmcompilerused:
8290 The '``llvm.compiler.used``' Global Variable
8291 --------------------------------------------
8293 The ``@llvm.compiler.used`` directive is the same as the ``@llvm.used``
8294 directive, except that it only prevents the compiler from touching the
8295 symbol. On targets that support it, this allows an intelligent linker to
8296 optimize references to the symbol without being impeded as it would be
8299 This is a rare construct that should only be used in rare circumstances,
8300 and should not be exposed to source languages.
8302 .. _gv_llvmglobalctors:
8304 The '``llvm.global_ctors``' Global Variable
8305 -------------------------------------------
8307 .. code-block:: llvm
8309 %0 = type { i32, ptr, ptr }
8310 @llvm.global_ctors = appending global [1 x %0] [%0 { i32 65535, ptr @ctor, ptr @data }]
8312 The ``@llvm.global_ctors`` array contains a list of constructor
8313 functions, priorities, and an associated global or function.
8314 The functions referenced by this array will be called in ascending order
8315 of priority (i.e. lowest first) when the module is loaded. The order of
8316 functions with the same priority is not defined.
8318 If the third field is non-null, and points to a global variable
8319 or function, the initializer function will only run if the associated
8320 data from the current module is not discarded.
8321 On ELF the referenced global variable or function must be in a comdat.
8323 .. _llvmglobaldtors:
8325 The '``llvm.global_dtors``' Global Variable
8326 -------------------------------------------
8328 .. code-block:: llvm
8330 %0 = type { i32, ptr, ptr }
8331 @llvm.global_dtors = appending global [1 x %0] [%0 { i32 65535, ptr @dtor, ptr @data }]
8333 The ``@llvm.global_dtors`` array contains a list of destructor
8334 functions, priorities, and an associated global or function.
8335 The functions referenced by this array will be called in descending
8336 order of priority (i.e. highest first) when the module is unloaded. The
8337 order of functions with the same priority is not defined.
8339 If the third field is non-null, and points to a global variable
8340 or function, the destructor function will only run if the associated
8341 data from the current module is not discarded.
8342 On ELF the referenced global variable or function must be in a comdat.
8344 Instruction Reference
8345 =====================
8347 The LLVM instruction set consists of several different classifications
8348 of instructions: :ref:`terminator instructions <terminators>`, :ref:`binary
8349 instructions <binaryops>`, :ref:`bitwise binary
8350 instructions <bitwiseops>`, :ref:`memory instructions <memoryops>`, and
8351 :ref:`other instructions <otherops>`.
8355 Terminator Instructions
8356 -----------------------
8358 As mentioned :ref:`previously <functionstructure>`, every basic block in a
8359 program ends with a "Terminator" instruction, which indicates which
8360 block should be executed after the current block is finished. These
8361 terminator instructions typically yield a '``void``' value: they produce
8362 control flow, not values (the one exception being the
8363 ':ref:`invoke <i_invoke>`' instruction).
8365 The terminator instructions are: ':ref:`ret <i_ret>`',
8366 ':ref:`br <i_br>`', ':ref:`switch <i_switch>`',
8367 ':ref:`indirectbr <i_indirectbr>`', ':ref:`invoke <i_invoke>`',
8368 ':ref:`callbr <i_callbr>`'
8369 ':ref:`resume <i_resume>`', ':ref:`catchswitch <i_catchswitch>`',
8370 ':ref:`catchret <i_catchret>`',
8371 ':ref:`cleanupret <i_cleanupret>`',
8372 and ':ref:`unreachable <i_unreachable>`'.
8376 '``ret``' Instruction
8377 ^^^^^^^^^^^^^^^^^^^^^
8384 ret <type> <value> ; Return a value from a non-void function
8385 ret void ; Return from void function
8390 The '``ret``' instruction is used to return control flow (and optionally
8391 a value) from a function back to the caller.
8393 There are two forms of the '``ret``' instruction: one that returns a
8394 value and then causes control flow, and one that just causes control
8400 The '``ret``' instruction optionally accepts a single argument, the
8401 return value. The type of the return value must be a ':ref:`first
8402 class <t_firstclass>`' type.
8404 A function is not :ref:`well formed <wellformed>` if it has a non-void
8405 return type and contains a '``ret``' instruction with no return value or
8406 a return value with a type that does not match its type, or if it has a
8407 void return type and contains a '``ret``' instruction with a return
8413 When the '``ret``' instruction is executed, control flow returns back to
8414 the calling function's context. If the caller is a
8415 ":ref:`call <i_call>`" instruction, execution continues at the
8416 instruction after the call. If the caller was an
8417 ":ref:`invoke <i_invoke>`" instruction, execution continues at the
8418 beginning of the "normal" destination block. If the instruction returns
8419 a value, that value shall set the call or invoke instruction's return
8425 .. code-block:: llvm
8427 ret i32 5 ; Return an integer value of 5
8428 ret void ; Return from a void function
8429 ret { i32, i8 } { i32 4, i8 2 } ; Return a struct of values 4 and 2
8433 '``br``' Instruction
8434 ^^^^^^^^^^^^^^^^^^^^
8441 br i1 <cond>, label <iftrue>, label <iffalse>
8442 br label <dest> ; Unconditional branch
8447 The '``br``' instruction is used to cause control flow to transfer to a
8448 different basic block in the current function. There are two forms of
8449 this instruction, corresponding to a conditional branch and an
8450 unconditional branch.
8455 The conditional branch form of the '``br``' instruction takes a single
8456 '``i1``' value and two '``label``' values. The unconditional form of the
8457 '``br``' instruction takes a single '``label``' value as a target.
8462 Upon execution of a conditional '``br``' instruction, the '``i1``'
8463 argument is evaluated. If the value is ``true``, control flows to the
8464 '``iftrue``' ``label`` argument. If "cond" is ``false``, control flows
8465 to the '``iffalse``' ``label`` argument.
8466 If '``cond``' is ``poison`` or ``undef``, this instruction has undefined
8472 .. code-block:: llvm
8475 %cond = icmp eq i32 %a, %b
8476 br i1 %cond, label %IfEqual, label %IfUnequal
8484 '``switch``' Instruction
8485 ^^^^^^^^^^^^^^^^^^^^^^^^
8492 switch <intty> <value>, label <defaultdest> [ <intty> <val>, label <dest> ... ]
8497 The '``switch``' instruction is used to transfer control flow to one of
8498 several different places. It is a generalization of the '``br``'
8499 instruction, allowing a branch to occur to one of many possible
8505 The '``switch``' instruction uses three parameters: an integer
8506 comparison value '``value``', a default '``label``' destination, and an
8507 array of pairs of comparison value constants and '``label``'s. The table
8508 is not allowed to contain duplicate constant entries.
8513 The ``switch`` instruction specifies a table of values and destinations.
8514 When the '``switch``' instruction is executed, this table is searched
8515 for the given value. If the value is found, control flow is transferred
8516 to the corresponding destination; otherwise, control flow is transferred
8517 to the default destination.
8518 If '``value``' is ``poison`` or ``undef``, this instruction has undefined
8524 Depending on properties of the target machine and the particular
8525 ``switch`` instruction, this instruction may be code generated in
8526 different ways. For example, it could be generated as a series of
8527 chained conditional branches or with a lookup table.
8532 .. code-block:: llvm
8534 ; Emulate a conditional br instruction
8535 %Val = zext i1 %value to i32
8536 switch i32 %Val, label %truedest [ i32 0, label %falsedest ]
8538 ; Emulate an unconditional br instruction
8539 switch i32 0, label %dest [ ]
8541 ; Implement a jump table:
8542 switch i32 %val, label %otherwise [ i32 0, label %onzero
8544 i32 2, label %ontwo ]
8548 '``indirectbr``' Instruction
8549 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
8556 indirectbr ptr <address>, [ label <dest1>, label <dest2>, ... ]
8561 The '``indirectbr``' instruction implements an indirect branch to a
8562 label within the current function, whose address is specified by
8563 "``address``". Address must be derived from a
8564 :ref:`blockaddress <blockaddress>` constant.
8569 The '``address``' argument is the address of the label to jump to. The
8570 rest of the arguments indicate the full set of possible destinations
8571 that the address may point to. Blocks are allowed to occur multiple
8572 times in the destination list, though this isn't particularly useful.
8574 This destination list is required so that dataflow analysis has an
8575 accurate understanding of the CFG.
8580 Control transfers to the block specified in the address argument. All
8581 possible destination blocks must be listed in the label list, otherwise
8582 this instruction has undefined behavior. This implies that jumps to
8583 labels defined in other functions have undefined behavior as well.
8584 If '``address``' is ``poison`` or ``undef``, this instruction has undefined
8590 This is typically implemented with a jump through a register.
8595 .. code-block:: llvm
8597 indirectbr ptr %Addr, [ label %bb1, label %bb2, label %bb3 ]
8601 '``invoke``' Instruction
8602 ^^^^^^^^^^^^^^^^^^^^^^^^
8609 <result> = invoke [cconv] [ret attrs] [addrspace(<num>)] <ty>|<fnty> <fnptrval>(<function args>) [fn attrs]
8610 [operand bundles] to label <normal label> unwind label <exception label>
8615 The '``invoke``' instruction causes control to transfer to a specified
8616 function, with the possibility of control flow transfer to either the
8617 '``normal``' label or the '``exception``' label. If the callee function
8618 returns with the "``ret``" instruction, control flow will return to the
8619 "normal" label. If the callee (or any indirect callees) returns via the
8620 ":ref:`resume <i_resume>`" instruction or other exception handling
8621 mechanism, control is interrupted and continued at the dynamically
8622 nearest "exception" label.
8624 The '``exception``' label is a `landing
8625 pad <ExceptionHandling.html#overview>`_ for the exception. As such,
8626 '``exception``' label is required to have the
8627 ":ref:`landingpad <i_landingpad>`" instruction, which contains the
8628 information about the behavior of the program after unwinding happens,
8629 as its first non-PHI instruction. The restrictions on the
8630 "``landingpad``" instruction's tightly couples it to the "``invoke``"
8631 instruction, so that the important information contained within the
8632 "``landingpad``" instruction can't be lost through normal code motion.
8637 This instruction requires several arguments:
8639 #. The optional "cconv" marker indicates which :ref:`calling
8640 convention <callingconv>` the call should use. If none is
8641 specified, the call defaults to using C calling conventions.
8642 #. The optional :ref:`Parameter Attributes <paramattrs>` list for return
8643 values. Only '``zeroext``', '``signext``', and '``inreg``' attributes
8645 #. The optional addrspace attribute can be used to indicate the address space
8646 of the called function. If it is not specified, the program address space
8647 from the :ref:`datalayout string<langref_datalayout>` will be used.
8648 #. '``ty``': the type of the call instruction itself which is also the
8649 type of the return value. Functions that return no value are marked
8651 #. '``fnty``': shall be the signature of the function being invoked. The
8652 argument types must match the types implied by this signature. This
8653 type can be omitted if the function is not varargs.
8654 #. '``fnptrval``': An LLVM value containing a pointer to a function to
8655 be invoked. In most cases, this is a direct function invocation, but
8656 indirect ``invoke``'s are just as possible, calling an arbitrary pointer
8658 #. '``function args``': argument list whose types match the function
8659 signature argument types and parameter attributes. All arguments must
8660 be of :ref:`first class <t_firstclass>` type. If the function signature
8661 indicates the function accepts a variable number of arguments, the
8662 extra arguments can be specified.
8663 #. '``normal label``': the label reached when the called function
8664 executes a '``ret``' instruction.
8665 #. '``exception label``': the label reached when a callee returns via
8666 the :ref:`resume <i_resume>` instruction or other exception handling
8668 #. The optional :ref:`function attributes <fnattrs>` list.
8669 #. The optional :ref:`operand bundles <opbundles>` list.
8674 This instruction is designed to operate as a standard '``call``'
8675 instruction in most regards. The primary difference is that it
8676 establishes an association with a label, which is used by the runtime
8677 library to unwind the stack.
8679 This instruction is used in languages with destructors to ensure that
8680 proper cleanup is performed in the case of either a ``longjmp`` or a
8681 thrown exception. Additionally, this is important for implementation of
8682 '``catch``' clauses in high-level languages that support them.
8684 For the purposes of the SSA form, the definition of the value returned
8685 by the '``invoke``' instruction is deemed to occur on the edge from the
8686 current block to the "normal" label. If the callee unwinds then no
8687 return value is available.
8692 .. code-block:: llvm
8694 %retval = invoke i32 @Test(i32 15) to label %Continue
8695 unwind label %TestCleanup ; i32:retval set
8696 %retval = invoke coldcc i32 %Testfnptr(i32 15) to label %Continue
8697 unwind label %TestCleanup ; i32:retval set
8701 '``callbr``' Instruction
8702 ^^^^^^^^^^^^^^^^^^^^^^^^
8709 <result> = callbr [cconv] [ret attrs] [addrspace(<num>)] <ty>|<fnty> <fnptrval>(<function args>) [fn attrs]
8710 [operand bundles] to label <fallthrough label> [indirect labels]
8715 The '``callbr``' instruction causes control to transfer to a specified
8716 function, with the possibility of control flow transfer to either the
8717 '``fallthrough``' label or one of the '``indirect``' labels.
8719 This instruction should only be used to implement the "goto" feature of gcc
8720 style inline assembly. Any other usage is an error in the IR verifier.
8722 Note that in order to support outputs along indirect edges, LLVM may need to
8723 split critical edges, which may require synthesizing a replacement block for
8724 the ``indirect labels``. Therefore, the address of a label as seen by another
8725 ``callbr`` instruction, or for a :ref:`blockaddress <blockaddress>` constant,
8726 may not be equal to the address provided for the same block to this
8727 instruction's ``indirect labels`` operand. The assembly code may only transfer
8728 control to addresses provided via this instruction's ``indirect labels``.
8733 This instruction requires several arguments:
8735 #. The optional "cconv" marker indicates which :ref:`calling
8736 convention <callingconv>` the call should use. If none is
8737 specified, the call defaults to using C calling conventions.
8738 #. The optional :ref:`Parameter Attributes <paramattrs>` list for return
8739 values. Only '``zeroext``', '``signext``', and '``inreg``' attributes
8741 #. The optional addrspace attribute can be used to indicate the address space
8742 of the called function. If it is not specified, the program address space
8743 from the :ref:`datalayout string<langref_datalayout>` will be used.
8744 #. '``ty``': the type of the call instruction itself which is also the
8745 type of the return value. Functions that return no value are marked
8747 #. '``fnty``': shall be the signature of the function being called. The
8748 argument types must match the types implied by this signature. This
8749 type can be omitted if the function is not varargs.
8750 #. '``fnptrval``': An LLVM value containing a pointer to a function to
8751 be called. In most cases, this is a direct function call, but
8752 other ``callbr``'s are just as possible, calling an arbitrary pointer
8754 #. '``function args``': argument list whose types match the function
8755 signature argument types and parameter attributes. All arguments must
8756 be of :ref:`first class <t_firstclass>` type. If the function signature
8757 indicates the function accepts a variable number of arguments, the
8758 extra arguments can be specified.
8759 #. '``fallthrough label``': the label reached when the inline assembly's
8760 execution exits the bottom.
8761 #. '``indirect labels``': the labels reached when a callee transfers control
8762 to a location other than the '``fallthrough label``'. Label constraints
8763 refer to these destinations.
8764 #. The optional :ref:`function attributes <fnattrs>` list.
8765 #. The optional :ref:`operand bundles <opbundles>` list.
8770 This instruction is designed to operate as a standard '``call``'
8771 instruction in most regards. The primary difference is that it
8772 establishes an association with additional labels to define where control
8773 flow goes after the call.
8775 The output values of a '``callbr``' instruction are available only to
8776 the '``fallthrough``' block, not to any '``indirect``' blocks(s).
8778 The only use of this today is to implement the "goto" feature of gcc inline
8779 assembly where additional labels can be provided as locations for the inline
8780 assembly to jump to.
8785 .. code-block:: llvm
8787 ; "asm goto" without output constraints.
8788 callbr void asm "", "r,!i"(i32 %x)
8789 to label %fallthrough [label %indirect]
8791 ; "asm goto" with output constraints.
8792 <result> = callbr i32 asm "", "=r,r,!i"(i32 %x)
8793 to label %fallthrough [label %indirect]
8797 '``resume``' Instruction
8798 ^^^^^^^^^^^^^^^^^^^^^^^^
8805 resume <type> <value>
8810 The '``resume``' instruction is a terminator instruction that has no
8816 The '``resume``' instruction requires one argument, which must have the
8817 same type as the result of any '``landingpad``' instruction in the same
8823 The '``resume``' instruction resumes propagation of an existing
8824 (in-flight) exception whose unwinding was interrupted with a
8825 :ref:`landingpad <i_landingpad>` instruction.
8830 .. code-block:: llvm
8832 resume { ptr, i32 } %exn
8836 '``catchswitch``' Instruction
8837 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
8844 <resultval> = catchswitch within <parent> [ label <handler1>, label <handler2>, ... ] unwind to caller
8845 <resultval> = catchswitch within <parent> [ label <handler1>, label <handler2>, ... ] unwind label <default>
8850 The '``catchswitch``' instruction is used by `LLVM's exception handling system
8851 <ExceptionHandling.html#overview>`_ to describe the set of possible catch handlers
8852 that may be executed by the :ref:`EH personality routine <personalityfn>`.
8857 The ``parent`` argument is the token of the funclet that contains the
8858 ``catchswitch`` instruction. If the ``catchswitch`` is not inside a funclet,
8859 this operand may be the token ``none``.
8861 The ``default`` argument is the label of another basic block beginning with
8862 either a ``cleanuppad`` or ``catchswitch`` instruction. This unwind destination
8863 must be a legal target with respect to the ``parent`` links, as described in
8864 the `exception handling documentation\ <ExceptionHandling.html#wineh-constraints>`_.
8866 The ``handlers`` are a nonempty list of successor blocks that each begin with a
8867 :ref:`catchpad <i_catchpad>` instruction.
8872 Executing this instruction transfers control to one of the successors in
8873 ``handlers``, if appropriate, or continues to unwind via the unwind label if
8876 The ``catchswitch`` is both a terminator and a "pad" instruction, meaning that
8877 it must be both the first non-phi instruction and last instruction in the basic
8878 block. Therefore, it must be the only non-phi instruction in the block.
8883 .. code-block:: text
8886 %cs1 = catchswitch within none [label %handler0, label %handler1] unwind to caller
8888 %cs2 = catchswitch within %parenthandler [label %handler0] unwind label %cleanup
8892 '``catchret``' Instruction
8893 ^^^^^^^^^^^^^^^^^^^^^^^^^^
8900 catchret from <token> to label <normal>
8905 The '``catchret``' instruction is a terminator instruction that has a
8912 The first argument to a '``catchret``' indicates which ``catchpad`` it
8913 exits. It must be a :ref:`catchpad <i_catchpad>`.
8914 The second argument to a '``catchret``' specifies where control will
8920 The '``catchret``' instruction ends an existing (in-flight) exception whose
8921 unwinding was interrupted with a :ref:`catchpad <i_catchpad>` instruction. The
8922 :ref:`personality function <personalityfn>` gets a chance to execute arbitrary
8923 code to, for example, destroy the active exception. Control then transfers to
8926 The ``token`` argument must be a token produced by a ``catchpad`` instruction.
8927 If the specified ``catchpad`` is not the most-recently-entered not-yet-exited
8928 funclet pad (as described in the `EH documentation\ <ExceptionHandling.html#wineh-constraints>`_),
8929 the ``catchret``'s behavior is undefined.
8934 .. code-block:: text
8936 catchret from %catch to label %continue
8940 '``cleanupret``' Instruction
8941 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
8948 cleanupret from <value> unwind label <continue>
8949 cleanupret from <value> unwind to caller
8954 The '``cleanupret``' instruction is a terminator instruction that has
8955 an optional successor.
8961 The '``cleanupret``' instruction requires one argument, which indicates
8962 which ``cleanuppad`` it exits, and must be a :ref:`cleanuppad <i_cleanuppad>`.
8963 If the specified ``cleanuppad`` is not the most-recently-entered not-yet-exited
8964 funclet pad (as described in the `EH documentation\ <ExceptionHandling.html#wineh-constraints>`_),
8965 the ``cleanupret``'s behavior is undefined.
8967 The '``cleanupret``' instruction also has an optional successor, ``continue``,
8968 which must be the label of another basic block beginning with either a
8969 ``cleanuppad`` or ``catchswitch`` instruction. This unwind destination must
8970 be a legal target with respect to the ``parent`` links, as described in the
8971 `exception handling documentation\ <ExceptionHandling.html#wineh-constraints>`_.
8976 The '``cleanupret``' instruction indicates to the
8977 :ref:`personality function <personalityfn>` that one
8978 :ref:`cleanuppad <i_cleanuppad>` it transferred control to has ended.
8979 It transfers control to ``continue`` or unwinds out of the function.
8984 .. code-block:: text
8986 cleanupret from %cleanup unwind to caller
8987 cleanupret from %cleanup unwind label %continue
8991 '``unreachable``' Instruction
8992 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
9004 The '``unreachable``' instruction has no defined semantics. This
9005 instruction is used to inform the optimizer that a particular portion of
9006 the code is not reachable. This can be used to indicate that the code
9007 after a no-return function cannot be reached, and other facts.
9012 The '``unreachable``' instruction has no defined semantics.
9019 Unary operators require a single operand, execute an operation on
9020 it, and produce a single value. The operand might represent multiple
9021 data, as is the case with the :ref:`vector <t_vector>` data type. The
9022 result value has the same type as its operand.
9026 '``fneg``' Instruction
9027 ^^^^^^^^^^^^^^^^^^^^^^
9034 <result> = fneg [fast-math flags]* <ty> <op1> ; yields ty:result
9039 The '``fneg``' instruction returns the negation of its operand.
9044 The argument to the '``fneg``' instruction must be a
9045 :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of
9046 floating-point values.
9051 The value produced is a copy of the operand with its sign bit flipped.
9052 This instruction can also take any number of :ref:`fast-math
9053 flags <fastmath>`, which are optimization hints to enable otherwise
9054 unsafe floating-point optimizations:
9059 .. code-block:: text
9061 <result> = fneg float %val ; yields float:result = -%var
9068 Binary operators are used to do most of the computation in a program.
9069 They require two operands of the same type, execute an operation on
9070 them, and produce a single value. The operands might represent multiple
9071 data, as is the case with the :ref:`vector <t_vector>` data type. The
9072 result value has the same type as its operands.
9074 There are several different binary operators:
9078 '``add``' Instruction
9079 ^^^^^^^^^^^^^^^^^^^^^
9086 <result> = add <ty> <op1>, <op2> ; yields ty:result
9087 <result> = add nuw <ty> <op1>, <op2> ; yields ty:result
9088 <result> = add nsw <ty> <op1>, <op2> ; yields ty:result
9089 <result> = add nuw nsw <ty> <op1>, <op2> ; yields ty:result
9094 The '``add``' instruction returns the sum of its two operands.
9099 The two arguments to the '``add``' instruction must be
9100 :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
9101 arguments must have identical types.
9106 The value produced is the integer sum of the two operands.
9108 If the sum has unsigned overflow, the result returned is the
9109 mathematical result modulo 2\ :sup:`n`\ , where n is the bit width of
9112 Because LLVM integers use a two's complement representation, this
9113 instruction is appropriate for both signed and unsigned integers.
9115 ``nuw`` and ``nsw`` stand for "No Unsigned Wrap" and "No Signed Wrap",
9116 respectively. If the ``nuw`` and/or ``nsw`` keywords are present, the
9117 result value of the ``add`` is a :ref:`poison value <poisonvalues>` if
9118 unsigned and/or signed overflow, respectively, occurs.
9123 .. code-block:: text
9125 <result> = add i32 4, %var ; yields i32:result = 4 + %var
9129 '``fadd``' Instruction
9130 ^^^^^^^^^^^^^^^^^^^^^^
9137 <result> = fadd [fast-math flags]* <ty> <op1>, <op2> ; yields ty:result
9142 The '``fadd``' instruction returns the sum of its two operands.
9147 The two arguments to the '``fadd``' instruction must be
9148 :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of
9149 floating-point values. Both arguments must have identical types.
9154 The value produced is the floating-point sum of the two operands.
9155 This instruction is assumed to execute in the default :ref:`floating-point
9156 environment <floatenv>`.
9157 This instruction can also take any number of :ref:`fast-math
9158 flags <fastmath>`, which are optimization hints to enable otherwise
9159 unsafe floating-point optimizations:
9164 .. code-block:: text
9166 <result> = fadd float 4.0, %var ; yields float:result = 4.0 + %var
9170 '``sub``' Instruction
9171 ^^^^^^^^^^^^^^^^^^^^^
9178 <result> = sub <ty> <op1>, <op2> ; yields ty:result
9179 <result> = sub nuw <ty> <op1>, <op2> ; yields ty:result
9180 <result> = sub nsw <ty> <op1>, <op2> ; yields ty:result
9181 <result> = sub nuw nsw <ty> <op1>, <op2> ; yields ty:result
9186 The '``sub``' instruction returns the difference of its two operands.
9188 Note that the '``sub``' instruction is used to represent the '``neg``'
9189 instruction present in most other intermediate representations.
9194 The two arguments to the '``sub``' instruction must be
9195 :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
9196 arguments must have identical types.
9201 The value produced is the integer difference of the two operands.
9203 If the difference has unsigned overflow, the result returned is the
9204 mathematical result modulo 2\ :sup:`n`\ , where n is the bit width of
9207 Because LLVM integers use a two's complement representation, this
9208 instruction is appropriate for both signed and unsigned integers.
9210 ``nuw`` and ``nsw`` stand for "No Unsigned Wrap" and "No Signed Wrap",
9211 respectively. If the ``nuw`` and/or ``nsw`` keywords are present, the
9212 result value of the ``sub`` is a :ref:`poison value <poisonvalues>` if
9213 unsigned and/or signed overflow, respectively, occurs.
9218 .. code-block:: text
9220 <result> = sub i32 4, %var ; yields i32:result = 4 - %var
9221 <result> = sub i32 0, %val ; yields i32:result = -%var
9225 '``fsub``' Instruction
9226 ^^^^^^^^^^^^^^^^^^^^^^
9233 <result> = fsub [fast-math flags]* <ty> <op1>, <op2> ; yields ty:result
9238 The '``fsub``' instruction returns the difference of its two operands.
9243 The two arguments to the '``fsub``' instruction must be
9244 :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of
9245 floating-point values. Both arguments must have identical types.
9250 The value produced is the floating-point difference of the two operands.
9251 This instruction is assumed to execute in the default :ref:`floating-point
9252 environment <floatenv>`.
9253 This instruction can also take any number of :ref:`fast-math
9254 flags <fastmath>`, which are optimization hints to enable otherwise
9255 unsafe floating-point optimizations:
9260 .. code-block:: text
9262 <result> = fsub float 4.0, %var ; yields float:result = 4.0 - %var
9263 <result> = fsub float -0.0, %val ; yields float:result = -%var
9267 '``mul``' Instruction
9268 ^^^^^^^^^^^^^^^^^^^^^
9275 <result> = mul <ty> <op1>, <op2> ; yields ty:result
9276 <result> = mul nuw <ty> <op1>, <op2> ; yields ty:result
9277 <result> = mul nsw <ty> <op1>, <op2> ; yields ty:result
9278 <result> = mul nuw nsw <ty> <op1>, <op2> ; yields ty:result
9283 The '``mul``' instruction returns the product of its two operands.
9288 The two arguments to the '``mul``' instruction must be
9289 :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
9290 arguments must have identical types.
9295 The value produced is the integer product of the two operands.
9297 If the result of the multiplication has unsigned overflow, the result
9298 returned is the mathematical result modulo 2\ :sup:`n`\ , where n is the
9299 bit width of the result.
9301 Because LLVM integers use a two's complement representation, and the
9302 result is the same width as the operands, this instruction returns the
9303 correct result for both signed and unsigned integers. If a full product
9304 (e.g. ``i32`` * ``i32`` -> ``i64``) is needed, the operands should be
9305 sign-extended or zero-extended as appropriate to the width of the full
9308 ``nuw`` and ``nsw`` stand for "No Unsigned Wrap" and "No Signed Wrap",
9309 respectively. If the ``nuw`` and/or ``nsw`` keywords are present, the
9310 result value of the ``mul`` is a :ref:`poison value <poisonvalues>` if
9311 unsigned and/or signed overflow, respectively, occurs.
9316 .. code-block:: text
9318 <result> = mul i32 4, %var ; yields i32:result = 4 * %var
9322 '``fmul``' Instruction
9323 ^^^^^^^^^^^^^^^^^^^^^^
9330 <result> = fmul [fast-math flags]* <ty> <op1>, <op2> ; yields ty:result
9335 The '``fmul``' instruction returns the product of its two operands.
9340 The two arguments to the '``fmul``' instruction must be
9341 :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of
9342 floating-point values. Both arguments must have identical types.
9347 The value produced is the floating-point product of the two operands.
9348 This instruction is assumed to execute in the default :ref:`floating-point
9349 environment <floatenv>`.
9350 This instruction can also take any number of :ref:`fast-math
9351 flags <fastmath>`, which are optimization hints to enable otherwise
9352 unsafe floating-point optimizations:
9357 .. code-block:: text
9359 <result> = fmul float 4.0, %var ; yields float:result = 4.0 * %var
9363 '``udiv``' Instruction
9364 ^^^^^^^^^^^^^^^^^^^^^^
9371 <result> = udiv <ty> <op1>, <op2> ; yields ty:result
9372 <result> = udiv exact <ty> <op1>, <op2> ; yields ty:result
9377 The '``udiv``' instruction returns the quotient of its two operands.
9382 The two arguments to the '``udiv``' instruction must be
9383 :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
9384 arguments must have identical types.
9389 The value produced is the unsigned integer quotient of the two operands.
9391 Note that unsigned integer division and signed integer division are
9392 distinct operations; for signed integer division, use '``sdiv``'.
9394 Division by zero is undefined behavior. For vectors, if any element
9395 of the divisor is zero, the operation has undefined behavior.
9398 If the ``exact`` keyword is present, the result value of the ``udiv`` is
9399 a :ref:`poison value <poisonvalues>` if %op1 is not a multiple of %op2 (as
9400 such, "((a udiv exact b) mul b) == a").
9405 .. code-block:: text
9407 <result> = udiv i32 4, %var ; yields i32:result = 4 / %var
9411 '``sdiv``' Instruction
9412 ^^^^^^^^^^^^^^^^^^^^^^
9419 <result> = sdiv <ty> <op1>, <op2> ; yields ty:result
9420 <result> = sdiv exact <ty> <op1>, <op2> ; yields ty:result
9425 The '``sdiv``' instruction returns the quotient of its two operands.
9430 The two arguments to the '``sdiv``' instruction must be
9431 :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
9432 arguments must have identical types.
9437 The value produced is the signed integer quotient of the two operands
9438 rounded towards zero.
9440 Note that signed integer division and unsigned integer division are
9441 distinct operations; for unsigned integer division, use '``udiv``'.
9443 Division by zero is undefined behavior. For vectors, if any element
9444 of the divisor is zero, the operation has undefined behavior.
9445 Overflow also leads to undefined behavior; this is a rare case, but can
9446 occur, for example, by doing a 32-bit division of -2147483648 by -1.
9448 If the ``exact`` keyword is present, the result value of the ``sdiv`` is
9449 a :ref:`poison value <poisonvalues>` if the result would be rounded.
9454 .. code-block:: text
9456 <result> = sdiv i32 4, %var ; yields i32:result = 4 / %var
9460 '``fdiv``' Instruction
9461 ^^^^^^^^^^^^^^^^^^^^^^
9468 <result> = fdiv [fast-math flags]* <ty> <op1>, <op2> ; yields ty:result
9473 The '``fdiv``' instruction returns the quotient of its two operands.
9478 The two arguments to the '``fdiv``' instruction must be
9479 :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of
9480 floating-point values. Both arguments must have identical types.
9485 The value produced is the floating-point quotient of the two operands.
9486 This instruction is assumed to execute in the default :ref:`floating-point
9487 environment <floatenv>`.
9488 This instruction can also take any number of :ref:`fast-math
9489 flags <fastmath>`, which are optimization hints to enable otherwise
9490 unsafe floating-point optimizations:
9495 .. code-block:: text
9497 <result> = fdiv float 4.0, %var ; yields float:result = 4.0 / %var
9501 '``urem``' Instruction
9502 ^^^^^^^^^^^^^^^^^^^^^^
9509 <result> = urem <ty> <op1>, <op2> ; yields ty:result
9514 The '``urem``' instruction returns the remainder from the unsigned
9515 division of its two arguments.
9520 The two arguments to the '``urem``' instruction must be
9521 :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
9522 arguments must have identical types.
9527 This instruction returns the unsigned integer *remainder* of a division.
9528 This instruction always performs an unsigned division to get the
9531 Note that unsigned integer remainder and signed integer remainder are
9532 distinct operations; for signed integer remainder, use '``srem``'.
9534 Taking the remainder of a division by zero is undefined behavior.
9535 For vectors, if any element of the divisor is zero, the operation has
9541 .. code-block:: text
9543 <result> = urem i32 4, %var ; yields i32:result = 4 % %var
9547 '``srem``' Instruction
9548 ^^^^^^^^^^^^^^^^^^^^^^
9555 <result> = srem <ty> <op1>, <op2> ; yields ty:result
9560 The '``srem``' instruction returns the remainder from the signed
9561 division of its two operands. This instruction can also take
9562 :ref:`vector <t_vector>` versions of the values in which case the elements
9568 The two arguments to the '``srem``' instruction must be
9569 :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
9570 arguments must have identical types.
9575 This instruction returns the *remainder* of a division (where the result
9576 is either zero or has the same sign as the dividend, ``op1``), not the
9577 *modulo* operator (where the result is either zero or has the same sign
9578 as the divisor, ``op2``) of a value. For more information about the
9579 difference, see `The Math
9580 Forum <http://mathforum.org/dr.math/problems/anne.4.28.99.html>`_. For a
9581 table of how this is implemented in various languages, please see
9583 operation <http://en.wikipedia.org/wiki/Modulo_operation>`_.
9585 Note that signed integer remainder and unsigned integer remainder are
9586 distinct operations; for unsigned integer remainder, use '``urem``'.
9588 Taking the remainder of a division by zero is undefined behavior.
9589 For vectors, if any element of the divisor is zero, the operation has
9591 Overflow also leads to undefined behavior; this is a rare case, but can
9592 occur, for example, by taking the remainder of a 32-bit division of
9593 -2147483648 by -1. (The remainder doesn't actually overflow, but this
9594 rule lets srem be implemented using instructions that return both the
9595 result of the division and the remainder.)
9600 .. code-block:: text
9602 <result> = srem i32 4, %var ; yields i32:result = 4 % %var
9606 '``frem``' Instruction
9607 ^^^^^^^^^^^^^^^^^^^^^^
9614 <result> = frem [fast-math flags]* <ty> <op1>, <op2> ; yields ty:result
9619 The '``frem``' instruction returns the remainder from the division of
9624 The instruction is implemented as a call to libm's '``fmod``'
9625 for some targets, and using the instruction may thus require linking libm.
9631 The two arguments to the '``frem``' instruction must be
9632 :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of
9633 floating-point values. Both arguments must have identical types.
9638 The value produced is the floating-point remainder of the two operands.
9639 This is the same output as a libm '``fmod``' function, but without any
9640 possibility of setting ``errno``. The remainder has the same sign as the
9642 This instruction is assumed to execute in the default :ref:`floating-point
9643 environment <floatenv>`.
9644 This instruction can also take any number of :ref:`fast-math
9645 flags <fastmath>`, which are optimization hints to enable otherwise
9646 unsafe floating-point optimizations:
9651 .. code-block:: text
9653 <result> = frem float 4.0, %var ; yields float:result = 4.0 % %var
9657 Bitwise Binary Operations
9658 -------------------------
9660 Bitwise binary operators are used to do various forms of bit-twiddling
9661 in a program. They are generally very efficient instructions and can
9662 commonly be strength reduced from other instructions. They require two
9663 operands of the same type, execute an operation on them, and produce a
9664 single value. The resulting value is the same type as its operands.
9668 '``shl``' Instruction
9669 ^^^^^^^^^^^^^^^^^^^^^
9676 <result> = shl <ty> <op1>, <op2> ; yields ty:result
9677 <result> = shl nuw <ty> <op1>, <op2> ; yields ty:result
9678 <result> = shl nsw <ty> <op1>, <op2> ; yields ty:result
9679 <result> = shl nuw nsw <ty> <op1>, <op2> ; yields ty:result
9684 The '``shl``' instruction returns the first operand shifted to the left
9685 a specified number of bits.
9690 Both arguments to the '``shl``' instruction must be the same
9691 :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer type.
9692 '``op2``' is treated as an unsigned value.
9697 The value produced is ``op1`` \* 2\ :sup:`op2` mod 2\ :sup:`n`,
9698 where ``n`` is the width of the result. If ``op2`` is (statically or
9699 dynamically) equal to or larger than the number of bits in
9700 ``op1``, this instruction returns a :ref:`poison value <poisonvalues>`.
9701 If the arguments are vectors, each vector element of ``op1`` is shifted
9702 by the corresponding shift amount in ``op2``.
9704 If the ``nuw`` keyword is present, then the shift produces a poison
9705 value if it shifts out any non-zero bits.
9706 If the ``nsw`` keyword is present, then the shift produces a poison
9707 value if it shifts out any bits that disagree with the resultant sign bit.
9712 .. code-block:: text
9714 <result> = shl i32 4, %var ; yields i32: 4 << %var
9715 <result> = shl i32 4, 2 ; yields i32: 16
9716 <result> = shl i32 1, 10 ; yields i32: 1024
9717 <result> = shl i32 1, 32 ; undefined
9718 <result> = shl <2 x i32> < i32 1, i32 1>, < i32 1, i32 2> ; yields: result=<2 x i32> < i32 2, i32 4>
9723 '``lshr``' Instruction
9724 ^^^^^^^^^^^^^^^^^^^^^^
9731 <result> = lshr <ty> <op1>, <op2> ; yields ty:result
9732 <result> = lshr exact <ty> <op1>, <op2> ; yields ty:result
9737 The '``lshr``' instruction (logical shift right) returns the first
9738 operand shifted to the right a specified number of bits with zero fill.
9743 Both arguments to the '``lshr``' instruction must be the same
9744 :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer type.
9745 '``op2``' is treated as an unsigned value.
9750 This instruction always performs a logical shift right operation. The
9751 most significant bits of the result will be filled with zero bits after
9752 the shift. If ``op2`` is (statically or dynamically) equal to or larger
9753 than the number of bits in ``op1``, this instruction returns a :ref:`poison
9754 value <poisonvalues>`. If the arguments are vectors, each vector element
9755 of ``op1`` is shifted by the corresponding shift amount in ``op2``.
9757 If the ``exact`` keyword is present, the result value of the ``lshr`` is
9758 a poison value if any of the bits shifted out are non-zero.
9763 .. code-block:: text
9765 <result> = lshr i32 4, 1 ; yields i32:result = 2
9766 <result> = lshr i32 4, 2 ; yields i32:result = 1
9767 <result> = lshr i8 4, 3 ; yields i8:result = 0
9768 <result> = lshr i8 -2, 1 ; yields i8:result = 0x7F
9769 <result> = lshr i32 1, 32 ; undefined
9770 <result> = lshr <2 x i32> < i32 -2, i32 4>, < i32 1, i32 2> ; yields: result=<2 x i32> < i32 0x7FFFFFFF, i32 1>
9774 '``ashr``' Instruction
9775 ^^^^^^^^^^^^^^^^^^^^^^
9782 <result> = ashr <ty> <op1>, <op2> ; yields ty:result
9783 <result> = ashr exact <ty> <op1>, <op2> ; yields ty:result
9788 The '``ashr``' instruction (arithmetic shift right) returns the first
9789 operand shifted to the right a specified number of bits with sign
9795 Both arguments to the '``ashr``' instruction must be the same
9796 :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer type.
9797 '``op2``' is treated as an unsigned value.
9802 This instruction always performs an arithmetic shift right operation,
9803 The most significant bits of the result will be filled with the sign bit
9804 of ``op1``. If ``op2`` is (statically or dynamically) equal to or larger
9805 than the number of bits in ``op1``, this instruction returns a :ref:`poison
9806 value <poisonvalues>`. If the arguments are vectors, each vector element
9807 of ``op1`` is shifted by the corresponding shift amount in ``op2``.
9809 If the ``exact`` keyword is present, the result value of the ``ashr`` is
9810 a poison value if any of the bits shifted out are non-zero.
9815 .. code-block:: text
9817 <result> = ashr i32 4, 1 ; yields i32:result = 2
9818 <result> = ashr i32 4, 2 ; yields i32:result = 1
9819 <result> = ashr i8 4, 3 ; yields i8:result = 0
9820 <result> = ashr i8 -2, 1 ; yields i8:result = -1
9821 <result> = ashr i32 1, 32 ; undefined
9822 <result> = ashr <2 x i32> < i32 -2, i32 4>, < i32 1, i32 3> ; yields: result=<2 x i32> < i32 -1, i32 0>
9826 '``and``' Instruction
9827 ^^^^^^^^^^^^^^^^^^^^^
9834 <result> = and <ty> <op1>, <op2> ; yields ty:result
9839 The '``and``' instruction returns the bitwise logical and of its two
9845 The two arguments to the '``and``' instruction must be
9846 :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
9847 arguments must have identical types.
9852 The truth table used for the '``and``' instruction is:
9869 .. code-block:: text
9871 <result> = and i32 4, %var ; yields i32:result = 4 & %var
9872 <result> = and i32 15, 40 ; yields i32:result = 8
9873 <result> = and i32 4, 8 ; yields i32:result = 0
9877 '``or``' Instruction
9878 ^^^^^^^^^^^^^^^^^^^^
9885 <result> = or <ty> <op1>, <op2> ; yields ty:result
9890 The '``or``' instruction returns the bitwise logical inclusive or of its
9896 The two arguments to the '``or``' instruction must be
9897 :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
9898 arguments must have identical types.
9903 The truth table used for the '``or``' instruction is:
9922 <result> = or i32 4, %var ; yields i32:result = 4 | %var
9923 <result> = or i32 15, 40 ; yields i32:result = 47
9924 <result> = or i32 4, 8 ; yields i32:result = 12
9928 '``xor``' Instruction
9929 ^^^^^^^^^^^^^^^^^^^^^
9936 <result> = xor <ty> <op1>, <op2> ; yields ty:result
9941 The '``xor``' instruction returns the bitwise logical exclusive or of
9942 its two operands. The ``xor`` is used to implement the "one's
9943 complement" operation, which is the "~" operator in C.
9948 The two arguments to the '``xor``' instruction must be
9949 :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
9950 arguments must have identical types.
9955 The truth table used for the '``xor``' instruction is:
9972 .. code-block:: text
9974 <result> = xor i32 4, %var ; yields i32:result = 4 ^ %var
9975 <result> = xor i32 15, 40 ; yields i32:result = 39
9976 <result> = xor i32 4, 8 ; yields i32:result = 12
9977 <result> = xor i32 %V, -1 ; yields i32:result = ~%V
9982 LLVM supports several instructions to represent vector operations in a
9983 target-independent manner. These instructions cover the element-access
9984 and vector-specific operations needed to process vectors effectively.
9985 While LLVM does directly support these vector operations, many
9986 sophisticated algorithms will want to use target-specific intrinsics to
9987 take full advantage of a specific target.
9989 .. _i_extractelement:
9991 '``extractelement``' Instruction
9992 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
9999 <result> = extractelement <n x <ty>> <val>, <ty2> <idx> ; yields <ty>
10000 <result> = extractelement <vscale x n x <ty>> <val>, <ty2> <idx> ; yields <ty>
10005 The '``extractelement``' instruction extracts a single scalar element
10006 from a vector at a specified index.
10011 The first operand of an '``extractelement``' instruction is a value of
10012 :ref:`vector <t_vector>` type. The second operand is an index indicating
10013 the position from which to extract the element. The index may be a
10014 variable of any integer type, and will be treated as an unsigned integer.
10019 The result is a scalar of the same type as the element type of ``val``.
10020 Its value is the value at position ``idx`` of ``val``. If ``idx``
10021 exceeds the length of ``val`` for a fixed-length vector, the result is a
10022 :ref:`poison value <poisonvalues>`. For a scalable vector, if the value
10023 of ``idx`` exceeds the runtime length of the vector, the result is a
10024 :ref:`poison value <poisonvalues>`.
10029 .. code-block:: text
10031 <result> = extractelement <4 x i32> %vec, i32 0 ; yields i32
10033 .. _i_insertelement:
10035 '``insertelement``' Instruction
10036 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10043 <result> = insertelement <n x <ty>> <val>, <ty> <elt>, <ty2> <idx> ; yields <n x <ty>>
10044 <result> = insertelement <vscale x n x <ty>> <val>, <ty> <elt>, <ty2> <idx> ; yields <vscale x n x <ty>>
10049 The '``insertelement``' instruction inserts a scalar element into a
10050 vector at a specified index.
10055 The first operand of an '``insertelement``' instruction is a value of
10056 :ref:`vector <t_vector>` type. The second operand is a scalar value whose
10057 type must equal the element type of the first operand. The third operand
10058 is an index indicating the position at which to insert the value. The
10059 index may be a variable of any integer type, and will be treated as an
10065 The result is a vector of the same type as ``val``. Its element values
10066 are those of ``val`` except at position ``idx``, where it gets the value
10067 ``elt``. If ``idx`` exceeds the length of ``val`` for a fixed-length vector,
10068 the result is a :ref:`poison value <poisonvalues>`. For a scalable vector,
10069 if the value of ``idx`` exceeds the runtime length of the vector, the result
10070 is a :ref:`poison value <poisonvalues>`.
10075 .. code-block:: text
10077 <result> = insertelement <4 x i32> %vec, i32 1, i32 0 ; yields <4 x i32>
10079 .. _i_shufflevector:
10081 '``shufflevector``' Instruction
10082 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10089 <result> = shufflevector <n x <ty>> <v1>, <n x <ty>> <v2>, <m x i32> <mask> ; yields <m x <ty>>
10090 <result> = shufflevector <vscale x n x <ty>> <v1>, <vscale x n x <ty>> v2, <vscale x m x i32> <mask> ; yields <vscale x m x <ty>>
10095 The '``shufflevector``' instruction constructs a permutation of elements
10096 from two input vectors, returning a vector with the same element type as
10097 the input and length that is the same as the shuffle mask.
10102 The first two operands of a '``shufflevector``' instruction are vectors
10103 with the same type. The third argument is a shuffle mask vector constant
10104 whose element type is ``i32``. The mask vector elements must be constant
10105 integers or ``poison`` values. The result of the instruction is a vector
10106 whose length is the same as the shuffle mask and whose element type is the
10107 same as the element type of the first two operands.
10112 The elements of the two input vectors are numbered from left to right
10113 across both of the vectors. For each element of the result vector, the
10114 shuffle mask selects an element from one of the input vectors to copy
10115 to the result. Non-negative elements in the mask represent an index
10116 into the concatenated pair of input vectors.
10118 A ``poison`` element in the mask vector specifies that the resulting element
10120 For backwards-compatibility reasons, LLVM temporarily also accepts ``undef``
10121 mask elements, which will be interpreted the same way as ``poison`` elements.
10122 If the shuffle mask selects an ``undef`` element from one of the input
10123 vectors, the resulting element is ``undef``.
10125 For scalable vectors, the only valid mask values at present are
10126 ``zeroinitializer``, ``undef`` and ``poison``, since we cannot write all indices as
10127 literals for a vector with a length unknown at compile time.
10132 .. code-block:: text
10134 <result> = shufflevector <4 x i32> %v1, <4 x i32> %v2,
10135 <4 x i32> <i32 0, i32 4, i32 1, i32 5> ; yields <4 x i32>
10136 <result> = shufflevector <4 x i32> %v1, <4 x i32> poison,
10137 <4 x i32> <i32 0, i32 1, i32 2, i32 3> ; yields <4 x i32> - Identity shuffle.
10138 <result> = shufflevector <8 x i32> %v1, <8 x i32> poison,
10139 <4 x i32> <i32 0, i32 1, i32 2, i32 3> ; yields <4 x i32>
10140 <result> = shufflevector <4 x i32> %v1, <4 x i32> %v2,
10141 <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7 > ; yields <8 x i32>
10143 Aggregate Operations
10144 --------------------
10146 LLVM supports several instructions for working with
10147 :ref:`aggregate <t_aggregate>` values.
10149 .. _i_extractvalue:
10151 '``extractvalue``' Instruction
10152 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10159 <result> = extractvalue <aggregate type> <val>, <idx>{, <idx>}*
10164 The '``extractvalue``' instruction extracts the value of a member field
10165 from an :ref:`aggregate <t_aggregate>` value.
10170 The first operand of an '``extractvalue``' instruction is a value of
10171 :ref:`struct <t_struct>` or :ref:`array <t_array>` type. The other operands are
10172 constant indices to specify which value to extract in a similar manner
10173 as indices in a '``getelementptr``' instruction.
10175 The major differences to ``getelementptr`` indexing are:
10177 - Since the value being indexed is not a pointer, the first index is
10178 omitted and assumed to be zero.
10179 - At least one index must be specified.
10180 - Not only struct indices but also array indices must be in bounds.
10185 The result is the value at the position in the aggregate specified by
10186 the index operands.
10191 .. code-block:: text
10193 <result> = extractvalue {i32, float} %agg, 0 ; yields i32
10197 '``insertvalue``' Instruction
10198 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10205 <result> = insertvalue <aggregate type> <val>, <ty> <elt>, <idx>{, <idx>}* ; yields <aggregate type>
10210 The '``insertvalue``' instruction inserts a value into a member field in
10211 an :ref:`aggregate <t_aggregate>` value.
10216 The first operand of an '``insertvalue``' instruction is a value of
10217 :ref:`struct <t_struct>` or :ref:`array <t_array>` type. The second operand is
10218 a first-class value to insert. The following operands are constant
10219 indices indicating the position at which to insert the value in a
10220 similar manner as indices in a '``extractvalue``' instruction. The value
10221 to insert must have the same type as the value identified by the
10227 The result is an aggregate of the same type as ``val``. Its value is
10228 that of ``val`` except that the value at the position specified by the
10229 indices is that of ``elt``.
10234 .. code-block:: llvm
10236 %agg1 = insertvalue {i32, float} undef, i32 1, 0 ; yields {i32 1, float undef}
10237 %agg2 = insertvalue {i32, float} %agg1, float %val, 1 ; yields {i32 1, float %val}
10238 %agg3 = insertvalue {i32, {float}} undef, float %val, 1, 0 ; yields {i32 undef, {float %val}}
10242 Memory Access and Addressing Operations
10243 ---------------------------------------
10245 A key design point of an SSA-based representation is how it represents
10246 memory. In LLVM, no memory locations are in SSA form, which makes things
10247 very simple. This section describes how to read, write, and allocate
10252 '``alloca``' Instruction
10253 ^^^^^^^^^^^^^^^^^^^^^^^^
10260 <result> = alloca [inalloca] <type> [, <ty> <NumElements>] [, align <alignment>] [, addrspace(<num>)] ; yields type addrspace(num)*:result
10265 The '``alloca``' instruction allocates memory on the stack frame of the
10266 currently executing function, to be automatically released when this
10267 function returns to its caller. If the address space is not explicitly
10268 specified, the object is allocated in the alloca address space from the
10269 :ref:`datalayout string<langref_datalayout>`.
10274 The '``alloca``' instruction allocates ``sizeof(<type>)*NumElements``
10275 bytes of memory on the runtime stack, returning a pointer of the
10276 appropriate type to the program. If "NumElements" is specified, it is
10277 the number of elements allocated, otherwise "NumElements" is defaulted
10280 If a constant alignment is specified, the value result of the
10281 allocation is guaranteed to be aligned to at least that boundary. The
10282 alignment may not be greater than ``1 << 32``.
10284 The alignment is only optional when parsing textual IR; for in-memory IR,
10285 it is always present. If not specified, the target can choose to align the
10286 allocation on any convenient boundary compatible with the type.
10288 '``type``' may be any sized type.
10293 Memory is allocated; a pointer is returned. The allocated memory is
10294 uninitialized, and loading from uninitialized memory produces an undefined
10295 value. The operation itself is undefined if there is insufficient stack
10296 space for the allocation.'``alloca``'d memory is automatically released
10297 when the function returns. The '``alloca``' instruction is commonly used
10298 to represent automatic variables that must have an address available. When
10299 the function returns (either with the ``ret`` or ``resume`` instructions),
10300 the memory is reclaimed. Allocating zero bytes is legal, but the returned
10301 pointer may not be unique. The order in which memory is allocated (ie.,
10302 which way the stack grows) is not specified.
10304 Note that '``alloca``' outside of the alloca address space from the
10305 :ref:`datalayout string<langref_datalayout>` is meaningful only if the
10306 target has assigned it a semantics.
10308 If the returned pointer is used by :ref:`llvm.lifetime.start <int_lifestart>`,
10309 the returned object is initially dead.
10310 See :ref:`llvm.lifetime.start <int_lifestart>` and
10311 :ref:`llvm.lifetime.end <int_lifeend>` for the precise semantics of
10312 lifetime-manipulating intrinsics.
10317 .. code-block:: llvm
10319 %ptr = alloca i32 ; yields ptr
10320 %ptr = alloca i32, i32 4 ; yields ptr
10321 %ptr = alloca i32, i32 4, align 1024 ; yields ptr
10322 %ptr = alloca i32, align 1024 ; yields ptr
10326 '``load``' Instruction
10327 ^^^^^^^^^^^^^^^^^^^^^^
10334 <result> = load [volatile] <ty>, ptr <pointer>[, align <alignment>][, !nontemporal !<nontemp_node>][, !invariant.load !<empty_node>][, !invariant.group !<empty_node>][, !nonnull !<empty_node>][, !dereferenceable !<deref_bytes_node>][, !dereferenceable_or_null !<deref_bytes_node>][, !align !<align_node>][, !noundef !<empty_node>]
10335 <result> = load atomic [volatile] <ty>, ptr <pointer> [syncscope("<target-scope>")] <ordering>, align <alignment> [, !invariant.group !<empty_node>]
10336 !<nontemp_node> = !{ i32 1 }
10337 !<empty_node> = !{}
10338 !<deref_bytes_node> = !{ i64 <dereferenceable_bytes> }
10339 !<align_node> = !{ i64 <value_alignment> }
10344 The '``load``' instruction is used to read from memory.
10349 The argument to the ``load`` instruction specifies the memory address from which
10350 to load. The type specified must be a :ref:`first class <t_firstclass>` type of
10351 known size (i.e. not containing an :ref:`opaque structural type <t_opaque>`). If
10352 the ``load`` is marked as ``volatile``, then the optimizer is not allowed to
10353 modify the number or order of execution of this ``load`` with other
10354 :ref:`volatile operations <volatile>`.
10356 If the ``load`` is marked as ``atomic``, it takes an extra :ref:`ordering
10357 <ordering>` and optional ``syncscope("<target-scope>")`` argument. The
10358 ``release`` and ``acq_rel`` orderings are not valid on ``load`` instructions.
10359 Atomic loads produce :ref:`defined <memmodel>` results when they may see
10360 multiple atomic stores. The type of the pointee must be an integer, pointer, or
10361 floating-point type whose bit width is a power of two greater than or equal to
10362 eight and less than or equal to a target-specific size limit. ``align`` must be
10363 explicitly specified on atomic loads, and the load has undefined behavior if the
10364 alignment is not set to a value which is at least the size in bytes of the
10365 pointee. ``!nontemporal`` does not have any defined semantics for atomic loads.
10367 The optional constant ``align`` argument specifies the alignment of the
10368 operation (that is, the alignment of the memory address). It is the
10369 responsibility of the code emitter to ensure that the alignment information is
10370 correct. Overestimating the alignment results in undefined behavior.
10371 Underestimating the alignment may produce less efficient code. An alignment of
10372 1 is always safe. The maximum possible alignment is ``1 << 32``. An alignment
10373 value higher than the size of the loaded type implies memory up to the
10374 alignment value bytes can be safely loaded without trapping in the default
10375 address space. Access of the high bytes can interfere with debugging tools, so
10376 should not be accessed if the function has the ``sanitize_thread`` or
10377 ``sanitize_address`` attributes.
10379 The alignment is only optional when parsing textual IR; for in-memory IR, it is
10380 always present. An omitted ``align`` argument means that the operation has the
10381 ABI alignment for the target.
10383 The optional ``!nontemporal`` metadata must reference a single
10384 metadata name ``<nontemp_node>`` corresponding to a metadata node with one
10385 ``i32`` entry of value 1. The existence of the ``!nontemporal``
10386 metadata on the instruction tells the optimizer and code generator
10387 that this load is not expected to be reused in the cache. The code
10388 generator may select special instructions to save cache bandwidth, such
10389 as the ``MOVNT`` instruction on x86.
10391 The optional ``!invariant.load`` metadata must reference a single
10392 metadata name ``<empty_node>`` corresponding to a metadata node with no
10393 entries. If a load instruction tagged with the ``!invariant.load``
10394 metadata is executed, the memory location referenced by the load has
10395 to contain the same value at all points in the program where the
10396 memory location is dereferenceable; otherwise, the behavior is
10399 The optional ``!invariant.group`` metadata must reference a single metadata name
10400 ``<empty_node>`` corresponding to a metadata node with no entries.
10401 See ``invariant.group`` metadata :ref:`invariant.group <md_invariant.group>`.
10403 The optional ``!nonnull`` metadata must reference a single
10404 metadata name ``<empty_node>`` corresponding to a metadata node with no
10405 entries. The existence of the ``!nonnull`` metadata on the
10406 instruction tells the optimizer that the value loaded is known to
10407 never be null. If the value is null at runtime, a poison value is returned
10408 instead. This is analogous to the ``nonnull`` attribute on parameters and
10409 return values. This metadata can only be applied to loads of a pointer type.
10411 The optional ``!dereferenceable`` metadata must reference a single metadata
10412 name ``<deref_bytes_node>`` corresponding to a metadata node with one ``i64``
10414 See ``dereferenceable`` metadata :ref:`dereferenceable <md_dereferenceable>`.
10416 The optional ``!dereferenceable_or_null`` metadata must reference a single
10417 metadata name ``<deref_bytes_node>`` corresponding to a metadata node with one
10419 See ``dereferenceable_or_null`` metadata :ref:`dereferenceable_or_null
10420 <md_dereferenceable_or_null>`.
10422 The optional ``!align`` metadata must reference a single metadata name
10423 ``<align_node>`` corresponding to a metadata node with one ``i64`` entry.
10424 The existence of the ``!align`` metadata on the instruction tells the
10425 optimizer that the value loaded is known to be aligned to a boundary specified
10426 by the integer value in the metadata node. The alignment must be a power of 2.
10427 This is analogous to the ''align'' attribute on parameters and return values.
10428 This metadata can only be applied to loads of a pointer type. If the returned
10429 value is not appropriately aligned at runtime, a poison value is returned
10432 The optional ``!noundef`` metadata must reference a single metadata name
10433 ``<empty_node>`` corresponding to a node with no entries. The existence of
10434 ``!noundef`` metadata on the instruction tells the optimizer that the value
10435 loaded is known to be :ref:`well defined <welldefinedvalues>`.
10436 If the value isn't well defined, the behavior is undefined. If the ``!noundef``
10437 metadata is combined with poison-generating metadata like ``!nonnull``,
10438 violation of that metadata constraint will also result in undefined behavior.
10443 The location of memory pointed to is loaded. If the value being loaded
10444 is of scalar type then the number of bytes read does not exceed the
10445 minimum number of bytes needed to hold all bits of the type. For
10446 example, loading an ``i24`` reads at most three bytes. When loading a
10447 value of a type like ``i20`` with a size that is not an integral number
10448 of bytes, the result is undefined if the value was not originally
10449 written using a store of the same type.
10450 If the value being loaded is of aggregate type, the bytes that correspond to
10451 padding may be accessed but are ignored, because it is impossible to observe
10452 padding from the loaded aggregate value.
10453 If ``<pointer>`` is not a well-defined value, the behavior is undefined.
10458 .. code-block:: llvm
10460 %ptr = alloca i32 ; yields ptr
10461 store i32 3, ptr %ptr ; yields void
10462 %val = load i32, ptr %ptr ; yields i32:val = i32 3
10466 '``store``' Instruction
10467 ^^^^^^^^^^^^^^^^^^^^^^^
10474 store [volatile] <ty> <value>, ptr <pointer>[, align <alignment>][, !nontemporal !<nontemp_node>][, !invariant.group !<empty_node>] ; yields void
10475 store atomic [volatile] <ty> <value>, ptr <pointer> [syncscope("<target-scope>")] <ordering>, align <alignment> [, !invariant.group !<empty_node>] ; yields void
10476 !<nontemp_node> = !{ i32 1 }
10477 !<empty_node> = !{}
10482 The '``store``' instruction is used to write to memory.
10487 There are two arguments to the ``store`` instruction: a value to store and an
10488 address at which to store it. The type of the ``<pointer>`` operand must be a
10489 pointer to the :ref:`first class <t_firstclass>` type of the ``<value>``
10490 operand. If the ``store`` is marked as ``volatile``, then the optimizer is not
10491 allowed to modify the number or order of execution of this ``store`` with other
10492 :ref:`volatile operations <volatile>`. Only values of :ref:`first class
10493 <t_firstclass>` types of known size (i.e. not containing an :ref:`opaque
10494 structural type <t_opaque>`) can be stored.
10496 If the ``store`` is marked as ``atomic``, it takes an extra :ref:`ordering
10497 <ordering>` and optional ``syncscope("<target-scope>")`` argument. The
10498 ``acquire`` and ``acq_rel`` orderings aren't valid on ``store`` instructions.
10499 Atomic loads produce :ref:`defined <memmodel>` results when they may see
10500 multiple atomic stores. The type of the pointee must be an integer, pointer, or
10501 floating-point type whose bit width is a power of two greater than or equal to
10502 eight and less than or equal to a target-specific size limit. ``align`` must be
10503 explicitly specified on atomic stores, and the store has undefined behavior if
10504 the alignment is not set to a value which is at least the size in bytes of the
10505 pointee. ``!nontemporal`` does not have any defined semantics for atomic stores.
10507 The optional constant ``align`` argument specifies the alignment of the
10508 operation (that is, the alignment of the memory address). It is the
10509 responsibility of the code emitter to ensure that the alignment information is
10510 correct. Overestimating the alignment results in undefined behavior.
10511 Underestimating the alignment may produce less efficient code. An alignment of
10512 1 is always safe. The maximum possible alignment is ``1 << 32``. An alignment
10513 value higher than the size of the loaded type implies memory up to the
10514 alignment value bytes can be safely loaded without trapping in the default
10515 address space. Access of the high bytes can interfere with debugging tools, so
10516 should not be accessed if the function has the ``sanitize_thread`` or
10517 ``sanitize_address`` attributes.
10519 The alignment is only optional when parsing textual IR; for in-memory IR, it is
10520 always present. An omitted ``align`` argument means that the operation has the
10521 ABI alignment for the target.
10523 The optional ``!nontemporal`` metadata must reference a single metadata
10524 name ``<nontemp_node>`` corresponding to a metadata node with one ``i32`` entry
10525 of value 1. The existence of the ``!nontemporal`` metadata on the instruction
10526 tells the optimizer and code generator that this load is not expected to
10527 be reused in the cache. The code generator may select special
10528 instructions to save cache bandwidth, such as the ``MOVNT`` instruction on
10531 The optional ``!invariant.group`` metadata must reference a
10532 single metadata name ``<empty_node>``. See ``invariant.group`` metadata.
10537 The contents of memory are updated to contain ``<value>`` at the
10538 location specified by the ``<pointer>`` operand. If ``<value>`` is
10539 of scalar type then the number of bytes written does not exceed the
10540 minimum number of bytes needed to hold all bits of the type. For
10541 example, storing an ``i24`` writes at most three bytes. When writing a
10542 value of a type like ``i20`` with a size that is not an integral number
10543 of bytes, it is unspecified what happens to the extra bits that do not
10544 belong to the type, but they will typically be overwritten.
10545 If ``<value>`` is of aggregate type, padding is filled with
10546 :ref:`undef <undefvalues>`.
10547 If ``<pointer>`` is not a well-defined value, the behavior is undefined.
10552 .. code-block:: llvm
10554 %ptr = alloca i32 ; yields ptr
10555 store i32 3, ptr %ptr ; yields void
10556 %val = load i32, ptr %ptr ; yields i32:val = i32 3
10560 '``fence``' Instruction
10561 ^^^^^^^^^^^^^^^^^^^^^^^
10568 fence [syncscope("<target-scope>")] <ordering> ; yields void
10573 The '``fence``' instruction is used to introduce happens-before edges
10574 between operations.
10579 '``fence``' instructions take an :ref:`ordering <ordering>` argument which
10580 defines what *synchronizes-with* edges they add. They can only be given
10581 ``acquire``, ``release``, ``acq_rel``, and ``seq_cst`` orderings.
10586 A fence A which has (at least) ``release`` ordering semantics
10587 *synchronizes with* a fence B with (at least) ``acquire`` ordering
10588 semantics if and only if there exist atomic operations X and Y, both
10589 operating on some atomic object M, such that A is sequenced before X, X
10590 modifies M (either directly or through some side effect of a sequence
10591 headed by X), Y is sequenced before B, and Y observes M. This provides a
10592 *happens-before* dependency between A and B. Rather than an explicit
10593 ``fence``, one (but not both) of the atomic operations X or Y might
10594 provide a ``release`` or ``acquire`` (resp.) ordering constraint and
10595 still *synchronize-with* the explicit ``fence`` and establish the
10596 *happens-before* edge.
10598 A ``fence`` which has ``seq_cst`` ordering, in addition to having both
10599 ``acquire`` and ``release`` semantics specified above, participates in
10600 the global program order of other ``seq_cst`` operations and/or fences.
10602 A ``fence`` instruction can also take an optional
10603 ":ref:`syncscope <syncscope>`" argument.
10608 .. code-block:: text
10610 fence acquire ; yields void
10611 fence syncscope("singlethread") seq_cst ; yields void
10612 fence syncscope("agent") seq_cst ; yields void
10616 '``cmpxchg``' Instruction
10617 ^^^^^^^^^^^^^^^^^^^^^^^^^
10624 cmpxchg [weak] [volatile] ptr <pointer>, <ty> <cmp>, <ty> <new> [syncscope("<target-scope>")] <success ordering> <failure ordering>[, align <alignment>] ; yields { ty, i1 }
10629 The '``cmpxchg``' instruction is used to atomically modify memory. It
10630 loads a value in memory and compares it to a given value. If they are
10631 equal, it tries to store a new value into the memory.
10636 There are three arguments to the '``cmpxchg``' instruction: an address
10637 to operate on, a value to compare to the value currently be at that
10638 address, and a new value to place at that address if the compared values
10639 are equal. The type of '<cmp>' must be an integer or pointer type whose
10640 bit width is a power of two greater than or equal to eight and less
10641 than or equal to a target-specific size limit. '<cmp>' and '<new>' must
10642 have the same type, and the type of '<pointer>' must be a pointer to
10643 that type. If the ``cmpxchg`` is marked as ``volatile``, then the
10644 optimizer is not allowed to modify the number or order of execution of
10645 this ``cmpxchg`` with other :ref:`volatile operations <volatile>`.
10647 The success and failure :ref:`ordering <ordering>` arguments specify how this
10648 ``cmpxchg`` synchronizes with other atomic operations. Both ordering parameters
10649 must be at least ``monotonic``, the failure ordering cannot be either
10650 ``release`` or ``acq_rel``.
10652 A ``cmpxchg`` instruction can also take an optional
10653 ":ref:`syncscope <syncscope>`" argument.
10655 The alignment must be a power of two greater or equal to the size of the
10658 The alignment is only optional when parsing textual IR; for in-memory IR, it is
10659 always present. If unspecified, the alignment is assumed to be equal to the
10660 size of the '<value>' type. Note that this default alignment assumption is
10661 different from the alignment used for the load/store instructions when align
10664 The pointer passed into cmpxchg must have alignment greater than or
10665 equal to the size in memory of the operand.
10670 The contents of memory at the location specified by the '``<pointer>``' operand
10671 is read and compared to '``<cmp>``'; if the values are equal, '``<new>``' is
10672 written to the location. The original value at the location is returned,
10673 together with a flag indicating success (true) or failure (false).
10675 If the cmpxchg operation is marked as ``weak`` then a spurious failure is
10676 permitted: the operation may not write ``<new>`` even if the comparison
10679 If the cmpxchg operation is strong (the default), the i1 value is 1 if and only
10680 if the value loaded equals ``cmp``.
10682 A successful ``cmpxchg`` is a read-modify-write instruction for the purpose of
10683 identifying release sequences. A failed ``cmpxchg`` is equivalent to an atomic
10684 load with an ordering parameter determined the second ordering parameter.
10689 .. code-block:: llvm
10692 %orig = load atomic i32, ptr %ptr unordered, align 4 ; yields i32
10696 %cmp = phi i32 [ %orig, %entry ], [%value_loaded, %loop]
10697 %squared = mul i32 %cmp, %cmp
10698 %val_success = cmpxchg ptr %ptr, i32 %cmp, i32 %squared acq_rel monotonic ; yields { i32, i1 }
10699 %value_loaded = extractvalue { i32, i1 } %val_success, 0
10700 %success = extractvalue { i32, i1 } %val_success, 1
10701 br i1 %success, label %done, label %loop
10708 '``atomicrmw``' Instruction
10709 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
10716 atomicrmw [volatile] <operation> ptr <pointer>, <ty> <value> [syncscope("<target-scope>")] <ordering>[, align <alignment>] ; yields ty
10721 The '``atomicrmw``' instruction is used to atomically modify memory.
10726 There are three arguments to the '``atomicrmw``' instruction: an
10727 operation to apply, an address whose value to modify, an argument to the
10728 operation. The operation must be one of the following keywords:
10748 For most of these operations, the type of '<value>' must be an integer
10749 type whose bit width is a power of two greater than or equal to eight
10750 and less than or equal to a target-specific size limit. For xchg, this
10751 may also be a floating point or a pointer type with the same size constraints
10752 as integers. For fadd/fsub/fmax/fmin, this must be a floating point type. The
10753 type of the '``<pointer>``' operand must be a pointer to that type. If
10754 the ``atomicrmw`` is marked as ``volatile``, then the optimizer is not
10755 allowed to modify the number or order of execution of this
10756 ``atomicrmw`` with other :ref:`volatile operations <volatile>`.
10758 The alignment must be a power of two greater or equal to the size of the
10761 The alignment is only optional when parsing textual IR; for in-memory IR, it is
10762 always present. If unspecified, the alignment is assumed to be equal to the
10763 size of the '<value>' type. Note that this default alignment assumption is
10764 different from the alignment used for the load/store instructions when align
10767 A ``atomicrmw`` instruction can also take an optional
10768 ":ref:`syncscope <syncscope>`" argument.
10773 The contents of memory at the location specified by the '``<pointer>``'
10774 operand are atomically read, modified, and written back. The original
10775 value at the location is returned. The modification is specified by the
10776 operation argument:
10778 - xchg: ``*ptr = val``
10779 - add: ``*ptr = *ptr + val``
10780 - sub: ``*ptr = *ptr - val``
10781 - and: ``*ptr = *ptr & val``
10782 - nand: ``*ptr = ~(*ptr & val)``
10783 - or: ``*ptr = *ptr | val``
10784 - xor: ``*ptr = *ptr ^ val``
10785 - max: ``*ptr = *ptr > val ? *ptr : val`` (using a signed comparison)
10786 - min: ``*ptr = *ptr < val ? *ptr : val`` (using a signed comparison)
10787 - umax: ``*ptr = *ptr > val ? *ptr : val`` (using an unsigned comparison)
10788 - umin: ``*ptr = *ptr < val ? *ptr : val`` (using an unsigned comparison)
10789 - fadd: ``*ptr = *ptr + val`` (using floating point arithmetic)
10790 - fsub: ``*ptr = *ptr - val`` (using floating point arithmetic)
10791 - fmax: ``*ptr = maxnum(*ptr, val)`` (match the `llvm.maxnum.*`` intrinsic)
10792 - fmin: ``*ptr = minnum(*ptr, val)`` (match the `llvm.minnum.*`` intrinsic)
10793 - uinc_wrap: ``*ptr = (*ptr u>= val) ? 0 : (*ptr + 1)`` (increment value with wraparound to zero when incremented above input value)
10794 - udec_wrap: ``*ptr = ((*ptr == 0) || (*ptr u> val)) ? val : (*ptr - 1)`` (decrement with wraparound to input value when decremented below zero).
10800 .. code-block:: llvm
10802 %old = atomicrmw add ptr %ptr, i32 1 acquire ; yields i32
10804 .. _i_getelementptr:
10806 '``getelementptr``' Instruction
10807 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10814 <result> = getelementptr <ty>, ptr <ptrval>{, [inrange] <ty> <idx>}*
10815 <result> = getelementptr inbounds <ty>, ptr <ptrval>{, [inrange] <ty> <idx>}*
10816 <result> = getelementptr <ty>, <N x ptr> <ptrval>, [inrange] <vector index type> <idx>
10821 The '``getelementptr``' instruction is used to get the address of a
10822 subelement of an :ref:`aggregate <t_aggregate>` data structure. It performs
10823 address calculation only and does not access memory. The instruction can also
10824 be used to calculate a vector of such addresses.
10829 The first argument is always a type used as the basis for the calculations.
10830 The second argument is always a pointer or a vector of pointers, and is the
10831 base address to start from. The remaining arguments are indices
10832 that indicate which of the elements of the aggregate object are indexed.
10833 The interpretation of each index is dependent on the type being indexed
10834 into. The first index always indexes the pointer value given as the
10835 second argument, the second index indexes a value of the type pointed to
10836 (not necessarily the value directly pointed to, since the first index
10837 can be non-zero), etc. The first type indexed into must be a pointer
10838 value, subsequent types can be arrays, vectors, and structs. Note that
10839 subsequent types being indexed into can never be pointers, since that
10840 would require loading the pointer before continuing calculation.
10842 The type of each index argument depends on the type it is indexing into.
10843 When indexing into a (optionally packed) structure, only ``i32`` integer
10844 **constants** are allowed (when using a vector of indices they must all
10845 be the **same** ``i32`` integer constant). When indexing into an array,
10846 pointer or vector, integers of any width are allowed, and they are not
10847 required to be constant. These integers are treated as signed values
10850 For example, let's consider a C code fragment and how it gets compiled
10866 int *foo(struct ST *s) {
10867 return &s[1].Z.B[5][13];
10870 The LLVM code generated by Clang is approximately:
10872 .. code-block:: llvm
10874 %struct.RT = type { i8, [10 x [20 x i32]], i8 }
10875 %struct.ST = type { i32, double, %struct.RT }
10877 define ptr @foo(ptr %s) {
10879 %arrayidx = getelementptr inbounds %struct.ST, ptr %s, i64 1, i32 2, i32 1, i64 5, i64 13
10886 In the example above, the first index is indexing into the
10887 '``%struct.ST*``' type, which is a pointer, yielding a '``%struct.ST``'
10888 = '``{ i32, double, %struct.RT }``' type, a structure. The second index
10889 indexes into the third element of the structure, yielding a
10890 '``%struct.RT``' = '``{ i8 , [10 x [20 x i32]], i8 }``' type, another
10891 structure. The third index indexes into the second element of the
10892 structure, yielding a '``[10 x [20 x i32]]``' type, an array. The two
10893 dimensions of the array are subscripted into, yielding an '``i32``'
10894 type. The '``getelementptr``' instruction returns a pointer to this
10897 Note that it is perfectly legal to index partially through a structure,
10898 returning a pointer to an inner element. Because of this, the LLVM code
10899 for the given testcase is equivalent to:
10901 .. code-block:: llvm
10903 define ptr @foo(ptr %s) {
10904 %t1 = getelementptr %struct.ST, ptr %s, i32 1
10905 %t2 = getelementptr %struct.ST, ptr %t1, i32 0, i32 2
10906 %t3 = getelementptr %struct.RT, ptr %t2, i32 0, i32 1
10907 %t4 = getelementptr [10 x [20 x i32]], ptr %t3, i32 0, i32 5
10908 %t5 = getelementptr [20 x i32], ptr %t4, i32 0, i32 13
10912 If the ``inbounds`` keyword is present, the result value of the
10913 ``getelementptr`` is a :ref:`poison value <poisonvalues>` if one of the
10914 following rules is violated:
10916 * The base pointer has an *in bounds* address of an allocated object, which
10917 means that it points into an allocated object, or to its end. The only
10918 *in bounds* address for a null pointer in the default address-space is the
10919 null pointer itself.
10920 * If the type of an index is larger than the pointer index type, the
10921 truncation to the pointer index type preserves the signed value.
10922 * The multiplication of an index by the type size does not wrap the pointer
10923 index type in a signed sense (``nsw``).
10924 * The successive addition of offsets (without adding the base address) does
10925 not wrap the pointer index type in a signed sense (``nsw``).
10926 * The successive addition of the current address, interpreted as an unsigned
10927 number, and an offset, interpreted as a signed number, does not wrap the
10928 unsigned address space and remains *in bounds* of the allocated object.
10929 As a corollary, if the added offset is non-negative, the addition does not
10930 wrap in an unsigned sense (``nuw``).
10931 * In cases where the base is a vector of pointers, the ``inbounds`` keyword
10932 applies to each of the computations element-wise.
10934 These rules are based on the assumption that no allocated object may cross
10935 the unsigned address space boundary, and no allocated object may be larger
10936 than half the pointer index type space.
10938 If the ``inbounds`` keyword is not present, the offsets are added to the
10939 base address with silently-wrapping two's complement arithmetic. If the
10940 offsets have a different width from the pointer's index type, they are
10941 sign-extended or truncated to the width of the pointer's index type. The result
10942 value of the ``getelementptr`` may be outside the object pointed to by the base
10943 pointer. The result value may not necessarily be used to access memory
10944 though, even if it happens to point into allocated storage. See the
10945 :ref:`Pointer Aliasing Rules <pointeraliasing>` section for more
10948 If the ``inrange`` keyword is present before any index, loading from or
10949 storing to any pointer derived from the ``getelementptr`` has undefined
10950 behavior if the load or store would access memory outside of the bounds of
10951 the element selected by the index marked as ``inrange``. The result of a
10952 pointer comparison or ``ptrtoint`` (including ``ptrtoint``-like operations
10953 involving memory) involving a pointer derived from a ``getelementptr`` with
10954 the ``inrange`` keyword is undefined, with the exception of comparisons
10955 in the case where both operands are in the range of the element selected
10956 by the ``inrange`` keyword, inclusive of the address one past the end of
10957 that element. Note that the ``inrange`` keyword is currently only allowed
10958 in constant ``getelementptr`` expressions.
10960 The getelementptr instruction is often confusing. For some more insight
10961 into how it works, see :doc:`the getelementptr FAQ <GetElementPtr>`.
10966 .. code-block:: llvm
10968 %aptr = getelementptr {i32, [12 x i8]}, ptr %saptr, i64 0, i32 1
10969 %vptr = getelementptr {i32, <2 x i8>}, ptr %svptr, i64 0, i32 1, i32 1
10970 %eptr = getelementptr [12 x i8], ptr %aptr, i64 0, i32 1
10971 %iptr = getelementptr [10 x i32], ptr @arr, i16 0, i16 0
10973 Vector of pointers:
10974 """""""""""""""""""
10976 The ``getelementptr`` returns a vector of pointers, instead of a single address,
10977 when one or more of its arguments is a vector. In such cases, all vector
10978 arguments should have the same number of elements, and every scalar argument
10979 will be effectively broadcast into a vector during address calculation.
10981 .. code-block:: llvm
10983 ; All arguments are vectors:
10984 ; A[i] = ptrs[i] + offsets[i]*sizeof(i8)
10985 %A = getelementptr i8, <4 x i8*> %ptrs, <4 x i64> %offsets
10987 ; Add the same scalar offset to each pointer of a vector:
10988 ; A[i] = ptrs[i] + offset*sizeof(i8)
10989 %A = getelementptr i8, <4 x ptr> %ptrs, i64 %offset
10991 ; Add distinct offsets to the same pointer:
10992 ; A[i] = ptr + offsets[i]*sizeof(i8)
10993 %A = getelementptr i8, ptr %ptr, <4 x i64> %offsets
10995 ; In all cases described above the type of the result is <4 x ptr>
10997 The two following instructions are equivalent:
10999 .. code-block:: llvm
11001 getelementptr %struct.ST, <4 x ptr> %s, <4 x i64> %ind1,
11002 <4 x i32> <i32 2, i32 2, i32 2, i32 2>,
11003 <4 x i32> <i32 1, i32 1, i32 1, i32 1>,
11005 <4 x i64> <i64 13, i64 13, i64 13, i64 13>
11007 getelementptr %struct.ST, <4 x ptr> %s, <4 x i64> %ind1,
11008 i32 2, i32 1, <4 x i32> %ind4, i64 13
11010 Let's look at the C code, where the vector version of ``getelementptr``
11015 // Let's assume that we vectorize the following loop:
11016 double *A, *B; int *C;
11017 for (int i = 0; i < size; ++i) {
11021 .. code-block:: llvm
11023 ; get pointers for 8 elements from array B
11024 %ptrs = getelementptr double, ptr %B, <8 x i32> %C
11025 ; load 8 elements from array B into A
11026 %A = call <8 x double> @llvm.masked.gather.v8f64.v8p0f64(<8 x ptr> %ptrs,
11027 i32 8, <8 x i1> %mask, <8 x double> %passthru)
11029 Conversion Operations
11030 ---------------------
11032 The instructions in this category are the conversion instructions
11033 (casting) which all take a single operand and a type. They perform
11034 various bit conversions on the operand.
11038 '``trunc .. to``' Instruction
11039 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11046 <result> = trunc <ty> <value> to <ty2> ; yields ty2
11051 The '``trunc``' instruction truncates its operand to the type ``ty2``.
11056 The '``trunc``' instruction takes a value to trunc, and a type to trunc
11057 it to. Both types must be of :ref:`integer <t_integer>` types, or vectors
11058 of the same number of integers. The bit size of the ``value`` must be
11059 larger than the bit size of the destination type, ``ty2``. Equal sized
11060 types are not allowed.
11065 The '``trunc``' instruction truncates the high order bits in ``value``
11066 and converts the remaining bits to ``ty2``. Since the source size must
11067 be larger than the destination size, ``trunc`` cannot be a *no-op cast*.
11068 It will always truncate bits.
11073 .. code-block:: llvm
11075 %X = trunc i32 257 to i8 ; yields i8:1
11076 %Y = trunc i32 123 to i1 ; yields i1:true
11077 %Z = trunc i32 122 to i1 ; yields i1:false
11078 %W = trunc <2 x i16> <i16 8, i16 7> to <2 x i8> ; yields <i8 8, i8 7>
11082 '``zext .. to``' Instruction
11083 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11090 <result> = zext <ty> <value> to <ty2> ; yields ty2
11095 The '``zext``' instruction zero extends its operand to type ``ty2``.
11100 The '``zext``' instruction takes a value to cast, and a type to cast it
11101 to. Both types must be of :ref:`integer <t_integer>` types, or vectors of
11102 the same number of integers. The bit size of the ``value`` must be
11103 smaller than the bit size of the destination type, ``ty2``.
11108 The ``zext`` fills the high order bits of the ``value`` with zero bits
11109 until it reaches the size of the destination type, ``ty2``.
11111 When zero extending from i1, the result will always be either 0 or 1.
11116 .. code-block:: llvm
11118 %X = zext i32 257 to i64 ; yields i64:257
11119 %Y = zext i1 true to i32 ; yields i32:1
11120 %Z = zext <2 x i16> <i16 8, i16 7> to <2 x i32> ; yields <i32 8, i32 7>
11124 '``sext .. to``' Instruction
11125 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11132 <result> = sext <ty> <value> to <ty2> ; yields ty2
11137 The '``sext``' sign extends ``value`` to the type ``ty2``.
11142 The '``sext``' instruction takes a value to cast, and a type to cast it
11143 to. Both types must be of :ref:`integer <t_integer>` types, or vectors of
11144 the same number of integers. The bit size of the ``value`` must be
11145 smaller than the bit size of the destination type, ``ty2``.
11150 The '``sext``' instruction performs a sign extension by copying the sign
11151 bit (highest order bit) of the ``value`` until it reaches the bit size
11152 of the type ``ty2``.
11154 When sign extending from i1, the extension always results in -1 or 0.
11159 .. code-block:: llvm
11161 %X = sext i8 -1 to i16 ; yields i16 :65535
11162 %Y = sext i1 true to i32 ; yields i32:-1
11163 %Z = sext <2 x i16> <i16 8, i16 7> to <2 x i32> ; yields <i32 8, i32 7>
11165 '``fptrunc .. to``' Instruction
11166 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11173 <result> = fptrunc <ty> <value> to <ty2> ; yields ty2
11178 The '``fptrunc``' instruction truncates ``value`` to type ``ty2``.
11183 The '``fptrunc``' instruction takes a :ref:`floating-point <t_floating>`
11184 value to cast and a :ref:`floating-point <t_floating>` type to cast it to.
11185 The size of ``value`` must be larger than the size of ``ty2``. This
11186 implies that ``fptrunc`` cannot be used to make a *no-op cast*.
11191 The '``fptrunc``' instruction casts a ``value`` from a larger
11192 :ref:`floating-point <t_floating>` type to a smaller :ref:`floating-point
11193 <t_floating>` type.
11194 This instruction is assumed to execute in the default :ref:`floating-point
11195 environment <floatenv>`.
11200 .. code-block:: llvm
11202 %X = fptrunc double 16777217.0 to float ; yields float:16777216.0
11203 %Y = fptrunc double 1.0E+300 to half ; yields half:+infinity
11205 '``fpext .. to``' Instruction
11206 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11213 <result> = fpext <ty> <value> to <ty2> ; yields ty2
11218 The '``fpext``' extends a floating-point ``value`` to a larger floating-point
11224 The '``fpext``' instruction takes a :ref:`floating-point <t_floating>`
11225 ``value`` to cast, and a :ref:`floating-point <t_floating>` type to cast it
11226 to. The source type must be smaller than the destination type.
11231 The '``fpext``' instruction extends the ``value`` from a smaller
11232 :ref:`floating-point <t_floating>` type to a larger :ref:`floating-point
11233 <t_floating>` type. The ``fpext`` cannot be used to make a
11234 *no-op cast* because it always changes bits. Use ``bitcast`` to make a
11235 *no-op cast* for a floating-point cast.
11240 .. code-block:: llvm
11242 %X = fpext float 3.125 to double ; yields double:3.125000e+00
11243 %Y = fpext double %X to fp128 ; yields fp128:0xL00000000000000004000900000000000
11245 '``fptoui .. to``' Instruction
11246 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11253 <result> = fptoui <ty> <value> to <ty2> ; yields ty2
11258 The '``fptoui``' converts a floating-point ``value`` to its unsigned
11259 integer equivalent of type ``ty2``.
11264 The '``fptoui``' instruction takes a value to cast, which must be a
11265 scalar or vector :ref:`floating-point <t_floating>` value, and a type to
11266 cast it to ``ty2``, which must be an :ref:`integer <t_integer>` type. If
11267 ``ty`` is a vector floating-point type, ``ty2`` must be a vector integer
11268 type with the same number of elements as ``ty``
11273 The '``fptoui``' instruction converts its :ref:`floating-point
11274 <t_floating>` operand into the nearest (rounding towards zero)
11275 unsigned integer value. If the value cannot fit in ``ty2``, the result
11276 is a :ref:`poison value <poisonvalues>`.
11281 .. code-block:: llvm
11283 %X = fptoui double 123.0 to i32 ; yields i32:123
11284 %Y = fptoui float 1.0E+300 to i1 ; yields undefined:1
11285 %Z = fptoui float 1.04E+17 to i8 ; yields undefined:1
11287 '``fptosi .. to``' Instruction
11288 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11295 <result> = fptosi <ty> <value> to <ty2> ; yields ty2
11300 The '``fptosi``' instruction converts :ref:`floating-point <t_floating>`
11301 ``value`` to type ``ty2``.
11306 The '``fptosi``' instruction takes a value to cast, which must be a
11307 scalar or vector :ref:`floating-point <t_floating>` value, and a type to
11308 cast it to ``ty2``, which must be an :ref:`integer <t_integer>` type. If
11309 ``ty`` is a vector floating-point type, ``ty2`` must be a vector integer
11310 type with the same number of elements as ``ty``
11315 The '``fptosi``' instruction converts its :ref:`floating-point
11316 <t_floating>` operand into the nearest (rounding towards zero)
11317 signed integer value. If the value cannot fit in ``ty2``, the result
11318 is a :ref:`poison value <poisonvalues>`.
11323 .. code-block:: llvm
11325 %X = fptosi double -123.0 to i32 ; yields i32:-123
11326 %Y = fptosi float 1.0E-247 to i1 ; yields undefined:1
11327 %Z = fptosi float 1.04E+17 to i8 ; yields undefined:1
11329 '``uitofp .. to``' Instruction
11330 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11337 <result> = uitofp <ty> <value> to <ty2> ; yields ty2
11342 The '``uitofp``' instruction regards ``value`` as an unsigned integer
11343 and converts that value to the ``ty2`` type.
11348 The '``uitofp``' instruction takes a value to cast, which must be a
11349 scalar or vector :ref:`integer <t_integer>` value, and a type to cast it to
11350 ``ty2``, which must be an :ref:`floating-point <t_floating>` type. If
11351 ``ty`` is a vector integer type, ``ty2`` must be a vector floating-point
11352 type with the same number of elements as ``ty``
11357 The '``uitofp``' instruction interprets its operand as an unsigned
11358 integer quantity and converts it to the corresponding floating-point
11359 value. If the value cannot be exactly represented, it is rounded using
11360 the default rounding mode.
11366 .. code-block:: llvm
11368 %X = uitofp i32 257 to float ; yields float:257.0
11369 %Y = uitofp i8 -1 to double ; yields double:255.0
11371 '``sitofp .. to``' Instruction
11372 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11379 <result> = sitofp <ty> <value> to <ty2> ; yields ty2
11384 The '``sitofp``' instruction regards ``value`` as a signed integer and
11385 converts that value to the ``ty2`` type.
11390 The '``sitofp``' instruction takes a value to cast, which must be a
11391 scalar or vector :ref:`integer <t_integer>` value, and a type to cast it to
11392 ``ty2``, which must be an :ref:`floating-point <t_floating>` type. If
11393 ``ty`` is a vector integer type, ``ty2`` must be a vector floating-point
11394 type with the same number of elements as ``ty``
11399 The '``sitofp``' instruction interprets its operand as a signed integer
11400 quantity and converts it to the corresponding floating-point value. If the
11401 value cannot be exactly represented, it is rounded using the default rounding
11407 .. code-block:: llvm
11409 %X = sitofp i32 257 to float ; yields float:257.0
11410 %Y = sitofp i8 -1 to double ; yields double:-1.0
11414 '``ptrtoint .. to``' Instruction
11415 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11422 <result> = ptrtoint <ty> <value> to <ty2> ; yields ty2
11427 The '``ptrtoint``' instruction converts the pointer or a vector of
11428 pointers ``value`` to the integer (or vector of integers) type ``ty2``.
11433 The '``ptrtoint``' instruction takes a ``value`` to cast, which must be
11434 a value of type :ref:`pointer <t_pointer>` or a vector of pointers, and a
11435 type to cast it to ``ty2``, which must be an :ref:`integer <t_integer>` or
11436 a vector of integers type.
11441 The '``ptrtoint``' instruction converts ``value`` to integer type
11442 ``ty2`` by interpreting the pointer value as an integer and either
11443 truncating or zero extending that value to the size of the integer type.
11444 If ``value`` is smaller than ``ty2`` then a zero extension is done. If
11445 ``value`` is larger than ``ty2`` then a truncation is done. If they are
11446 the same size, then nothing is done (*no-op cast*) other than a type
11452 .. code-block:: llvm
11454 %X = ptrtoint ptr %P to i8 ; yields truncation on 32-bit architecture
11455 %Y = ptrtoint ptr %P to i64 ; yields zero extension on 32-bit architecture
11456 %Z = ptrtoint <4 x ptr> %P to <4 x i64>; yields vector zero extension for a vector of addresses on 32-bit architecture
11460 '``inttoptr .. to``' Instruction
11461 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11468 <result> = inttoptr <ty> <value> to <ty2>[, !dereferenceable !<deref_bytes_node>][, !dereferenceable_or_null !<deref_bytes_node>] ; yields ty2
11473 The '``inttoptr``' instruction converts an integer ``value`` to a
11474 pointer type, ``ty2``.
11479 The '``inttoptr``' instruction takes an :ref:`integer <t_integer>` value to
11480 cast, and a type to cast it to, which must be a :ref:`pointer <t_pointer>`
11483 The optional ``!dereferenceable`` metadata must reference a single metadata
11484 name ``<deref_bytes_node>`` corresponding to a metadata node with one ``i64``
11486 See ``dereferenceable`` metadata.
11488 The optional ``!dereferenceable_or_null`` metadata must reference a single
11489 metadata name ``<deref_bytes_node>`` corresponding to a metadata node with one
11491 See ``dereferenceable_or_null`` metadata.
11496 The '``inttoptr``' instruction converts ``value`` to type ``ty2`` by
11497 applying either a zero extension or a truncation depending on the size
11498 of the integer ``value``. If ``value`` is larger than the size of a
11499 pointer then a truncation is done. If ``value`` is smaller than the size
11500 of a pointer then a zero extension is done. If they are the same size,
11501 nothing is done (*no-op cast*).
11506 .. code-block:: llvm
11508 %X = inttoptr i32 255 to ptr ; yields zero extension on 64-bit architecture
11509 %Y = inttoptr i32 255 to ptr ; yields no-op on 32-bit architecture
11510 %Z = inttoptr i64 0 to ptr ; yields truncation on 32-bit architecture
11511 %Z = inttoptr <4 x i32> %G to <4 x ptr>; yields truncation of vector G to four pointers
11515 '``bitcast .. to``' Instruction
11516 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11523 <result> = bitcast <ty> <value> to <ty2> ; yields ty2
11528 The '``bitcast``' instruction converts ``value`` to type ``ty2`` without
11534 The '``bitcast``' instruction takes a value to cast, which must be a
11535 non-aggregate first class value, and a type to cast it to, which must
11536 also be a non-aggregate :ref:`first class <t_firstclass>` type. The
11537 bit sizes of ``value`` and the destination type, ``ty2``, must be
11538 identical. If the source type is a pointer, the destination type must
11539 also be a pointer of the same size. This instruction supports bitwise
11540 conversion of vectors to integers and to vectors of other types (as
11541 long as they have the same size).
11546 The '``bitcast``' instruction converts ``value`` to type ``ty2``. It
11547 is always a *no-op cast* because no bits change with this
11548 conversion. The conversion is done as if the ``value`` had been stored
11549 to memory and read back as type ``ty2``. Pointer (or vector of
11550 pointers) types may only be converted to other pointer (or vector of
11551 pointers) types with the same address space through this instruction.
11552 To convert pointers to other types, use the :ref:`inttoptr <i_inttoptr>`
11553 or :ref:`ptrtoint <i_ptrtoint>` instructions first.
11555 There is a caveat for bitcasts involving vector types in relation to
11556 endianess. For example ``bitcast <2 x i8> <value> to i16`` puts element zero
11557 of the vector in the least significant bits of the i16 for little-endian while
11558 element zero ends up in the most significant bits for big-endian.
11563 .. code-block:: text
11565 %X = bitcast i8 255 to i8 ; yields i8 :-1
11566 %Y = bitcast i32* %x to i16* ; yields i16*:%x
11567 %Z = bitcast <2 x i32> %V to i64; ; yields i64: %V (depends on endianess)
11568 %Z = bitcast <2 x i32*> %V to <2 x i64*> ; yields <2 x i64*>
11570 .. _i_addrspacecast:
11572 '``addrspacecast .. to``' Instruction
11573 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11580 <result> = addrspacecast <pty> <ptrval> to <pty2> ; yields pty2
11585 The '``addrspacecast``' instruction converts ``ptrval`` from ``pty`` in
11586 address space ``n`` to type ``pty2`` in address space ``m``.
11591 The '``addrspacecast``' instruction takes a pointer or vector of pointer value
11592 to cast and a pointer type to cast it to, which must have a different
11598 The '``addrspacecast``' instruction converts the pointer value
11599 ``ptrval`` to type ``pty2``. It can be a *no-op cast* or a complex
11600 value modification, depending on the target and the address space
11601 pair. Pointer conversions within the same address space must be
11602 performed with the ``bitcast`` instruction. Note that if the address
11603 space conversion produces a dereferenceable result then both result
11604 and operand refer to the same memory location. The conversion must
11605 have no side effects, and must not capture the value of the pointer.
11607 If the source is :ref:`poison <poisonvalues>`, the result is
11608 :ref:`poison <poisonvalues>`.
11610 If the source is not :ref:`poison <poisonvalues>`, and both source and
11611 destination are :ref:`integral pointers <nointptrtype>`, and the
11612 result pointer is dereferenceable, the cast is assumed to be
11613 reversible (i.e. casting the result back to the original address space
11614 should yield the original bit pattern).
11619 .. code-block:: llvm
11621 %X = addrspacecast ptr %x to ptr addrspace(1)
11622 %Y = addrspacecast ptr addrspace(1) %y to ptr addrspace(2)
11623 %Z = addrspacecast <4 x ptr> %z to <4 x ptr addrspace(3)>
11630 The instructions in this category are the "miscellaneous" instructions,
11631 which defy better classification.
11635 '``icmp``' Instruction
11636 ^^^^^^^^^^^^^^^^^^^^^^
11643 <result> = icmp <cond> <ty> <op1>, <op2> ; yields i1 or <N x i1>:result
11648 The '``icmp``' instruction returns a boolean value or a vector of
11649 boolean values based on comparison of its two integer, integer vector,
11650 pointer, or pointer vector operands.
11655 The '``icmp``' instruction takes three operands. The first operand is
11656 the condition code indicating the kind of comparison to perform. It is
11657 not a value, just a keyword. The possible condition codes are:
11662 #. ``ne``: not equal
11663 #. ``ugt``: unsigned greater than
11664 #. ``uge``: unsigned greater or equal
11665 #. ``ult``: unsigned less than
11666 #. ``ule``: unsigned less or equal
11667 #. ``sgt``: signed greater than
11668 #. ``sge``: signed greater or equal
11669 #. ``slt``: signed less than
11670 #. ``sle``: signed less or equal
11672 The remaining two arguments must be :ref:`integer <t_integer>` or
11673 :ref:`pointer <t_pointer>` or integer :ref:`vector <t_vector>` typed. They
11674 must also be identical types.
11679 The '``icmp``' compares ``op1`` and ``op2`` according to the condition
11680 code given as ``cond``. The comparison performed always yields either an
11681 :ref:`i1 <t_integer>` or vector of ``i1`` result, as follows:
11683 .. _icmp_md_cc_sem:
11685 #. ``eq``: yields ``true`` if the operands are equal, ``false``
11686 otherwise. No sign interpretation is necessary or performed.
11687 #. ``ne``: yields ``true`` if the operands are unequal, ``false``
11688 otherwise. No sign interpretation is necessary or performed.
11689 #. ``ugt``: interprets the operands as unsigned values and yields
11690 ``true`` if ``op1`` is greater than ``op2``.
11691 #. ``uge``: interprets the operands as unsigned values and yields
11692 ``true`` if ``op1`` is greater than or equal to ``op2``.
11693 #. ``ult``: interprets the operands as unsigned values and yields
11694 ``true`` if ``op1`` is less than ``op2``.
11695 #. ``ule``: interprets the operands as unsigned values and yields
11696 ``true`` if ``op1`` is less than or equal to ``op2``.
11697 #. ``sgt``: interprets the operands as signed values and yields ``true``
11698 if ``op1`` is greater than ``op2``.
11699 #. ``sge``: interprets the operands as signed values and yields ``true``
11700 if ``op1`` is greater than or equal to ``op2``.
11701 #. ``slt``: interprets the operands as signed values and yields ``true``
11702 if ``op1`` is less than ``op2``.
11703 #. ``sle``: interprets the operands as signed values and yields ``true``
11704 if ``op1`` is less than or equal to ``op2``.
11706 If the operands are :ref:`pointer <t_pointer>` typed, the pointer values
11707 are compared as if they were integers.
11709 If the operands are integer vectors, then they are compared element by
11710 element. The result is an ``i1`` vector with the same number of elements
11711 as the values being compared. Otherwise, the result is an ``i1``.
11716 .. code-block:: text
11718 <result> = icmp eq i32 4, 5 ; yields: result=false
11719 <result> = icmp ne ptr %X, %X ; yields: result=false
11720 <result> = icmp ult i16 4, 5 ; yields: result=true
11721 <result> = icmp sgt i16 4, 5 ; yields: result=false
11722 <result> = icmp ule i16 -4, 5 ; yields: result=false
11723 <result> = icmp sge i16 4, 5 ; yields: result=false
11727 '``fcmp``' Instruction
11728 ^^^^^^^^^^^^^^^^^^^^^^
11735 <result> = fcmp [fast-math flags]* <cond> <ty> <op1>, <op2> ; yields i1 or <N x i1>:result
11740 The '``fcmp``' instruction returns a boolean value or vector of boolean
11741 values based on comparison of its operands.
11743 If the operands are floating-point scalars, then the result type is a
11744 boolean (:ref:`i1 <t_integer>`).
11746 If the operands are floating-point vectors, then the result type is a
11747 vector of boolean with the same number of elements as the operands being
11753 The '``fcmp``' instruction takes three operands. The first operand is
11754 the condition code indicating the kind of comparison to perform. It is
11755 not a value, just a keyword. The possible condition codes are:
11757 #. ``false``: no comparison, always returns false
11758 #. ``oeq``: ordered and equal
11759 #. ``ogt``: ordered and greater than
11760 #. ``oge``: ordered and greater than or equal
11761 #. ``olt``: ordered and less than
11762 #. ``ole``: ordered and less than or equal
11763 #. ``one``: ordered and not equal
11764 #. ``ord``: ordered (no nans)
11765 #. ``ueq``: unordered or equal
11766 #. ``ugt``: unordered or greater than
11767 #. ``uge``: unordered or greater than or equal
11768 #. ``ult``: unordered or less than
11769 #. ``ule``: unordered or less than or equal
11770 #. ``une``: unordered or not equal
11771 #. ``uno``: unordered (either nans)
11772 #. ``true``: no comparison, always returns true
11774 *Ordered* means that neither operand is a QNAN while *unordered* means
11775 that either operand may be a QNAN.
11777 Each of ``val1`` and ``val2`` arguments must be either a :ref:`floating-point
11778 <t_floating>` type or a :ref:`vector <t_vector>` of floating-point type.
11779 They must have identical types.
11784 The '``fcmp``' instruction compares ``op1`` and ``op2`` according to the
11785 condition code given as ``cond``. If the operands are vectors, then the
11786 vectors are compared element by element. Each comparison performed
11787 always yields an :ref:`i1 <t_integer>` result, as follows:
11789 #. ``false``: always yields ``false``, regardless of operands.
11790 #. ``oeq``: yields ``true`` if both operands are not a QNAN and ``op1``
11791 is equal to ``op2``.
11792 #. ``ogt``: yields ``true`` if both operands are not a QNAN and ``op1``
11793 is greater than ``op2``.
11794 #. ``oge``: yields ``true`` if both operands are not a QNAN and ``op1``
11795 is greater than or equal to ``op2``.
11796 #. ``olt``: yields ``true`` if both operands are not a QNAN and ``op1``
11797 is less than ``op2``.
11798 #. ``ole``: yields ``true`` if both operands are not a QNAN and ``op1``
11799 is less than or equal to ``op2``.
11800 #. ``one``: yields ``true`` if both operands are not a QNAN and ``op1``
11801 is not equal to ``op2``.
11802 #. ``ord``: yields ``true`` if both operands are not a QNAN.
11803 #. ``ueq``: yields ``true`` if either operand is a QNAN or ``op1`` is
11805 #. ``ugt``: yields ``true`` if either operand is a QNAN or ``op1`` is
11806 greater than ``op2``.
11807 #. ``uge``: yields ``true`` if either operand is a QNAN or ``op1`` is
11808 greater than or equal to ``op2``.
11809 #. ``ult``: yields ``true`` if either operand is a QNAN or ``op1`` is
11811 #. ``ule``: yields ``true`` if either operand is a QNAN or ``op1`` is
11812 less than or equal to ``op2``.
11813 #. ``une``: yields ``true`` if either operand is a QNAN or ``op1`` is
11814 not equal to ``op2``.
11815 #. ``uno``: yields ``true`` if either operand is a QNAN.
11816 #. ``true``: always yields ``true``, regardless of operands.
11818 The ``fcmp`` instruction can also optionally take any number of
11819 :ref:`fast-math flags <fastmath>`, which are optimization hints to enable
11820 otherwise unsafe floating-point optimizations.
11822 Any set of fast-math flags are legal on an ``fcmp`` instruction, but the
11823 only flags that have any effect on its semantics are those that allow
11824 assumptions to be made about the values of input arguments; namely
11825 ``nnan``, ``ninf``, and ``reassoc``. See :ref:`fastmath` for more information.
11830 .. code-block:: text
11832 <result> = fcmp oeq float 4.0, 5.0 ; yields: result=false
11833 <result> = fcmp one float 4.0, 5.0 ; yields: result=true
11834 <result> = fcmp olt float 4.0, 5.0 ; yields: result=true
11835 <result> = fcmp ueq double 1.0, 2.0 ; yields: result=false
11839 '``phi``' Instruction
11840 ^^^^^^^^^^^^^^^^^^^^^
11847 <result> = phi [fast-math-flags] <ty> [ <val0>, <label0>], ...
11852 The '``phi``' instruction is used to implement the φ node in the SSA
11853 graph representing the function.
11858 The type of the incoming values is specified with the first type field.
11859 After this, the '``phi``' instruction takes a list of pairs as
11860 arguments, with one pair for each predecessor basic block of the current
11861 block. Only values of :ref:`first class <t_firstclass>` type may be used as
11862 the value arguments to the PHI node. Only labels may be used as the
11865 There must be no non-phi instructions between the start of a basic block
11866 and the PHI instructions: i.e. PHI instructions must be first in a basic
11869 For the purposes of the SSA form, the use of each incoming value is
11870 deemed to occur on the edge from the corresponding predecessor block to
11871 the current block (but after any definition of an '``invoke``'
11872 instruction's return value on the same edge).
11874 The optional ``fast-math-flags`` marker indicates that the phi has one
11875 or more :ref:`fast-math-flags <fastmath>`. These are optimization hints
11876 to enable otherwise unsafe floating-point optimizations. Fast-math-flags
11877 are only valid for phis that return a floating-point scalar or vector
11878 type, or an array (nested to any depth) of floating-point scalar or vector
11884 At runtime, the '``phi``' instruction logically takes on the value
11885 specified by the pair corresponding to the predecessor basic block that
11886 executed just prior to the current block.
11891 .. code-block:: llvm
11893 Loop: ; Infinite loop that counts from 0 on up...
11894 %indvar = phi i32 [ 0, %LoopHeader ], [ %nextindvar, %Loop ]
11895 %nextindvar = add i32 %indvar, 1
11900 '``select``' Instruction
11901 ^^^^^^^^^^^^^^^^^^^^^^^^
11908 <result> = select [fast-math flags] selty <cond>, <ty> <val1>, <ty> <val2> ; yields ty
11910 selty is either i1 or {<N x i1>}
11915 The '``select``' instruction is used to choose one value based on a
11916 condition, without IR-level branching.
11921 The '``select``' instruction requires an 'i1' value or a vector of 'i1'
11922 values indicating the condition, and two values of the same :ref:`first
11923 class <t_firstclass>` type.
11925 #. The optional ``fast-math flags`` marker indicates that the select has one or more
11926 :ref:`fast-math flags <fastmath>`. These are optimization hints to enable
11927 otherwise unsafe floating-point optimizations. Fast-math flags are only valid
11928 for selects that return a floating-point scalar or vector type, or an array
11929 (nested to any depth) of floating-point scalar or vector types.
11934 If the condition is an i1 and it evaluates to 1, the instruction returns
11935 the first value argument; otherwise, it returns the second value
11938 If the condition is a vector of i1, then the value arguments must be
11939 vectors of the same size, and the selection is done element by element.
11941 If the condition is an i1 and the value arguments are vectors of the
11942 same size, then an entire vector is selected.
11947 .. code-block:: llvm
11949 %X = select i1 true, i8 17, i8 42 ; yields i8:17
11954 '``freeze``' Instruction
11955 ^^^^^^^^^^^^^^^^^^^^^^^^
11962 <result> = freeze ty <val> ; yields ty:result
11967 The '``freeze``' instruction is used to stop propagation of
11968 :ref:`undef <undefvalues>` and :ref:`poison <poisonvalues>` values.
11973 The '``freeze``' instruction takes a single argument.
11978 If the argument is ``undef`` or ``poison``, '``freeze``' returns an
11979 arbitrary, but fixed, value of type '``ty``'.
11980 Otherwise, this instruction is a no-op and returns the input argument.
11981 All uses of a value returned by the same '``freeze``' instruction are
11982 guaranteed to always observe the same value, while different '``freeze``'
11983 instructions may yield different values.
11985 While ``undef`` and ``poison`` pointers can be frozen, the result is a
11986 non-dereferenceable pointer. See the
11987 :ref:`Pointer Aliasing Rules <pointeraliasing>` section for more information.
11988 If an aggregate value or vector is frozen, the operand is frozen element-wise.
11989 The padding of an aggregate isn't considered, since it isn't visible
11990 without storing it into memory and loading it with a different type.
11996 .. code-block:: text
12000 %y = add i32 %w, %w ; undef
12001 %z = add i32 %x, %x ; even number because all uses of %x observe
12003 %x2 = freeze i32 %w
12004 %cmp = icmp eq i32 %x, %x2 ; can be true or false
12006 ; example with vectors
12007 %v = <2 x i32> <i32 undef, i32 poison>
12008 %a = extractelement <2 x i32> %v, i32 0 ; undef
12009 %b = extractelement <2 x i32> %v, i32 1 ; poison
12010 %add = add i32 %a, %a ; undef
12012 %v.fr = freeze <2 x i32> %v ; element-wise freeze
12013 %d = extractelement <2 x i32> %v.fr, i32 0 ; not undef
12014 %add.f = add i32 %d, %d ; even number
12016 ; branching on frozen value
12017 %poison = add nsw i1 %k, undef ; poison
12018 %c = freeze i1 %poison
12019 br i1 %c, label %foo, label %bar ; non-deterministic branch to %foo or %bar
12024 '``call``' Instruction
12025 ^^^^^^^^^^^^^^^^^^^^^^
12032 <result> = [tail | musttail | notail ] call [fast-math flags] [cconv] [ret attrs] [addrspace(<num>)]
12033 <ty>|<fnty> <fnptrval>(<function args>) [fn attrs] [ operand bundles ]
12038 The '``call``' instruction represents a simple function call.
12043 This instruction requires several arguments:
12045 #. The optional ``tail`` and ``musttail`` markers indicate that the optimizers
12046 should perform tail call optimization. The ``tail`` marker is a hint that
12047 `can be ignored <CodeGenerator.html#tail-call-optimization>`_. The
12048 ``musttail`` marker means that the call must be tail call optimized in order
12049 for the program to be correct. This is true even in the presence of
12050 attributes like "disable-tail-calls". The ``musttail`` marker provides these
12053 #. The call will not cause unbounded stack growth if it is part of a
12054 recursive cycle in the call graph.
12055 #. Arguments with the :ref:`inalloca <attr_inalloca>` or
12056 :ref:`preallocated <attr_preallocated>` attribute are forwarded in place.
12057 #. If the musttail call appears in a function with the ``"thunk"`` attribute
12058 and the caller and callee both have varargs, than any unprototyped
12059 arguments in register or memory are forwarded to the callee. Similarly,
12060 the return value of the callee is returned to the caller's caller, even
12061 if a void return type is in use.
12063 Both markers imply that the callee does not access allocas from the caller.
12064 The ``tail`` marker additionally implies that the callee does not access
12065 varargs from the caller. Calls marked ``musttail`` must obey the following
12068 - The call must immediately precede a :ref:`ret <i_ret>` instruction,
12069 or a pointer bitcast followed by a ret instruction.
12070 - The ret instruction must return the (possibly bitcasted) value
12071 produced by the call, undef, or void.
12072 - The calling conventions of the caller and callee must match.
12073 - The callee must be varargs iff the caller is varargs. Bitcasting a
12074 non-varargs function to the appropriate varargs type is legal so
12075 long as the non-varargs prefixes obey the other rules.
12076 - The return type must not undergo automatic conversion to an `sret` pointer.
12078 In addition, if the calling convention is not `swifttailcc` or `tailcc`:
12080 - All ABI-impacting function attributes, such as sret, byval, inreg,
12081 returned, and inalloca, must match.
12082 - The caller and callee prototypes must match. Pointer types of parameters
12083 or return types may differ in pointee type, but not in address space.
12085 On the other hand, if the calling convention is `swifttailcc` or `swiftcc`:
12087 - Only these ABI-impacting attributes attributes are allowed: sret, byval,
12088 swiftself, and swiftasync.
12089 - Prototypes are not required to match.
12091 Tail call optimization for calls marked ``tail`` is guaranteed to occur if
12092 the following conditions are met:
12094 - Caller and callee both have the calling convention ``fastcc`` or ``tailcc``.
12095 - The call is in tail position (ret immediately follows call and ret
12096 uses value of call or is void).
12097 - Option ``-tailcallopt`` is enabled,
12098 ``llvm::GuaranteedTailCallOpt`` is ``true``, or the calling convention
12100 - `Platform-specific constraints are
12101 met. <CodeGenerator.html#tailcallopt>`_
12103 #. The optional ``notail`` marker indicates that the optimizers should not add
12104 ``tail`` or ``musttail`` markers to the call. It is used to prevent tail
12105 call optimization from being performed on the call.
12107 #. The optional ``fast-math flags`` marker indicates that the call has one or more
12108 :ref:`fast-math flags <fastmath>`, which are optimization hints to enable
12109 otherwise unsafe floating-point optimizations. Fast-math flags are only valid
12110 for calls that return a floating-point scalar or vector type, or an array
12111 (nested to any depth) of floating-point scalar or vector types.
12113 #. The optional "cconv" marker indicates which :ref:`calling
12114 convention <callingconv>` the call should use. If none is
12115 specified, the call defaults to using C calling conventions. The
12116 calling convention of the call must match the calling convention of
12117 the target function, or else the behavior is undefined.
12118 #. The optional :ref:`Parameter Attributes <paramattrs>` list for return
12119 values. Only '``zeroext``', '``signext``', and '``inreg``' attributes
12121 #. The optional addrspace attribute can be used to indicate the address space
12122 of the called function. If it is not specified, the program address space
12123 from the :ref:`datalayout string<langref_datalayout>` will be used.
12124 #. '``ty``': the type of the call instruction itself which is also the
12125 type of the return value. Functions that return no value are marked
12127 #. '``fnty``': shall be the signature of the function being called. The
12128 argument types must match the types implied by this signature. This
12129 type can be omitted if the function is not varargs.
12130 #. '``fnptrval``': An LLVM value containing a pointer to a function to
12131 be called. In most cases, this is a direct function call, but
12132 indirect ``call``'s are just as possible, calling an arbitrary pointer
12134 #. '``function args``': argument list whose types match the function
12135 signature argument types and parameter attributes. All arguments must
12136 be of :ref:`first class <t_firstclass>` type. If the function signature
12137 indicates the function accepts a variable number of arguments, the
12138 extra arguments can be specified.
12139 #. The optional :ref:`function attributes <fnattrs>` list.
12140 #. The optional :ref:`operand bundles <opbundles>` list.
12145 The '``call``' instruction is used to cause control flow to transfer to
12146 a specified function, with its incoming arguments bound to the specified
12147 values. Upon a '``ret``' instruction in the called function, control
12148 flow continues with the instruction after the function call, and the
12149 return value of the function is bound to the result argument.
12154 .. code-block:: llvm
12156 %retval = call i32 @test(i32 %argc)
12157 call i32 (ptr, ...) @printf(ptr %msg, i32 12, i8 42) ; yields i32
12158 %X = tail call i32 @foo() ; yields i32
12159 %Y = tail call fastcc i32 @foo() ; yields i32
12160 call void %foo(i8 signext 97)
12162 %struct.A = type { i32, i8 }
12163 %r = call %struct.A @foo() ; yields { i32, i8 }
12164 %gr = extractvalue %struct.A %r, 0 ; yields i32
12165 %gr1 = extractvalue %struct.A %r, 1 ; yields i8
12166 %Z = call void @foo() noreturn ; indicates that %foo never returns normally
12167 %ZZ = call zeroext i32 @bar() ; Return value is %zero extended
12169 llvm treats calls to some functions with names and arguments that match
12170 the standard C99 library as being the C99 library functions, and may
12171 perform optimizations or generate code for them under that assumption.
12172 This is something we'd like to change in the future to provide better
12173 support for freestanding environments and non-C-based languages.
12177 '``va_arg``' Instruction
12178 ^^^^^^^^^^^^^^^^^^^^^^^^
12185 <resultval> = va_arg <va_list*> <arglist>, <argty>
12190 The '``va_arg``' instruction is used to access arguments passed through
12191 the "variable argument" area of a function call. It is used to implement
12192 the ``va_arg`` macro in C.
12197 This instruction takes a ``va_list*`` value and the type of the
12198 argument. It returns a value of the specified argument type and
12199 increments the ``va_list`` to point to the next argument. The actual
12200 type of ``va_list`` is target specific.
12205 The '``va_arg``' instruction loads an argument of the specified type
12206 from the specified ``va_list`` and causes the ``va_list`` to point to
12207 the next argument. For more information, see the variable argument
12208 handling :ref:`Intrinsic Functions <int_varargs>`.
12210 It is legal for this instruction to be called in a function which does
12211 not take a variable number of arguments, for example, the ``vfprintf``
12214 ``va_arg`` is an LLVM instruction instead of an :ref:`intrinsic
12215 function <intrinsics>` because it takes a type as an argument.
12220 See the :ref:`variable argument processing <int_varargs>` section.
12222 Note that the code generator does not yet fully support va\_arg on many
12223 targets. Also, it does not currently support va\_arg with aggregate
12224 types on any target.
12228 '``landingpad``' Instruction
12229 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12236 <resultval> = landingpad <resultty> <clause>+
12237 <resultval> = landingpad <resultty> cleanup <clause>*
12239 <clause> := catch <type> <value>
12240 <clause> := filter <array constant type> <array constant>
12245 The '``landingpad``' instruction is used by `LLVM's exception handling
12246 system <ExceptionHandling.html#overview>`_ to specify that a basic block
12247 is a landing pad --- one where the exception lands, and corresponds to the
12248 code found in the ``catch`` portion of a ``try``/``catch`` sequence. It
12249 defines values supplied by the :ref:`personality function <personalityfn>` upon
12250 re-entry to the function. The ``resultval`` has the type ``resultty``.
12256 ``cleanup`` flag indicates that the landing pad block is a cleanup.
12258 A ``clause`` begins with the clause type --- ``catch`` or ``filter`` --- and
12259 contains the global variable representing the "type" that may be caught
12260 or filtered respectively. Unlike the ``catch`` clause, the ``filter``
12261 clause takes an array constant as its argument. Use
12262 "``[0 x ptr] undef``" for a filter which cannot throw. The
12263 '``landingpad``' instruction must contain *at least* one ``clause`` or
12264 the ``cleanup`` flag.
12269 The '``landingpad``' instruction defines the values which are set by the
12270 :ref:`personality function <personalityfn>` upon re-entry to the function, and
12271 therefore the "result type" of the ``landingpad`` instruction. As with
12272 calling conventions, how the personality function results are
12273 represented in LLVM IR is target specific.
12275 The clauses are applied in order from top to bottom. If two
12276 ``landingpad`` instructions are merged together through inlining, the
12277 clauses from the calling function are appended to the list of clauses.
12278 When the call stack is being unwound due to an exception being thrown,
12279 the exception is compared against each ``clause`` in turn. If it doesn't
12280 match any of the clauses, and the ``cleanup`` flag is not set, then
12281 unwinding continues further up the call stack.
12283 The ``landingpad`` instruction has several restrictions:
12285 - A landing pad block is a basic block which is the unwind destination
12286 of an '``invoke``' instruction.
12287 - A landing pad block must have a '``landingpad``' instruction as its
12288 first non-PHI instruction.
12289 - There can be only one '``landingpad``' instruction within the landing
12291 - A basic block that is not a landing pad block may not include a
12292 '``landingpad``' instruction.
12297 .. code-block:: llvm
12299 ;; A landing pad which can catch an integer.
12300 %res = landingpad { ptr, i32 }
12302 ;; A landing pad that is a cleanup.
12303 %res = landingpad { ptr, i32 }
12305 ;; A landing pad which can catch an integer and can only throw a double.
12306 %res = landingpad { ptr, i32 }
12308 filter [1 x ptr] [ptr @_ZTId]
12312 '``catchpad``' Instruction
12313 ^^^^^^^^^^^^^^^^^^^^^^^^^^
12320 <resultval> = catchpad within <catchswitch> [<args>*]
12325 The '``catchpad``' instruction is used by `LLVM's exception handling
12326 system <ExceptionHandling.html#overview>`_ to specify that a basic block
12327 begins a catch handler --- one where a personality routine attempts to transfer
12328 control to catch an exception.
12333 The ``catchswitch`` operand must always be a token produced by a
12334 :ref:`catchswitch <i_catchswitch>` instruction in a predecessor block. This
12335 ensures that each ``catchpad`` has exactly one predecessor block, and it always
12336 terminates in a ``catchswitch``.
12338 The ``args`` correspond to whatever information the personality routine
12339 requires to know if this is an appropriate handler for the exception. Control
12340 will transfer to the ``catchpad`` if this is the first appropriate handler for
12343 The ``resultval`` has the type :ref:`token <t_token>` and is used to match the
12344 ``catchpad`` to corresponding :ref:`catchrets <i_catchret>` and other nested EH
12350 When the call stack is being unwound due to an exception being thrown, the
12351 exception is compared against the ``args``. If it doesn't match, control will
12352 not reach the ``catchpad`` instruction. The representation of ``args`` is
12353 entirely target and personality function-specific.
12355 Like the :ref:`landingpad <i_landingpad>` instruction, the ``catchpad``
12356 instruction must be the first non-phi of its parent basic block.
12358 The meaning of the tokens produced and consumed by ``catchpad`` and other "pad"
12359 instructions is described in the
12360 `Windows exception handling documentation\ <ExceptionHandling.html#wineh>`_.
12362 When a ``catchpad`` has been "entered" but not yet "exited" (as
12363 described in the `EH documentation\ <ExceptionHandling.html#wineh-constraints>`_),
12364 it is undefined behavior to execute a :ref:`call <i_call>` or :ref:`invoke <i_invoke>`
12365 that does not carry an appropriate :ref:`"funclet" bundle <ob_funclet>`.
12370 .. code-block:: text
12373 %cs = catchswitch within none [label %handler0] unwind to caller
12374 ;; A catch block which can catch an integer.
12376 %tok = catchpad within %cs [ptr @_ZTIi]
12380 '``cleanuppad``' Instruction
12381 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12388 <resultval> = cleanuppad within <parent> [<args>*]
12393 The '``cleanuppad``' instruction is used by `LLVM's exception handling
12394 system <ExceptionHandling.html#overview>`_ to specify that a basic block
12395 is a cleanup block --- one where a personality routine attempts to
12396 transfer control to run cleanup actions.
12397 The ``args`` correspond to whatever additional
12398 information the :ref:`personality function <personalityfn>` requires to
12399 execute the cleanup.
12400 The ``resultval`` has the type :ref:`token <t_token>` and is used to
12401 match the ``cleanuppad`` to corresponding :ref:`cleanuprets <i_cleanupret>`.
12402 The ``parent`` argument is the token of the funclet that contains the
12403 ``cleanuppad`` instruction. If the ``cleanuppad`` is not inside a funclet,
12404 this operand may be the token ``none``.
12409 The instruction takes a list of arbitrary values which are interpreted
12410 by the :ref:`personality function <personalityfn>`.
12415 When the call stack is being unwound due to an exception being thrown,
12416 the :ref:`personality function <personalityfn>` transfers control to the
12417 ``cleanuppad`` with the aid of the personality-specific arguments.
12418 As with calling conventions, how the personality function results are
12419 represented in LLVM IR is target specific.
12421 The ``cleanuppad`` instruction has several restrictions:
12423 - A cleanup block is a basic block which is the unwind destination of
12424 an exceptional instruction.
12425 - A cleanup block must have a '``cleanuppad``' instruction as its
12426 first non-PHI instruction.
12427 - There can be only one '``cleanuppad``' instruction within the
12429 - A basic block that is not a cleanup block may not include a
12430 '``cleanuppad``' instruction.
12432 When a ``cleanuppad`` has been "entered" but not yet "exited" (as
12433 described in the `EH documentation\ <ExceptionHandling.html#wineh-constraints>`_),
12434 it is undefined behavior to execute a :ref:`call <i_call>` or :ref:`invoke <i_invoke>`
12435 that does not carry an appropriate :ref:`"funclet" bundle <ob_funclet>`.
12440 .. code-block:: text
12442 %tok = cleanuppad within %cs []
12446 Intrinsic Functions
12447 ===================
12449 LLVM supports the notion of an "intrinsic function". These functions
12450 have well known names and semantics and are required to follow certain
12451 restrictions. Overall, these intrinsics represent an extension mechanism
12452 for the LLVM language that does not require changing all of the
12453 transformations in LLVM when adding to the language (or the bitcode
12454 reader/writer, the parser, etc...).
12456 Intrinsic function names must all start with an "``llvm.``" prefix. This
12457 prefix is reserved in LLVM for intrinsic names; thus, function names may
12458 not begin with this prefix. Intrinsic functions must always be external
12459 functions: you cannot define the body of intrinsic functions. Intrinsic
12460 functions may only be used in call or invoke instructions: it is illegal
12461 to take the address of an intrinsic function. Additionally, because
12462 intrinsic functions are part of the LLVM language, it is required if any
12463 are added that they be documented here.
12465 Some intrinsic functions can be overloaded, i.e., the intrinsic
12466 represents a family of functions that perform the same operation but on
12467 different data types. Because LLVM can represent over 8 million
12468 different integer types, overloading is used commonly to allow an
12469 intrinsic function to operate on any integer type. One or more of the
12470 argument types or the result type can be overloaded to accept any
12471 integer type. Argument types may also be defined as exactly matching a
12472 previous argument's type or the result type. This allows an intrinsic
12473 function which accepts multiple arguments, but needs all of them to be
12474 of the same type, to only be overloaded with respect to a single
12475 argument or the result.
12477 Overloaded intrinsics will have the names of its overloaded argument
12478 types encoded into its function name, each preceded by a period. Only
12479 those types which are overloaded result in a name suffix. Arguments
12480 whose type is matched against another type do not. For example, the
12481 ``llvm.ctpop`` function can take an integer of any width and returns an
12482 integer of exactly the same integer width. This leads to a family of
12483 functions such as ``i8 @llvm.ctpop.i8(i8 %val)`` and
12484 ``i29 @llvm.ctpop.i29(i29 %val)``. Only one type, the return type, is
12485 overloaded, and only one type suffix is required. Because the argument's
12486 type is matched against the return type, it does not require its own
12489 :ref:`Unnamed types <t_opaque>` are encoded as ``s_s``. Overloaded intrinsics
12490 that depend on an unnamed type in one of its overloaded argument types get an
12491 additional ``.<number>`` suffix. This allows differentiating intrinsics with
12492 different unnamed types as arguments. (For example:
12493 ``llvm.ssa.copy.p0s_s.2(%42*)``) The number is tracked in the LLVM module and
12494 it ensures unique names in the module. While linking together two modules, it is
12495 still possible to get a name clash. In that case one of the names will be
12496 changed by getting a new number.
12498 For target developers who are defining intrinsics for back-end code
12499 generation, any intrinsic overloads based solely the distinction between
12500 integer or floating point types should not be relied upon for correct
12501 code generation. In such cases, the recommended approach for target
12502 maintainers when defining intrinsics is to create separate integer and
12503 FP intrinsics rather than rely on overloading. For example, if different
12504 codegen is required for ``llvm.target.foo(<4 x i32>)`` and
12505 ``llvm.target.foo(<4 x float>)`` then these should be split into
12506 different intrinsics.
12508 To learn how to add an intrinsic function, please see the `Extending
12509 LLVM Guide <ExtendingLLVM.html>`_.
12513 Variable Argument Handling Intrinsics
12514 -------------------------------------
12516 Variable argument support is defined in LLVM with the
12517 :ref:`va_arg <i_va_arg>` instruction and these three intrinsic
12518 functions. These functions are related to the similarly named macros
12519 defined in the ``<stdarg.h>`` header file.
12521 All of these functions operate on arguments that use a target-specific
12522 value type "``va_list``". The LLVM assembly language reference manual
12523 does not define what this type is, so all transformations should be
12524 prepared to handle these functions regardless of the type used.
12526 This example shows how the :ref:`va_arg <i_va_arg>` instruction and the
12527 variable argument handling intrinsic functions are used.
12529 .. code-block:: llvm
12531 ; This struct is different for every platform. For most platforms,
12532 ; it is merely a ptr.
12533 %struct.va_list = type { ptr }
12535 ; For Unix x86_64 platforms, va_list is the following struct:
12536 ; %struct.va_list = type { i32, i32, ptr, ptr }
12538 define i32 @test(i32 %X, ...) {
12539 ; Initialize variable argument processing
12540 %ap = alloca %struct.va_list
12541 call void @llvm.va_start(ptr %ap)
12543 ; Read a single integer argument
12544 %tmp = va_arg ptr %ap, i32
12546 ; Demonstrate usage of llvm.va_copy and llvm.va_end
12548 call void @llvm.va_copy(ptr %aq, ptr %ap)
12549 call void @llvm.va_end(ptr %aq)
12551 ; Stop processing of arguments.
12552 call void @llvm.va_end(ptr %ap)
12556 declare void @llvm.va_start(ptr)
12557 declare void @llvm.va_copy(ptr, ptr)
12558 declare void @llvm.va_end(ptr)
12562 '``llvm.va_start``' Intrinsic
12563 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12570 declare void @llvm.va_start(ptr <arglist>)
12575 The '``llvm.va_start``' intrinsic initializes ``<arglist>`` for
12576 subsequent use by ``va_arg``.
12581 The argument is a pointer to a ``va_list`` element to initialize.
12586 The '``llvm.va_start``' intrinsic works just like the ``va_start`` macro
12587 available in C. In a target-dependent way, it initializes the
12588 ``va_list`` element to which the argument points, so that the next call
12589 to ``va_arg`` will produce the first variable argument passed to the
12590 function. Unlike the C ``va_start`` macro, this intrinsic does not need
12591 to know the last argument of the function as the compiler can figure
12594 '``llvm.va_end``' Intrinsic
12595 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
12602 declare void @llvm.va_end(ptr <arglist>)
12607 The '``llvm.va_end``' intrinsic destroys ``<arglist>``, which has been
12608 initialized previously with ``llvm.va_start`` or ``llvm.va_copy``.
12613 The argument is a pointer to a ``va_list`` to destroy.
12618 The '``llvm.va_end``' intrinsic works just like the ``va_end`` macro
12619 available in C. In a target-dependent way, it destroys the ``va_list``
12620 element to which the argument points. Calls to
12621 :ref:`llvm.va_start <int_va_start>` and
12622 :ref:`llvm.va_copy <int_va_copy>` must be matched exactly with calls to
12627 '``llvm.va_copy``' Intrinsic
12628 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12635 declare void @llvm.va_copy(ptr <destarglist>, ptr <srcarglist>)
12640 The '``llvm.va_copy``' intrinsic copies the current argument position
12641 from the source argument list to the destination argument list.
12646 The first argument is a pointer to a ``va_list`` element to initialize.
12647 The second argument is a pointer to a ``va_list`` element to copy from.
12652 The '``llvm.va_copy``' intrinsic works just like the ``va_copy`` macro
12653 available in C. In a target-dependent way, it copies the source
12654 ``va_list`` element into the destination ``va_list`` element. This
12655 intrinsic is necessary because the `` llvm.va_start`` intrinsic may be
12656 arbitrarily complex and require, for example, memory allocation.
12658 Accurate Garbage Collection Intrinsics
12659 --------------------------------------
12661 LLVM's support for `Accurate Garbage Collection <GarbageCollection.html>`_
12662 (GC) requires the frontend to generate code containing appropriate intrinsic
12663 calls and select an appropriate GC strategy which knows how to lower these
12664 intrinsics in a manner which is appropriate for the target collector.
12666 These intrinsics allow identification of :ref:`GC roots on the
12667 stack <int_gcroot>`, as well as garbage collector implementations that
12668 require :ref:`read <int_gcread>` and :ref:`write <int_gcwrite>` barriers.
12669 Frontends for type-safe garbage collected languages should generate
12670 these intrinsics to make use of the LLVM garbage collectors. For more
12671 details, see `Garbage Collection with LLVM <GarbageCollection.html>`_.
12673 LLVM provides an second experimental set of intrinsics for describing garbage
12674 collection safepoints in compiled code. These intrinsics are an alternative
12675 to the ``llvm.gcroot`` intrinsics, but are compatible with the ones for
12676 :ref:`read <int_gcread>` and :ref:`write <int_gcwrite>` barriers. The
12677 differences in approach are covered in the `Garbage Collection with LLVM
12678 <GarbageCollection.html>`_ documentation. The intrinsics themselves are
12679 described in :doc:`Statepoints`.
12683 '``llvm.gcroot``' Intrinsic
12684 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
12691 declare void @llvm.gcroot(ptr %ptrloc, ptr %metadata)
12696 The '``llvm.gcroot``' intrinsic declares the existence of a GC root to
12697 the code generator, and allows some metadata to be associated with it.
12702 The first argument specifies the address of a stack object that contains
12703 the root pointer. The second pointer (which must be either a constant or
12704 a global value address) contains the meta-data to be associated with the
12710 At runtime, a call to this intrinsic stores a null pointer into the
12711 "ptrloc" location. At compile-time, the code generator generates
12712 information to allow the runtime to find the pointer at GC safe points.
12713 The '``llvm.gcroot``' intrinsic may only be used in a function which
12714 :ref:`specifies a GC algorithm <gc>`.
12718 '``llvm.gcread``' Intrinsic
12719 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
12726 declare ptr @llvm.gcread(ptr %ObjPtr, ptr %Ptr)
12731 The '``llvm.gcread``' intrinsic identifies reads of references from heap
12732 locations, allowing garbage collector implementations that require read
12738 The second argument is the address to read from, which should be an
12739 address allocated from the garbage collector. The first object is a
12740 pointer to the start of the referenced object, if needed by the language
12741 runtime (otherwise null).
12746 The '``llvm.gcread``' intrinsic has the same semantics as a load
12747 instruction, but may be replaced with substantially more complex code by
12748 the garbage collector runtime, as needed. The '``llvm.gcread``'
12749 intrinsic may only be used in a function which :ref:`specifies a GC
12754 '``llvm.gcwrite``' Intrinsic
12755 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12762 declare void @llvm.gcwrite(ptr %P1, ptr %Obj, ptr %P2)
12767 The '``llvm.gcwrite``' intrinsic identifies writes of references to heap
12768 locations, allowing garbage collector implementations that require write
12769 barriers (such as generational or reference counting collectors).
12774 The first argument is the reference to store, the second is the start of
12775 the object to store it to, and the third is the address of the field of
12776 Obj to store to. If the runtime does not require a pointer to the
12777 object, Obj may be null.
12782 The '``llvm.gcwrite``' intrinsic has the same semantics as a store
12783 instruction, but may be replaced with substantially more complex code by
12784 the garbage collector runtime, as needed. The '``llvm.gcwrite``'
12785 intrinsic may only be used in a function which :ref:`specifies a GC
12791 '``llvm.experimental.gc.statepoint``' Intrinsic
12792 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12800 @llvm.experimental.gc.statepoint(i64 <id>, i32 <num patch bytes>,
12801 ptr elementtype(func_type) <target>,
12802 i64 <#call args>, i64 <flags>,
12803 ... (call parameters),
12809 The statepoint intrinsic represents a call which is parse-able by the
12815 The 'id' operand is a constant integer that is reported as the ID
12816 field in the generated stackmap. LLVM does not interpret this
12817 parameter in any way and its meaning is up to the statepoint user to
12818 decide. Note that LLVM is free to duplicate code containing
12819 statepoint calls, and this may transform IR that had a unique 'id' per
12820 lexical call to statepoint to IR that does not.
12822 If 'num patch bytes' is non-zero then the call instruction
12823 corresponding to the statepoint is not emitted and LLVM emits 'num
12824 patch bytes' bytes of nops in its place. LLVM will emit code to
12825 prepare the function arguments and retrieve the function return value
12826 in accordance to the calling convention; the former before the nop
12827 sequence and the latter after the nop sequence. It is expected that
12828 the user will patch over the 'num patch bytes' bytes of nops with a
12829 calling sequence specific to their runtime before executing the
12830 generated machine code. There are no guarantees with respect to the
12831 alignment of the nop sequence. Unlike :doc:`StackMaps` statepoints do
12832 not have a concept of shadow bytes. Note that semantically the
12833 statepoint still represents a call or invoke to 'target', and the nop
12834 sequence after patching is expected to represent an operation
12835 equivalent to a call or invoke to 'target'.
12837 The 'target' operand is the function actually being called. The operand
12838 must have an :ref:`elementtype <attr_elementtype>` attribute specifying
12839 the function type of the target. The target can be specified as either
12840 a symbolic LLVM function, or as an arbitrary Value of pointer type. Note
12841 that the function type must match the signature of the callee and the
12842 types of the 'call parameters' arguments.
12844 The '#call args' operand is the number of arguments to the actual
12845 call. It must exactly match the number of arguments passed in the
12846 'call parameters' variable length section.
12848 The 'flags' operand is used to specify extra information about the
12849 statepoint. This is currently only used to mark certain statepoints
12850 as GC transitions. This operand is a 64-bit integer with the following
12851 layout, where bit 0 is the least significant bit:
12853 +-------+---------------------------------------------------+
12855 +=======+===================================================+
12856 | 0 | Set if the statepoint is a GC transition, cleared |
12858 +-------+---------------------------------------------------+
12859 | 1-63 | Reserved for future use; must be cleared. |
12860 +-------+---------------------------------------------------+
12862 The 'call parameters' arguments are simply the arguments which need to
12863 be passed to the call target. They will be lowered according to the
12864 specified calling convention and otherwise handled like a normal call
12865 instruction. The number of arguments must exactly match what is
12866 specified in '# call args'. The types must match the signature of
12869 The 'call parameter' attributes must be followed by two 'i64 0' constants.
12870 These were originally the length prefixes for 'gc transition parameter' and
12871 'deopt parameter' arguments, but the role of these parameter sets have been
12872 entirely replaced with the corresponding operand bundles. In a future
12873 revision, these now redundant arguments will be removed.
12878 A statepoint is assumed to read and write all memory. As a result,
12879 memory operations can not be reordered past a statepoint. It is
12880 illegal to mark a statepoint as being either 'readonly' or 'readnone'.
12882 Note that legal IR can not perform any memory operation on a 'gc
12883 pointer' argument of the statepoint in a location statically reachable
12884 from the statepoint. Instead, the explicitly relocated value (from a
12885 ``gc.relocate``) must be used.
12887 '``llvm.experimental.gc.result``' Intrinsic
12888 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12896 @llvm.experimental.gc.result(token %statepoint_token)
12901 ``gc.result`` extracts the result of the original call instruction
12902 which was replaced by the ``gc.statepoint``. The ``gc.result``
12903 intrinsic is actually a family of three intrinsics due to an
12904 implementation limitation. Other than the type of the return value,
12905 the semantics are the same.
12910 The first and only argument is the ``gc.statepoint`` which starts
12911 the safepoint sequence of which this ``gc.result`` is a part.
12912 Despite the typing of this as a generic token, *only* the value defined
12913 by a ``gc.statepoint`` is legal here.
12918 The ``gc.result`` represents the return value of the call target of
12919 the ``statepoint``. The type of the ``gc.result`` must exactly match
12920 the type of the target. If the call target returns void, there will
12921 be no ``gc.result``.
12923 A ``gc.result`` is modeled as a 'readnone' pure function. It has no
12924 side effects since it is just a projection of the return value of the
12925 previous call represented by the ``gc.statepoint``.
12927 '``llvm.experimental.gc.relocate``' Intrinsic
12928 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12935 declare <pointer type>
12936 @llvm.experimental.gc.relocate(token %statepoint_token,
12938 i32 %pointer_offset)
12943 A ``gc.relocate`` returns the potentially relocated value of a pointer
12949 The first argument is the ``gc.statepoint`` which starts the
12950 safepoint sequence of which this ``gc.relocation`` is a part.
12951 Despite the typing of this as a generic token, *only* the value defined
12952 by a ``gc.statepoint`` is legal here.
12954 The second and third arguments are both indices into operands of the
12955 corresponding statepoint's :ref:`gc-live <ob_gc_live>` operand bundle.
12957 The second argument is an index which specifies the allocation for the pointer
12958 being relocated. The associated value must be within the object with which the
12959 pointer being relocated is associated. The optimizer is free to change *which*
12960 interior derived pointer is reported, provided that it does not replace an
12961 actual base pointer with another interior derived pointer. Collectors are
12962 allowed to rely on the base pointer operand remaining an actual base pointer if
12965 The third argument is an index which specify the (potentially) derived pointer
12966 being relocated. It is legal for this index to be the same as the second
12967 argument if-and-only-if a base pointer is being relocated.
12972 The return value of ``gc.relocate`` is the potentially relocated value
12973 of the pointer specified by its arguments. It is unspecified how the
12974 value of the returned pointer relates to the argument to the
12975 ``gc.statepoint`` other than that a) it points to the same source
12976 language object with the same offset, and b) the 'based-on'
12977 relationship of the newly relocated pointers is a projection of the
12978 unrelocated pointers. In particular, the integer value of the pointer
12979 returned is unspecified.
12981 A ``gc.relocate`` is modeled as a ``readnone`` pure function. It has no
12982 side effects since it is just a way to extract information about work
12983 done during the actual call modeled by the ``gc.statepoint``.
12985 .. _gc.get.pointer.base:
12987 '``llvm.experimental.gc.get.pointer.base``' Intrinsic
12988 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12995 declare <pointer type>
12996 @llvm.experimental.gc.get.pointer.base(
12997 <pointer type> readnone nocapture %derived_ptr)
12998 nounwind willreturn memory(none)
13003 ``gc.get.pointer.base`` for a derived pointer returns its base pointer.
13008 The only argument is a pointer which is based on some object with
13009 an unknown offset from the base of said object.
13014 This intrinsic is used in the abstract machine model for GC to represent
13015 the base pointer for an arbitrary derived pointer.
13017 This intrinsic is inlined by the :ref:`RewriteStatepointsForGC` pass by
13018 replacing all uses of this callsite with the offset of a derived pointer from
13019 its base pointer value. The replacement is done as part of the lowering to the
13020 explicit statepoint model.
13022 The return pointer type must be the same as the type of the parameter.
13025 '``llvm.experimental.gc.get.pointer.offset``' Intrinsic
13026 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13034 @llvm.experimental.gc.get.pointer.offset(
13035 <pointer type> readnone nocapture %derived_ptr)
13036 nounwind willreturn memory(none)
13041 ``gc.get.pointer.offset`` for a derived pointer returns the offset from its
13047 The only argument is a pointer which is based on some object with
13048 an unknown offset from the base of said object.
13053 This intrinsic is used in the abstract machine model for GC to represent
13054 the offset of an arbitrary derived pointer from its base pointer.
13056 This intrinsic is inlined by the :ref:`RewriteStatepointsForGC` pass by
13057 replacing all uses of this callsite with the offset of a derived pointer from
13058 its base pointer value. The replacement is done as part of the lowering to the
13059 explicit statepoint model.
13061 Basically this call calculates difference between the derived pointer and its
13062 base pointer (see :ref:`gc.get.pointer.base`) both ptrtoint casted. But
13063 this cast done outside the :ref:`RewriteStatepointsForGC` pass could result
13064 in the pointers lost for further lowering from the abstract model to the
13065 explicit physical one.
13067 Code Generator Intrinsics
13068 -------------------------
13070 These intrinsics are provided by LLVM to expose special features that
13071 may only be implemented with code generator support.
13073 '``llvm.returnaddress``' Intrinsic
13074 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13081 declare ptr @llvm.returnaddress(i32 <level>)
13086 The '``llvm.returnaddress``' intrinsic attempts to compute a
13087 target-specific value indicating the return address of the current
13088 function or one of its callers.
13093 The argument to this intrinsic indicates which function to return the
13094 address for. Zero indicates the calling function, one indicates its
13095 caller, etc. The argument is **required** to be a constant integer
13101 The '``llvm.returnaddress``' intrinsic either returns a pointer
13102 indicating the return address of the specified call frame, or zero if it
13103 cannot be identified. The value returned by this intrinsic is likely to
13104 be incorrect or 0 for arguments other than zero, so it should only be
13105 used for debugging purposes.
13107 Note that calling this intrinsic does not prevent function inlining or
13108 other aggressive transformations, so the value returned may not be that
13109 of the obvious source-language caller.
13111 '``llvm.addressofreturnaddress``' Intrinsic
13112 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13119 declare ptr @llvm.addressofreturnaddress()
13124 The '``llvm.addressofreturnaddress``' intrinsic returns a target-specific
13125 pointer to the place in the stack frame where the return address of the
13126 current function is stored.
13131 Note that calling this intrinsic does not prevent function inlining or
13132 other aggressive transformations, so the value returned may not be that
13133 of the obvious source-language caller.
13135 This intrinsic is only implemented for x86 and aarch64.
13137 '``llvm.sponentry``' Intrinsic
13138 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13145 declare ptr @llvm.sponentry()
13150 The '``llvm.sponentry``' intrinsic returns the stack pointer value at
13151 the entry of the current function calling this intrinsic.
13156 Note this intrinsic is only verified on AArch64 and ARM.
13158 '``llvm.frameaddress``' Intrinsic
13159 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13166 declare ptr @llvm.frameaddress(i32 <level>)
13171 The '``llvm.frameaddress``' intrinsic attempts to return the
13172 target-specific frame pointer value for the specified stack frame.
13177 The argument to this intrinsic indicates which function to return the
13178 frame pointer for. Zero indicates the calling function, one indicates
13179 its caller, etc. The argument is **required** to be a constant integer
13185 The '``llvm.frameaddress``' intrinsic either returns a pointer
13186 indicating the frame address of the specified call frame, or zero if it
13187 cannot be identified. The value returned by this intrinsic is likely to
13188 be incorrect or 0 for arguments other than zero, so it should only be
13189 used for debugging purposes.
13191 Note that calling this intrinsic does not prevent function inlining or
13192 other aggressive transformations, so the value returned may not be that
13193 of the obvious source-language caller.
13195 '``llvm.swift.async.context.addr``' Intrinsic
13196 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13203 declare ptr @llvm.swift.async.context.addr()
13208 The '``llvm.swift.async.context.addr``' intrinsic returns a pointer to
13209 the part of the extended frame record containing the asynchronous
13210 context of a Swift execution.
13215 If the caller has a ``swiftasync`` parameter, that argument will initially
13216 be stored at the returned address. If not, it will be initialized to null.
13218 '``llvm.localescape``' and '``llvm.localrecover``' Intrinsics
13219 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13226 declare void @llvm.localescape(...)
13227 declare ptr @llvm.localrecover(ptr %func, ptr %fp, i32 %idx)
13232 The '``llvm.localescape``' intrinsic escapes offsets of a collection of static
13233 allocas, and the '``llvm.localrecover``' intrinsic applies those offsets to a
13234 live frame pointer to recover the address of the allocation. The offset is
13235 computed during frame layout of the caller of ``llvm.localescape``.
13240 All arguments to '``llvm.localescape``' must be pointers to static allocas or
13241 casts of static allocas. Each function can only call '``llvm.localescape``'
13242 once, and it can only do so from the entry block.
13244 The ``func`` argument to '``llvm.localrecover``' must be a constant
13245 bitcasted pointer to a function defined in the current module. The code
13246 generator cannot determine the frame allocation offset of functions defined in
13249 The ``fp`` argument to '``llvm.localrecover``' must be a frame pointer of a
13250 call frame that is currently live. The return value of '``llvm.localaddress``'
13251 is one way to produce such a value, but various runtimes also expose a suitable
13252 pointer in platform-specific ways.
13254 The ``idx`` argument to '``llvm.localrecover``' indicates which alloca passed to
13255 '``llvm.localescape``' to recover. It is zero-indexed.
13260 These intrinsics allow a group of functions to share access to a set of local
13261 stack allocations of a one parent function. The parent function may call the
13262 '``llvm.localescape``' intrinsic once from the function entry block, and the
13263 child functions can use '``llvm.localrecover``' to access the escaped allocas.
13264 The '``llvm.localescape``' intrinsic blocks inlining, as inlining changes where
13265 the escaped allocas are allocated, which would break attempts to use
13266 '``llvm.localrecover``'.
13268 '``llvm.seh.try.begin``' and '``llvm.seh.try.end``' Intrinsics
13269 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13276 declare void @llvm.seh.try.begin()
13277 declare void @llvm.seh.try.end()
13282 The '``llvm.seh.try.begin``' and '``llvm.seh.try.end``' intrinsics mark
13283 the boundary of a _try region for Windows SEH Asynchrous Exception Handling.
13288 When a C-function is compiled with Windows SEH Asynchrous Exception option,
13289 -feh_asynch (aka MSVC -EHa), these two intrinsics are injected to mark _try
13290 boundary and to prevent potential exceptions from being moved across boundary.
13291 Any set of operations can then be confined to the region by reading their leaf
13292 inputs via volatile loads and writing their root outputs via volatile stores.
13294 '``llvm.seh.scope.begin``' and '``llvm.seh.scope.end``' Intrinsics
13295 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13302 declare void @llvm.seh.scope.begin()
13303 declare void @llvm.seh.scope.end()
13308 The '``llvm.seh.scope.begin``' and '``llvm.seh.scope.end``' intrinsics mark
13309 the boundary of a CPP object lifetime for Windows SEH Asynchrous Exception
13310 Handling (MSVC option -EHa).
13315 LLVM's ordinary exception-handling representation associates EH cleanups and
13316 handlers only with ``invoke``s, which normally correspond only to call sites. To
13317 support arbitrary faulting instructions, it must be possible to recover the current
13318 EH scope for any instruction. Turning every operation in LLVM that could fault
13319 into an ``invoke`` of a new, potentially-throwing intrinsic would require adding a
13320 large number of intrinsics, impede optimization of those operations, and make
13321 compilation slower by introducing many extra basic blocks. These intrinsics can
13322 be used instead to mark the region protected by a cleanup, such as for a local
13323 C++ object with a non-trivial destructor. ``llvm.seh.scope.begin`` is used to mark
13324 the start of the region; it is always called with ``invoke``, with the unwind block
13325 being the desired unwind destination for any potentially-throwing instructions
13326 within the region. `llvm.seh.scope.end` is used to mark when the scope ends
13327 and the EH cleanup is no longer required (e.g. because the destructor is being
13330 .. _int_read_register:
13331 .. _int_read_volatile_register:
13332 .. _int_write_register:
13334 '``llvm.read_register``', '``llvm.read_volatile_register``', and '``llvm.write_register``' Intrinsics
13335 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13342 declare i32 @llvm.read_register.i32(metadata)
13343 declare i64 @llvm.read_register.i64(metadata)
13344 declare i32 @llvm.read_volatile_register.i32(metadata)
13345 declare i64 @llvm.read_volatile_register.i64(metadata)
13346 declare void @llvm.write_register.i32(metadata, i32 @value)
13347 declare void @llvm.write_register.i64(metadata, i64 @value)
13353 The '``llvm.read_register``', '``llvm.read_volatile_register``', and
13354 '``llvm.write_register``' intrinsics provide access to the named register.
13355 The register must be valid on the architecture being compiled to. The type
13356 needs to be compatible with the register being read.
13361 The '``llvm.read_register``' and '``llvm.read_volatile_register``' intrinsics
13362 return the current value of the register, where possible. The
13363 '``llvm.write_register``' intrinsic sets the current value of the register,
13366 A call to '``llvm.read_volatile_register``' is assumed to have side-effects
13367 and possibly return a different value each time (e.g. for a timer register).
13369 This is useful to implement named register global variables that need
13370 to always be mapped to a specific register, as is common practice on
13371 bare-metal programs including OS kernels.
13373 The compiler doesn't check for register availability or use of the used
13374 register in surrounding code, including inline assembly. Because of that,
13375 allocatable registers are not supported.
13377 Warning: So far it only works with the stack pointer on selected
13378 architectures (ARM, AArch64, PowerPC and x86_64). Significant amount of
13379 work is needed to support other registers and even more so, allocatable
13384 '``llvm.stacksave``' Intrinsic
13385 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13392 declare ptr @llvm.stacksave()
13397 The '``llvm.stacksave``' intrinsic is used to remember the current state
13398 of the function stack, for use with
13399 :ref:`llvm.stackrestore <int_stackrestore>`. This is useful for
13400 implementing language features like scoped automatic variable sized
13406 This intrinsic returns an opaque pointer value that can be passed to
13407 :ref:`llvm.stackrestore <int_stackrestore>`. When an
13408 ``llvm.stackrestore`` intrinsic is executed with a value saved from
13409 ``llvm.stacksave``, it effectively restores the state of the stack to
13410 the state it was in when the ``llvm.stacksave`` intrinsic executed. In
13411 practice, this pops any :ref:`alloca <i_alloca>` blocks from the stack that
13412 were allocated after the ``llvm.stacksave`` was executed.
13414 .. _int_stackrestore:
13416 '``llvm.stackrestore``' Intrinsic
13417 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13424 declare void @llvm.stackrestore(ptr %ptr)
13429 The '``llvm.stackrestore``' intrinsic is used to restore the state of
13430 the function stack to the state it was in when the corresponding
13431 :ref:`llvm.stacksave <int_stacksave>` intrinsic executed. This is
13432 useful for implementing language features like scoped automatic variable
13433 sized arrays in C99.
13438 See the description for :ref:`llvm.stacksave <int_stacksave>`.
13440 .. _int_get_dynamic_area_offset:
13442 '``llvm.get.dynamic.area.offset``' Intrinsic
13443 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13450 declare i32 @llvm.get.dynamic.area.offset.i32()
13451 declare i64 @llvm.get.dynamic.area.offset.i64()
13456 The '``llvm.get.dynamic.area.offset.*``' intrinsic family is used to
13457 get the offset from native stack pointer to the address of the most
13458 recent dynamic alloca on the caller's stack. These intrinsics are
13459 intended for use in combination with
13460 :ref:`llvm.stacksave <int_stacksave>` to get a
13461 pointer to the most recent dynamic alloca. This is useful, for example,
13462 for AddressSanitizer's stack unpoisoning routines.
13467 These intrinsics return a non-negative integer value that can be used to
13468 get the address of the most recent dynamic alloca, allocated by :ref:`alloca <i_alloca>`
13469 on the caller's stack. In particular, for targets where stack grows downwards,
13470 adding this offset to the native stack pointer would get the address of the most
13471 recent dynamic alloca. For targets where stack grows upwards, the situation is a bit more
13472 complicated, because subtracting this value from stack pointer would get the address
13473 one past the end of the most recent dynamic alloca.
13475 Although for most targets `llvm.get.dynamic.area.offset <int_get_dynamic_area_offset>`
13476 returns just a zero, for others, such as PowerPC and PowerPC64, it returns a
13477 compile-time-known constant value.
13479 The return value type of :ref:`llvm.get.dynamic.area.offset <int_get_dynamic_area_offset>`
13480 must match the target's default address space's (address space 0) pointer type.
13482 '``llvm.prefetch``' Intrinsic
13483 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13490 declare void @llvm.prefetch(ptr <address>, i32 <rw>, i32 <locality>, i32 <cache type>)
13495 The '``llvm.prefetch``' intrinsic is a hint to the code generator to
13496 insert a prefetch instruction if supported; otherwise, it is a noop.
13497 Prefetches have no effect on the behavior of the program but can change
13498 its performance characteristics.
13503 ``address`` is the address to be prefetched, ``rw`` is the specifier
13504 determining if the fetch should be for a read (0) or write (1), and
13505 ``locality`` is a temporal locality specifier ranging from (0) - no
13506 locality, to (3) - extremely local keep in cache. The ``cache type``
13507 specifies whether the prefetch is performed on the data (1) or
13508 instruction (0) cache. The ``rw``, ``locality`` and ``cache type``
13509 arguments must be constant integers.
13514 This intrinsic does not modify the behavior of the program. In
13515 particular, prefetches cannot trap and do not produce a value. On
13516 targets that support this intrinsic, the prefetch can provide hints to
13517 the processor cache for better performance.
13519 '``llvm.pcmarker``' Intrinsic
13520 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13527 declare void @llvm.pcmarker(i32 <id>)
13532 The '``llvm.pcmarker``' intrinsic is a method to export a Program
13533 Counter (PC) in a region of code to simulators and other tools. The
13534 method is target specific, but it is expected that the marker will use
13535 exported symbols to transmit the PC of the marker. The marker makes no
13536 guarantees that it will remain with any specific instruction after
13537 optimizations. It is possible that the presence of a marker will inhibit
13538 optimizations. The intended use is to be inserted after optimizations to
13539 allow correlations of simulation runs.
13544 ``id`` is a numerical id identifying the marker.
13549 This intrinsic does not modify the behavior of the program. Backends
13550 that do not support this intrinsic may ignore it.
13552 '``llvm.readcyclecounter``' Intrinsic
13553 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13560 declare i64 @llvm.readcyclecounter()
13565 The '``llvm.readcyclecounter``' intrinsic provides access to the cycle
13566 counter register (or similar low latency, high accuracy clocks) on those
13567 targets that support it. On X86, it should map to RDTSC. On Alpha, it
13568 should map to RPCC. As the backing counters overflow quickly (on the
13569 order of 9 seconds on alpha), this should only be used for small
13575 When directly supported, reading the cycle counter should not modify any
13576 memory. Implementations are allowed to either return an application
13577 specific value or a system wide value. On backends without support, this
13578 is lowered to a constant 0.
13580 Note that runtime support may be conditional on the privilege-level code is
13581 running at and the host platform.
13583 '``llvm.clear_cache``' Intrinsic
13584 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13591 declare void @llvm.clear_cache(ptr, ptr)
13596 The '``llvm.clear_cache``' intrinsic ensures visibility of modifications
13597 in the specified range to the execution unit of the processor. On
13598 targets with non-unified instruction and data cache, the implementation
13599 flushes the instruction cache.
13604 On platforms with coherent instruction and data caches (e.g. x86), this
13605 intrinsic is a nop. On platforms with non-coherent instruction and data
13606 cache (e.g. ARM, MIPS), the intrinsic is lowered either to appropriate
13607 instructions or a system call, if cache flushing requires special
13610 The default behavior is to emit a call to ``__clear_cache`` from the run
13613 This intrinsic does *not* empty the instruction pipeline. Modifications
13614 of the current function are outside the scope of the intrinsic.
13616 '``llvm.instrprof.increment``' Intrinsic
13617 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13624 declare void @llvm.instrprof.increment(ptr <name>, i64 <hash>,
13625 i32 <num-counters>, i32 <index>)
13630 The '``llvm.instrprof.increment``' intrinsic can be emitted by a
13631 frontend for use with instrumentation based profiling. These will be
13632 lowered by the ``-instrprof`` pass to generate execution counts of a
13633 program at runtime.
13638 The first argument is a pointer to a global variable containing the
13639 name of the entity being instrumented. This should generally be the
13640 (mangled) function name for a set of counters.
13642 The second argument is a hash value that can be used by the consumer
13643 of the profile data to detect changes to the instrumented source, and
13644 the third is the number of counters associated with ``name``. It is an
13645 error if ``hash`` or ``num-counters`` differ between two instances of
13646 ``instrprof.increment`` that refer to the same name.
13648 The last argument refers to which of the counters for ``name`` should
13649 be incremented. It should be a value between 0 and ``num-counters``.
13654 This intrinsic represents an increment of a profiling counter. It will
13655 cause the ``-instrprof`` pass to generate the appropriate data
13656 structures and the code to increment the appropriate value, in a
13657 format that can be written out by a compiler runtime and consumed via
13658 the ``llvm-profdata`` tool.
13660 '``llvm.instrprof.increment.step``' Intrinsic
13661 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13668 declare void @llvm.instrprof.increment.step(ptr <name>, i64 <hash>,
13669 i32 <num-counters>,
13670 i32 <index>, i64 <step>)
13675 The '``llvm.instrprof.increment.step``' intrinsic is an extension to
13676 the '``llvm.instrprof.increment``' intrinsic with an additional fifth
13677 argument to specify the step of the increment.
13681 The first four arguments are the same as '``llvm.instrprof.increment``'
13684 The last argument specifies the value of the increment of the counter variable.
13688 See description of '``llvm.instrprof.increment``' intrinsic.
13690 '``llvm.instrprof.timestamp``' Intrinsic
13691 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13698 declare void @llvm.instrprof.timestamp(i8* <name>, i64 <hash>,
13699 i32 <num-counters>, i32 <index>)
13704 The '``llvm.instrprof.timestamp``' intrinsic is used to implement temporal
13709 The arguments are the same as '``llvm.instrprof.increment``'. The ``index`` is
13710 expected to always be zero.
13714 Similar to the '``llvm.instrprof.increment``' intrinsic, but it stores a
13715 timestamp representing when this function was executed for the first time.
13717 '``llvm.instrprof.cover``' Intrinsic
13718 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13725 declare void @llvm.instrprof.cover(ptr <name>, i64 <hash>,
13726 i32 <num-counters>, i32 <index>)
13731 The '``llvm.instrprof.cover``' intrinsic is used to implement coverage
13736 The arguments are the same as the first four arguments of
13737 '``llvm.instrprof.increment``'.
13741 Similar to the '``llvm.instrprof.increment``' intrinsic, but it stores zero to
13742 the profiling variable to signify that the function has been covered. We store
13743 zero because this is more efficient on some targets.
13745 '``llvm.instrprof.value.profile``' Intrinsic
13746 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13753 declare void @llvm.instrprof.value.profile(ptr <name>, i64 <hash>,
13754 i64 <value>, i32 <value_kind>,
13760 The '``llvm.instrprof.value.profile``' intrinsic can be emitted by a
13761 frontend for use with instrumentation based profiling. This will be
13762 lowered by the ``-instrprof`` pass to find out the target values,
13763 instrumented expressions take in a program at runtime.
13768 The first argument is a pointer to a global variable containing the
13769 name of the entity being instrumented. ``name`` should generally be the
13770 (mangled) function name for a set of counters.
13772 The second argument is a hash value that can be used by the consumer
13773 of the profile data to detect changes to the instrumented source. It
13774 is an error if ``hash`` differs between two instances of
13775 ``llvm.instrprof.*`` that refer to the same name.
13777 The third argument is the value of the expression being profiled. The profiled
13778 expression's value should be representable as an unsigned 64-bit value. The
13779 fourth argument represents the kind of value profiling that is being done. The
13780 supported value profiling kinds are enumerated through the
13781 ``InstrProfValueKind`` type declared in the
13782 ``<include/llvm/ProfileData/InstrProf.h>`` header file. The last argument is the
13783 index of the instrumented expression within ``name``. It should be >= 0.
13788 This intrinsic represents the point where a call to a runtime routine
13789 should be inserted for value profiling of target expressions. ``-instrprof``
13790 pass will generate the appropriate data structures and replace the
13791 ``llvm.instrprof.value.profile`` intrinsic with the call to the profile
13792 runtime library with proper arguments.
13794 '``llvm.thread.pointer``' Intrinsic
13795 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13802 declare ptr @llvm.thread.pointer()
13807 The '``llvm.thread.pointer``' intrinsic returns the value of the thread
13813 The '``llvm.thread.pointer``' intrinsic returns a pointer to the TLS area
13814 for the current thread. The exact semantics of this value are target
13815 specific: it may point to the start of TLS area, to the end, or somewhere
13816 in the middle. Depending on the target, this intrinsic may read a register,
13817 call a helper function, read from an alternate memory space, or perform
13818 other operations necessary to locate the TLS area. Not all targets support
13821 '``llvm.call.preallocated.setup``' Intrinsic
13822 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13829 declare token @llvm.call.preallocated.setup(i32 %num_args)
13834 The '``llvm.call.preallocated.setup``' intrinsic returns a token which can
13835 be used with a call's ``"preallocated"`` operand bundle to indicate that
13836 certain arguments are allocated and initialized before the call.
13841 The '``llvm.call.preallocated.setup``' intrinsic returns a token which is
13842 associated with at most one call. The token can be passed to
13843 '``@llvm.call.preallocated.arg``' to get a pointer to get that
13844 corresponding argument. The token must be the parameter to a
13845 ``"preallocated"`` operand bundle for the corresponding call.
13847 Nested calls to '``llvm.call.preallocated.setup``' are allowed, but must
13848 be properly nested. e.g.
13850 :: code-block:: llvm
13852 %t1 = call token @llvm.call.preallocated.setup(i32 0)
13853 %t2 = call token @llvm.call.preallocated.setup(i32 0)
13854 call void foo() ["preallocated"(token %t2)]
13855 call void foo() ["preallocated"(token %t1)]
13857 is allowed, but not
13859 :: code-block:: llvm
13861 %t1 = call token @llvm.call.preallocated.setup(i32 0)
13862 %t2 = call token @llvm.call.preallocated.setup(i32 0)
13863 call void foo() ["preallocated"(token %t1)]
13864 call void foo() ["preallocated"(token %t2)]
13866 .. _int_call_preallocated_arg:
13868 '``llvm.call.preallocated.arg``' Intrinsic
13869 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13876 declare ptr @llvm.call.preallocated.arg(token %setup_token, i32 %arg_index)
13881 The '``llvm.call.preallocated.arg``' intrinsic returns a pointer to the
13882 corresponding preallocated argument for the preallocated call.
13887 The '``llvm.call.preallocated.arg``' intrinsic returns a pointer to the
13888 ``%arg_index``th argument with the ``preallocated`` attribute for
13889 the call associated with the ``%setup_token``, which must be from
13890 '``llvm.call.preallocated.setup``'.
13892 A call to '``llvm.call.preallocated.arg``' must have a call site
13893 ``preallocated`` attribute. The type of the ``preallocated`` attribute must
13894 match the type used by the ``preallocated`` attribute of the corresponding
13895 argument at the preallocated call. The type is used in the case that an
13896 ``llvm.call.preallocated.setup`` does not have a corresponding call (e.g. due
13897 to DCE), where otherwise we cannot know how large the arguments are.
13899 It is undefined behavior if this is called with a token from an
13900 '``llvm.call.preallocated.setup``' if another
13901 '``llvm.call.preallocated.setup``' has already been called or if the
13902 preallocated call corresponding to the '``llvm.call.preallocated.setup``'
13903 has already been called.
13905 .. _int_call_preallocated_teardown:
13907 '``llvm.call.preallocated.teardown``' Intrinsic
13908 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13915 declare ptr @llvm.call.preallocated.teardown(token %setup_token)
13920 The '``llvm.call.preallocated.teardown``' intrinsic cleans up the stack
13921 created by a '``llvm.call.preallocated.setup``'.
13926 The token argument must be a '``llvm.call.preallocated.setup``'.
13928 The '``llvm.call.preallocated.teardown``' intrinsic cleans up the stack
13929 allocated by the corresponding '``llvm.call.preallocated.setup``'. Exactly
13930 one of this or the preallocated call must be called to prevent stack leaks.
13931 It is undefined behavior to call both a '``llvm.call.preallocated.teardown``'
13932 and the preallocated call for a given '``llvm.call.preallocated.setup``'.
13934 For example, if the stack is allocated for a preallocated call by a
13935 '``llvm.call.preallocated.setup``', then an initializer function called on an
13936 allocated argument throws an exception, there should be a
13937 '``llvm.call.preallocated.teardown``' in the exception handler to prevent
13940 Following the nesting rules in '``llvm.call.preallocated.setup``', nested
13941 calls to '``llvm.call.preallocated.setup``' and
13942 '``llvm.call.preallocated.teardown``' are allowed but must be properly
13948 .. code-block:: llvm
13950 %cs = call token @llvm.call.preallocated.setup(i32 1)
13951 %x = call ptr @llvm.call.preallocated.arg(token %cs, i32 0) preallocated(i32)
13952 invoke void @constructor(ptr %x) to label %conta unwind label %contb
13954 call void @foo1(ptr preallocated(i32) %x) ["preallocated"(token %cs)]
13957 %s = catchswitch within none [label %catch] unwind to caller
13959 %p = catchpad within %s []
13960 call void @llvm.call.preallocated.teardown(token %cs)
13963 Standard C/C++ Library Intrinsics
13964 ---------------------------------
13966 LLVM provides intrinsics for a few important standard C/C++ library
13967 functions. These intrinsics allow source-language front-ends to pass
13968 information about the alignment of the pointer arguments to the code
13969 generator, providing opportunity for more efficient code generation.
13973 '``llvm.abs.*``' Intrinsic
13974 ^^^^^^^^^^^^^^^^^^^^^^^^^^
13979 This is an overloaded intrinsic. You can use ``llvm.abs`` on any
13980 integer bit width or any vector of integer elements.
13984 declare i32 @llvm.abs.i32(i32 <src>, i1 <is_int_min_poison>)
13985 declare <4 x i32> @llvm.abs.v4i32(<4 x i32> <src>, i1 <is_int_min_poison>)
13990 The '``llvm.abs``' family of intrinsic functions returns the absolute value
13996 The first argument is the value for which the absolute value is to be returned.
13997 This argument may be of any integer type or a vector with integer element type.
13998 The return type must match the first argument type.
14000 The second argument must be a constant and is a flag to indicate whether the
14001 result value of the '``llvm.abs``' intrinsic is a
14002 :ref:`poison value <poisonvalues>` if the argument is statically or dynamically
14003 an ``INT_MIN`` value.
14008 The '``llvm.abs``' intrinsic returns the magnitude (always positive) of the
14009 argument or each element of a vector argument.". If the argument is ``INT_MIN``,
14010 then the result is also ``INT_MIN`` if ``is_int_min_poison == 0`` and
14011 ``poison`` otherwise.
14016 '``llvm.smax.*``' Intrinsic
14017 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
14022 This is an overloaded intrinsic. You can use ``@llvm.smax`` on any
14023 integer bit width or any vector of integer elements.
14027 declare i32 @llvm.smax.i32(i32 %a, i32 %b)
14028 declare <4 x i32> @llvm.smax.v4i32(<4 x i32> %a, <4 x i32> %b)
14033 Return the larger of ``%a`` and ``%b`` comparing the values as signed integers.
14034 Vector intrinsics operate on a per-element basis. The larger element of ``%a``
14035 and ``%b`` at a given index is returned for that index.
14040 The arguments (``%a`` and ``%b``) may be of any integer type or a vector with
14041 integer element type. The argument types must match each other, and the return
14042 type must match the argument type.
14047 '``llvm.smin.*``' Intrinsic
14048 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
14053 This is an overloaded intrinsic. You can use ``@llvm.smin`` on any
14054 integer bit width or any vector of integer elements.
14058 declare i32 @llvm.smin.i32(i32 %a, i32 %b)
14059 declare <4 x i32> @llvm.smin.v4i32(<4 x i32> %a, <4 x i32> %b)
14064 Return the smaller of ``%a`` and ``%b`` comparing the values as signed integers.
14065 Vector intrinsics operate on a per-element basis. The smaller element of ``%a``
14066 and ``%b`` at a given index is returned for that index.
14071 The arguments (``%a`` and ``%b``) may be of any integer type or a vector with
14072 integer element type. The argument types must match each other, and the return
14073 type must match the argument type.
14078 '``llvm.umax.*``' Intrinsic
14079 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
14084 This is an overloaded intrinsic. You can use ``@llvm.umax`` on any
14085 integer bit width or any vector of integer elements.
14089 declare i32 @llvm.umax.i32(i32 %a, i32 %b)
14090 declare <4 x i32> @llvm.umax.v4i32(<4 x i32> %a, <4 x i32> %b)
14095 Return the larger of ``%a`` and ``%b`` comparing the values as unsigned
14096 integers. Vector intrinsics operate on a per-element basis. The larger element
14097 of ``%a`` and ``%b`` at a given index is returned for that index.
14102 The arguments (``%a`` and ``%b``) may be of any integer type or a vector with
14103 integer element type. The argument types must match each other, and the return
14104 type must match the argument type.
14109 '``llvm.umin.*``' Intrinsic
14110 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
14115 This is an overloaded intrinsic. You can use ``@llvm.umin`` on any
14116 integer bit width or any vector of integer elements.
14120 declare i32 @llvm.umin.i32(i32 %a, i32 %b)
14121 declare <4 x i32> @llvm.umin.v4i32(<4 x i32> %a, <4 x i32> %b)
14126 Return the smaller of ``%a`` and ``%b`` comparing the values as unsigned
14127 integers. Vector intrinsics operate on a per-element basis. The smaller element
14128 of ``%a`` and ``%b`` at a given index is returned for that index.
14133 The arguments (``%a`` and ``%b``) may be of any integer type or a vector with
14134 integer element type. The argument types must match each other, and the return
14135 type must match the argument type.
14140 '``llvm.memcpy``' Intrinsic
14141 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
14146 This is an overloaded intrinsic. You can use ``llvm.memcpy`` on any
14147 integer bit width and for different address spaces. Not all targets
14148 support all bit widths however.
14152 declare void @llvm.memcpy.p0.p0.i32(ptr <dest>, ptr <src>,
14153 i32 <len>, i1 <isvolatile>)
14154 declare void @llvm.memcpy.p0.p0.i64(ptr <dest>, ptr <src>,
14155 i64 <len>, i1 <isvolatile>)
14160 The '``llvm.memcpy.*``' intrinsics copy a block of memory from the
14161 source location to the destination location.
14163 Note that, unlike the standard libc function, the ``llvm.memcpy.*``
14164 intrinsics do not return a value, takes extra isvolatile
14165 arguments and the pointers can be in specified address spaces.
14170 The first argument is a pointer to the destination, the second is a
14171 pointer to the source. The third argument is an integer argument
14172 specifying the number of bytes to copy, and the fourth is a
14173 boolean indicating a volatile access.
14175 The :ref:`align <attr_align>` parameter attribute can be provided
14176 for the first and second arguments.
14178 If the ``isvolatile`` parameter is ``true``, the ``llvm.memcpy`` call is
14179 a :ref:`volatile operation <volatile>`. The detailed access behavior is not
14180 very cleanly specified and it is unwise to depend on it.
14185 The '``llvm.memcpy.*``' intrinsics copy a block of memory from the source
14186 location to the destination location, which must either be equal or
14187 non-overlapping. It copies "len" bytes of memory over. If the argument is known
14188 to be aligned to some boundary, this can be specified as an attribute on the
14191 If ``<len>`` is 0, it is no-op modulo the behavior of attributes attached to
14193 If ``<len>`` is not a well-defined value, the behavior is undefined.
14194 If ``<len>`` is not zero, both ``<dest>`` and ``<src>`` should be well-defined,
14195 otherwise the behavior is undefined.
14197 .. _int_memcpy_inline:
14199 '``llvm.memcpy.inline``' Intrinsic
14200 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14205 This is an overloaded intrinsic. You can use ``llvm.memcpy.inline`` on any
14206 integer bit width and for different address spaces. Not all targets
14207 support all bit widths however.
14211 declare void @llvm.memcpy.inline.p0.p0.i32(ptr <dest>, ptr <src>,
14212 i32 <len>, i1 <isvolatile>)
14213 declare void @llvm.memcpy.inline.p0.p0.i64(ptr <dest>, ptr <src>,
14214 i64 <len>, i1 <isvolatile>)
14219 The '``llvm.memcpy.inline.*``' intrinsics copy a block of memory from the
14220 source location to the destination location and guarantees that no external
14221 functions are called.
14223 Note that, unlike the standard libc function, the ``llvm.memcpy.inline.*``
14224 intrinsics do not return a value, takes extra isvolatile
14225 arguments and the pointers can be in specified address spaces.
14230 The first argument is a pointer to the destination, the second is a
14231 pointer to the source. The third argument is a constant integer argument
14232 specifying the number of bytes to copy, and the fourth is a
14233 boolean indicating a volatile access.
14235 The :ref:`align <attr_align>` parameter attribute can be provided
14236 for the first and second arguments.
14238 If the ``isvolatile`` parameter is ``true``, the ``llvm.memcpy.inline`` call is
14239 a :ref:`volatile operation <volatile>`. The detailed access behavior is not
14240 very cleanly specified and it is unwise to depend on it.
14245 The '``llvm.memcpy.inline.*``' intrinsics copy a block of memory from the
14246 source location to the destination location, which are not allowed to
14247 overlap. It copies "len" bytes of memory over. If the argument is known
14248 to be aligned to some boundary, this can be specified as an attribute on
14250 The behavior of '``llvm.memcpy.inline.*``' is equivalent to the behavior of
14251 '``llvm.memcpy.*``', but the generated code is guaranteed not to call any
14252 external functions.
14256 '``llvm.memmove``' Intrinsic
14257 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14262 This is an overloaded intrinsic. You can use llvm.memmove on any integer
14263 bit width and for different address space. Not all targets support all
14264 bit widths however.
14268 declare void @llvm.memmove.p0.p0.i32(ptr <dest>, ptr <src>,
14269 i32 <len>, i1 <isvolatile>)
14270 declare void @llvm.memmove.p0.p0.i64(ptr <dest>, ptr <src>,
14271 i64 <len>, i1 <isvolatile>)
14276 The '``llvm.memmove.*``' intrinsics move a block of memory from the
14277 source location to the destination location. It is similar to the
14278 '``llvm.memcpy``' intrinsic but allows the two memory locations to
14281 Note that, unlike the standard libc function, the ``llvm.memmove.*``
14282 intrinsics do not return a value, takes an extra isvolatile
14283 argument and the pointers can be in specified address spaces.
14288 The first argument is a pointer to the destination, the second is a
14289 pointer to the source. The third argument is an integer argument
14290 specifying the number of bytes to copy, and the fourth is a
14291 boolean indicating a volatile access.
14293 The :ref:`align <attr_align>` parameter attribute can be provided
14294 for the first and second arguments.
14296 If the ``isvolatile`` parameter is ``true``, the ``llvm.memmove`` call
14297 is a :ref:`volatile operation <volatile>`. The detailed access behavior is
14298 not very cleanly specified and it is unwise to depend on it.
14303 The '``llvm.memmove.*``' intrinsics copy a block of memory from the
14304 source location to the destination location, which may overlap. It
14305 copies "len" bytes of memory over. If the argument is known to be
14306 aligned to some boundary, this can be specified as an attribute on
14309 If ``<len>`` is 0, it is no-op modulo the behavior of attributes attached to
14311 If ``<len>`` is not a well-defined value, the behavior is undefined.
14312 If ``<len>`` is not zero, both ``<dest>`` and ``<src>`` should be well-defined,
14313 otherwise the behavior is undefined.
14317 '``llvm.memset.*``' Intrinsics
14318 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14323 This is an overloaded intrinsic. You can use llvm.memset on any integer
14324 bit width and for different address spaces. However, not all targets
14325 support all bit widths.
14329 declare void @llvm.memset.p0.i32(ptr <dest>, i8 <val>,
14330 i32 <len>, i1 <isvolatile>)
14331 declare void @llvm.memset.p0.i64(ptr <dest>, i8 <val>,
14332 i64 <len>, i1 <isvolatile>)
14337 The '``llvm.memset.*``' intrinsics fill a block of memory with a
14338 particular byte value.
14340 Note that, unlike the standard libc function, the ``llvm.memset``
14341 intrinsic does not return a value and takes an extra volatile
14342 argument. Also, the destination can be in an arbitrary address space.
14347 The first argument is a pointer to the destination to fill, the second
14348 is the byte value with which to fill it, the third argument is an
14349 integer argument specifying the number of bytes to fill, and the fourth
14350 is a boolean indicating a volatile access.
14352 The :ref:`align <attr_align>` parameter attribute can be provided
14353 for the first arguments.
14355 If the ``isvolatile`` parameter is ``true``, the ``llvm.memset`` call is
14356 a :ref:`volatile operation <volatile>`. The detailed access behavior is not
14357 very cleanly specified and it is unwise to depend on it.
14362 The '``llvm.memset.*``' intrinsics fill "len" bytes of memory starting
14363 at the destination location. If the argument is known to be
14364 aligned to some boundary, this can be specified as an attribute on
14367 If ``<len>`` is 0, it is no-op modulo the behavior of attributes attached to
14369 If ``<len>`` is not a well-defined value, the behavior is undefined.
14370 If ``<len>`` is not zero, ``<dest>`` should be well-defined, otherwise the
14371 behavior is undefined.
14373 .. _int_memset_inline:
14375 '``llvm.memset.inline``' Intrinsic
14376 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14381 This is an overloaded intrinsic. You can use ``llvm.memset.inline`` on any
14382 integer bit width and for different address spaces. Not all targets
14383 support all bit widths however.
14387 declare void @llvm.memset.inline.p0.p0i8.i32(ptr <dest>, i8 <val>,
14388 i32 <len>, i1 <isvolatile>)
14389 declare void @llvm.memset.inline.p0.p0.i64(ptr <dest>, i8 <val>,
14390 i64 <len>, i1 <isvolatile>)
14395 The '``llvm.memset.inline.*``' intrinsics fill a block of memory with a
14396 particular byte value and guarantees that no external functions are called.
14398 Note that, unlike the standard libc function, the ``llvm.memset.inline.*``
14399 intrinsics do not return a value, take an extra isvolatile argument and the
14400 pointer can be in specified address spaces.
14405 The first argument is a pointer to the destination to fill, the second
14406 is the byte value with which to fill it, the third argument is a constant
14407 integer argument specifying the number of bytes to fill, and the fourth
14408 is a boolean indicating a volatile access.
14410 The :ref:`align <attr_align>` parameter attribute can be provided
14411 for the first argument.
14413 If the ``isvolatile`` parameter is ``true``, the ``llvm.memset.inline`` call is
14414 a :ref:`volatile operation <volatile>`. The detailed access behavior is not
14415 very cleanly specified and it is unwise to depend on it.
14420 The '``llvm.memset.inline.*``' intrinsics fill "len" bytes of memory starting
14421 at the destination location. If the argument is known to be
14422 aligned to some boundary, this can be specified as an attribute on
14425 ``len`` must be a constant expression.
14426 If ``<len>`` is 0, it is no-op modulo the behavior of attributes attached to
14428 If ``<len>`` is not a well-defined value, the behavior is undefined.
14429 If ``<len>`` is not zero, ``<dest>`` should be well-defined, otherwise the
14430 behavior is undefined.
14432 The behavior of '``llvm.memset.inline.*``' is equivalent to the behavior of
14433 '``llvm.memset.*``', but the generated code is guaranteed not to call any
14434 external functions.
14438 '``llvm.sqrt.*``' Intrinsic
14439 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
14444 This is an overloaded intrinsic. You can use ``llvm.sqrt`` on any
14445 floating-point or vector of floating-point type. Not all targets support
14450 declare float @llvm.sqrt.f32(float %Val)
14451 declare double @llvm.sqrt.f64(double %Val)
14452 declare x86_fp80 @llvm.sqrt.f80(x86_fp80 %Val)
14453 declare fp128 @llvm.sqrt.f128(fp128 %Val)
14454 declare ppc_fp128 @llvm.sqrt.ppcf128(ppc_fp128 %Val)
14459 The '``llvm.sqrt``' intrinsics return the square root of the specified value.
14464 The argument and return value are floating-point numbers of the same type.
14469 Return the same value as a corresponding libm '``sqrt``' function but without
14470 trapping or setting ``errno``. For types specified by IEEE-754, the result
14471 matches a conforming libm implementation.
14473 When specified with the fast-math-flag 'afn', the result may be approximated
14474 using a less accurate calculation.
14476 '``llvm.powi.*``' Intrinsic
14477 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
14482 This is an overloaded intrinsic. You can use ``llvm.powi`` on any
14483 floating-point or vector of floating-point type. Not all targets support
14486 Generally, the only supported type for the exponent is the one matching
14487 with the C type ``int``.
14491 declare float @llvm.powi.f32.i32(float %Val, i32 %power)
14492 declare double @llvm.powi.f64.i16(double %Val, i16 %power)
14493 declare x86_fp80 @llvm.powi.f80.i32(x86_fp80 %Val, i32 %power)
14494 declare fp128 @llvm.powi.f128.i32(fp128 %Val, i32 %power)
14495 declare ppc_fp128 @llvm.powi.ppcf128.i32(ppc_fp128 %Val, i32 %power)
14500 The '``llvm.powi.*``' intrinsics return the first operand raised to the
14501 specified (positive or negative) power. The order of evaluation of
14502 multiplications is not defined. When a vector of floating-point type is
14503 used, the second argument remains a scalar integer value.
14508 The second argument is an integer power, and the first is a value to
14509 raise to that power.
14514 This function returns the first value raised to the second power with an
14515 unspecified sequence of rounding operations.
14517 '``llvm.sin.*``' Intrinsic
14518 ^^^^^^^^^^^^^^^^^^^^^^^^^^
14523 This is an overloaded intrinsic. You can use ``llvm.sin`` on any
14524 floating-point or vector of floating-point type. Not all targets support
14529 declare float @llvm.sin.f32(float %Val)
14530 declare double @llvm.sin.f64(double %Val)
14531 declare x86_fp80 @llvm.sin.f80(x86_fp80 %Val)
14532 declare fp128 @llvm.sin.f128(fp128 %Val)
14533 declare ppc_fp128 @llvm.sin.ppcf128(ppc_fp128 %Val)
14538 The '``llvm.sin.*``' intrinsics return the sine of the operand.
14543 The argument and return value are floating-point numbers of the same type.
14548 Return the same value as a corresponding libm '``sin``' function but without
14549 trapping or setting ``errno``.
14551 When specified with the fast-math-flag 'afn', the result may be approximated
14552 using a less accurate calculation.
14554 '``llvm.cos.*``' Intrinsic
14555 ^^^^^^^^^^^^^^^^^^^^^^^^^^
14560 This is an overloaded intrinsic. You can use ``llvm.cos`` on any
14561 floating-point or vector of floating-point type. Not all targets support
14566 declare float @llvm.cos.f32(float %Val)
14567 declare double @llvm.cos.f64(double %Val)
14568 declare x86_fp80 @llvm.cos.f80(x86_fp80 %Val)
14569 declare fp128 @llvm.cos.f128(fp128 %Val)
14570 declare ppc_fp128 @llvm.cos.ppcf128(ppc_fp128 %Val)
14575 The '``llvm.cos.*``' intrinsics return the cosine of the operand.
14580 The argument and return value are floating-point numbers of the same type.
14585 Return the same value as a corresponding libm '``cos``' function but without
14586 trapping or setting ``errno``.
14588 When specified with the fast-math-flag 'afn', the result may be approximated
14589 using a less accurate calculation.
14591 '``llvm.pow.*``' Intrinsic
14592 ^^^^^^^^^^^^^^^^^^^^^^^^^^
14597 This is an overloaded intrinsic. You can use ``llvm.pow`` on any
14598 floating-point or vector of floating-point type. Not all targets support
14603 declare float @llvm.pow.f32(float %Val, float %Power)
14604 declare double @llvm.pow.f64(double %Val, double %Power)
14605 declare x86_fp80 @llvm.pow.f80(x86_fp80 %Val, x86_fp80 %Power)
14606 declare fp128 @llvm.pow.f128(fp128 %Val, fp128 %Power)
14607 declare ppc_fp128 @llvm.pow.ppcf128(ppc_fp128 %Val, ppc_fp128 Power)
14612 The '``llvm.pow.*``' intrinsics return the first operand raised to the
14613 specified (positive or negative) power.
14618 The arguments and return value are floating-point numbers of the same type.
14623 Return the same value as a corresponding libm '``pow``' function but without
14624 trapping or setting ``errno``.
14626 When specified with the fast-math-flag 'afn', the result may be approximated
14627 using a less accurate calculation.
14629 '``llvm.exp.*``' Intrinsic
14630 ^^^^^^^^^^^^^^^^^^^^^^^^^^
14635 This is an overloaded intrinsic. You can use ``llvm.exp`` on any
14636 floating-point or vector of floating-point type. Not all targets support
14641 declare float @llvm.exp.f32(float %Val)
14642 declare double @llvm.exp.f64(double %Val)
14643 declare x86_fp80 @llvm.exp.f80(x86_fp80 %Val)
14644 declare fp128 @llvm.exp.f128(fp128 %Val)
14645 declare ppc_fp128 @llvm.exp.ppcf128(ppc_fp128 %Val)
14650 The '``llvm.exp.*``' intrinsics compute the base-e exponential of the specified
14656 The argument and return value are floating-point numbers of the same type.
14661 Return the same value as a corresponding libm '``exp``' function but without
14662 trapping or setting ``errno``.
14664 When specified with the fast-math-flag 'afn', the result may be approximated
14665 using a less accurate calculation.
14667 '``llvm.exp2.*``' Intrinsic
14668 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
14673 This is an overloaded intrinsic. You can use ``llvm.exp2`` on any
14674 floating-point or vector of floating-point type. Not all targets support
14679 declare float @llvm.exp2.f32(float %Val)
14680 declare double @llvm.exp2.f64(double %Val)
14681 declare x86_fp80 @llvm.exp2.f80(x86_fp80 %Val)
14682 declare fp128 @llvm.exp2.f128(fp128 %Val)
14683 declare ppc_fp128 @llvm.exp2.ppcf128(ppc_fp128 %Val)
14688 The '``llvm.exp2.*``' intrinsics compute the base-2 exponential of the
14694 The argument and return value are floating-point numbers of the same type.
14699 Return the same value as a corresponding libm '``exp2``' function but without
14700 trapping or setting ``errno``.
14702 When specified with the fast-math-flag 'afn', the result may be approximated
14703 using a less accurate calculation.
14705 '``llvm.log.*``' Intrinsic
14706 ^^^^^^^^^^^^^^^^^^^^^^^^^^
14711 This is an overloaded intrinsic. You can use ``llvm.log`` on any
14712 floating-point or vector of floating-point type. Not all targets support
14717 declare float @llvm.log.f32(float %Val)
14718 declare double @llvm.log.f64(double %Val)
14719 declare x86_fp80 @llvm.log.f80(x86_fp80 %Val)
14720 declare fp128 @llvm.log.f128(fp128 %Val)
14721 declare ppc_fp128 @llvm.log.ppcf128(ppc_fp128 %Val)
14726 The '``llvm.log.*``' intrinsics compute the base-e logarithm of the specified
14732 The argument and return value are floating-point numbers of the same type.
14737 Return the same value as a corresponding libm '``log``' function but without
14738 trapping or setting ``errno``.
14740 When specified with the fast-math-flag 'afn', the result may be approximated
14741 using a less accurate calculation.
14743 '``llvm.log10.*``' Intrinsic
14744 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14749 This is an overloaded intrinsic. You can use ``llvm.log10`` on any
14750 floating-point or vector of floating-point type. Not all targets support
14755 declare float @llvm.log10.f32(float %Val)
14756 declare double @llvm.log10.f64(double %Val)
14757 declare x86_fp80 @llvm.log10.f80(x86_fp80 %Val)
14758 declare fp128 @llvm.log10.f128(fp128 %Val)
14759 declare ppc_fp128 @llvm.log10.ppcf128(ppc_fp128 %Val)
14764 The '``llvm.log10.*``' intrinsics compute the base-10 logarithm of the
14770 The argument and return value are floating-point numbers of the same type.
14775 Return the same value as a corresponding libm '``log10``' function but without
14776 trapping or setting ``errno``.
14778 When specified with the fast-math-flag 'afn', the result may be approximated
14779 using a less accurate calculation.
14781 '``llvm.log2.*``' Intrinsic
14782 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
14787 This is an overloaded intrinsic. You can use ``llvm.log2`` on any
14788 floating-point or vector of floating-point type. Not all targets support
14793 declare float @llvm.log2.f32(float %Val)
14794 declare double @llvm.log2.f64(double %Val)
14795 declare x86_fp80 @llvm.log2.f80(x86_fp80 %Val)
14796 declare fp128 @llvm.log2.f128(fp128 %Val)
14797 declare ppc_fp128 @llvm.log2.ppcf128(ppc_fp128 %Val)
14802 The '``llvm.log2.*``' intrinsics compute the base-2 logarithm of the specified
14808 The argument and return value are floating-point numbers of the same type.
14813 Return the same value as a corresponding libm '``log2``' function but without
14814 trapping or setting ``errno``.
14816 When specified with the fast-math-flag 'afn', the result may be approximated
14817 using a less accurate calculation.
14821 '``llvm.fma.*``' Intrinsic
14822 ^^^^^^^^^^^^^^^^^^^^^^^^^^
14827 This is an overloaded intrinsic. You can use ``llvm.fma`` on any
14828 floating-point or vector of floating-point type. Not all targets support
14833 declare float @llvm.fma.f32(float %a, float %b, float %c)
14834 declare double @llvm.fma.f64(double %a, double %b, double %c)
14835 declare x86_fp80 @llvm.fma.f80(x86_fp80 %a, x86_fp80 %b, x86_fp80 %c)
14836 declare fp128 @llvm.fma.f128(fp128 %a, fp128 %b, fp128 %c)
14837 declare ppc_fp128 @llvm.fma.ppcf128(ppc_fp128 %a, ppc_fp128 %b, ppc_fp128 %c)
14842 The '``llvm.fma.*``' intrinsics perform the fused multiply-add operation.
14847 The arguments and return value are floating-point numbers of the same type.
14852 Return the same value as a corresponding libm '``fma``' function but without
14853 trapping or setting ``errno``.
14855 When specified with the fast-math-flag 'afn', the result may be approximated
14856 using a less accurate calculation.
14860 '``llvm.fabs.*``' Intrinsic
14861 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
14866 This is an overloaded intrinsic. You can use ``llvm.fabs`` on any
14867 floating-point or vector of floating-point type. Not all targets support
14872 declare float @llvm.fabs.f32(float %Val)
14873 declare double @llvm.fabs.f64(double %Val)
14874 declare x86_fp80 @llvm.fabs.f80(x86_fp80 %Val)
14875 declare fp128 @llvm.fabs.f128(fp128 %Val)
14876 declare ppc_fp128 @llvm.fabs.ppcf128(ppc_fp128 %Val)
14881 The '``llvm.fabs.*``' intrinsics return the absolute value of the
14887 The argument and return value are floating-point numbers of the same
14893 This function returns the same values as the libm ``fabs`` functions
14894 would, and handles error conditions in the same way.
14898 '``llvm.minnum.*``' Intrinsic
14899 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14904 This is an overloaded intrinsic. You can use ``llvm.minnum`` on any
14905 floating-point or vector of floating-point type. Not all targets support
14910 declare float @llvm.minnum.f32(float %Val0, float %Val1)
14911 declare double @llvm.minnum.f64(double %Val0, double %Val1)
14912 declare x86_fp80 @llvm.minnum.f80(x86_fp80 %Val0, x86_fp80 %Val1)
14913 declare fp128 @llvm.minnum.f128(fp128 %Val0, fp128 %Val1)
14914 declare ppc_fp128 @llvm.minnum.ppcf128(ppc_fp128 %Val0, ppc_fp128 %Val1)
14919 The '``llvm.minnum.*``' intrinsics return the minimum of the two
14926 The arguments and return value are floating-point numbers of the same
14932 Follows the IEEE-754 semantics for minNum, except for handling of
14933 signaling NaNs. This match's the behavior of libm's fmin.
14935 If either operand is a NaN, returns the other non-NaN operand. Returns
14936 NaN only if both operands are NaN. If the operands compare equal,
14937 returns either one of the operands. For example, this means that
14938 fmin(+0.0, -0.0) returns either operand.
14940 Unlike the IEEE-754 2008 behavior, this does not distinguish between
14941 signaling and quiet NaN inputs. If a target's implementation follows
14942 the standard and returns a quiet NaN if either input is a signaling
14943 NaN, the intrinsic lowering is responsible for quieting the inputs to
14944 correctly return the non-NaN input (e.g. by using the equivalent of
14945 ``llvm.canonicalize``).
14949 '``llvm.maxnum.*``' Intrinsic
14950 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14955 This is an overloaded intrinsic. You can use ``llvm.maxnum`` on any
14956 floating-point or vector of floating-point type. Not all targets support
14961 declare float @llvm.maxnum.f32(float %Val0, float %Val1)
14962 declare double @llvm.maxnum.f64(double %Val0, double %Val1)
14963 declare x86_fp80 @llvm.maxnum.f80(x86_fp80 %Val0, x86_fp80 %Val1)
14964 declare fp128 @llvm.maxnum.f128(fp128 %Val0, fp128 %Val1)
14965 declare ppc_fp128 @llvm.maxnum.ppcf128(ppc_fp128 %Val0, ppc_fp128 %Val1)
14970 The '``llvm.maxnum.*``' intrinsics return the maximum of the two
14977 The arguments and return value are floating-point numbers of the same
14982 Follows the IEEE-754 semantics for maxNum except for the handling of
14983 signaling NaNs. This matches the behavior of libm's fmax.
14985 If either operand is a NaN, returns the other non-NaN operand. Returns
14986 NaN only if both operands are NaN. If the operands compare equal,
14987 returns either one of the operands. For example, this means that
14988 fmax(+0.0, -0.0) returns either -0.0 or 0.0.
14990 Unlike the IEEE-754 2008 behavior, this does not distinguish between
14991 signaling and quiet NaN inputs. If a target's implementation follows
14992 the standard and returns a quiet NaN if either input is a signaling
14993 NaN, the intrinsic lowering is responsible for quieting the inputs to
14994 correctly return the non-NaN input (e.g. by using the equivalent of
14995 ``llvm.canonicalize``).
14997 '``llvm.minimum.*``' Intrinsic
14998 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15003 This is an overloaded intrinsic. You can use ``llvm.minimum`` on any
15004 floating-point or vector of floating-point type. Not all targets support
15009 declare float @llvm.minimum.f32(float %Val0, float %Val1)
15010 declare double @llvm.minimum.f64(double %Val0, double %Val1)
15011 declare x86_fp80 @llvm.minimum.f80(x86_fp80 %Val0, x86_fp80 %Val1)
15012 declare fp128 @llvm.minimum.f128(fp128 %Val0, fp128 %Val1)
15013 declare ppc_fp128 @llvm.minimum.ppcf128(ppc_fp128 %Val0, ppc_fp128 %Val1)
15018 The '``llvm.minimum.*``' intrinsics return the minimum of the two
15019 arguments, propagating NaNs and treating -0.0 as less than +0.0.
15025 The arguments and return value are floating-point numbers of the same
15030 If either operand is a NaN, returns NaN. Otherwise returns the lesser
15031 of the two arguments. -0.0 is considered to be less than +0.0 for this
15032 intrinsic. Note that these are the semantics specified in the draft of
15035 '``llvm.maximum.*``' Intrinsic
15036 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15041 This is an overloaded intrinsic. You can use ``llvm.maximum`` on any
15042 floating-point or vector of floating-point type. Not all targets support
15047 declare float @llvm.maximum.f32(float %Val0, float %Val1)
15048 declare double @llvm.maximum.f64(double %Val0, double %Val1)
15049 declare x86_fp80 @llvm.maximum.f80(x86_fp80 %Val0, x86_fp80 %Val1)
15050 declare fp128 @llvm.maximum.f128(fp128 %Val0, fp128 %Val1)
15051 declare ppc_fp128 @llvm.maximum.ppcf128(ppc_fp128 %Val0, ppc_fp128 %Val1)
15056 The '``llvm.maximum.*``' intrinsics return the maximum of the two
15057 arguments, propagating NaNs and treating -0.0 as less than +0.0.
15063 The arguments and return value are floating-point numbers of the same
15068 If either operand is a NaN, returns NaN. Otherwise returns the greater
15069 of the two arguments. -0.0 is considered to be less than +0.0 for this
15070 intrinsic. Note that these are the semantics specified in the draft of
15075 '``llvm.copysign.*``' Intrinsic
15076 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15081 This is an overloaded intrinsic. You can use ``llvm.copysign`` on any
15082 floating-point or vector of floating-point type. Not all targets support
15087 declare float @llvm.copysign.f32(float %Mag, float %Sgn)
15088 declare double @llvm.copysign.f64(double %Mag, double %Sgn)
15089 declare x86_fp80 @llvm.copysign.f80(x86_fp80 %Mag, x86_fp80 %Sgn)
15090 declare fp128 @llvm.copysign.f128(fp128 %Mag, fp128 %Sgn)
15091 declare ppc_fp128 @llvm.copysign.ppcf128(ppc_fp128 %Mag, ppc_fp128 %Sgn)
15096 The '``llvm.copysign.*``' intrinsics return a value with the magnitude of the
15097 first operand and the sign of the second operand.
15102 The arguments and return value are floating-point numbers of the same
15108 This function returns the same values as the libm ``copysign``
15109 functions would, and handles error conditions in the same way.
15113 '``llvm.floor.*``' Intrinsic
15114 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15119 This is an overloaded intrinsic. You can use ``llvm.floor`` on any
15120 floating-point or vector of floating-point type. Not all targets support
15125 declare float @llvm.floor.f32(float %Val)
15126 declare double @llvm.floor.f64(double %Val)
15127 declare x86_fp80 @llvm.floor.f80(x86_fp80 %Val)
15128 declare fp128 @llvm.floor.f128(fp128 %Val)
15129 declare ppc_fp128 @llvm.floor.ppcf128(ppc_fp128 %Val)
15134 The '``llvm.floor.*``' intrinsics return the floor of the operand.
15139 The argument and return value are floating-point numbers of the same
15145 This function returns the same values as the libm ``floor`` functions
15146 would, and handles error conditions in the same way.
15150 '``llvm.ceil.*``' Intrinsic
15151 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
15156 This is an overloaded intrinsic. You can use ``llvm.ceil`` on any
15157 floating-point or vector of floating-point type. Not all targets support
15162 declare float @llvm.ceil.f32(float %Val)
15163 declare double @llvm.ceil.f64(double %Val)
15164 declare x86_fp80 @llvm.ceil.f80(x86_fp80 %Val)
15165 declare fp128 @llvm.ceil.f128(fp128 %Val)
15166 declare ppc_fp128 @llvm.ceil.ppcf128(ppc_fp128 %Val)
15171 The '``llvm.ceil.*``' intrinsics return the ceiling of the operand.
15176 The argument and return value are floating-point numbers of the same
15182 This function returns the same values as the libm ``ceil`` functions
15183 would, and handles error conditions in the same way.
15186 .. _int_llvm_trunc:
15188 '``llvm.trunc.*``' Intrinsic
15189 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15194 This is an overloaded intrinsic. You can use ``llvm.trunc`` on any
15195 floating-point or vector of floating-point type. Not all targets support
15200 declare float @llvm.trunc.f32(float %Val)
15201 declare double @llvm.trunc.f64(double %Val)
15202 declare x86_fp80 @llvm.trunc.f80(x86_fp80 %Val)
15203 declare fp128 @llvm.trunc.f128(fp128 %Val)
15204 declare ppc_fp128 @llvm.trunc.ppcf128(ppc_fp128 %Val)
15209 The '``llvm.trunc.*``' intrinsics returns the operand rounded to the
15210 nearest integer not larger in magnitude than the operand.
15215 The argument and return value are floating-point numbers of the same
15221 This function returns the same values as the libm ``trunc`` functions
15222 would, and handles error conditions in the same way.
15226 '``llvm.rint.*``' Intrinsic
15227 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
15232 This is an overloaded intrinsic. You can use ``llvm.rint`` on any
15233 floating-point or vector of floating-point type. Not all targets support
15238 declare float @llvm.rint.f32(float %Val)
15239 declare double @llvm.rint.f64(double %Val)
15240 declare x86_fp80 @llvm.rint.f80(x86_fp80 %Val)
15241 declare fp128 @llvm.rint.f128(fp128 %Val)
15242 declare ppc_fp128 @llvm.rint.ppcf128(ppc_fp128 %Val)
15247 The '``llvm.rint.*``' intrinsics returns the operand rounded to the
15248 nearest integer. It may raise an inexact floating-point exception if the
15249 operand isn't an integer.
15254 The argument and return value are floating-point numbers of the same
15260 This function returns the same values as the libm ``rint`` functions
15261 would, and handles error conditions in the same way.
15265 '``llvm.nearbyint.*``' Intrinsic
15266 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15271 This is an overloaded intrinsic. You can use ``llvm.nearbyint`` on any
15272 floating-point or vector of floating-point type. Not all targets support
15277 declare float @llvm.nearbyint.f32(float %Val)
15278 declare double @llvm.nearbyint.f64(double %Val)
15279 declare x86_fp80 @llvm.nearbyint.f80(x86_fp80 %Val)
15280 declare fp128 @llvm.nearbyint.f128(fp128 %Val)
15281 declare ppc_fp128 @llvm.nearbyint.ppcf128(ppc_fp128 %Val)
15286 The '``llvm.nearbyint.*``' intrinsics returns the operand rounded to the
15292 The argument and return value are floating-point numbers of the same
15298 This function returns the same values as the libm ``nearbyint``
15299 functions would, and handles error conditions in the same way.
15303 '``llvm.round.*``' Intrinsic
15304 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15309 This is an overloaded intrinsic. You can use ``llvm.round`` on any
15310 floating-point or vector of floating-point type. Not all targets support
15315 declare float @llvm.round.f32(float %Val)
15316 declare double @llvm.round.f64(double %Val)
15317 declare x86_fp80 @llvm.round.f80(x86_fp80 %Val)
15318 declare fp128 @llvm.round.f128(fp128 %Val)
15319 declare ppc_fp128 @llvm.round.ppcf128(ppc_fp128 %Val)
15324 The '``llvm.round.*``' intrinsics returns the operand rounded to the
15330 The argument and return value are floating-point numbers of the same
15336 This function returns the same values as the libm ``round``
15337 functions would, and handles error conditions in the same way.
15341 '``llvm.roundeven.*``' Intrinsic
15342 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15347 This is an overloaded intrinsic. You can use ``llvm.roundeven`` on any
15348 floating-point or vector of floating-point type. Not all targets support
15353 declare float @llvm.roundeven.f32(float %Val)
15354 declare double @llvm.roundeven.f64(double %Val)
15355 declare x86_fp80 @llvm.roundeven.f80(x86_fp80 %Val)
15356 declare fp128 @llvm.roundeven.f128(fp128 %Val)
15357 declare ppc_fp128 @llvm.roundeven.ppcf128(ppc_fp128 %Val)
15362 The '``llvm.roundeven.*``' intrinsics returns the operand rounded to the nearest
15363 integer in floating-point format rounding halfway cases to even (that is, to the
15364 nearest value that is an even integer).
15369 The argument and return value are floating-point numbers of the same type.
15374 This function implements IEEE-754 operation ``roundToIntegralTiesToEven``. It
15375 also behaves in the same way as C standard function ``roundeven``, except that
15376 it does not raise floating point exceptions.
15379 '``llvm.lround.*``' Intrinsic
15380 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15385 This is an overloaded intrinsic. You can use ``llvm.lround`` on any
15386 floating-point type. Not all targets support all types however.
15390 declare i32 @llvm.lround.i32.f32(float %Val)
15391 declare i32 @llvm.lround.i32.f64(double %Val)
15392 declare i32 @llvm.lround.i32.f80(float %Val)
15393 declare i32 @llvm.lround.i32.f128(double %Val)
15394 declare i32 @llvm.lround.i32.ppcf128(double %Val)
15396 declare i64 @llvm.lround.i64.f32(float %Val)
15397 declare i64 @llvm.lround.i64.f64(double %Val)
15398 declare i64 @llvm.lround.i64.f80(float %Val)
15399 declare i64 @llvm.lround.i64.f128(double %Val)
15400 declare i64 @llvm.lround.i64.ppcf128(double %Val)
15405 The '``llvm.lround.*``' intrinsics return the operand rounded to the nearest
15406 integer with ties away from zero.
15412 The argument is a floating-point number and the return value is an integer
15418 This function returns the same values as the libm ``lround`` functions
15419 would, but without setting errno. If the rounded value is too large to
15420 be stored in the result type, the return value is a non-deterministic
15421 value (equivalent to `freeze poison`).
15423 '``llvm.llround.*``' Intrinsic
15424 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15429 This is an overloaded intrinsic. You can use ``llvm.llround`` on any
15430 floating-point type. Not all targets support all types however.
15434 declare i64 @llvm.lround.i64.f32(float %Val)
15435 declare i64 @llvm.lround.i64.f64(double %Val)
15436 declare i64 @llvm.lround.i64.f80(float %Val)
15437 declare i64 @llvm.lround.i64.f128(double %Val)
15438 declare i64 @llvm.lround.i64.ppcf128(double %Val)
15443 The '``llvm.llround.*``' intrinsics return the operand rounded to the nearest
15444 integer with ties away from zero.
15449 The argument is a floating-point number and the return value is an integer
15455 This function returns the same values as the libm ``llround``
15456 functions would, but without setting errno. If the rounded value is
15457 too large to be stored in the result type, the return value is a
15458 non-deterministic value (equivalent to `freeze poison`).
15460 '``llvm.lrint.*``' Intrinsic
15461 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15466 This is an overloaded intrinsic. You can use ``llvm.lrint`` on any
15467 floating-point type. Not all targets support all types however.
15471 declare i32 @llvm.lrint.i32.f32(float %Val)
15472 declare i32 @llvm.lrint.i32.f64(double %Val)
15473 declare i32 @llvm.lrint.i32.f80(float %Val)
15474 declare i32 @llvm.lrint.i32.f128(double %Val)
15475 declare i32 @llvm.lrint.i32.ppcf128(double %Val)
15477 declare i64 @llvm.lrint.i64.f32(float %Val)
15478 declare i64 @llvm.lrint.i64.f64(double %Val)
15479 declare i64 @llvm.lrint.i64.f80(float %Val)
15480 declare i64 @llvm.lrint.i64.f128(double %Val)
15481 declare i64 @llvm.lrint.i64.ppcf128(double %Val)
15486 The '``llvm.lrint.*``' intrinsics return the operand rounded to the nearest
15493 The argument is a floating-point number and the return value is an integer
15499 This function returns the same values as the libm ``lrint`` functions
15500 would, but without setting errno. If the rounded value is too large to
15501 be stored in the result type, the return value is a non-deterministic
15502 value (equivalent to `freeze poison`).
15504 '``llvm.llrint.*``' Intrinsic
15505 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15510 This is an overloaded intrinsic. You can use ``llvm.llrint`` on any
15511 floating-point type. Not all targets support all types however.
15515 declare i64 @llvm.llrint.i64.f32(float %Val)
15516 declare i64 @llvm.llrint.i64.f64(double %Val)
15517 declare i64 @llvm.llrint.i64.f80(float %Val)
15518 declare i64 @llvm.llrint.i64.f128(double %Val)
15519 declare i64 @llvm.llrint.i64.ppcf128(double %Val)
15524 The '``llvm.llrint.*``' intrinsics return the operand rounded to the nearest
15530 The argument is a floating-point number and the return value is an integer
15536 This function returns the same values as the libm ``llrint`` functions
15537 would, but without setting errno. If the rounded value is too large to
15538 be stored in the result type, the return value is a non-deterministic
15539 value (equivalent to `freeze poison`).
15541 Bit Manipulation Intrinsics
15542 ---------------------------
15544 LLVM provides intrinsics for a few important bit manipulation
15545 operations. These allow efficient code generation for some algorithms.
15547 .. _int_bitreverse:
15549 '``llvm.bitreverse.*``' Intrinsics
15550 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15555 This is an overloaded intrinsic function. You can use bitreverse on any
15560 declare i16 @llvm.bitreverse.i16(i16 <id>)
15561 declare i32 @llvm.bitreverse.i32(i32 <id>)
15562 declare i64 @llvm.bitreverse.i64(i64 <id>)
15563 declare <4 x i32> @llvm.bitreverse.v4i32(<4 x i32> <id>)
15568 The '``llvm.bitreverse``' family of intrinsics is used to reverse the
15569 bitpattern of an integer value or vector of integer values; for example
15570 ``0b10110110`` becomes ``0b01101101``.
15575 The ``llvm.bitreverse.iN`` intrinsic returns an iN value that has bit
15576 ``M`` in the input moved to bit ``N-M-1`` in the output. The vector
15577 intrinsics, such as ``llvm.bitreverse.v4i32``, operate on a per-element
15578 basis and the element order is not affected.
15582 '``llvm.bswap.*``' Intrinsics
15583 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15588 This is an overloaded intrinsic function. You can use bswap on any
15589 integer type that is an even number of bytes (i.e. BitWidth % 16 == 0).
15593 declare i16 @llvm.bswap.i16(i16 <id>)
15594 declare i32 @llvm.bswap.i32(i32 <id>)
15595 declare i64 @llvm.bswap.i64(i64 <id>)
15596 declare <4 x i32> @llvm.bswap.v4i32(<4 x i32> <id>)
15601 The '``llvm.bswap``' family of intrinsics is used to byte swap an integer
15602 value or vector of integer values with an even number of bytes (positive
15603 multiple of 16 bits).
15608 The ``llvm.bswap.i16`` intrinsic returns an i16 value that has the high
15609 and low byte of the input i16 swapped. Similarly, the ``llvm.bswap.i32``
15610 intrinsic returns an i32 value that has the four bytes of the input i32
15611 swapped, so that if the input bytes are numbered 0, 1, 2, 3 then the
15612 returned i32 will have its bytes in 3, 2, 1, 0 order. The
15613 ``llvm.bswap.i48``, ``llvm.bswap.i64`` and other intrinsics extend this
15614 concept to additional even-byte lengths (6 bytes, 8 bytes and more,
15615 respectively). The vector intrinsics, such as ``llvm.bswap.v4i32``,
15616 operate on a per-element basis and the element order is not affected.
15620 '``llvm.ctpop.*``' Intrinsic
15621 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15626 This is an overloaded intrinsic. You can use llvm.ctpop on any integer
15627 bit width, or on any vector with integer elements. Not all targets
15628 support all bit widths or vector types, however.
15632 declare i8 @llvm.ctpop.i8(i8 <src>)
15633 declare i16 @llvm.ctpop.i16(i16 <src>)
15634 declare i32 @llvm.ctpop.i32(i32 <src>)
15635 declare i64 @llvm.ctpop.i64(i64 <src>)
15636 declare i256 @llvm.ctpop.i256(i256 <src>)
15637 declare <2 x i32> @llvm.ctpop.v2i32(<2 x i32> <src>)
15642 The '``llvm.ctpop``' family of intrinsics counts the number of bits set
15648 The only argument is the value to be counted. The argument may be of any
15649 integer type, or a vector with integer elements. The return type must
15650 match the argument type.
15655 The '``llvm.ctpop``' intrinsic counts the 1's in a variable, or within
15656 each element of a vector.
15660 '``llvm.ctlz.*``' Intrinsic
15661 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
15666 This is an overloaded intrinsic. You can use ``llvm.ctlz`` on any
15667 integer bit width, or any vector whose elements are integers. Not all
15668 targets support all bit widths or vector types, however.
15672 declare i8 @llvm.ctlz.i8 (i8 <src>, i1 <is_zero_poison>)
15673 declare <2 x i37> @llvm.ctlz.v2i37(<2 x i37> <src>, i1 <is_zero_poison>)
15678 The '``llvm.ctlz``' family of intrinsic functions counts the number of
15679 leading zeros in a variable.
15684 The first argument is the value to be counted. This argument may be of
15685 any integer type, or a vector with integer element type. The return
15686 type must match the first argument type.
15688 The second argument is a constant flag that indicates whether the intrinsic
15689 returns a valid result if the first argument is zero. If the first
15690 argument is zero and the second argument is true, the result is poison.
15691 Historically some architectures did not provide a defined result for zero
15692 values as efficiently, and many algorithms are now predicated on avoiding
15698 The '``llvm.ctlz``' intrinsic counts the leading (most significant)
15699 zeros in a variable, or within each element of the vector. If
15700 ``src == 0`` then the result is the size in bits of the type of ``src``
15701 if ``is_zero_poison == 0`` and ``poison`` otherwise. For example,
15702 ``llvm.ctlz(i32 2) = 30``.
15706 '``llvm.cttz.*``' Intrinsic
15707 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
15712 This is an overloaded intrinsic. You can use ``llvm.cttz`` on any
15713 integer bit width, or any vector of integer elements. Not all targets
15714 support all bit widths or vector types, however.
15718 declare i42 @llvm.cttz.i42 (i42 <src>, i1 <is_zero_poison>)
15719 declare <2 x i32> @llvm.cttz.v2i32(<2 x i32> <src>, i1 <is_zero_poison>)
15724 The '``llvm.cttz``' family of intrinsic functions counts the number of
15730 The first argument is the value to be counted. This argument may be of
15731 any integer type, or a vector with integer element type. The return
15732 type must match the first argument type.
15734 The second argument is a constant flag that indicates whether the intrinsic
15735 returns a valid result if the first argument is zero. If the first
15736 argument is zero and the second argument is true, the result is poison.
15737 Historically some architectures did not provide a defined result for zero
15738 values as efficiently, and many algorithms are now predicated on avoiding
15744 The '``llvm.cttz``' intrinsic counts the trailing (least significant)
15745 zeros in a variable, or within each element of a vector. If ``src == 0``
15746 then the result is the size in bits of the type of ``src`` if
15747 ``is_zero_poison == 0`` and ``poison`` otherwise. For example,
15748 ``llvm.cttz(2) = 1``.
15754 '``llvm.fshl.*``' Intrinsic
15755 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
15760 This is an overloaded intrinsic. You can use ``llvm.fshl`` on any
15761 integer bit width or any vector of integer elements. Not all targets
15762 support all bit widths or vector types, however.
15766 declare i8 @llvm.fshl.i8 (i8 %a, i8 %b, i8 %c)
15767 declare i64 @llvm.fshl.i64(i64 %a, i64 %b, i64 %c)
15768 declare <2 x i32> @llvm.fshl.v2i32(<2 x i32> %a, <2 x i32> %b, <2 x i32> %c)
15773 The '``llvm.fshl``' family of intrinsic functions performs a funnel shift left:
15774 the first two values are concatenated as { %a : %b } (%a is the most significant
15775 bits of the wide value), the combined value is shifted left, and the most
15776 significant bits are extracted to produce a result that is the same size as the
15777 original arguments. If the first 2 arguments are identical, this is equivalent
15778 to a rotate left operation. For vector types, the operation occurs for each
15779 element of the vector. The shift argument is treated as an unsigned amount
15780 modulo the element size of the arguments.
15785 The first two arguments are the values to be concatenated. The third
15786 argument is the shift amount. The arguments may be any integer type or a
15787 vector with integer element type. All arguments and the return value must
15788 have the same type.
15793 .. code-block:: text
15795 %r = call i8 @llvm.fshl.i8(i8 %x, i8 %y, i8 %z) ; %r = i8: msb_extract((concat(x, y) << (z % 8)), 8)
15796 %r = call i8 @llvm.fshl.i8(i8 255, i8 0, i8 15) ; %r = i8: 128 (0b10000000)
15797 %r = call i8 @llvm.fshl.i8(i8 15, i8 15, i8 11) ; %r = i8: 120 (0b01111000)
15798 %r = call i8 @llvm.fshl.i8(i8 0, i8 255, i8 8) ; %r = i8: 0 (0b00000000)
15802 '``llvm.fshr.*``' Intrinsic
15803 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
15808 This is an overloaded intrinsic. You can use ``llvm.fshr`` on any
15809 integer bit width or any vector of integer elements. Not all targets
15810 support all bit widths or vector types, however.
15814 declare i8 @llvm.fshr.i8 (i8 %a, i8 %b, i8 %c)
15815 declare i64 @llvm.fshr.i64(i64 %a, i64 %b, i64 %c)
15816 declare <2 x i32> @llvm.fshr.v2i32(<2 x i32> %a, <2 x i32> %b, <2 x i32> %c)
15821 The '``llvm.fshr``' family of intrinsic functions performs a funnel shift right:
15822 the first two values are concatenated as { %a : %b } (%a is the most significant
15823 bits of the wide value), the combined value is shifted right, and the least
15824 significant bits are extracted to produce a result that is the same size as the
15825 original arguments. If the first 2 arguments are identical, this is equivalent
15826 to a rotate right operation. For vector types, the operation occurs for each
15827 element of the vector. The shift argument is treated as an unsigned amount
15828 modulo the element size of the arguments.
15833 The first two arguments are the values to be concatenated. The third
15834 argument is the shift amount. The arguments may be any integer type or a
15835 vector with integer element type. All arguments and the return value must
15836 have the same type.
15841 .. code-block:: text
15843 %r = call i8 @llvm.fshr.i8(i8 %x, i8 %y, i8 %z) ; %r = i8: lsb_extract((concat(x, y) >> (z % 8)), 8)
15844 %r = call i8 @llvm.fshr.i8(i8 255, i8 0, i8 15) ; %r = i8: 254 (0b11111110)
15845 %r = call i8 @llvm.fshr.i8(i8 15, i8 15, i8 11) ; %r = i8: 225 (0b11100001)
15846 %r = call i8 @llvm.fshr.i8(i8 0, i8 255, i8 8) ; %r = i8: 255 (0b11111111)
15848 Arithmetic with Overflow Intrinsics
15849 -----------------------------------
15851 LLVM provides intrinsics for fast arithmetic overflow checking.
15853 Each of these intrinsics returns a two-element struct. The first
15854 element of this struct contains the result of the corresponding
15855 arithmetic operation modulo 2\ :sup:`n`\ , where n is the bit width of
15856 the result. Therefore, for example, the first element of the struct
15857 returned by ``llvm.sadd.with.overflow.i32`` is always the same as the
15858 result of a 32-bit ``add`` instruction with the same operands, where
15859 the ``add`` is *not* modified by an ``nsw`` or ``nuw`` flag.
15861 The second element of the result is an ``i1`` that is 1 if the
15862 arithmetic operation overflowed and 0 otherwise. An operation
15863 overflows if, for any values of its operands ``A`` and ``B`` and for
15864 any ``N`` larger than the operands' width, ``ext(A op B) to iN`` is
15865 not equal to ``(ext(A) to iN) op (ext(B) to iN)`` where ``ext`` is
15866 ``sext`` for signed overflow and ``zext`` for unsigned overflow, and
15867 ``op`` is the underlying arithmetic operation.
15869 The behavior of these intrinsics is well-defined for all argument
15872 '``llvm.sadd.with.overflow.*``' Intrinsics
15873 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15878 This is an overloaded intrinsic. You can use ``llvm.sadd.with.overflow``
15879 on any integer bit width or vectors of integers.
15883 declare {i16, i1} @llvm.sadd.with.overflow.i16(i16 %a, i16 %b)
15884 declare {i32, i1} @llvm.sadd.with.overflow.i32(i32 %a, i32 %b)
15885 declare {i64, i1} @llvm.sadd.with.overflow.i64(i64 %a, i64 %b)
15886 declare {<4 x i32>, <4 x i1>} @llvm.sadd.with.overflow.v4i32(<4 x i32> %a, <4 x i32> %b)
15891 The '``llvm.sadd.with.overflow``' family of intrinsic functions perform
15892 a signed addition of the two arguments, and indicate whether an overflow
15893 occurred during the signed summation.
15898 The arguments (%a and %b) and the first element of the result structure
15899 may be of integer types of any bit width, but they must have the same
15900 bit width. The second element of the result structure must be of type
15901 ``i1``. ``%a`` and ``%b`` are the two values that will undergo signed
15907 The '``llvm.sadd.with.overflow``' family of intrinsic functions perform
15908 a signed addition of the two variables. They return a structure --- the
15909 first element of which is the signed summation, and the second element
15910 of which is a bit specifying if the signed summation resulted in an
15916 .. code-block:: llvm
15918 %res = call {i32, i1} @llvm.sadd.with.overflow.i32(i32 %a, i32 %b)
15919 %sum = extractvalue {i32, i1} %res, 0
15920 %obit = extractvalue {i32, i1} %res, 1
15921 br i1 %obit, label %overflow, label %normal
15923 '``llvm.uadd.with.overflow.*``' Intrinsics
15924 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15929 This is an overloaded intrinsic. You can use ``llvm.uadd.with.overflow``
15930 on any integer bit width or vectors of integers.
15934 declare {i16, i1} @llvm.uadd.with.overflow.i16(i16 %a, i16 %b)
15935 declare {i32, i1} @llvm.uadd.with.overflow.i32(i32 %a, i32 %b)
15936 declare {i64, i1} @llvm.uadd.with.overflow.i64(i64 %a, i64 %b)
15937 declare {<4 x i32>, <4 x i1>} @llvm.uadd.with.overflow.v4i32(<4 x i32> %a, <4 x i32> %b)
15942 The '``llvm.uadd.with.overflow``' family of intrinsic functions perform
15943 an unsigned addition of the two arguments, and indicate whether a carry
15944 occurred during the unsigned summation.
15949 The arguments (%a and %b) and the first element of the result structure
15950 may be of integer types of any bit width, but they must have the same
15951 bit width. The second element of the result structure must be of type
15952 ``i1``. ``%a`` and ``%b`` are the two values that will undergo unsigned
15958 The '``llvm.uadd.with.overflow``' family of intrinsic functions perform
15959 an unsigned addition of the two arguments. They return a structure --- the
15960 first element of which is the sum, and the second element of which is a
15961 bit specifying if the unsigned summation resulted in a carry.
15966 .. code-block:: llvm
15968 %res = call {i32, i1} @llvm.uadd.with.overflow.i32(i32 %a, i32 %b)
15969 %sum = extractvalue {i32, i1} %res, 0
15970 %obit = extractvalue {i32, i1} %res, 1
15971 br i1 %obit, label %carry, label %normal
15973 '``llvm.ssub.with.overflow.*``' Intrinsics
15974 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15979 This is an overloaded intrinsic. You can use ``llvm.ssub.with.overflow``
15980 on any integer bit width or vectors of integers.
15984 declare {i16, i1} @llvm.ssub.with.overflow.i16(i16 %a, i16 %b)
15985 declare {i32, i1} @llvm.ssub.with.overflow.i32(i32 %a, i32 %b)
15986 declare {i64, i1} @llvm.ssub.with.overflow.i64(i64 %a, i64 %b)
15987 declare {<4 x i32>, <4 x i1>} @llvm.ssub.with.overflow.v4i32(<4 x i32> %a, <4 x i32> %b)
15992 The '``llvm.ssub.with.overflow``' family of intrinsic functions perform
15993 a signed subtraction of the two arguments, and indicate whether an
15994 overflow occurred during the signed subtraction.
15999 The arguments (%a and %b) and the first element of the result structure
16000 may be of integer types of any bit width, but they must have the same
16001 bit width. The second element of the result structure must be of type
16002 ``i1``. ``%a`` and ``%b`` are the two values that will undergo signed
16008 The '``llvm.ssub.with.overflow``' family of intrinsic functions perform
16009 a signed subtraction of the two arguments. They return a structure --- the
16010 first element of which is the subtraction, and the second element of
16011 which is a bit specifying if the signed subtraction resulted in an
16017 .. code-block:: llvm
16019 %res = call {i32, i1} @llvm.ssub.with.overflow.i32(i32 %a, i32 %b)
16020 %sum = extractvalue {i32, i1} %res, 0
16021 %obit = extractvalue {i32, i1} %res, 1
16022 br i1 %obit, label %overflow, label %normal
16024 '``llvm.usub.with.overflow.*``' Intrinsics
16025 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16030 This is an overloaded intrinsic. You can use ``llvm.usub.with.overflow``
16031 on any integer bit width or vectors of integers.
16035 declare {i16, i1} @llvm.usub.with.overflow.i16(i16 %a, i16 %b)
16036 declare {i32, i1} @llvm.usub.with.overflow.i32(i32 %a, i32 %b)
16037 declare {i64, i1} @llvm.usub.with.overflow.i64(i64 %a, i64 %b)
16038 declare {<4 x i32>, <4 x i1>} @llvm.usub.with.overflow.v4i32(<4 x i32> %a, <4 x i32> %b)
16043 The '``llvm.usub.with.overflow``' family of intrinsic functions perform
16044 an unsigned subtraction of the two arguments, and indicate whether an
16045 overflow occurred during the unsigned subtraction.
16050 The arguments (%a and %b) and the first element of the result structure
16051 may be of integer types of any bit width, but they must have the same
16052 bit width. The second element of the result structure must be of type
16053 ``i1``. ``%a`` and ``%b`` are the two values that will undergo unsigned
16059 The '``llvm.usub.with.overflow``' family of intrinsic functions perform
16060 an unsigned subtraction of the two arguments. They return a structure ---
16061 the first element of which is the subtraction, and the second element of
16062 which is a bit specifying if the unsigned subtraction resulted in an
16068 .. code-block:: llvm
16070 %res = call {i32, i1} @llvm.usub.with.overflow.i32(i32 %a, i32 %b)
16071 %sum = extractvalue {i32, i1} %res, 0
16072 %obit = extractvalue {i32, i1} %res, 1
16073 br i1 %obit, label %overflow, label %normal
16075 '``llvm.smul.with.overflow.*``' Intrinsics
16076 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16081 This is an overloaded intrinsic. You can use ``llvm.smul.with.overflow``
16082 on any integer bit width or vectors of integers.
16086 declare {i16, i1} @llvm.smul.with.overflow.i16(i16 %a, i16 %b)
16087 declare {i32, i1} @llvm.smul.with.overflow.i32(i32 %a, i32 %b)
16088 declare {i64, i1} @llvm.smul.with.overflow.i64(i64 %a, i64 %b)
16089 declare {<4 x i32>, <4 x i1>} @llvm.smul.with.overflow.v4i32(<4 x i32> %a, <4 x i32> %b)
16094 The '``llvm.smul.with.overflow``' family of intrinsic functions perform
16095 a signed multiplication of the two arguments, and indicate whether an
16096 overflow occurred during the signed multiplication.
16101 The arguments (%a and %b) and the first element of the result structure
16102 may be of integer types of any bit width, but they must have the same
16103 bit width. The second element of the result structure must be of type
16104 ``i1``. ``%a`` and ``%b`` are the two values that will undergo signed
16110 The '``llvm.smul.with.overflow``' family of intrinsic functions perform
16111 a signed multiplication of the two arguments. They return a structure ---
16112 the first element of which is the multiplication, and the second element
16113 of which is a bit specifying if the signed multiplication resulted in an
16119 .. code-block:: llvm
16121 %res = call {i32, i1} @llvm.smul.with.overflow.i32(i32 %a, i32 %b)
16122 %sum = extractvalue {i32, i1} %res, 0
16123 %obit = extractvalue {i32, i1} %res, 1
16124 br i1 %obit, label %overflow, label %normal
16126 '``llvm.umul.with.overflow.*``' Intrinsics
16127 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16132 This is an overloaded intrinsic. You can use ``llvm.umul.with.overflow``
16133 on any integer bit width or vectors of integers.
16137 declare {i16, i1} @llvm.umul.with.overflow.i16(i16 %a, i16 %b)
16138 declare {i32, i1} @llvm.umul.with.overflow.i32(i32 %a, i32 %b)
16139 declare {i64, i1} @llvm.umul.with.overflow.i64(i64 %a, i64 %b)
16140 declare {<4 x i32>, <4 x i1>} @llvm.umul.with.overflow.v4i32(<4 x i32> %a, <4 x i32> %b)
16145 The '``llvm.umul.with.overflow``' family of intrinsic functions perform
16146 a unsigned multiplication of the two arguments, and indicate whether an
16147 overflow occurred during the unsigned multiplication.
16152 The arguments (%a and %b) and the first element of the result structure
16153 may be of integer types of any bit width, but they must have the same
16154 bit width. The second element of the result structure must be of type
16155 ``i1``. ``%a`` and ``%b`` are the two values that will undergo unsigned
16161 The '``llvm.umul.with.overflow``' family of intrinsic functions perform
16162 an unsigned multiplication of the two arguments. They return a structure ---
16163 the first element of which is the multiplication, and the second
16164 element of which is a bit specifying if the unsigned multiplication
16165 resulted in an overflow.
16170 .. code-block:: llvm
16172 %res = call {i32, i1} @llvm.umul.with.overflow.i32(i32 %a, i32 %b)
16173 %sum = extractvalue {i32, i1} %res, 0
16174 %obit = extractvalue {i32, i1} %res, 1
16175 br i1 %obit, label %overflow, label %normal
16177 Saturation Arithmetic Intrinsics
16178 ---------------------------------
16180 Saturation arithmetic is a version of arithmetic in which operations are
16181 limited to a fixed range between a minimum and maximum value. If the result of
16182 an operation is greater than the maximum value, the result is set (or
16183 "clamped") to this maximum. If it is below the minimum, it is clamped to this
16187 '``llvm.sadd.sat.*``' Intrinsics
16188 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16193 This is an overloaded intrinsic. You can use ``llvm.sadd.sat``
16194 on any integer bit width or vectors of integers.
16198 declare i16 @llvm.sadd.sat.i16(i16 %a, i16 %b)
16199 declare i32 @llvm.sadd.sat.i32(i32 %a, i32 %b)
16200 declare i64 @llvm.sadd.sat.i64(i64 %a, i64 %b)
16201 declare <4 x i32> @llvm.sadd.sat.v4i32(<4 x i32> %a, <4 x i32> %b)
16206 The '``llvm.sadd.sat``' family of intrinsic functions perform signed
16207 saturating addition on the 2 arguments.
16212 The arguments (%a and %b) and the result may be of integer types of any bit
16213 width, but they must have the same bit width. ``%a`` and ``%b`` are the two
16214 values that will undergo signed addition.
16219 The maximum value this operation can clamp to is the largest signed value
16220 representable by the bit width of the arguments. The minimum value is the
16221 smallest signed value representable by this bit width.
16227 .. code-block:: llvm
16229 %res = call i4 @llvm.sadd.sat.i4(i4 1, i4 2) ; %res = 3
16230 %res = call i4 @llvm.sadd.sat.i4(i4 5, i4 6) ; %res = 7
16231 %res = call i4 @llvm.sadd.sat.i4(i4 -4, i4 2) ; %res = -2
16232 %res = call i4 @llvm.sadd.sat.i4(i4 -4, i4 -5) ; %res = -8
16235 '``llvm.uadd.sat.*``' Intrinsics
16236 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16241 This is an overloaded intrinsic. You can use ``llvm.uadd.sat``
16242 on any integer bit width or vectors of integers.
16246 declare i16 @llvm.uadd.sat.i16(i16 %a, i16 %b)
16247 declare i32 @llvm.uadd.sat.i32(i32 %a, i32 %b)
16248 declare i64 @llvm.uadd.sat.i64(i64 %a, i64 %b)
16249 declare <4 x i32> @llvm.uadd.sat.v4i32(<4 x i32> %a, <4 x i32> %b)
16254 The '``llvm.uadd.sat``' family of intrinsic functions perform unsigned
16255 saturating addition on the 2 arguments.
16260 The arguments (%a and %b) and the result may be of integer types of any bit
16261 width, but they must have the same bit width. ``%a`` and ``%b`` are the two
16262 values that will undergo unsigned addition.
16267 The maximum value this operation can clamp to is the largest unsigned value
16268 representable by the bit width of the arguments. Because this is an unsigned
16269 operation, the result will never saturate towards zero.
16275 .. code-block:: llvm
16277 %res = call i4 @llvm.uadd.sat.i4(i4 1, i4 2) ; %res = 3
16278 %res = call i4 @llvm.uadd.sat.i4(i4 5, i4 6) ; %res = 11
16279 %res = call i4 @llvm.uadd.sat.i4(i4 8, i4 8) ; %res = 15
16282 '``llvm.ssub.sat.*``' Intrinsics
16283 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16288 This is an overloaded intrinsic. You can use ``llvm.ssub.sat``
16289 on any integer bit width or vectors of integers.
16293 declare i16 @llvm.ssub.sat.i16(i16 %a, i16 %b)
16294 declare i32 @llvm.ssub.sat.i32(i32 %a, i32 %b)
16295 declare i64 @llvm.ssub.sat.i64(i64 %a, i64 %b)
16296 declare <4 x i32> @llvm.ssub.sat.v4i32(<4 x i32> %a, <4 x i32> %b)
16301 The '``llvm.ssub.sat``' family of intrinsic functions perform signed
16302 saturating subtraction on the 2 arguments.
16307 The arguments (%a and %b) and the result may be of integer types of any bit
16308 width, but they must have the same bit width. ``%a`` and ``%b`` are the two
16309 values that will undergo signed subtraction.
16314 The maximum value this operation can clamp to is the largest signed value
16315 representable by the bit width of the arguments. The minimum value is the
16316 smallest signed value representable by this bit width.
16322 .. code-block:: llvm
16324 %res = call i4 @llvm.ssub.sat.i4(i4 2, i4 1) ; %res = 1
16325 %res = call i4 @llvm.ssub.sat.i4(i4 2, i4 6) ; %res = -4
16326 %res = call i4 @llvm.ssub.sat.i4(i4 -4, i4 5) ; %res = -8
16327 %res = call i4 @llvm.ssub.sat.i4(i4 4, i4 -5) ; %res = 7
16330 '``llvm.usub.sat.*``' Intrinsics
16331 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16336 This is an overloaded intrinsic. You can use ``llvm.usub.sat``
16337 on any integer bit width or vectors of integers.
16341 declare i16 @llvm.usub.sat.i16(i16 %a, i16 %b)
16342 declare i32 @llvm.usub.sat.i32(i32 %a, i32 %b)
16343 declare i64 @llvm.usub.sat.i64(i64 %a, i64 %b)
16344 declare <4 x i32> @llvm.usub.sat.v4i32(<4 x i32> %a, <4 x i32> %b)
16349 The '``llvm.usub.sat``' family of intrinsic functions perform unsigned
16350 saturating subtraction on the 2 arguments.
16355 The arguments (%a and %b) and the result may be of integer types of any bit
16356 width, but they must have the same bit width. ``%a`` and ``%b`` are the two
16357 values that will undergo unsigned subtraction.
16362 The minimum value this operation can clamp to is 0, which is the smallest
16363 unsigned value representable by the bit width of the unsigned arguments.
16364 Because this is an unsigned operation, the result will never saturate towards
16365 the largest possible value representable by this bit width.
16371 .. code-block:: llvm
16373 %res = call i4 @llvm.usub.sat.i4(i4 2, i4 1) ; %res = 1
16374 %res = call i4 @llvm.usub.sat.i4(i4 2, i4 6) ; %res = 0
16377 '``llvm.sshl.sat.*``' Intrinsics
16378 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16383 This is an overloaded intrinsic. You can use ``llvm.sshl.sat``
16384 on integers or vectors of integers of any bit width.
16388 declare i16 @llvm.sshl.sat.i16(i16 %a, i16 %b)
16389 declare i32 @llvm.sshl.sat.i32(i32 %a, i32 %b)
16390 declare i64 @llvm.sshl.sat.i64(i64 %a, i64 %b)
16391 declare <4 x i32> @llvm.sshl.sat.v4i32(<4 x i32> %a, <4 x i32> %b)
16396 The '``llvm.sshl.sat``' family of intrinsic functions perform signed
16397 saturating left shift on the first argument.
16402 The arguments (``%a`` and ``%b``) and the result may be of integer types of any
16403 bit width, but they must have the same bit width. ``%a`` is the value to be
16404 shifted, and ``%b`` is the amount to shift by. If ``b`` is (statically or
16405 dynamically) equal to or larger than the integer bit width of the arguments,
16406 the result is a :ref:`poison value <poisonvalues>`. If the arguments are
16407 vectors, each vector element of ``a`` is shifted by the corresponding shift
16414 The maximum value this operation can clamp to is the largest signed value
16415 representable by the bit width of the arguments. The minimum value is the
16416 smallest signed value representable by this bit width.
16422 .. code-block:: llvm
16424 %res = call i4 @llvm.sshl.sat.i4(i4 2, i4 1) ; %res = 4
16425 %res = call i4 @llvm.sshl.sat.i4(i4 2, i4 2) ; %res = 7
16426 %res = call i4 @llvm.sshl.sat.i4(i4 -5, i4 1) ; %res = -8
16427 %res = call i4 @llvm.sshl.sat.i4(i4 -1, i4 1) ; %res = -2
16430 '``llvm.ushl.sat.*``' Intrinsics
16431 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16436 This is an overloaded intrinsic. You can use ``llvm.ushl.sat``
16437 on integers or vectors of integers of any bit width.
16441 declare i16 @llvm.ushl.sat.i16(i16 %a, i16 %b)
16442 declare i32 @llvm.ushl.sat.i32(i32 %a, i32 %b)
16443 declare i64 @llvm.ushl.sat.i64(i64 %a, i64 %b)
16444 declare <4 x i32> @llvm.ushl.sat.v4i32(<4 x i32> %a, <4 x i32> %b)
16449 The '``llvm.ushl.sat``' family of intrinsic functions perform unsigned
16450 saturating left shift on the first argument.
16455 The arguments (``%a`` and ``%b``) and the result may be of integer types of any
16456 bit width, but they must have the same bit width. ``%a`` is the value to be
16457 shifted, and ``%b`` is the amount to shift by. If ``b`` is (statically or
16458 dynamically) equal to or larger than the integer bit width of the arguments,
16459 the result is a :ref:`poison value <poisonvalues>`. If the arguments are
16460 vectors, each vector element of ``a`` is shifted by the corresponding shift
16466 The maximum value this operation can clamp to is the largest unsigned value
16467 representable by the bit width of the arguments.
16473 .. code-block:: llvm
16475 %res = call i4 @llvm.ushl.sat.i4(i4 2, i4 1) ; %res = 4
16476 %res = call i4 @llvm.ushl.sat.i4(i4 3, i4 3) ; %res = 15
16479 Fixed Point Arithmetic Intrinsics
16480 ---------------------------------
16482 A fixed point number represents a real data type for a number that has a fixed
16483 number of digits after a radix point (equivalent to the decimal point '.').
16484 The number of digits after the radix point is referred as the `scale`. These
16485 are useful for representing fractional values to a specific precision. The
16486 following intrinsics perform fixed point arithmetic operations on 2 operands
16487 of the same scale, specified as the third argument.
16489 The ``llvm.*mul.fix`` family of intrinsic functions represents a multiplication
16490 of fixed point numbers through scaled integers. Therefore, fixed point
16491 multiplication can be represented as
16493 .. code-block:: llvm
16495 %result = call i4 @llvm.smul.fix.i4(i4 %a, i4 %b, i32 %scale)
16498 %a2 = sext i4 %a to i8
16499 %b2 = sext i4 %b to i8
16500 %mul = mul nsw nuw i8 %a2, %b2
16501 %scale2 = trunc i32 %scale to i8
16502 %r = ashr i8 %mul, i8 %scale2 ; this is for a target rounding down towards negative infinity
16503 %result = trunc i8 %r to i4
16505 The ``llvm.*div.fix`` family of intrinsic functions represents a division of
16506 fixed point numbers through scaled integers. Fixed point division can be
16509 .. code-block:: llvm
16511 %result call i4 @llvm.sdiv.fix.i4(i4 %a, i4 %b, i32 %scale)
16514 %a2 = sext i4 %a to i8
16515 %b2 = sext i4 %b to i8
16516 %scale2 = trunc i32 %scale to i8
16517 %a3 = shl i8 %a2, %scale2
16518 %r = sdiv i8 %a3, %b2 ; this is for a target rounding towards zero
16519 %result = trunc i8 %r to i4
16521 For each of these functions, if the result cannot be represented exactly with
16522 the provided scale, the result is rounded. Rounding is unspecified since
16523 preferred rounding may vary for different targets. Rounding is specified
16524 through a target hook. Different pipelines should legalize or optimize this
16525 using the rounding specified by this hook if it is provided. Operations like
16526 constant folding, instruction combining, KnownBits, and ValueTracking should
16527 also use this hook, if provided, and not assume the direction of rounding. A
16528 rounded result must always be within one unit of precision from the true
16529 result. That is, the error between the returned result and the true result must
16530 be less than 1/2^(scale).
16533 '``llvm.smul.fix.*``' Intrinsics
16534 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16539 This is an overloaded intrinsic. You can use ``llvm.smul.fix``
16540 on any integer bit width or vectors of integers.
16544 declare i16 @llvm.smul.fix.i16(i16 %a, i16 %b, i32 %scale)
16545 declare i32 @llvm.smul.fix.i32(i32 %a, i32 %b, i32 %scale)
16546 declare i64 @llvm.smul.fix.i64(i64 %a, i64 %b, i32 %scale)
16547 declare <4 x i32> @llvm.smul.fix.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale)
16552 The '``llvm.smul.fix``' family of intrinsic functions perform signed
16553 fixed point multiplication on 2 arguments of the same scale.
16558 The arguments (%a and %b) and the result may be of integer types of any bit
16559 width, but they must have the same bit width. The arguments may also work with
16560 int vectors of the same length and int size. ``%a`` and ``%b`` are the two
16561 values that will undergo signed fixed point multiplication. The argument
16562 ``%scale`` represents the scale of both operands, and must be a constant
16568 This operation performs fixed point multiplication on the 2 arguments of a
16569 specified scale. The result will also be returned in the same scale specified
16570 in the third argument.
16572 If the result value cannot be precisely represented in the given scale, the
16573 value is rounded up or down to the closest representable value. The rounding
16574 direction is unspecified.
16576 It is undefined behavior if the result value does not fit within the range of
16577 the fixed point type.
16583 .. code-block:: llvm
16585 %res = call i4 @llvm.smul.fix.i4(i4 3, i4 2, i32 0) ; %res = 6 (2 x 3 = 6)
16586 %res = call i4 @llvm.smul.fix.i4(i4 3, i4 2, i32 1) ; %res = 3 (1.5 x 1 = 1.5)
16587 %res = call i4 @llvm.smul.fix.i4(i4 3, i4 -2, i32 1) ; %res = -3 (1.5 x -1 = -1.5)
16589 ; The result in the following could be rounded up to -2 or down to -2.5
16590 %res = call i4 @llvm.smul.fix.i4(i4 3, i4 -3, i32 1) ; %res = -5 (or -4) (1.5 x -1.5 = -2.25)
16593 '``llvm.umul.fix.*``' Intrinsics
16594 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16599 This is an overloaded intrinsic. You can use ``llvm.umul.fix``
16600 on any integer bit width or vectors of integers.
16604 declare i16 @llvm.umul.fix.i16(i16 %a, i16 %b, i32 %scale)
16605 declare i32 @llvm.umul.fix.i32(i32 %a, i32 %b, i32 %scale)
16606 declare i64 @llvm.umul.fix.i64(i64 %a, i64 %b, i32 %scale)
16607 declare <4 x i32> @llvm.umul.fix.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale)
16612 The '``llvm.umul.fix``' family of intrinsic functions perform unsigned
16613 fixed point multiplication on 2 arguments of the same scale.
16618 The arguments (%a and %b) and the result may be of integer types of any bit
16619 width, but they must have the same bit width. The arguments may also work with
16620 int vectors of the same length and int size. ``%a`` and ``%b`` are the two
16621 values that will undergo unsigned fixed point multiplication. The argument
16622 ``%scale`` represents the scale of both operands, and must be a constant
16628 This operation performs unsigned fixed point multiplication on the 2 arguments of a
16629 specified scale. The result will also be returned in the same scale specified
16630 in the third argument.
16632 If the result value cannot be precisely represented in the given scale, the
16633 value is rounded up or down to the closest representable value. The rounding
16634 direction is unspecified.
16636 It is undefined behavior if the result value does not fit within the range of
16637 the fixed point type.
16643 .. code-block:: llvm
16645 %res = call i4 @llvm.umul.fix.i4(i4 3, i4 2, i32 0) ; %res = 6 (2 x 3 = 6)
16646 %res = call i4 @llvm.umul.fix.i4(i4 3, i4 2, i32 1) ; %res = 3 (1.5 x 1 = 1.5)
16648 ; The result in the following could be rounded down to 3.5 or up to 4
16649 %res = call i4 @llvm.umul.fix.i4(i4 15, i4 1, i32 1) ; %res = 7 (or 8) (7.5 x 0.5 = 3.75)
16652 '``llvm.smul.fix.sat.*``' Intrinsics
16653 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16658 This is an overloaded intrinsic. You can use ``llvm.smul.fix.sat``
16659 on any integer bit width or vectors of integers.
16663 declare i16 @llvm.smul.fix.sat.i16(i16 %a, i16 %b, i32 %scale)
16664 declare i32 @llvm.smul.fix.sat.i32(i32 %a, i32 %b, i32 %scale)
16665 declare i64 @llvm.smul.fix.sat.i64(i64 %a, i64 %b, i32 %scale)
16666 declare <4 x i32> @llvm.smul.fix.sat.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale)
16671 The '``llvm.smul.fix.sat``' family of intrinsic functions perform signed
16672 fixed point saturating multiplication on 2 arguments of the same scale.
16677 The arguments (%a and %b) and the result may be of integer types of any bit
16678 width, but they must have the same bit width. ``%a`` and ``%b`` are the two
16679 values that will undergo signed fixed point multiplication. The argument
16680 ``%scale`` represents the scale of both operands, and must be a constant
16686 This operation performs fixed point multiplication on the 2 arguments of a
16687 specified scale. The result will also be returned in the same scale specified
16688 in the third argument.
16690 If the result value cannot be precisely represented in the given scale, the
16691 value is rounded up or down to the closest representable value. The rounding
16692 direction is unspecified.
16694 The maximum value this operation can clamp to is the largest signed value
16695 representable by the bit width of the first 2 arguments. The minimum value is the
16696 smallest signed value representable by this bit width.
16702 .. code-block:: llvm
16704 %res = call i4 @llvm.smul.fix.sat.i4(i4 3, i4 2, i32 0) ; %res = 6 (2 x 3 = 6)
16705 %res = call i4 @llvm.smul.fix.sat.i4(i4 3, i4 2, i32 1) ; %res = 3 (1.5 x 1 = 1.5)
16706 %res = call i4 @llvm.smul.fix.sat.i4(i4 3, i4 -2, i32 1) ; %res = -3 (1.5 x -1 = -1.5)
16708 ; The result in the following could be rounded up to -2 or down to -2.5
16709 %res = call i4 @llvm.smul.fix.sat.i4(i4 3, i4 -3, i32 1) ; %res = -5 (or -4) (1.5 x -1.5 = -2.25)
16712 %res = call i4 @llvm.smul.fix.sat.i4(i4 7, i4 2, i32 0) ; %res = 7
16713 %res = call i4 @llvm.smul.fix.sat.i4(i4 7, i4 4, i32 2) ; %res = 7
16714 %res = call i4 @llvm.smul.fix.sat.i4(i4 -8, i4 5, i32 2) ; %res = -8
16715 %res = call i4 @llvm.smul.fix.sat.i4(i4 -8, i4 -2, i32 1) ; %res = 7
16717 ; Scale can affect the saturation result
16718 %res = call i4 @llvm.smul.fix.sat.i4(i4 2, i4 4, i32 0) ; %res = 7 (2 x 4 -> clamped to 7)
16719 %res = call i4 @llvm.smul.fix.sat.i4(i4 2, i4 4, i32 1) ; %res = 4 (1 x 2 = 2)
16722 '``llvm.umul.fix.sat.*``' Intrinsics
16723 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16728 This is an overloaded intrinsic. You can use ``llvm.umul.fix.sat``
16729 on any integer bit width or vectors of integers.
16733 declare i16 @llvm.umul.fix.sat.i16(i16 %a, i16 %b, i32 %scale)
16734 declare i32 @llvm.umul.fix.sat.i32(i32 %a, i32 %b, i32 %scale)
16735 declare i64 @llvm.umul.fix.sat.i64(i64 %a, i64 %b, i32 %scale)
16736 declare <4 x i32> @llvm.umul.fix.sat.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale)
16741 The '``llvm.umul.fix.sat``' family of intrinsic functions perform unsigned
16742 fixed point saturating multiplication on 2 arguments of the same scale.
16747 The arguments (%a and %b) and the result may be of integer types of any bit
16748 width, but they must have the same bit width. ``%a`` and ``%b`` are the two
16749 values that will undergo unsigned fixed point multiplication. The argument
16750 ``%scale`` represents the scale of both operands, and must be a constant
16756 This operation performs fixed point multiplication on the 2 arguments of a
16757 specified scale. The result will also be returned in the same scale specified
16758 in the third argument.
16760 If the result value cannot be precisely represented in the given scale, the
16761 value is rounded up or down to the closest representable value. The rounding
16762 direction is unspecified.
16764 The maximum value this operation can clamp to is the largest unsigned value
16765 representable by the bit width of the first 2 arguments. The minimum value is the
16766 smallest unsigned value representable by this bit width (zero).
16772 .. code-block:: llvm
16774 %res = call i4 @llvm.umul.fix.sat.i4(i4 3, i4 2, i32 0) ; %res = 6 (2 x 3 = 6)
16775 %res = call i4 @llvm.umul.fix.sat.i4(i4 3, i4 2, i32 1) ; %res = 3 (1.5 x 1 = 1.5)
16777 ; The result in the following could be rounded down to 2 or up to 2.5
16778 %res = call i4 @llvm.umul.fix.sat.i4(i4 3, i4 3, i32 1) ; %res = 4 (or 5) (1.5 x 1.5 = 2.25)
16781 %res = call i4 @llvm.umul.fix.sat.i4(i4 8, i4 2, i32 0) ; %res = 15 (8 x 2 -> clamped to 15)
16782 %res = call i4 @llvm.umul.fix.sat.i4(i4 8, i4 8, i32 2) ; %res = 15 (2 x 2 -> clamped to 3.75)
16784 ; Scale can affect the saturation result
16785 %res = call i4 @llvm.umul.fix.sat.i4(i4 2, i4 4, i32 0) ; %res = 7 (2 x 4 -> clamped to 7)
16786 %res = call i4 @llvm.umul.fix.sat.i4(i4 2, i4 4, i32 1) ; %res = 4 (1 x 2 = 2)
16789 '``llvm.sdiv.fix.*``' Intrinsics
16790 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16795 This is an overloaded intrinsic. You can use ``llvm.sdiv.fix``
16796 on any integer bit width or vectors of integers.
16800 declare i16 @llvm.sdiv.fix.i16(i16 %a, i16 %b, i32 %scale)
16801 declare i32 @llvm.sdiv.fix.i32(i32 %a, i32 %b, i32 %scale)
16802 declare i64 @llvm.sdiv.fix.i64(i64 %a, i64 %b, i32 %scale)
16803 declare <4 x i32> @llvm.sdiv.fix.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale)
16808 The '``llvm.sdiv.fix``' family of intrinsic functions perform signed
16809 fixed point division on 2 arguments of the same scale.
16814 The arguments (%a and %b) and the result may be of integer types of any bit
16815 width, but they must have the same bit width. The arguments may also work with
16816 int vectors of the same length and int size. ``%a`` and ``%b`` are the two
16817 values that will undergo signed fixed point division. The argument
16818 ``%scale`` represents the scale of both operands, and must be a constant
16824 This operation performs fixed point division on the 2 arguments of a
16825 specified scale. The result will also be returned in the same scale specified
16826 in the third argument.
16828 If the result value cannot be precisely represented in the given scale, the
16829 value is rounded up or down to the closest representable value. The rounding
16830 direction is unspecified.
16832 It is undefined behavior if the result value does not fit within the range of
16833 the fixed point type, or if the second argument is zero.
16839 .. code-block:: llvm
16841 %res = call i4 @llvm.sdiv.fix.i4(i4 6, i4 2, i32 0) ; %res = 3 (6 / 2 = 3)
16842 %res = call i4 @llvm.sdiv.fix.i4(i4 6, i4 4, i32 1) ; %res = 3 (3 / 2 = 1.5)
16843 %res = call i4 @llvm.sdiv.fix.i4(i4 3, i4 -2, i32 1) ; %res = -3 (1.5 / -1 = -1.5)
16845 ; The result in the following could be rounded up to 1 or down to 0.5
16846 %res = call i4 @llvm.sdiv.fix.i4(i4 3, i4 4, i32 1) ; %res = 2 (or 1) (1.5 / 2 = 0.75)
16849 '``llvm.udiv.fix.*``' Intrinsics
16850 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16855 This is an overloaded intrinsic. You can use ``llvm.udiv.fix``
16856 on any integer bit width or vectors of integers.
16860 declare i16 @llvm.udiv.fix.i16(i16 %a, i16 %b, i32 %scale)
16861 declare i32 @llvm.udiv.fix.i32(i32 %a, i32 %b, i32 %scale)
16862 declare i64 @llvm.udiv.fix.i64(i64 %a, i64 %b, i32 %scale)
16863 declare <4 x i32> @llvm.udiv.fix.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale)
16868 The '``llvm.udiv.fix``' family of intrinsic functions perform unsigned
16869 fixed point division on 2 arguments of the same scale.
16874 The arguments (%a and %b) and the result may be of integer types of any bit
16875 width, but they must have the same bit width. The arguments may also work with
16876 int vectors of the same length and int size. ``%a`` and ``%b`` are the two
16877 values that will undergo unsigned fixed point division. The argument
16878 ``%scale`` represents the scale of both operands, and must be a constant
16884 This operation performs fixed point division on the 2 arguments of a
16885 specified scale. The result will also be returned in the same scale specified
16886 in the third argument.
16888 If the result value cannot be precisely represented in the given scale, the
16889 value is rounded up or down to the closest representable value. The rounding
16890 direction is unspecified.
16892 It is undefined behavior if the result value does not fit within the range of
16893 the fixed point type, or if the second argument is zero.
16899 .. code-block:: llvm
16901 %res = call i4 @llvm.udiv.fix.i4(i4 6, i4 2, i32 0) ; %res = 3 (6 / 2 = 3)
16902 %res = call i4 @llvm.udiv.fix.i4(i4 6, i4 4, i32 1) ; %res = 3 (3 / 2 = 1.5)
16903 %res = call i4 @llvm.udiv.fix.i4(i4 1, i4 -8, i32 4) ; %res = 2 (0.0625 / 0.5 = 0.125)
16905 ; The result in the following could be rounded up to 1 or down to 0.5
16906 %res = call i4 @llvm.udiv.fix.i4(i4 3, i4 4, i32 1) ; %res = 2 (or 1) (1.5 / 2 = 0.75)
16909 '``llvm.sdiv.fix.sat.*``' Intrinsics
16910 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16915 This is an overloaded intrinsic. You can use ``llvm.sdiv.fix.sat``
16916 on any integer bit width or vectors of integers.
16920 declare i16 @llvm.sdiv.fix.sat.i16(i16 %a, i16 %b, i32 %scale)
16921 declare i32 @llvm.sdiv.fix.sat.i32(i32 %a, i32 %b, i32 %scale)
16922 declare i64 @llvm.sdiv.fix.sat.i64(i64 %a, i64 %b, i32 %scale)
16923 declare <4 x i32> @llvm.sdiv.fix.sat.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale)
16928 The '``llvm.sdiv.fix.sat``' family of intrinsic functions perform signed
16929 fixed point saturating division on 2 arguments of the same scale.
16934 The arguments (%a and %b) and the result may be of integer types of any bit
16935 width, but they must have the same bit width. ``%a`` and ``%b`` are the two
16936 values that will undergo signed fixed point division. The argument
16937 ``%scale`` represents the scale of both operands, and must be a constant
16943 This operation performs fixed point division on the 2 arguments of a
16944 specified scale. The result will also be returned in the same scale specified
16945 in the third argument.
16947 If the result value cannot be precisely represented in the given scale, the
16948 value is rounded up or down to the closest representable value. The rounding
16949 direction is unspecified.
16951 The maximum value this operation can clamp to is the largest signed value
16952 representable by the bit width of the first 2 arguments. The minimum value is the
16953 smallest signed value representable by this bit width.
16955 It is undefined behavior if the second argument is zero.
16961 .. code-block:: llvm
16963 %res = call i4 @llvm.sdiv.fix.sat.i4(i4 6, i4 2, i32 0) ; %res = 3 (6 / 2 = 3)
16964 %res = call i4 @llvm.sdiv.fix.sat.i4(i4 6, i4 4, i32 1) ; %res = 3 (3 / 2 = 1.5)
16965 %res = call i4 @llvm.sdiv.fix.sat.i4(i4 3, i4 -2, i32 1) ; %res = -3 (1.5 / -1 = -1.5)
16967 ; The result in the following could be rounded up to 1 or down to 0.5
16968 %res = call i4 @llvm.sdiv.fix.sat.i4(i4 3, i4 4, i32 1) ; %res = 2 (or 1) (1.5 / 2 = 0.75)
16971 %res = call i4 @llvm.sdiv.fix.sat.i4(i4 -8, i4 -1, i32 0) ; %res = 7 (-8 / -1 = 8 => 7)
16972 %res = call i4 @llvm.sdiv.fix.sat.i4(i4 4, i4 2, i32 2) ; %res = 7 (1 / 0.5 = 2 => 1.75)
16973 %res = call i4 @llvm.sdiv.fix.sat.i4(i4 -4, i4 1, i32 2) ; %res = -8 (-1 / 0.25 = -4 => -2)
16976 '``llvm.udiv.fix.sat.*``' Intrinsics
16977 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16982 This is an overloaded intrinsic. You can use ``llvm.udiv.fix.sat``
16983 on any integer bit width or vectors of integers.
16987 declare i16 @llvm.udiv.fix.sat.i16(i16 %a, i16 %b, i32 %scale)
16988 declare i32 @llvm.udiv.fix.sat.i32(i32 %a, i32 %b, i32 %scale)
16989 declare i64 @llvm.udiv.fix.sat.i64(i64 %a, i64 %b, i32 %scale)
16990 declare <4 x i32> @llvm.udiv.fix.sat.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale)
16995 The '``llvm.udiv.fix.sat``' family of intrinsic functions perform unsigned
16996 fixed point saturating division on 2 arguments of the same scale.
17001 The arguments (%a and %b) and the result may be of integer types of any bit
17002 width, but they must have the same bit width. ``%a`` and ``%b`` are the two
17003 values that will undergo unsigned fixed point division. The argument
17004 ``%scale`` represents the scale of both operands, and must be a constant
17010 This operation performs fixed point division on the 2 arguments of a
17011 specified scale. The result will also be returned in the same scale specified
17012 in the third argument.
17014 If the result value cannot be precisely represented in the given scale, the
17015 value is rounded up or down to the closest representable value. The rounding
17016 direction is unspecified.
17018 The maximum value this operation can clamp to is the largest unsigned value
17019 representable by the bit width of the first 2 arguments. The minimum value is the
17020 smallest unsigned value representable by this bit width (zero).
17022 It is undefined behavior if the second argument is zero.
17027 .. code-block:: llvm
17029 %res = call i4 @llvm.udiv.fix.sat.i4(i4 6, i4 2, i32 0) ; %res = 3 (6 / 2 = 3)
17030 %res = call i4 @llvm.udiv.fix.sat.i4(i4 6, i4 4, i32 1) ; %res = 3 (3 / 2 = 1.5)
17032 ; The result in the following could be rounded down to 0.5 or up to 1
17033 %res = call i4 @llvm.udiv.fix.sat.i4(i4 3, i4 4, i32 1) ; %res = 1 (or 2) (1.5 / 2 = 0.75)
17036 %res = call i4 @llvm.udiv.fix.sat.i4(i4 8, i4 2, i32 2) ; %res = 15 (2 / 0.5 = 4 => 3.75)
17039 Specialised Arithmetic Intrinsics
17040 ---------------------------------
17042 .. _i_intr_llvm_canonicalize:
17044 '``llvm.canonicalize.*``' Intrinsic
17045 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17052 declare float @llvm.canonicalize.f32(float %a)
17053 declare double @llvm.canonicalize.f64(double %b)
17058 The '``llvm.canonicalize.*``' intrinsic returns the platform specific canonical
17059 encoding of a floating-point number. This canonicalization is useful for
17060 implementing certain numeric primitives such as frexp. The canonical encoding is
17061 defined by IEEE-754-2008 to be:
17065 2.1.8 canonical encoding: The preferred encoding of a floating-point
17066 representation in a format. Applied to declets, significands of finite
17067 numbers, infinities, and NaNs, especially in decimal formats.
17069 This operation can also be considered equivalent to the IEEE-754-2008
17070 conversion of a floating-point value to the same format. NaNs are handled
17071 according to section 6.2.
17073 Examples of non-canonical encodings:
17075 - x87 pseudo denormals, pseudo NaNs, pseudo Infinity, Unnormals. These are
17076 converted to a canonical representation per hardware-specific protocol.
17077 - Many normal decimal floating-point numbers have non-canonical alternative
17079 - Some machines, like GPUs or ARMv7 NEON, do not support subnormal values.
17080 These are treated as non-canonical encodings of zero and will be flushed to
17081 a zero of the same sign by this operation.
17083 Note that per IEEE-754-2008 6.2, systems that support signaling NaNs with
17084 default exception handling must signal an invalid exception, and produce a
17087 This function should always be implementable as multiplication by 1.0, provided
17088 that the compiler does not constant fold the operation. Likewise, division by
17089 1.0 and ``llvm.minnum(x, x)`` are possible implementations. Addition with
17090 -0.0 is also sufficient provided that the rounding mode is not -Infinity.
17092 ``@llvm.canonicalize`` must preserve the equality relation. That is:
17094 - ``(@llvm.canonicalize(x) == x)`` is equivalent to ``(x == x)``
17095 - ``(@llvm.canonicalize(x) == @llvm.canonicalize(y))`` is equivalent
17098 Additionally, the sign of zero must be conserved:
17099 ``@llvm.canonicalize(-0.0) = -0.0`` and ``@llvm.canonicalize(+0.0) = +0.0``
17101 The payload bits of a NaN must be conserved, with two exceptions.
17102 First, environments which use only a single canonical representation of NaN
17103 must perform said canonicalization. Second, SNaNs must be quieted per the
17106 The canonicalization operation may be optimized away if:
17108 - The input is known to be canonical. For example, it was produced by a
17109 floating-point operation that is required by the standard to be canonical.
17110 - The result is consumed only by (or fused with) other floating-point
17111 operations. That is, the bits of the floating-point value are not examined.
17115 '``llvm.fmuladd.*``' Intrinsic
17116 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17123 declare float @llvm.fmuladd.f32(float %a, float %b, float %c)
17124 declare double @llvm.fmuladd.f64(double %a, double %b, double %c)
17129 The '``llvm.fmuladd.*``' intrinsic functions represent multiply-add
17130 expressions that can be fused if the code generator determines that (a) the
17131 target instruction set has support for a fused operation, and (b) that the
17132 fused operation is more efficient than the equivalent, separate pair of mul
17133 and add instructions.
17138 The '``llvm.fmuladd.*``' intrinsics each take three arguments: two
17139 multiplicands, a and b, and an addend c.
17148 %0 = call float @llvm.fmuladd.f32(%a, %b, %c)
17150 is equivalent to the expression a \* b + c, except that it is unspecified
17151 whether rounding will be performed between the multiplication and addition
17152 steps. Fusion is not guaranteed, even if the target platform supports it.
17153 If a fused multiply-add is required, the corresponding
17154 :ref:`llvm.fma <int_fma>` intrinsic function should be used instead.
17155 This never sets errno, just as '``llvm.fma.*``'.
17160 .. code-block:: llvm
17162 %r2 = call float @llvm.fmuladd.f32(float %a, float %b, float %c) ; yields float:r2 = (a * b) + c
17165 Hardware-Loop Intrinsics
17166 ------------------------
17168 LLVM support several intrinsics to mark a loop as a hardware-loop. They are
17169 hints to the backend which are required to lower these intrinsics further to target
17170 specific instructions, or revert the hardware-loop to a normal loop if target
17171 specific restriction are not met and a hardware-loop can't be generated.
17173 These intrinsics may be modified in the future and are not intended to be used
17174 outside the backend. Thus, front-end and mid-level optimizations should not be
17175 generating these intrinsics.
17178 '``llvm.set.loop.iterations.*``' Intrinsic
17179 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17184 This is an overloaded intrinsic.
17188 declare void @llvm.set.loop.iterations.i32(i32)
17189 declare void @llvm.set.loop.iterations.i64(i64)
17194 The '``llvm.set.loop.iterations.*``' intrinsics are used to specify the
17195 hardware-loop trip count. They are placed in the loop preheader basic block and
17196 are marked as ``IntrNoDuplicate`` to avoid optimizers duplicating these
17202 The integer operand is the loop trip count of the hardware-loop, and thus
17203 not e.g. the loop back-edge taken count.
17208 The '``llvm.set.loop.iterations.*``' intrinsics do not perform any arithmetic
17209 on their operand. It's a hint to the backend that can use this to set up the
17210 hardware-loop count with a target specific instruction, usually a move of this
17211 value to a special register or a hardware-loop instruction.
17214 '``llvm.start.loop.iterations.*``' Intrinsic
17215 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17220 This is an overloaded intrinsic.
17224 declare i32 @llvm.start.loop.iterations.i32(i32)
17225 declare i64 @llvm.start.loop.iterations.i64(i64)
17230 The '``llvm.start.loop.iterations.*``' intrinsics are similar to the
17231 '``llvm.set.loop.iterations.*``' intrinsics, used to specify the
17232 hardware-loop trip count but also produce a value identical to the input
17233 that can be used as the input to the loop. They are placed in the loop
17234 preheader basic block and the output is expected to be the input to the
17235 phi for the induction variable of the loop, decremented by the
17236 '``llvm.loop.decrement.reg.*``'.
17241 The integer operand is the loop trip count of the hardware-loop, and thus
17242 not e.g. the loop back-edge taken count.
17247 The '``llvm.start.loop.iterations.*``' intrinsics do not perform any arithmetic
17248 on their operand. It's a hint to the backend that can use this to set up the
17249 hardware-loop count with a target specific instruction, usually a move of this
17250 value to a special register or a hardware-loop instruction.
17252 '``llvm.test.set.loop.iterations.*``' Intrinsic
17253 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17258 This is an overloaded intrinsic.
17262 declare i1 @llvm.test.set.loop.iterations.i32(i32)
17263 declare i1 @llvm.test.set.loop.iterations.i64(i64)
17268 The '``llvm.test.set.loop.iterations.*``' intrinsics are used to specify the
17269 the loop trip count, and also test that the given count is not zero, allowing
17270 it to control entry to a while-loop. They are placed in the loop preheader's
17271 predecessor basic block, and are marked as ``IntrNoDuplicate`` to avoid
17272 optimizers duplicating these instructions.
17277 The integer operand is the loop trip count of the hardware-loop, and thus
17278 not e.g. the loop back-edge taken count.
17283 The '``llvm.test.set.loop.iterations.*``' intrinsics do not perform any
17284 arithmetic on their operand. It's a hint to the backend that can use this to
17285 set up the hardware-loop count with a target specific instruction, usually a
17286 move of this value to a special register or a hardware-loop instruction.
17287 The result is the conditional value of whether the given count is not zero.
17290 '``llvm.test.start.loop.iterations.*``' Intrinsic
17291 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17296 This is an overloaded intrinsic.
17300 declare {i32, i1} @llvm.test.start.loop.iterations.i32(i32)
17301 declare {i64, i1} @llvm.test.start.loop.iterations.i64(i64)
17306 The '``llvm.test.start.loop.iterations.*``' intrinsics are similar to the
17307 '``llvm.test.set.loop.iterations.*``' and '``llvm.start.loop.iterations.*``'
17308 intrinsics, used to specify the hardware-loop trip count, but also produce a
17309 value identical to the input that can be used as the input to the loop. The
17310 second i1 output controls entry to a while-loop.
17315 The integer operand is the loop trip count of the hardware-loop, and thus
17316 not e.g. the loop back-edge taken count.
17321 The '``llvm.test.start.loop.iterations.*``' intrinsics do not perform any
17322 arithmetic on their operand. It's a hint to the backend that can use this to
17323 set up the hardware-loop count with a target specific instruction, usually a
17324 move of this value to a special register or a hardware-loop instruction.
17325 The result is a pair of the input and a conditional value of whether the
17326 given count is not zero.
17329 '``llvm.loop.decrement.reg.*``' Intrinsic
17330 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17335 This is an overloaded intrinsic.
17339 declare i32 @llvm.loop.decrement.reg.i32(i32, i32)
17340 declare i64 @llvm.loop.decrement.reg.i64(i64, i64)
17345 The '``llvm.loop.decrement.reg.*``' intrinsics are used to lower the loop
17346 iteration counter and return an updated value that will be used in the next
17352 Both arguments must have identical integer types. The first operand is the
17353 loop iteration counter. The second operand is the maximum number of elements
17354 processed in an iteration.
17359 The '``llvm.loop.decrement.reg.*``' intrinsics do an integer ``SUB`` of its
17360 two operands, which is not allowed to wrap. They return the remaining number of
17361 iterations still to be executed, and can be used together with a ``PHI``,
17362 ``ICMP`` and ``BR`` to control the number of loop iterations executed. Any
17363 optimisations are allowed to treat it is a ``SUB``, and it is supported by
17364 SCEV, so it's the backends responsibility to handle cases where it may be
17365 optimised. These intrinsics are marked as ``IntrNoDuplicate`` to avoid
17366 optimizers duplicating these instructions.
17369 '``llvm.loop.decrement.*``' Intrinsic
17370 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17375 This is an overloaded intrinsic.
17379 declare i1 @llvm.loop.decrement.i32(i32)
17380 declare i1 @llvm.loop.decrement.i64(i64)
17385 The HardwareLoops pass allows the loop decrement value to be specified with an
17386 option. It defaults to a loop decrement value of 1, but it can be an unsigned
17387 integer value provided by this option. The '``llvm.loop.decrement.*``'
17388 intrinsics decrement the loop iteration counter with this value, and return a
17389 false predicate if the loop should exit, and true otherwise.
17390 This is emitted if the loop counter is not updated via a ``PHI`` node, which
17391 can also be controlled with an option.
17396 The integer argument is the loop decrement value used to decrement the loop
17402 The '``llvm.loop.decrement.*``' intrinsics do a ``SUB`` of the loop iteration
17403 counter with the given loop decrement value, and return false if the loop
17404 should exit, this ``SUB`` is not allowed to wrap. The result is a condition
17405 that is used by the conditional branch controlling the loop.
17408 Vector Reduction Intrinsics
17409 ---------------------------
17411 Horizontal reductions of vectors can be expressed using the following
17412 intrinsics. Each one takes a vector operand as an input and applies its
17413 respective operation across all elements of the vector, returning a single
17414 scalar result of the same element type.
17416 .. _int_vector_reduce_add:
17418 '``llvm.vector.reduce.add.*``' Intrinsic
17419 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17426 declare i32 @llvm.vector.reduce.add.v4i32(<4 x i32> %a)
17427 declare i64 @llvm.vector.reduce.add.v2i64(<2 x i64> %a)
17432 The '``llvm.vector.reduce.add.*``' intrinsics do an integer ``ADD``
17433 reduction of a vector, returning the result as a scalar. The return type matches
17434 the element-type of the vector input.
17438 The argument to this intrinsic must be a vector of integer values.
17440 .. _int_vector_reduce_fadd:
17442 '``llvm.vector.reduce.fadd.*``' Intrinsic
17443 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17450 declare float @llvm.vector.reduce.fadd.v4f32(float %start_value, <4 x float> %a)
17451 declare double @llvm.vector.reduce.fadd.v2f64(double %start_value, <2 x double> %a)
17456 The '``llvm.vector.reduce.fadd.*``' intrinsics do a floating-point
17457 ``ADD`` reduction of a vector, returning the result as a scalar. The return type
17458 matches the element-type of the vector input.
17460 If the intrinsic call has the 'reassoc' flag set, then the reduction will not
17461 preserve the associativity of an equivalent scalarized counterpart. Otherwise
17462 the reduction will be *sequential*, thus implying that the operation respects
17463 the associativity of a scalarized reduction. That is, the reduction begins with
17464 the start value and performs an fadd operation with consecutively increasing
17465 vector element indices. See the following pseudocode:
17469 float sequential_fadd(start_value, input_vector)
17470 result = start_value
17471 for i = 0 to length(input_vector)
17472 result = result + input_vector[i]
17478 The first argument to this intrinsic is a scalar start value for the reduction.
17479 The type of the start value matches the element-type of the vector input.
17480 The second argument must be a vector of floating-point values.
17482 To ignore the start value, negative zero (``-0.0``) can be used, as it is
17483 the neutral value of floating point addition.
17490 %unord = call reassoc float @llvm.vector.reduce.fadd.v4f32(float -0.0, <4 x float> %input) ; relaxed reduction
17491 %ord = call float @llvm.vector.reduce.fadd.v4f32(float %start_value, <4 x float> %input) ; sequential reduction
17494 .. _int_vector_reduce_mul:
17496 '``llvm.vector.reduce.mul.*``' Intrinsic
17497 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17504 declare i32 @llvm.vector.reduce.mul.v4i32(<4 x i32> %a)
17505 declare i64 @llvm.vector.reduce.mul.v2i64(<2 x i64> %a)
17510 The '``llvm.vector.reduce.mul.*``' intrinsics do an integer ``MUL``
17511 reduction of a vector, returning the result as a scalar. The return type matches
17512 the element-type of the vector input.
17516 The argument to this intrinsic must be a vector of integer values.
17518 .. _int_vector_reduce_fmul:
17520 '``llvm.vector.reduce.fmul.*``' Intrinsic
17521 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17528 declare float @llvm.vector.reduce.fmul.v4f32(float %start_value, <4 x float> %a)
17529 declare double @llvm.vector.reduce.fmul.v2f64(double %start_value, <2 x double> %a)
17534 The '``llvm.vector.reduce.fmul.*``' intrinsics do a floating-point
17535 ``MUL`` reduction of a vector, returning the result as a scalar. The return type
17536 matches the element-type of the vector input.
17538 If the intrinsic call has the 'reassoc' flag set, then the reduction will not
17539 preserve the associativity of an equivalent scalarized counterpart. Otherwise
17540 the reduction will be *sequential*, thus implying that the operation respects
17541 the associativity of a scalarized reduction. That is, the reduction begins with
17542 the start value and performs an fmul operation with consecutively increasing
17543 vector element indices. See the following pseudocode:
17547 float sequential_fmul(start_value, input_vector)
17548 result = start_value
17549 for i = 0 to length(input_vector)
17550 result = result * input_vector[i]
17556 The first argument to this intrinsic is a scalar start value for the reduction.
17557 The type of the start value matches the element-type of the vector input.
17558 The second argument must be a vector of floating-point values.
17560 To ignore the start value, one (``1.0``) can be used, as it is the neutral
17561 value of floating point multiplication.
17568 %unord = call reassoc float @llvm.vector.reduce.fmul.v4f32(float 1.0, <4 x float> %input) ; relaxed reduction
17569 %ord = call float @llvm.vector.reduce.fmul.v4f32(float %start_value, <4 x float> %input) ; sequential reduction
17571 .. _int_vector_reduce_and:
17573 '``llvm.vector.reduce.and.*``' Intrinsic
17574 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17581 declare i32 @llvm.vector.reduce.and.v4i32(<4 x i32> %a)
17586 The '``llvm.vector.reduce.and.*``' intrinsics do a bitwise ``AND``
17587 reduction of a vector, returning the result as a scalar. The return type matches
17588 the element-type of the vector input.
17592 The argument to this intrinsic must be a vector of integer values.
17594 .. _int_vector_reduce_or:
17596 '``llvm.vector.reduce.or.*``' Intrinsic
17597 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17604 declare i32 @llvm.vector.reduce.or.v4i32(<4 x i32> %a)
17609 The '``llvm.vector.reduce.or.*``' intrinsics do a bitwise ``OR`` reduction
17610 of a vector, returning the result as a scalar. The return type matches the
17611 element-type of the vector input.
17615 The argument to this intrinsic must be a vector of integer values.
17617 .. _int_vector_reduce_xor:
17619 '``llvm.vector.reduce.xor.*``' Intrinsic
17620 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17627 declare i32 @llvm.vector.reduce.xor.v4i32(<4 x i32> %a)
17632 The '``llvm.vector.reduce.xor.*``' intrinsics do a bitwise ``XOR``
17633 reduction of a vector, returning the result as a scalar. The return type matches
17634 the element-type of the vector input.
17638 The argument to this intrinsic must be a vector of integer values.
17640 .. _int_vector_reduce_smax:
17642 '``llvm.vector.reduce.smax.*``' Intrinsic
17643 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17650 declare i32 @llvm.vector.reduce.smax.v4i32(<4 x i32> %a)
17655 The '``llvm.vector.reduce.smax.*``' intrinsics do a signed integer
17656 ``MAX`` reduction of a vector, returning the result as a scalar. The return type
17657 matches the element-type of the vector input.
17661 The argument to this intrinsic must be a vector of integer values.
17663 .. _int_vector_reduce_smin:
17665 '``llvm.vector.reduce.smin.*``' Intrinsic
17666 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17673 declare i32 @llvm.vector.reduce.smin.v4i32(<4 x i32> %a)
17678 The '``llvm.vector.reduce.smin.*``' intrinsics do a signed integer
17679 ``MIN`` reduction of a vector, returning the result as a scalar. The return type
17680 matches the element-type of the vector input.
17684 The argument to this intrinsic must be a vector of integer values.
17686 .. _int_vector_reduce_umax:
17688 '``llvm.vector.reduce.umax.*``' Intrinsic
17689 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17696 declare i32 @llvm.vector.reduce.umax.v4i32(<4 x i32> %a)
17701 The '``llvm.vector.reduce.umax.*``' intrinsics do an unsigned
17702 integer ``MAX`` reduction of a vector, returning the result as a scalar. The
17703 return type matches the element-type of the vector input.
17707 The argument to this intrinsic must be a vector of integer values.
17709 .. _int_vector_reduce_umin:
17711 '``llvm.vector.reduce.umin.*``' Intrinsic
17712 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17719 declare i32 @llvm.vector.reduce.umin.v4i32(<4 x i32> %a)
17724 The '``llvm.vector.reduce.umin.*``' intrinsics do an unsigned
17725 integer ``MIN`` reduction of a vector, returning the result as a scalar. The
17726 return type matches the element-type of the vector input.
17730 The argument to this intrinsic must be a vector of integer values.
17732 .. _int_vector_reduce_fmax:
17734 '``llvm.vector.reduce.fmax.*``' Intrinsic
17735 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17742 declare float @llvm.vector.reduce.fmax.v4f32(<4 x float> %a)
17743 declare double @llvm.vector.reduce.fmax.v2f64(<2 x double> %a)
17748 The '``llvm.vector.reduce.fmax.*``' intrinsics do a floating-point
17749 ``MAX`` reduction of a vector, returning the result as a scalar. The return type
17750 matches the element-type of the vector input.
17752 This instruction has the same comparison semantics as the '``llvm.maxnum.*``'
17753 intrinsic. That is, the result will always be a number unless all elements of
17754 the vector are NaN. For a vector with maximum element magnitude 0.0 and
17755 containing both +0.0 and -0.0 elements, the sign of the result is unspecified.
17757 If the intrinsic call has the ``nnan`` fast-math flag, then the operation can
17758 assume that NaNs are not present in the input vector.
17762 The argument to this intrinsic must be a vector of floating-point values.
17764 .. _int_vector_reduce_fmin:
17766 '``llvm.vector.reduce.fmin.*``' Intrinsic
17767 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17771 This is an overloaded intrinsic.
17775 declare float @llvm.vector.reduce.fmin.v4f32(<4 x float> %a)
17776 declare double @llvm.vector.reduce.fmin.v2f64(<2 x double> %a)
17781 The '``llvm.vector.reduce.fmin.*``' intrinsics do a floating-point
17782 ``MIN`` reduction of a vector, returning the result as a scalar. The return type
17783 matches the element-type of the vector input.
17785 This instruction has the same comparison semantics as the '``llvm.minnum.*``'
17786 intrinsic. That is, the result will always be a number unless all elements of
17787 the vector are NaN. For a vector with minimum element magnitude 0.0 and
17788 containing both +0.0 and -0.0 elements, the sign of the result is unspecified.
17790 If the intrinsic call has the ``nnan`` fast-math flag, then the operation can
17791 assume that NaNs are not present in the input vector.
17795 The argument to this intrinsic must be a vector of floating-point values.
17797 '``llvm.vector.insert``' Intrinsic
17798 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17802 This is an overloaded intrinsic.
17806 ; Insert fixed type into scalable type
17807 declare <vscale x 4 x float> @llvm.vector.insert.nxv4f32.v4f32(<vscale x 4 x float> %vec, <4 x float> %subvec, i64 <idx>)
17808 declare <vscale x 2 x double> @llvm.vector.insert.nxv2f64.v2f64(<vscale x 2 x double> %vec, <2 x double> %subvec, i64 <idx>)
17810 ; Insert scalable type into scalable type
17811 declare <vscale x 4 x float> @llvm.vector.insert.nxv4f64.nxv2f64(<vscale x 4 x float> %vec, <vscale x 2 x float> %subvec, i64 <idx>)
17813 ; Insert fixed type into fixed type
17814 declare <4 x double> @llvm.vector.insert.v4f64.v2f64(<4 x double> %vec, <2 x double> %subvec, i64 <idx>)
17819 The '``llvm.vector.insert.*``' intrinsics insert a vector into another vector
17820 starting from a given index. The return type matches the type of the vector we
17821 insert into. Conceptually, this can be used to build a scalable vector out of
17822 non-scalable vectors, however this intrinsic can also be used on purely fixed
17825 Scalable vectors can only be inserted into other scalable vectors.
17830 The ``vec`` is the vector which ``subvec`` will be inserted into.
17831 The ``subvec`` is the vector that will be inserted.
17833 ``idx`` represents the starting element number at which ``subvec`` will be
17834 inserted. ``idx`` must be a constant multiple of ``subvec``'s known minimum
17835 vector length. If ``subvec`` is a scalable vector, ``idx`` is first scaled by
17836 the runtime scaling factor of ``subvec``. The elements of ``vec`` starting at
17837 ``idx`` are overwritten with ``subvec``. Elements ``idx`` through (``idx`` +
17838 num_elements(``subvec``) - 1) must be valid ``vec`` indices. If this condition
17839 cannot be determined statically but is false at runtime, then the result vector
17840 is a :ref:`poison value <poisonvalues>`.
17843 '``llvm.vector.extract``' Intrinsic
17844 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17848 This is an overloaded intrinsic.
17852 ; Extract fixed type from scalable type
17853 declare <4 x float> @llvm.vector.extract.v4f32.nxv4f32(<vscale x 4 x float> %vec, i64 <idx>)
17854 declare <2 x double> @llvm.vector.extract.v2f64.nxv2f64(<vscale x 2 x double> %vec, i64 <idx>)
17856 ; Extract scalable type from scalable type
17857 declare <vscale x 2 x float> @llvm.vector.extract.nxv2f32.nxv4f32(<vscale x 4 x float> %vec, i64 <idx>)
17859 ; Extract fixed type from fixed type
17860 declare <2 x double> @llvm.vector.extract.v2f64.v4f64(<4 x double> %vec, i64 <idx>)
17865 The '``llvm.vector.extract.*``' intrinsics extract a vector from within another
17866 vector starting from a given index. The return type must be explicitly
17867 specified. Conceptually, this can be used to decompose a scalable vector into
17868 non-scalable parts, however this intrinsic can also be used on purely fixed
17871 Scalable vectors can only be extracted from other scalable vectors.
17876 The ``vec`` is the vector from which we will extract a subvector.
17878 The ``idx`` specifies the starting element number within ``vec`` from which a
17879 subvector is extracted. ``idx`` must be a constant multiple of the known-minimum
17880 vector length of the result type. If the result type is a scalable vector,
17881 ``idx`` is first scaled by the result type's runtime scaling factor. Elements
17882 ``idx`` through (``idx`` + num_elements(result_type) - 1) must be valid vector
17883 indices. If this condition cannot be determined statically but is false at
17884 runtime, then the result vector is a :ref:`poison value <poisonvalues>`. The
17885 ``idx`` parameter must be a vector index constant type (for most targets this
17886 will be an integer pointer type).
17888 '``llvm.experimental.vector.reverse``' Intrinsic
17889 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17893 This is an overloaded intrinsic.
17897 declare <2 x i8> @llvm.experimental.vector.reverse.v2i8(<2 x i8> %a)
17898 declare <vscale x 4 x i32> @llvm.experimental.vector.reverse.nxv4i32(<vscale x 4 x i32> %a)
17903 The '``llvm.experimental.vector.reverse.*``' intrinsics reverse a vector.
17904 The intrinsic takes a single vector and returns a vector of matching type but
17905 with the original lane order reversed. These intrinsics work for both fixed
17906 and scalable vectors. While this intrinsic is marked as experimental the
17907 recommended way to express reverse operations for fixed-width vectors is still
17908 to use a shufflevector, as that may allow for more optimization opportunities.
17913 The argument to this intrinsic must be a vector.
17915 '``llvm.experimental.vector.deinterleave2``' Intrinsic
17916 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17920 This is an overloaded intrinsic.
17924 declare {<2 x double>, <2 x double>} @llvm.experimental.vector.deinterleave2.v4f64(<4 x double> %vec1)
17925 declare {<vscale x 4 x i32>, <vscale x 4 x i32>} @llvm.experimental.vector.deinterleave2.nxv8i32(<vscale x 8 x i32> %vec1)
17930 The '``llvm.experimental.vector.deinterleave2``' intrinsic constructs two
17931 vectors by deinterleaving the even and odd lanes of the input vector.
17933 This intrinsic works for both fixed and scalable vectors. While this intrinsic
17934 supports all vector types the recommended way to express this operation for
17935 fixed-width vectors is still to use a shufflevector, as that may allow for more
17936 optimization opportunities.
17940 .. code-block:: text
17942 {<2 x i64>, <2 x i64>} llvm.experimental.vector.deinterleave2.v4i64(<4 x i64> <i64 0, i64 1, i64 2, i64 3>); ==> {<2 x i64> <i64 0, i64 2>, <2 x i64> <i64 1, i64 3>}
17947 The argument is a vector whose type corresponds to the logical concatenation of
17948 the two result types.
17950 '``llvm.experimental.vector.interleave2``' Intrinsic
17951 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17955 This is an overloaded intrinsic.
17959 declare <4 x double> @llvm.experimental.vector.interleave2.v4f64(<2 x double> %vec1, <2 x double> %vec2)
17960 declare <vscale x 8 x i32> @llvm.experimental.vector.interleave2.nxv8i32(<vscale x 4 x i32> %vec1, <vscale x 4 x i32> %vec2)
17965 The '``llvm.experimental.vector.interleave2``' intrinsic constructs a vector
17966 by interleaving two input vectors.
17968 This intrinsic works for both fixed and scalable vectors. While this intrinsic
17969 supports all vector types the recommended way to express this operation for
17970 fixed-width vectors is still to use a shufflevector, as that may allow for more
17971 optimization opportunities.
17975 .. code-block:: text
17977 <4 x i64> llvm.experimental.vector.interleave2.v4i64(<2 x i64> <i64 0, i64 2>, <2 x i64> <i64 1, i64 3>); ==> <4 x i64> <i64 0, i64 1, i64 2, i64 3>
17981 Both arguments must be vectors of the same type whereby their logical
17982 concatenation matches the result type.
17984 '``llvm.experimental.vector.splice``' Intrinsic
17985 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17989 This is an overloaded intrinsic.
17993 declare <2 x double> @llvm.experimental.vector.splice.v2f64(<2 x double> %vec1, <2 x double> %vec2, i32 %imm)
17994 declare <vscale x 4 x i32> @llvm.experimental.vector.splice.nxv4i32(<vscale x 4 x i32> %vec1, <vscale x 4 x i32> %vec2, i32 %imm)
17999 The '``llvm.experimental.vector.splice.*``' intrinsics construct a vector by
18000 concatenating elements from the first input vector with elements of the second
18001 input vector, returning a vector of the same type as the input vectors. The
18002 signed immediate, modulo the number of elements in the vector, is the index
18003 into the first vector from which to extract the result value. This means
18004 conceptually that for a positive immediate, a vector is extracted from
18005 ``concat(%vec1, %vec2)`` starting at index ``imm``, whereas for a negative
18006 immediate, it extracts ``-imm`` trailing elements from the first vector, and
18007 the remaining elements from ``%vec2``.
18009 These intrinsics work for both fixed and scalable vectors. While this intrinsic
18010 is marked as experimental, the recommended way to express this operation for
18011 fixed-width vectors is still to use a shufflevector, as that may allow for more
18012 optimization opportunities.
18016 .. code-block:: text
18018 llvm.experimental.vector.splice(<A,B,C,D>, <E,F,G,H>, 1) ==> <B, C, D, E> ; index
18019 llvm.experimental.vector.splice(<A,B,C,D>, <E,F,G,H>, -3) ==> <B, C, D, E> ; trailing elements
18025 The first two operands are vectors with the same type. The start index is imm
18026 modulo the runtime number of elements in the source vector. For a fixed-width
18027 vector <N x eltty>, imm is a signed integer constant in the range
18028 -N <= imm < N. For a scalable vector <vscale x N x eltty>, imm is a signed
18029 integer constant in the range -X <= imm < X where X=vscale_range_min * N.
18031 '``llvm.experimental.stepvector``' Intrinsic
18032 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18034 This is an overloaded intrinsic. You can use ``llvm.experimental.stepvector``
18035 to generate a vector whose lane values comprise the linear sequence
18036 <0, 1, 2, ...>. It is primarily intended for scalable vectors.
18040 declare <vscale x 4 x i32> @llvm.experimental.stepvector.nxv4i32()
18041 declare <vscale x 8 x i16> @llvm.experimental.stepvector.nxv8i16()
18043 The '``llvm.experimental.stepvector``' intrinsics are used to create vectors
18044 of integers whose elements contain a linear sequence of values starting from 0
18045 with a step of 1. This experimental intrinsic can only be used for vectors
18046 with integer elements that are at least 8 bits in size. If the sequence value
18047 exceeds the allowed limit for the element type then the result for that lane is
18050 These intrinsics work for both fixed and scalable vectors. While this intrinsic
18051 is marked as experimental, the recommended way to express this operation for
18052 fixed-width vectors is still to generate a constant vector instead.
18064 Operations on matrixes requiring shape information (like number of rows/columns
18065 or the memory layout) can be expressed using the matrix intrinsics. These
18066 intrinsics require matrix dimensions to be passed as immediate arguments, and
18067 matrixes are passed and returned as vectors. This means that for a ``R`` x
18068 ``C`` matrix, element ``i`` of column ``j`` is at index ``j * R + i`` in the
18069 corresponding vector, with indices starting at 0. Currently column-major layout
18070 is assumed. The intrinsics support both integer and floating point matrixes.
18073 '``llvm.matrix.transpose.*``' Intrinsic
18074 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18078 This is an overloaded intrinsic.
18082 declare vectorty @llvm.matrix.transpose.*(vectorty %In, i32 <Rows>, i32 <Cols>)
18087 The '``llvm.matrix.transpose.*``' intrinsics treat ``%In`` as a ``<Rows> x
18088 <Cols>`` matrix and return the transposed matrix in the result vector.
18093 The first argument ``%In`` is a vector that corresponds to a ``<Rows> x
18094 <Cols>`` matrix. Thus, arguments ``<Rows>`` and ``<Cols>`` correspond to the
18095 number of rows and columns, respectively, and must be positive, constant
18096 integers. The returned vector must have ``<Rows> * <Cols>`` elements, and have
18097 the same float or integer element type as ``%In``.
18099 '``llvm.matrix.multiply.*``' Intrinsic
18100 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18104 This is an overloaded intrinsic.
18108 declare vectorty @llvm.matrix.multiply.*(vectorty %A, vectorty %B, i32 <OuterRows>, i32 <Inner>, i32 <OuterColumns>)
18113 The '``llvm.matrix.multiply.*``' intrinsics treat ``%A`` as a ``<OuterRows> x
18114 <Inner>`` matrix, ``%B`` as a ``<Inner> x <OuterColumns>`` matrix, and
18115 multiplies them. The result matrix is returned in the result vector.
18120 The first vector argument ``%A`` corresponds to a matrix with ``<OuterRows> *
18121 <Inner>`` elements, and the second argument ``%B`` to a matrix with
18122 ``<Inner> * <OuterColumns>`` elements. Arguments ``<OuterRows>``,
18123 ``<Inner>`` and ``<OuterColumns>`` must be positive, constant integers. The
18124 returned vector must have ``<OuterRows> * <OuterColumns>`` elements.
18125 Vectors ``%A``, ``%B``, and the returned vector all have the same float or
18126 integer element type.
18129 '``llvm.matrix.column.major.load.*``' Intrinsic
18130 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18134 This is an overloaded intrinsic.
18138 declare vectorty @llvm.matrix.column.major.load.*(
18139 ptrty %Ptr, i64 %Stride, i1 <IsVolatile>, i32 <Rows>, i32 <Cols>)
18144 The '``llvm.matrix.column.major.load.*``' intrinsics load a ``<Rows> x <Cols>``
18145 matrix using a stride of ``%Stride`` to compute the start address of the
18146 different columns. The offset is computed using ``%Stride``'s bitwidth. This
18147 allows for convenient loading of sub matrixes. If ``<IsVolatile>`` is true, the
18148 intrinsic is considered a :ref:`volatile memory access <volatile>`. The result
18149 matrix is returned in the result vector. If the ``%Ptr`` argument is known to
18150 be aligned to some boundary, this can be specified as an attribute on the
18156 The first argument ``%Ptr`` is a pointer type to the returned vector type, and
18157 corresponds to the start address to load from. The second argument ``%Stride``
18158 is a positive, constant integer with ``%Stride >= <Rows>``. ``%Stride`` is used
18159 to compute the column memory addresses. I.e., for a column ``C``, its start
18160 memory addresses is calculated with ``%Ptr + C * %Stride``. The third Argument
18161 ``<IsVolatile>`` is a boolean value. The fourth and fifth arguments,
18162 ``<Rows>`` and ``<Cols>``, correspond to the number of rows and columns,
18163 respectively, and must be positive, constant integers. The returned vector must
18164 have ``<Rows> * <Cols>`` elements.
18166 The :ref:`align <attr_align>` parameter attribute can be provided for the
18167 ``%Ptr`` arguments.
18170 '``llvm.matrix.column.major.store.*``' Intrinsic
18171 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18178 declare void @llvm.matrix.column.major.store.*(
18179 vectorty %In, ptrty %Ptr, i64 %Stride, i1 <IsVolatile>, i32 <Rows>, i32 <Cols>)
18184 The '``llvm.matrix.column.major.store.*``' intrinsics store the ``<Rows> x
18185 <Cols>`` matrix in ``%In`` to memory using a stride of ``%Stride`` between
18186 columns. The offset is computed using ``%Stride``'s bitwidth. If
18187 ``<IsVolatile>`` is true, the intrinsic is considered a
18188 :ref:`volatile memory access <volatile>`.
18190 If the ``%Ptr`` argument is known to be aligned to some boundary, this can be
18191 specified as an attribute on the argument.
18196 The first argument ``%In`` is a vector that corresponds to a ``<Rows> x
18197 <Cols>`` matrix to be stored to memory. The second argument ``%Ptr`` is a
18198 pointer to the vector type of ``%In``, and is the start address of the matrix
18199 in memory. The third argument ``%Stride`` is a positive, constant integer with
18200 ``%Stride >= <Rows>``. ``%Stride`` is used to compute the column memory
18201 addresses. I.e., for a column ``C``, its start memory addresses is calculated
18202 with ``%Ptr + C * %Stride``. The fourth argument ``<IsVolatile>`` is a boolean
18203 value. The arguments ``<Rows>`` and ``<Cols>`` correspond to the number of rows
18204 and columns, respectively, and must be positive, constant integers.
18206 The :ref:`align <attr_align>` parameter attribute can be provided
18207 for the ``%Ptr`` arguments.
18210 Half Precision Floating-Point Intrinsics
18211 ----------------------------------------
18213 For most target platforms, half precision floating-point is a
18214 storage-only format. This means that it is a dense encoding (in memory)
18215 but does not support computation in the format.
18217 This means that code must first load the half-precision floating-point
18218 value as an i16, then convert it to float with
18219 :ref:`llvm.convert.from.fp16 <int_convert_from_fp16>`. Computation can
18220 then be performed on the float value (including extending to double
18221 etc). To store the value back to memory, it is first converted to float
18222 if needed, then converted to i16 with
18223 :ref:`llvm.convert.to.fp16 <int_convert_to_fp16>`, then storing as an
18226 .. _int_convert_to_fp16:
18228 '``llvm.convert.to.fp16``' Intrinsic
18229 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18236 declare i16 @llvm.convert.to.fp16.f32(float %a)
18237 declare i16 @llvm.convert.to.fp16.f64(double %a)
18242 The '``llvm.convert.to.fp16``' intrinsic function performs a conversion from a
18243 conventional floating-point type to half precision floating-point format.
18248 The intrinsic function contains single argument - the value to be
18254 The '``llvm.convert.to.fp16``' intrinsic function performs a conversion from a
18255 conventional floating-point format to half precision floating-point format. The
18256 return value is an ``i16`` which contains the converted number.
18261 .. code-block:: llvm
18263 %res = call i16 @llvm.convert.to.fp16.f32(float %a)
18264 store i16 %res, i16* @x, align 2
18266 .. _int_convert_from_fp16:
18268 '``llvm.convert.from.fp16``' Intrinsic
18269 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18276 declare float @llvm.convert.from.fp16.f32(i16 %a)
18277 declare double @llvm.convert.from.fp16.f64(i16 %a)
18282 The '``llvm.convert.from.fp16``' intrinsic function performs a
18283 conversion from half precision floating-point format to single precision
18284 floating-point format.
18289 The intrinsic function contains single argument - the value to be
18295 The '``llvm.convert.from.fp16``' intrinsic function performs a
18296 conversion from half single precision floating-point format to single
18297 precision floating-point format. The input half-float value is
18298 represented by an ``i16`` value.
18303 .. code-block:: llvm
18305 %a = load i16, ptr @x, align 2
18306 %res = call float @llvm.convert.from.fp16(i16 %a)
18308 Saturating floating-point to integer conversions
18309 ------------------------------------------------
18311 The ``fptoui`` and ``fptosi`` instructions return a
18312 :ref:`poison value <poisonvalues>` if the rounded-towards-zero value is not
18313 representable by the result type. These intrinsics provide an alternative
18314 conversion, which will saturate towards the smallest and largest representable
18315 integer values instead.
18317 '``llvm.fptoui.sat.*``' Intrinsic
18318 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18323 This is an overloaded intrinsic. You can use ``llvm.fptoui.sat`` on any
18324 floating-point argument type and any integer result type, or vectors thereof.
18325 Not all targets may support all types, however.
18329 declare i32 @llvm.fptoui.sat.i32.f32(float %f)
18330 declare i19 @llvm.fptoui.sat.i19.f64(double %f)
18331 declare <4 x i100> @llvm.fptoui.sat.v4i100.v4f128(<4 x fp128> %f)
18336 This intrinsic converts the argument into an unsigned integer using saturating
18342 The argument may be any floating-point or vector of floating-point type. The
18343 return value may be any integer or vector of integer type. The number of vector
18344 elements in argument and return must be the same.
18349 The conversion to integer is performed subject to the following rules:
18351 - If the argument is any NaN, zero is returned.
18352 - If the argument is smaller than zero (this includes negative infinity),
18354 - If the argument is larger than the largest representable unsigned integer of
18355 the result type (this includes positive infinity), the largest representable
18356 unsigned integer is returned.
18357 - Otherwise, the result of rounding the argument towards zero is returned.
18362 .. code-block:: text
18364 %a = call i8 @llvm.fptoui.sat.i8.f32(float 123.9) ; yields i8: 123
18365 %b = call i8 @llvm.fptoui.sat.i8.f32(float -5.7) ; yields i8: 0
18366 %c = call i8 @llvm.fptoui.sat.i8.f32(float 377.0) ; yields i8: 255
18367 %d = call i8 @llvm.fptoui.sat.i8.f32(float 0xFFF8000000000000) ; yields i8: 0
18369 '``llvm.fptosi.sat.*``' Intrinsic
18370 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18375 This is an overloaded intrinsic. You can use ``llvm.fptosi.sat`` on any
18376 floating-point argument type and any integer result type, or vectors thereof.
18377 Not all targets may support all types, however.
18381 declare i32 @llvm.fptosi.sat.i32.f32(float %f)
18382 declare i19 @llvm.fptosi.sat.i19.f64(double %f)
18383 declare <4 x i100> @llvm.fptosi.sat.v4i100.v4f128(<4 x fp128> %f)
18388 This intrinsic converts the argument into a signed integer using saturating
18394 The argument may be any floating-point or vector of floating-point type. The
18395 return value may be any integer or vector of integer type. The number of vector
18396 elements in argument and return must be the same.
18401 The conversion to integer is performed subject to the following rules:
18403 - If the argument is any NaN, zero is returned.
18404 - If the argument is smaller than the smallest representable signed integer of
18405 the result type (this includes negative infinity), the smallest
18406 representable signed integer is returned.
18407 - If the argument is larger than the largest representable signed integer of
18408 the result type (this includes positive infinity), the largest representable
18409 signed integer is returned.
18410 - Otherwise, the result of rounding the argument towards zero is returned.
18415 .. code-block:: text
18417 %a = call i8 @llvm.fptosi.sat.i8.f32(float 23.9) ; yields i8: 23
18418 %b = call i8 @llvm.fptosi.sat.i8.f32(float -130.8) ; yields i8: -128
18419 %c = call i8 @llvm.fptosi.sat.i8.f32(float 999.0) ; yields i8: 127
18420 %d = call i8 @llvm.fptosi.sat.i8.f32(float 0xFFF8000000000000) ; yields i8: 0
18422 .. _dbg_intrinsics:
18424 Debugger Intrinsics
18425 -------------------
18427 The LLVM debugger intrinsics (which all start with ``llvm.dbg.``
18428 prefix), are described in the `LLVM Source Level
18429 Debugging <SourceLevelDebugging.html#format-common-intrinsics>`_
18432 Exception Handling Intrinsics
18433 -----------------------------
18435 The LLVM exception handling intrinsics (which all start with
18436 ``llvm.eh.`` prefix), are described in the `LLVM Exception
18437 Handling <ExceptionHandling.html#format-common-intrinsics>`_ document.
18439 Pointer Authentication Intrinsics
18440 ---------------------------------
18442 The LLVM pointer authentication intrinsics (which all start with
18443 ``llvm.ptrauth.`` prefix), are described in the `Pointer Authentication
18444 <PointerAuth.html#intrinsics>`_ document.
18446 .. _int_trampoline:
18448 Trampoline Intrinsics
18449 ---------------------
18451 These intrinsics make it possible to excise one parameter, marked with
18452 the :ref:`nest <nest>` attribute, from a function. The result is a
18453 callable function pointer lacking the nest parameter - the caller does
18454 not need to provide a value for it. Instead, the value to use is stored
18455 in advance in a "trampoline", a block of memory usually allocated on the
18456 stack, which also contains code to splice the nest value into the
18457 argument list. This is used to implement the GCC nested function address
18460 For example, if the function is ``i32 f(ptr nest %c, i32 %x, i32 %y)``
18461 then the resulting function pointer has signature ``i32 (i32, i32)``.
18462 It can be created as follows:
18464 .. code-block:: llvm
18466 %tramp = alloca [10 x i8], align 4 ; size and alignment only correct for X86
18467 call ptr @llvm.init.trampoline(ptr %tramp, ptr @f, ptr %nval)
18468 %fp = call ptr @llvm.adjust.trampoline(ptr %tramp)
18470 The call ``%val = call i32 %fp(i32 %x, i32 %y)`` is then equivalent to
18471 ``%val = call i32 %f(ptr %nval, i32 %x, i32 %y)``.
18475 '``llvm.init.trampoline``' Intrinsic
18476 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18483 declare void @llvm.init.trampoline(ptr <tramp>, ptr <func>, ptr <nval>)
18488 This fills the memory pointed to by ``tramp`` with executable code,
18489 turning it into a trampoline.
18494 The ``llvm.init.trampoline`` intrinsic takes three arguments, all
18495 pointers. The ``tramp`` argument must point to a sufficiently large and
18496 sufficiently aligned block of memory; this memory is written to by the
18497 intrinsic. Note that the size and the alignment are target-specific -
18498 LLVM currently provides no portable way of determining them, so a
18499 front-end that generates this intrinsic needs to have some
18500 target-specific knowledge. The ``func`` argument must hold a function.
18505 The block of memory pointed to by ``tramp`` is filled with target
18506 dependent code, turning it into a function. Then ``tramp`` needs to be
18507 passed to :ref:`llvm.adjust.trampoline <int_at>` to get a pointer which can
18508 be :ref:`bitcast (to a new function) and called <int_trampoline>`. The new
18509 function's signature is the same as that of ``func`` with any arguments
18510 marked with the ``nest`` attribute removed. At most one such ``nest``
18511 argument is allowed, and it must be of pointer type. Calling the new
18512 function is equivalent to calling ``func`` with the same argument list,
18513 but with ``nval`` used for the missing ``nest`` argument. If, after
18514 calling ``llvm.init.trampoline``, the memory pointed to by ``tramp`` is
18515 modified, then the effect of any later call to the returned function
18516 pointer is undefined.
18520 '``llvm.adjust.trampoline``' Intrinsic
18521 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18528 declare ptr @llvm.adjust.trampoline(ptr <tramp>)
18533 This performs any required machine-specific adjustment to the address of
18534 a trampoline (passed as ``tramp``).
18539 ``tramp`` must point to a block of memory which already has trampoline
18540 code filled in by a previous call to
18541 :ref:`llvm.init.trampoline <int_it>`.
18546 On some architectures the address of the code to be executed needs to be
18547 different than the address where the trampoline is actually stored. This
18548 intrinsic returns the executable address corresponding to ``tramp``
18549 after performing the required machine specific adjustments. The pointer
18550 returned can then be :ref:`bitcast and executed <int_trampoline>`.
18555 Vector Predication Intrinsics
18556 -----------------------------
18557 VP intrinsics are intended for predicated SIMD/vector code. A typical VP
18558 operation takes a vector mask and an explicit vector length parameter as in:
18562 <W x T> llvm.vp.<opcode>.*(<W x T> %x, <W x T> %y, <W x i1> %mask, i32 %evl)
18564 The vector mask parameter (%mask) always has a vector of `i1` type, for example
18565 `<32 x i1>`. The explicit vector length parameter always has the type `i32` and
18566 is an unsigned integer value. The explicit vector length parameter (%evl) is in
18571 0 <= %evl <= W, where W is the number of vector elements
18573 Note that for :ref:`scalable vector types <t_vector>` ``W`` is the runtime
18574 length of the vector.
18576 The VP intrinsic has undefined behavior if ``%evl > W``. The explicit vector
18577 length (%evl) creates a mask, %EVLmask, with all elements ``0 <= i < %evl`` set
18578 to True, and all other lanes ``%evl <= i < W`` to False. A new mask %M is
18579 calculated with an element-wise AND from %mask and %EVLmask:
18583 M = %mask AND %EVLmask
18585 A vector operation ``<opcode>`` on vectors ``A`` and ``B`` calculates:
18589 A <opcode> B = { A[i] <opcode> B[i] M[i] = True, and
18595 Some targets, such as AVX512, do not support the %evl parameter in hardware.
18596 The use of an effective %evl is discouraged for those targets. The function
18597 ``TargetTransformInfo::hasActiveVectorLength()`` returns true when the target
18598 has native support for %evl.
18602 '``llvm.vp.select.*``' Intrinsics
18603 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18607 This is an overloaded intrinsic.
18611 declare <16 x i32> @llvm.vp.select.v16i32 (<16 x i1> <condition>, <16 x i32> <on_true>, <16 x i32> <on_false>, i32 <evl>)
18612 declare <vscale x 4 x i64> @llvm.vp.select.nxv4i64 (<vscale x 4 x i1> <condition>, <vscale x 4 x i64> <on_true>, <vscale x 4 x i64> <on_false>, i32 <evl>)
18617 The '``llvm.vp.select``' intrinsic is used to choose one value based on a
18618 condition vector, without IR-level branching.
18623 The first operand is a vector of ``i1`` and indicates the condition. The
18624 second operand is the value that is selected where the condition vector is
18625 true. The third operand is the value that is selected where the condition
18626 vector is false. The vectors must be of the same size. The fourth operand is
18627 the explicit vector length.
18629 #. The optional ``fast-math flags`` marker indicates that the select has one or
18630 more :ref:`fast-math flags <fastmath>`. These are optimization hints to
18631 enable otherwise unsafe floating-point optimizations. Fast-math flags are
18632 only valid for selects that return a floating-point scalar or vector type,
18633 or an array (nested to any depth) of floating-point scalar or vector types.
18638 The intrinsic selects lanes from the second and third operand depending on a
18641 All result lanes at positions greater or equal than ``%evl`` are undefined.
18642 For all lanes below ``%evl`` where the condition vector is true the lane is
18643 taken from the second operand. Otherwise, the lane is taken from the third
18649 .. code-block:: llvm
18651 %r = call <4 x i32> @llvm.vp.select.v4i32(<4 x i1> %cond, <4 x i32> %on_true, <4 x i32> %on_false, i32 %evl)
18654 ;; Any result is legal on lanes at and above %evl.
18655 %also.r = select <4 x i1> %cond, <4 x i32> %on_true, <4 x i32> %on_false
18660 '``llvm.vp.merge.*``' Intrinsics
18661 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18665 This is an overloaded intrinsic.
18669 declare <16 x i32> @llvm.vp.merge.v16i32 (<16 x i1> <condition>, <16 x i32> <on_true>, <16 x i32> <on_false>, i32 <pivot>)
18670 declare <vscale x 4 x i64> @llvm.vp.merge.nxv4i64 (<vscale x 4 x i1> <condition>, <vscale x 4 x i64> <on_true>, <vscale x 4 x i64> <on_false>, i32 <pivot>)
18675 The '``llvm.vp.merge``' intrinsic is used to choose one value based on a
18676 condition vector and an index operand, without IR-level branching.
18681 The first operand is a vector of ``i1`` and indicates the condition. The
18682 second operand is the value that is merged where the condition vector is true.
18683 The third operand is the value that is selected where the condition vector is
18684 false or the lane position is greater equal than the pivot. The fourth operand
18687 #. The optional ``fast-math flags`` marker indicates that the merge has one or
18688 more :ref:`fast-math flags <fastmath>`. These are optimization hints to
18689 enable otherwise unsafe floating-point optimizations. Fast-math flags are
18690 only valid for merges that return a floating-point scalar or vector type,
18691 or an array (nested to any depth) of floating-point scalar or vector types.
18696 The intrinsic selects lanes from the second and third operand depending on a
18697 condition vector and pivot value.
18699 For all lanes where the condition vector is true and the lane position is less
18700 than ``%pivot`` the lane is taken from the second operand. Otherwise, the lane
18701 is taken from the third operand.
18706 .. code-block:: llvm
18708 %r = call <4 x i32> @llvm.vp.merge.v4i32(<4 x i1> %cond, <4 x i32> %on_true, <4 x i32> %on_false, i32 %pivot)
18711 ;; Lanes at and above %pivot are taken from %on_false
18712 %atfirst = insertelement <4 x i32> undef, i32 %pivot, i32 0
18713 %splat = shufflevector <4 x i32> %atfirst, <4 x i32> poison, <4 x i32> zeroinitializer
18714 %pivotmask = icmp ult <4 x i32> <i32 0, i32 1, i32 2, i32 3>, <4 x i32> %splat
18715 %mergemask = and <4 x i1> %cond, <4 x i1> %pivotmask
18716 %also.r = select <4 x i1> %mergemask, <4 x i32> %on_true, <4 x i32> %on_false
18722 '``llvm.vp.add.*``' Intrinsics
18723 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18727 This is an overloaded intrinsic.
18731 declare <16 x i32> @llvm.vp.add.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
18732 declare <vscale x 4 x i32> @llvm.vp.add.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
18733 declare <256 x i64> @llvm.vp.add.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
18738 Predicated integer addition of two vectors of integers.
18744 The first two operands and the result have the same vector of integer type. The
18745 third operand is the vector mask and has the same number of elements as the
18746 result vector type. The fourth operand is the explicit vector length of the
18752 The '``llvm.vp.add``' intrinsic performs integer addition (:ref:`add <i_add>`)
18753 of the first and second vector operand on each enabled lane. The result on
18754 disabled lanes is a :ref:`poison value <poisonvalues>`.
18759 .. code-block:: llvm
18761 %r = call <4 x i32> @llvm.vp.add.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
18762 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
18764 %t = add <4 x i32> %a, %b
18765 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
18769 '``llvm.vp.sub.*``' Intrinsics
18770 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18774 This is an overloaded intrinsic.
18778 declare <16 x i32> @llvm.vp.sub.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
18779 declare <vscale x 4 x i32> @llvm.vp.sub.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
18780 declare <256 x i64> @llvm.vp.sub.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
18785 Predicated integer subtraction of two vectors of integers.
18791 The first two operands and the result have the same vector of integer type. The
18792 third operand is the vector mask and has the same number of elements as the
18793 result vector type. The fourth operand is the explicit vector length of the
18799 The '``llvm.vp.sub``' intrinsic performs integer subtraction
18800 (:ref:`sub <i_sub>`) of the first and second vector operand on each enabled
18801 lane. The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
18806 .. code-block:: llvm
18808 %r = call <4 x i32> @llvm.vp.sub.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
18809 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
18811 %t = sub <4 x i32> %a, %b
18812 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
18818 '``llvm.vp.mul.*``' Intrinsics
18819 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18823 This is an overloaded intrinsic.
18827 declare <16 x i32> @llvm.vp.mul.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
18828 declare <vscale x 4 x i32> @llvm.vp.mul.nxv46i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
18829 declare <256 x i64> @llvm.vp.mul.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
18834 Predicated integer multiplication of two vectors of integers.
18840 The first two operands and the result have the same vector of integer type. The
18841 third operand is the vector mask and has the same number of elements as the
18842 result vector type. The fourth operand is the explicit vector length of the
18847 The '``llvm.vp.mul``' intrinsic performs integer multiplication
18848 (:ref:`mul <i_mul>`) of the first and second vector operand on each enabled
18849 lane. The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
18854 .. code-block:: llvm
18856 %r = call <4 x i32> @llvm.vp.mul.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
18857 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
18859 %t = mul <4 x i32> %a, %b
18860 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
18865 '``llvm.vp.sdiv.*``' Intrinsics
18866 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18870 This is an overloaded intrinsic.
18874 declare <16 x i32> @llvm.vp.sdiv.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
18875 declare <vscale x 4 x i32> @llvm.vp.sdiv.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
18876 declare <256 x i64> @llvm.vp.sdiv.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
18881 Predicated, signed division of two vectors of integers.
18887 The first two operands and the result have the same vector of integer type. The
18888 third operand is the vector mask and has the same number of elements as the
18889 result vector type. The fourth operand is the explicit vector length of the
18895 The '``llvm.vp.sdiv``' intrinsic performs signed division (:ref:`sdiv <i_sdiv>`)
18896 of the first and second vector operand on each enabled lane. The result on
18897 disabled lanes is a :ref:`poison value <poisonvalues>`.
18902 .. code-block:: llvm
18904 %r = call <4 x i32> @llvm.vp.sdiv.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
18905 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
18907 %t = sdiv <4 x i32> %a, %b
18908 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
18913 '``llvm.vp.udiv.*``' Intrinsics
18914 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18918 This is an overloaded intrinsic.
18922 declare <16 x i32> @llvm.vp.udiv.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
18923 declare <vscale x 4 x i32> @llvm.vp.udiv.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
18924 declare <256 x i64> @llvm.vp.udiv.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
18929 Predicated, unsigned division of two vectors of integers.
18935 The first two operands and the result have the same vector of integer type. The third operand is the vector mask and has the same number of elements as the result vector type. The fourth operand is the explicit vector length of the operation.
18940 The '``llvm.vp.udiv``' intrinsic performs unsigned division
18941 (:ref:`udiv <i_udiv>`) of the first and second vector operand on each enabled
18942 lane. The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
18947 .. code-block:: llvm
18949 %r = call <4 x i32> @llvm.vp.udiv.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
18950 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
18952 %t = udiv <4 x i32> %a, %b
18953 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
18959 '``llvm.vp.srem.*``' Intrinsics
18960 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18964 This is an overloaded intrinsic.
18968 declare <16 x i32> @llvm.vp.srem.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
18969 declare <vscale x 4 x i32> @llvm.vp.srem.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
18970 declare <256 x i64> @llvm.vp.srem.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
18975 Predicated computations of the signed remainder of two integer vectors.
18981 The first two operands and the result have the same vector of integer type. The
18982 third operand is the vector mask and has the same number of elements as the
18983 result vector type. The fourth operand is the explicit vector length of the
18989 The '``llvm.vp.srem``' intrinsic computes the remainder of the signed division
18990 (:ref:`srem <i_srem>`) of the first and second vector operand on each enabled
18991 lane. The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
18996 .. code-block:: llvm
18998 %r = call <4 x i32> @llvm.vp.srem.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
18999 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
19001 %t = srem <4 x i32> %a, %b
19002 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
19008 '``llvm.vp.urem.*``' Intrinsics
19009 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19013 This is an overloaded intrinsic.
19017 declare <16 x i32> @llvm.vp.urem.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
19018 declare <vscale x 4 x i32> @llvm.vp.urem.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
19019 declare <256 x i64> @llvm.vp.urem.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
19024 Predicated computation of the unsigned remainder of two integer vectors.
19030 The first two operands and the result have the same vector of integer type. The
19031 third operand is the vector mask and has the same number of elements as the
19032 result vector type. The fourth operand is the explicit vector length of the
19038 The '``llvm.vp.urem``' intrinsic computes the remainder of the unsigned division
19039 (:ref:`urem <i_urem>`) of the first and second vector operand on each enabled
19040 lane. The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
19045 .. code-block:: llvm
19047 %r = call <4 x i32> @llvm.vp.urem.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
19048 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
19050 %t = urem <4 x i32> %a, %b
19051 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
19056 '``llvm.vp.ashr.*``' Intrinsics
19057 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19061 This is an overloaded intrinsic.
19065 declare <16 x i32> @llvm.vp.ashr.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
19066 declare <vscale x 4 x i32> @llvm.vp.ashr.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
19067 declare <256 x i64> @llvm.vp.ashr.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
19072 Vector-predicated arithmetic right-shift.
19078 The first two operands and the result have the same vector of integer type. The
19079 third operand is the vector mask and has the same number of elements as the
19080 result vector type. The fourth operand is the explicit vector length of the
19086 The '``llvm.vp.ashr``' intrinsic computes the arithmetic right shift
19087 (:ref:`ashr <i_ashr>`) of the first operand by the second operand on each
19088 enabled lane. The result on disabled lanes is a
19089 :ref:`poison value <poisonvalues>`.
19094 .. code-block:: llvm
19096 %r = call <4 x i32> @llvm.vp.ashr.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
19097 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
19099 %t = ashr <4 x i32> %a, %b
19100 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
19106 '``llvm.vp.lshr.*``' Intrinsics
19107 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19111 This is an overloaded intrinsic.
19115 declare <16 x i32> @llvm.vp.lshr.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
19116 declare <vscale x 4 x i32> @llvm.vp.lshr.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
19117 declare <256 x i64> @llvm.vp.lshr.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
19122 Vector-predicated logical right-shift.
19128 The first two operands and the result have the same vector of integer type. The
19129 third operand is the vector mask and has the same number of elements as the
19130 result vector type. The fourth operand is the explicit vector length of the
19136 The '``llvm.vp.lshr``' intrinsic computes the logical right shift
19137 (:ref:`lshr <i_lshr>`) of the first operand by the second operand on each
19138 enabled lane. The result on disabled lanes is a
19139 :ref:`poison value <poisonvalues>`.
19144 .. code-block:: llvm
19146 %r = call <4 x i32> @llvm.vp.lshr.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
19147 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
19149 %t = lshr <4 x i32> %a, %b
19150 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
19155 '``llvm.vp.shl.*``' Intrinsics
19156 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19160 This is an overloaded intrinsic.
19164 declare <16 x i32> @llvm.vp.shl.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
19165 declare <vscale x 4 x i32> @llvm.vp.shl.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
19166 declare <256 x i64> @llvm.vp.shl.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
19171 Vector-predicated left shift.
19177 The first two operands and the result have the same vector of integer type. The
19178 third operand is the vector mask and has the same number of elements as the
19179 result vector type. The fourth operand is the explicit vector length of the
19185 The '``llvm.vp.shl``' intrinsic computes the left shift (:ref:`shl <i_shl>`) of
19186 the first operand by the second operand on each enabled lane. The result on
19187 disabled lanes is a :ref:`poison value <poisonvalues>`.
19192 .. code-block:: llvm
19194 %r = call <4 x i32> @llvm.vp.shl.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
19195 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
19197 %t = shl <4 x i32> %a, %b
19198 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
19203 '``llvm.vp.or.*``' Intrinsics
19204 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19208 This is an overloaded intrinsic.
19212 declare <16 x i32> @llvm.vp.or.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
19213 declare <vscale x 4 x i32> @llvm.vp.or.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
19214 declare <256 x i64> @llvm.vp.or.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
19219 Vector-predicated or.
19225 The first two operands and the result have the same vector of integer type. The
19226 third operand is the vector mask and has the same number of elements as the
19227 result vector type. The fourth operand is the explicit vector length of the
19233 The '``llvm.vp.or``' intrinsic performs a bitwise or (:ref:`or <i_or>`) of the
19234 first two operands on each enabled lane. The result on disabled lanes is
19235 a :ref:`poison value <poisonvalues>`.
19240 .. code-block:: llvm
19242 %r = call <4 x i32> @llvm.vp.or.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
19243 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
19245 %t = or <4 x i32> %a, %b
19246 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
19251 '``llvm.vp.and.*``' Intrinsics
19252 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19256 This is an overloaded intrinsic.
19260 declare <16 x i32> @llvm.vp.and.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
19261 declare <vscale x 4 x i32> @llvm.vp.and.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
19262 declare <256 x i64> @llvm.vp.and.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
19267 Vector-predicated and.
19273 The first two operands and the result have the same vector of integer type. The
19274 third operand is the vector mask and has the same number of elements as the
19275 result vector type. The fourth operand is the explicit vector length of the
19281 The '``llvm.vp.and``' intrinsic performs a bitwise and (:ref:`and <i_or>`) of
19282 the first two operands on each enabled lane. The result on disabled lanes is
19283 a :ref:`poison value <poisonvalues>`.
19288 .. code-block:: llvm
19290 %r = call <4 x i32> @llvm.vp.and.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
19291 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
19293 %t = and <4 x i32> %a, %b
19294 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
19299 '``llvm.vp.xor.*``' Intrinsics
19300 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19304 This is an overloaded intrinsic.
19308 declare <16 x i32> @llvm.vp.xor.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
19309 declare <vscale x 4 x i32> @llvm.vp.xor.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
19310 declare <256 x i64> @llvm.vp.xor.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
19315 Vector-predicated, bitwise xor.
19321 The first two operands and the result have the same vector of integer type. The
19322 third operand is the vector mask and has the same number of elements as the
19323 result vector type. The fourth operand is the explicit vector length of the
19329 The '``llvm.vp.xor``' intrinsic performs a bitwise xor (:ref:`xor <i_xor>`) of
19330 the first two operands on each enabled lane.
19331 The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
19336 .. code-block:: llvm
19338 %r = call <4 x i32> @llvm.vp.xor.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
19339 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
19341 %t = xor <4 x i32> %a, %b
19342 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
19346 '``llvm.vp.abs.*``' Intrinsics
19347 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19351 This is an overloaded intrinsic.
19355 declare <16 x i32> @llvm.vp.abs.v16i32 (<16 x i32> <op>, <16 x i1> <mask>, i32 <vector_length>, i1 <is_int_min_poison>)
19356 declare <vscale x 4 x i32> @llvm.vp.abs.nxv4i32 (<vscale x 4 x i32> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>, i1 <is_int_min_poison>)
19357 declare <256 x i64> @llvm.vp.abs.v256i64 (<256 x i64> <op>, <256 x i1> <mask>, i32 <vector_length>, i1 <is_int_min_poison>)
19362 Predicated abs of a vector of integers.
19368 The first operand and the result have the same vector of integer type. The
19369 second operand is the vector mask and has the same number of elements as the
19370 result vector type. The third operand is the explicit vector length of the
19371 operation. The fourth argument must be a constant and is a flag to indicate
19372 whether the result value of the '``llvm.vp.abs``' intrinsic is a
19373 :ref:`poison value <poisonvalues>` if the argument is statically or dynamically
19374 an ``INT_MIN`` value.
19379 The '``llvm.vp.abs``' intrinsic performs abs (:ref:`abs <int_abs>`) of the first operand on each
19380 enabled lane. The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
19385 .. code-block:: llvm
19387 %r = call <4 x i32> @llvm.vp.abs.v4i32(<4 x i32> %a, <4 x i1> %mask, i32 %evl, i1 false)
19388 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
19390 %t = call <4 x i32> @llvm.abs.v4i32(<4 x i32> %a, i1 false)
19391 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
19397 '``llvm.vp.smax.*``' Intrinsics
19398 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19402 This is an overloaded intrinsic.
19406 declare <16 x i32> @llvm.vp.smax.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
19407 declare <vscale x 4 x i32> @llvm.vp.smax.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
19408 declare <256 x i64> @llvm.vp.smax.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
19413 Predicated integer signed maximum of two vectors of integers.
19419 The first two operands and the result have the same vector of integer type. The
19420 third operand is the vector mask and has the same number of elements as the
19421 result vector type. The fourth operand is the explicit vector length of the
19427 The '``llvm.vp.smax``' intrinsic performs integer signed maximum (:ref:`smax <int_smax>`)
19428 of the first and second vector operand on each enabled lane. The result on
19429 disabled lanes is a :ref:`poison value <poisonvalues>`.
19434 .. code-block:: llvm
19436 %r = call <4 x i32> @llvm.vp.smax.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
19437 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
19439 %t = call <4 x i32> @llvm.smax.v4i32(<4 x i32> %a, <4 x i32> %b)
19440 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
19445 '``llvm.vp.smin.*``' Intrinsics
19446 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19450 This is an overloaded intrinsic.
19454 declare <16 x i32> @llvm.vp.smin.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
19455 declare <vscale x 4 x i32> @llvm.vp.smin.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
19456 declare <256 x i64> @llvm.vp.smin.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
19461 Predicated integer signed minimum of two vectors of integers.
19467 The first two operands and the result have the same vector of integer type. The
19468 third operand is the vector mask and has the same number of elements as the
19469 result vector type. The fourth operand is the explicit vector length of the
19475 The '``llvm.vp.smin``' intrinsic performs integer signed minimum (:ref:`smin <int_smin>`)
19476 of the first and second vector operand on each enabled lane. The result on
19477 disabled lanes is a :ref:`poison value <poisonvalues>`.
19482 .. code-block:: llvm
19484 %r = call <4 x i32> @llvm.vp.smin.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
19485 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
19487 %t = call <4 x i32> @llvm.smin.v4i32(<4 x i32> %a, <4 x i32> %b)
19488 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
19493 '``llvm.vp.umax.*``' Intrinsics
19494 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19498 This is an overloaded intrinsic.
19502 declare <16 x i32> @llvm.vp.umax.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
19503 declare <vscale x 4 x i32> @llvm.vp.umax.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
19504 declare <256 x i64> @llvm.vp.umax.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
19509 Predicated integer unsigned maximum of two vectors of integers.
19515 The first two operands and the result have the same vector of integer type. The
19516 third operand is the vector mask and has the same number of elements as the
19517 result vector type. The fourth operand is the explicit vector length of the
19523 The '``llvm.vp.umax``' intrinsic performs integer unsigned maximum (:ref:`umax <int_umax>`)
19524 of the first and second vector operand on each enabled lane. The result on
19525 disabled lanes is a :ref:`poison value <poisonvalues>`.
19530 .. code-block:: llvm
19532 %r = call <4 x i32> @llvm.vp.umax.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
19533 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
19535 %t = call <4 x i32> @llvm.umax.v4i32(<4 x i32> %a, <4 x i32> %b)
19536 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
19541 '``llvm.vp.umin.*``' Intrinsics
19542 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19546 This is an overloaded intrinsic.
19550 declare <16 x i32> @llvm.vp.umin.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
19551 declare <vscale x 4 x i32> @llvm.vp.umin.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
19552 declare <256 x i64> @llvm.vp.umin.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
19557 Predicated integer unsigned minimum of two vectors of integers.
19563 The first two operands and the result have the same vector of integer type. The
19564 third operand is the vector mask and has the same number of elements as the
19565 result vector type. The fourth operand is the explicit vector length of the
19571 The '``llvm.vp.umin``' intrinsic performs integer unsigned minimum (:ref:`umin <int_umin>`)
19572 of the first and second vector operand on each enabled lane. The result on
19573 disabled lanes is a :ref:`poison value <poisonvalues>`.
19578 .. code-block:: llvm
19580 %r = call <4 x i32> @llvm.vp.umin.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
19581 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
19583 %t = call <4 x i32> @llvm.umin.v4i32(<4 x i32> %a, <4 x i32> %b)
19584 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
19587 .. _int_vp_copysign:
19589 '``llvm.vp.copysign.*``' Intrinsics
19590 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19594 This is an overloaded intrinsic.
19598 declare <16 x float> @llvm.vp.copysign.v16f32 (<16 x float> <mag_op>, <16 x float> <sign_op>, <16 x i1> <mask>, i32 <vector_length>)
19599 declare <vscale x 4 x float> @llvm.vp.copysign.nxv4f32 (<vscale x 4 x float> <mag_op>, <vscale x 4 x float> <sign_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
19600 declare <256 x double> @llvm.vp.copysign.v256f64 (<256 x double> <mag_op>, <256 x double> <sign_op>, <256 x i1> <mask>, i32 <vector_length>)
19605 Predicated floating-point copysign of two vectors of floating-point values.
19611 The first two operands and the result have the same vector of floating-point type. The
19612 third operand is the vector mask and has the same number of elements as the
19613 result vector type. The fourth operand is the explicit vector length of the
19619 The '``llvm.vp.copysign``' intrinsic performs floating-point copysign (:ref:`copysign <int_copysign>`)
19620 of the first and second vector operand on each enabled lane. The result on
19621 disabled lanes is a :ref:`poison value <poisonvalues>`. The operation is
19622 performed in the default floating-point environment.
19627 .. code-block:: llvm
19629 %r = call <4 x float> @llvm.vp.copysign.v4f32(<4 x float> %mag, <4 x float> %sign, <4 x i1> %mask, i32 %evl)
19630 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
19632 %t = call <4 x float> @llvm.copysign.v4f32(<4 x float> %mag, <4 x float> %sign)
19633 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
19638 '``llvm.vp.minnum.*``' Intrinsics
19639 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19643 This is an overloaded intrinsic.
19647 declare <16 x float> @llvm.vp.minnum.v16f32 (<16 x float> <left_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
19648 declare <vscale x 4 x float> @llvm.vp.minnum.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
19649 declare <256 x double> @llvm.vp.minnum.v256f64 (<256 x double> <left_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
19654 Predicated floating-point IEEE-754 minNum of two vectors of floating-point values.
19660 The first two operands and the result have the same vector of floating-point type. The
19661 third operand is the vector mask and has the same number of elements as the
19662 result vector type. The fourth operand is the explicit vector length of the
19668 The '``llvm.vp.minnum``' intrinsic performs floating-point minimum (:ref:`minnum <i_minnum>`)
19669 of the first and second vector operand on each enabled lane. The result on
19670 disabled lanes is a :ref:`poison value <poisonvalues>`. The operation is
19671 performed in the default floating-point environment.
19676 .. code-block:: llvm
19678 %r = call <4 x float> @llvm.vp.minnum.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl)
19679 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
19681 %t = call <4 x float> @llvm.minnum.v4f32(<4 x float> %a, <4 x float> %b)
19682 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
19687 '``llvm.vp.maxnum.*``' Intrinsics
19688 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19692 This is an overloaded intrinsic.
19696 declare <16 x float> @llvm.vp.maxnum.v16f32 (<16 x float> <left_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
19697 declare <vscale x 4 x float> @llvm.vp.maxnum.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
19698 declare <256 x double> @llvm.vp.maxnum.v256f64 (<256 x double> <left_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
19703 Predicated floating-point IEEE-754 maxNum of two vectors of floating-point values.
19709 The first two operands and the result have the same vector of floating-point type. The
19710 third operand is the vector mask and has the same number of elements as the
19711 result vector type. The fourth operand is the explicit vector length of the
19717 The '``llvm.vp.maxnum``' intrinsic performs floating-point maximum (:ref:`maxnum <i_maxnum>`)
19718 of the first and second vector operand on each enabled lane. The result on
19719 disabled lanes is a :ref:`poison value <poisonvalues>`. The operation is
19720 performed in the default floating-point environment.
19725 .. code-block:: llvm
19727 %r = call <4 x float> @llvm.vp.maxnum.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl)
19728 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
19730 %t = call <4 x float> @llvm.maxnum.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl)
19731 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
19736 '``llvm.vp.fadd.*``' Intrinsics
19737 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19741 This is an overloaded intrinsic.
19745 declare <16 x float> @llvm.vp.fadd.v16f32 (<16 x float> <left_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
19746 declare <vscale x 4 x float> @llvm.vp.fadd.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
19747 declare <256 x double> @llvm.vp.fadd.v256f64 (<256 x double> <left_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
19752 Predicated floating-point addition of two vectors of floating-point values.
19758 The first two operands and the result have the same vector of floating-point type. The
19759 third operand is the vector mask and has the same number of elements as the
19760 result vector type. The fourth operand is the explicit vector length of the
19766 The '``llvm.vp.fadd``' intrinsic performs floating-point addition (:ref:`fadd <i_fadd>`)
19767 of the first and second vector operand on each enabled lane. The result on
19768 disabled lanes is a :ref:`poison value <poisonvalues>`. The operation is
19769 performed in the default floating-point environment.
19774 .. code-block:: llvm
19776 %r = call <4 x float> @llvm.vp.fadd.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl)
19777 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
19779 %t = fadd <4 x float> %a, %b
19780 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
19785 '``llvm.vp.fsub.*``' Intrinsics
19786 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19790 This is an overloaded intrinsic.
19794 declare <16 x float> @llvm.vp.fsub.v16f32 (<16 x float> <left_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
19795 declare <vscale x 4 x float> @llvm.vp.fsub.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
19796 declare <256 x double> @llvm.vp.fsub.v256f64 (<256 x double> <left_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
19801 Predicated floating-point subtraction of two vectors of floating-point values.
19807 The first two operands and the result have the same vector of floating-point type. The
19808 third operand is the vector mask and has the same number of elements as the
19809 result vector type. The fourth operand is the explicit vector length of the
19815 The '``llvm.vp.fsub``' intrinsic performs floating-point subtraction (:ref:`fsub <i_fsub>`)
19816 of the first and second vector operand on each enabled lane. The result on
19817 disabled lanes is a :ref:`poison value <poisonvalues>`. The operation is
19818 performed in the default floating-point environment.
19823 .. code-block:: llvm
19825 %r = call <4 x float> @llvm.vp.fsub.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl)
19826 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
19828 %t = fsub <4 x float> %a, %b
19829 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
19834 '``llvm.vp.fmul.*``' Intrinsics
19835 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19839 This is an overloaded intrinsic.
19843 declare <16 x float> @llvm.vp.fmul.v16f32 (<16 x float> <left_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
19844 declare <vscale x 4 x float> @llvm.vp.fmul.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
19845 declare <256 x double> @llvm.vp.fmul.v256f64 (<256 x double> <left_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
19850 Predicated floating-point multiplication of two vectors of floating-point values.
19856 The first two operands and the result have the same vector of floating-point type. The
19857 third operand is the vector mask and has the same number of elements as the
19858 result vector type. The fourth operand is the explicit vector length of the
19864 The '``llvm.vp.fmul``' intrinsic performs floating-point multiplication (:ref:`fmul <i_fmul>`)
19865 of the first and second vector operand on each enabled lane. The result on
19866 disabled lanes is a :ref:`poison value <poisonvalues>`. The operation is
19867 performed in the default floating-point environment.
19872 .. code-block:: llvm
19874 %r = call <4 x float> @llvm.vp.fmul.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl)
19875 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
19877 %t = fmul <4 x float> %a, %b
19878 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
19883 '``llvm.vp.fdiv.*``' Intrinsics
19884 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19888 This is an overloaded intrinsic.
19892 declare <16 x float> @llvm.vp.fdiv.v16f32 (<16 x float> <left_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
19893 declare <vscale x 4 x float> @llvm.vp.fdiv.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
19894 declare <256 x double> @llvm.vp.fdiv.v256f64 (<256 x double> <left_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
19899 Predicated floating-point division of two vectors of floating-point values.
19905 The first two operands and the result have the same vector of floating-point type. The
19906 third operand is the vector mask and has the same number of elements as the
19907 result vector type. The fourth operand is the explicit vector length of the
19913 The '``llvm.vp.fdiv``' intrinsic performs floating-point division (:ref:`fdiv <i_fdiv>`)
19914 of the first and second vector operand on each enabled lane. The result on
19915 disabled lanes is a :ref:`poison value <poisonvalues>`. The operation is
19916 performed in the default floating-point environment.
19921 .. code-block:: llvm
19923 %r = call <4 x float> @llvm.vp.fdiv.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl)
19924 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
19926 %t = fdiv <4 x float> %a, %b
19927 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
19932 '``llvm.vp.frem.*``' Intrinsics
19933 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19937 This is an overloaded intrinsic.
19941 declare <16 x float> @llvm.vp.frem.v16f32 (<16 x float> <left_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
19942 declare <vscale x 4 x float> @llvm.vp.frem.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
19943 declare <256 x double> @llvm.vp.frem.v256f64 (<256 x double> <left_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
19948 Predicated floating-point remainder of two vectors of floating-point values.
19954 The first two operands and the result have the same vector of floating-point type. The
19955 third operand is the vector mask and has the same number of elements as the
19956 result vector type. The fourth operand is the explicit vector length of the
19962 The '``llvm.vp.frem``' intrinsic performs floating-point remainder (:ref:`frem <i_frem>`)
19963 of the first and second vector operand on each enabled lane. The result on
19964 disabled lanes is a :ref:`poison value <poisonvalues>`. The operation is
19965 performed in the default floating-point environment.
19970 .. code-block:: llvm
19972 %r = call <4 x float> @llvm.vp.frem.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl)
19973 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
19975 %t = frem <4 x float> %a, %b
19976 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
19981 '``llvm.vp.fneg.*``' Intrinsics
19982 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19986 This is an overloaded intrinsic.
19990 declare <16 x float> @llvm.vp.fneg.v16f32 (<16 x float> <op>, <16 x i1> <mask>, i32 <vector_length>)
19991 declare <vscale x 4 x float> @llvm.vp.fneg.nxv4f32 (<vscale x 4 x float> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
19992 declare <256 x double> @llvm.vp.fneg.v256f64 (<256 x double> <op>, <256 x i1> <mask>, i32 <vector_length>)
19997 Predicated floating-point negation of a vector of floating-point values.
20003 The first operand and the result have the same vector of floating-point type.
20004 The second operand is the vector mask and has the same number of elements as the
20005 result vector type. The third operand is the explicit vector length of the
20011 The '``llvm.vp.fneg``' intrinsic performs floating-point negation (:ref:`fneg <i_fneg>`)
20012 of the first vector operand on each enabled lane. The result on disabled lanes
20013 is a :ref:`poison value <poisonvalues>`.
20018 .. code-block:: llvm
20020 %r = call <4 x float> @llvm.vp.fneg.v4f32(<4 x float> %a, <4 x i1> %mask, i32 %evl)
20021 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
20023 %t = fneg <4 x float> %a
20024 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
20029 '``llvm.vp.fabs.*``' Intrinsics
20030 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20034 This is an overloaded intrinsic.
20038 declare <16 x float> @llvm.vp.fabs.v16f32 (<16 x float> <op>, <16 x i1> <mask>, i32 <vector_length>)
20039 declare <vscale x 4 x float> @llvm.vp.fabs.nxv4f32 (<vscale x 4 x float> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
20040 declare <256 x double> @llvm.vp.fabs.v256f64 (<256 x double> <op>, <256 x i1> <mask>, i32 <vector_length>)
20045 Predicated floating-point absolute value of a vector of floating-point values.
20051 The first operand and the result have the same vector of floating-point type.
20052 The second operand is the vector mask and has the same number of elements as the
20053 result vector type. The third operand is the explicit vector length of the
20059 The '``llvm.vp.fabs``' intrinsic performs floating-point absolute value
20060 (:ref:`fabs <int_fabs>`) of the first vector operand on each enabled lane. The
20061 result on disabled lanes is a :ref:`poison value <poisonvalues>`.
20066 .. code-block:: llvm
20068 %r = call <4 x float> @llvm.vp.fabs.v4f32(<4 x float> %a, <4 x i1> %mask, i32 %evl)
20069 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
20071 %t = call <4 x float> @llvm.fabs.v4f32(<4 x float> %a)
20072 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
20077 '``llvm.vp.sqrt.*``' Intrinsics
20078 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20082 This is an overloaded intrinsic.
20086 declare <16 x float> @llvm.vp.sqrt.v16f32 (<16 x float> <op>, <16 x i1> <mask>, i32 <vector_length>)
20087 declare <vscale x 4 x float> @llvm.vp.sqrt.nxv4f32 (<vscale x 4 x float> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
20088 declare <256 x double> @llvm.vp.sqrt.v256f64 (<256 x double> <op>, <256 x i1> <mask>, i32 <vector_length>)
20093 Predicated floating-point square root of a vector of floating-point values.
20099 The first operand and the result have the same vector of floating-point type.
20100 The second operand is the vector mask and has the same number of elements as the
20101 result vector type. The third operand is the explicit vector length of the
20107 The '``llvm.vp.sqrt``' intrinsic performs floating-point square root (:ref:`sqrt <int_sqrt>`) of
20108 the first vector operand on each enabled lane. The result on disabled lanes is
20109 a :ref:`poison value <poisonvalues>`. The operation is performed in the default
20110 floating-point environment.
20115 .. code-block:: llvm
20117 %r = call <4 x float> @llvm.vp.sqrt.v4f32(<4 x float> %a, <4 x i1> %mask, i32 %evl)
20118 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
20120 %t = call <4 x float> @llvm.sqrt.v4f32(<4 x float> %a)
20121 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
20126 '``llvm.vp.fma.*``' Intrinsics
20127 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20131 This is an overloaded intrinsic.
20135 declare <16 x float> @llvm.vp.fma.v16f32 (<16 x float> <left_op>, <16 x float> <middle_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
20136 declare <vscale x 4 x float> @llvm.vp.fma.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <middle_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
20137 declare <256 x double> @llvm.vp.fma.v256f64 (<256 x double> <left_op>, <256 x double> <middle_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
20142 Predicated floating-point fused multiply-add of two vectors of floating-point values.
20148 The first three operands and the result have the same vector of floating-point type. The
20149 fourth operand is the vector mask and has the same number of elements as the
20150 result vector type. The fifth operand is the explicit vector length of the
20156 The '``llvm.vp.fma``' intrinsic performs floating-point fused multiply-add (:ref:`llvm.fma <int_fma>`)
20157 of the first, second, and third vector operand on each enabled lane. The result on
20158 disabled lanes is a :ref:`poison value <poisonvalues>`. The operation is
20159 performed in the default floating-point environment.
20164 .. code-block:: llvm
20166 %r = call <4 x float> @llvm.vp.fma.v4f32(<4 x float> %a, <4 x float> %b, <4 x float> %c, <4 x i1> %mask, i32 %evl)
20167 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
20169 %t = call <4 x float> @llvm.fma(<4 x float> %a, <4 x float> %b, <4 x float> %c)
20170 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
20173 .. _int_vp_fmuladd:
20175 '``llvm.vp.fmuladd.*``' Intrinsics
20176 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20180 This is an overloaded intrinsic.
20184 declare <16 x float> @llvm.vp.fmuladd.v16f32 (<16 x float> <left_op>, <16 x float> <middle_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
20185 declare <vscale x 4 x float> @llvm.vp.fmuladd.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <middle_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
20186 declare <256 x double> @llvm.vp.fmuladd.v256f64 (<256 x double> <left_op>, <256 x double> <middle_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
20191 Predicated floating-point multiply-add of two vectors of floating-point values
20192 that can be fused if code generator determines that (a) the target instruction
20193 set has support for a fused operation, and (b) that the fused operation is more
20194 efficient than the equivalent, separate pair of mul and add instructions.
20199 The first three operands and the result have the same vector of floating-point
20200 type. The fourth operand is the vector mask and has the same number of elements
20201 as the result vector type. The fifth operand is the explicit vector length of
20207 The '``llvm.vp.fmuladd``' intrinsic performs floating-point multiply-add (:ref:`llvm.fuladd <int_fmuladd>`)
20208 of the first, second, and third vector operand on each enabled lane. The result
20209 on disabled lanes is a :ref:`poison value <poisonvalues>`. The operation is
20210 performed in the default floating-point environment.
20215 .. code-block:: llvm
20217 %r = call <4 x float> @llvm.vp.fmuladd.v4f32(<4 x float> %a, <4 x float> %b, <4 x float> %c, <4 x i1> %mask, i32 %evl)
20218 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
20220 %t = call <4 x float> @llvm.fmuladd(<4 x float> %a, <4 x float> %b, <4 x float> %c)
20221 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
20224 .. _int_vp_reduce_add:
20226 '``llvm.vp.reduce.add.*``' Intrinsics
20227 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20231 This is an overloaded intrinsic.
20235 declare i32 @llvm.vp.reduce.add.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>)
20236 declare i16 @llvm.vp.reduce.add.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
20241 Predicated integer ``ADD`` reduction of a vector and a scalar starting value,
20242 returning the result as a scalar.
20247 The first operand is the start value of the reduction, which must be a scalar
20248 integer type equal to the result type. The second operand is the vector on
20249 which the reduction is performed and must be a vector of integer values whose
20250 element type is the result/start type. The third operand is the vector mask and
20251 is a vector of boolean values with the same number of elements as the vector
20252 operand. The fourth operand is the explicit vector length of the operation.
20257 The '``llvm.vp.reduce.add``' intrinsic performs the integer ``ADD`` reduction
20258 (:ref:`llvm.vector.reduce.add <int_vector_reduce_add>`) of the vector operand
20259 ``val`` on each enabled lane, adding it to the scalar ``start_value``. Disabled
20260 lanes are treated as containing the neutral value ``0`` (i.e. having no effect
20261 on the reduction operation). If the vector length is zero, the result is equal
20262 to ``start_value``.
20264 To ignore the start value, the neutral value can be used.
20269 .. code-block:: llvm
20271 %r = call i32 @llvm.vp.reduce.add.v4i32(i32 %start, <4 x i32> %a, <4 x i1> %mask, i32 %evl)
20272 ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
20273 ; are treated as though %mask were false for those lanes.
20275 %masked.a = select <4 x i1> %mask, <4 x i32> %a, <4 x i32> zeroinitializer
20276 %reduction = call i32 @llvm.vector.reduce.add.v4i32(<4 x i32> %masked.a)
20277 %also.r = add i32 %reduction, %start
20280 .. _int_vp_reduce_fadd:
20282 '``llvm.vp.reduce.fadd.*``' Intrinsics
20283 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20287 This is an overloaded intrinsic.
20291 declare float @llvm.vp.reduce.fadd.v4f32(float <start_value>, <4 x float> <val>, <4 x i1> <mask>, i32 <vector_length>)
20292 declare double @llvm.vp.reduce.fadd.nxv8f64(double <start_value>, <vscale x 8 x double> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
20297 Predicated floating-point ``ADD`` reduction of a vector and a scalar starting
20298 value, returning the result as a scalar.
20303 The first operand is the start value of the reduction, which must be a scalar
20304 floating-point type equal to the result type. The second operand is the vector
20305 on which the reduction is performed and must be a vector of floating-point
20306 values whose element type is the result/start type. The third operand is the
20307 vector mask and is a vector of boolean values with the same number of elements
20308 as the vector operand. The fourth operand is the explicit vector length of the
20314 The '``llvm.vp.reduce.fadd``' intrinsic performs the floating-point ``ADD``
20315 reduction (:ref:`llvm.vector.reduce.fadd <int_vector_reduce_fadd>`) of the
20316 vector operand ``val`` on each enabled lane, adding it to the scalar
20317 ``start_value``. Disabled lanes are treated as containing the neutral value
20318 ``-0.0`` (i.e. having no effect on the reduction operation). If no lanes are
20319 enabled, the resulting value will be equal to ``start_value``.
20321 To ignore the start value, the neutral value can be used.
20323 See the unpredicated version (:ref:`llvm.vector.reduce.fadd
20324 <int_vector_reduce_fadd>`) for more detail on the semantics of the reduction.
20329 .. code-block:: llvm
20331 %r = call float @llvm.vp.reduce.fadd.v4f32(float %start, <4 x float> %a, <4 x i1> %mask, i32 %evl)
20332 ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
20333 ; are treated as though %mask were false for those lanes.
20335 %masked.a = select <4 x i1> %mask, <4 x float> %a, <4 x float> <float -0.0, float -0.0, float -0.0, float -0.0>
20336 %also.r = call float @llvm.vector.reduce.fadd.v4f32(float %start, <4 x float> %masked.a)
20339 .. _int_vp_reduce_mul:
20341 '``llvm.vp.reduce.mul.*``' Intrinsics
20342 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20346 This is an overloaded intrinsic.
20350 declare i32 @llvm.vp.reduce.mul.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>)
20351 declare i16 @llvm.vp.reduce.mul.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
20356 Predicated integer ``MUL`` reduction of a vector and a scalar starting value,
20357 returning the result as a scalar.
20363 The first operand is the start value of the reduction, which must be a scalar
20364 integer type equal to the result type. The second operand is the vector on
20365 which the reduction is performed and must be a vector of integer values whose
20366 element type is the result/start type. The third operand is the vector mask and
20367 is a vector of boolean values with the same number of elements as the vector
20368 operand. The fourth operand is the explicit vector length of the operation.
20373 The '``llvm.vp.reduce.mul``' intrinsic performs the integer ``MUL`` reduction
20374 (:ref:`llvm.vector.reduce.mul <int_vector_reduce_mul>`) of the vector operand ``val``
20375 on each enabled lane, multiplying it by the scalar ``start_value``. Disabled
20376 lanes are treated as containing the neutral value ``1`` (i.e. having no effect
20377 on the reduction operation). If the vector length is zero, the result is the
20380 To ignore the start value, the neutral value can be used.
20385 .. code-block:: llvm
20387 %r = call i32 @llvm.vp.reduce.mul.v4i32(i32 %start, <4 x i32> %a, <4 x i1> %mask, i32 %evl)
20388 ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
20389 ; are treated as though %mask were false for those lanes.
20391 %masked.a = select <4 x i1> %mask, <4 x i32> %a, <4 x i32> <i32 1, i32 1, i32 1, i32 1>
20392 %reduction = call i32 @llvm.vector.reduce.mul.v4i32(<4 x i32> %masked.a)
20393 %also.r = mul i32 %reduction, %start
20395 .. _int_vp_reduce_fmul:
20397 '``llvm.vp.reduce.fmul.*``' Intrinsics
20398 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20402 This is an overloaded intrinsic.
20406 declare float @llvm.vp.reduce.fmul.v4f32(float <start_value>, <4 x float> <val>, <4 x i1> <mask>, i32 <vector_length>)
20407 declare double @llvm.vp.reduce.fmul.nxv8f64(double <start_value>, <vscale x 8 x double> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
20412 Predicated floating-point ``MUL`` reduction of a vector and a scalar starting
20413 value, returning the result as a scalar.
20419 The first operand is the start value of the reduction, which must be a scalar
20420 floating-point type equal to the result type. The second operand is the vector
20421 on which the reduction is performed and must be a vector of floating-point
20422 values whose element type is the result/start type. The third operand is the
20423 vector mask and is a vector of boolean values with the same number of elements
20424 as the vector operand. The fourth operand is the explicit vector length of the
20430 The '``llvm.vp.reduce.fmul``' intrinsic performs the floating-point ``MUL``
20431 reduction (:ref:`llvm.vector.reduce.fmul <int_vector_reduce_fmul>`) of the
20432 vector operand ``val`` on each enabled lane, multiplying it by the scalar
20433 `start_value``. Disabled lanes are treated as containing the neutral value
20434 ``1.0`` (i.e. having no effect on the reduction operation). If no lanes are
20435 enabled, the resulting value will be equal to the starting value.
20437 To ignore the start value, the neutral value can be used.
20439 See the unpredicated version (:ref:`llvm.vector.reduce.fmul
20440 <int_vector_reduce_fmul>`) for more detail on the semantics.
20445 .. code-block:: llvm
20447 %r = call float @llvm.vp.reduce.fmul.v4f32(float %start, <4 x float> %a, <4 x i1> %mask, i32 %evl)
20448 ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
20449 ; are treated as though %mask were false for those lanes.
20451 %masked.a = select <4 x i1> %mask, <4 x float> %a, <4 x float> <float 1.0, float 1.0, float 1.0, float 1.0>
20452 %also.r = call float @llvm.vector.reduce.fmul.v4f32(float %start, <4 x float> %masked.a)
20455 .. _int_vp_reduce_and:
20457 '``llvm.vp.reduce.and.*``' Intrinsics
20458 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20462 This is an overloaded intrinsic.
20466 declare i32 @llvm.vp.reduce.and.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>)
20467 declare i16 @llvm.vp.reduce.and.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
20472 Predicated integer ``AND`` reduction of a vector and a scalar starting value,
20473 returning the result as a scalar.
20479 The first operand is the start value of the reduction, which must be a scalar
20480 integer type equal to the result type. The second operand is the vector on
20481 which the reduction is performed and must be a vector of integer values whose
20482 element type is the result/start type. The third operand is the vector mask and
20483 is a vector of boolean values with the same number of elements as the vector
20484 operand. The fourth operand is the explicit vector length of the operation.
20489 The '``llvm.vp.reduce.and``' intrinsic performs the integer ``AND`` reduction
20490 (:ref:`llvm.vector.reduce.and <int_vector_reduce_and>`) of the vector operand
20491 ``val`` on each enabled lane, performing an '``and``' of that with with the
20492 scalar ``start_value``. Disabled lanes are treated as containing the neutral
20493 value ``UINT_MAX``, or ``-1`` (i.e. having no effect on the reduction
20494 operation). If the vector length is zero, the result is the start value.
20496 To ignore the start value, the neutral value can be used.
20501 .. code-block:: llvm
20503 %r = call i32 @llvm.vp.reduce.and.v4i32(i32 %start, <4 x i32> %a, <4 x i1> %mask, i32 %evl)
20504 ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
20505 ; are treated as though %mask were false for those lanes.
20507 %masked.a = select <4 x i1> %mask, <4 x i32> %a, <4 x i32> <i32 -1, i32 -1, i32 -1, i32 -1>
20508 %reduction = call i32 @llvm.vector.reduce.and.v4i32(<4 x i32> %masked.a)
20509 %also.r = and i32 %reduction, %start
20512 .. _int_vp_reduce_or:
20514 '``llvm.vp.reduce.or.*``' Intrinsics
20515 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20519 This is an overloaded intrinsic.
20523 declare i32 @llvm.vp.reduce.or.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>)
20524 declare i16 @llvm.vp.reduce.or.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
20529 Predicated integer ``OR`` reduction of a vector and a scalar starting value,
20530 returning the result as a scalar.
20536 The first operand is the start value of the reduction, which must be a scalar
20537 integer type equal to the result type. The second operand is the vector on
20538 which the reduction is performed and must be a vector of integer values whose
20539 element type is the result/start type. The third operand is the vector mask and
20540 is a vector of boolean values with the same number of elements as the vector
20541 operand. The fourth operand is the explicit vector length of the operation.
20546 The '``llvm.vp.reduce.or``' intrinsic performs the integer ``OR`` reduction
20547 (:ref:`llvm.vector.reduce.or <int_vector_reduce_or>`) of the vector operand
20548 ``val`` on each enabled lane, performing an '``or``' of that with the scalar
20549 ``start_value``. Disabled lanes are treated as containing the neutral value
20550 ``0`` (i.e. having no effect on the reduction operation). If the vector length
20551 is zero, the result is the start value.
20553 To ignore the start value, the neutral value can be used.
20558 .. code-block:: llvm
20560 %r = call i32 @llvm.vp.reduce.or.v4i32(i32 %start, <4 x i32> %a, <4 x i1> %mask, i32 %evl)
20561 ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
20562 ; are treated as though %mask were false for those lanes.
20564 %masked.a = select <4 x i1> %mask, <4 x i32> %a, <4 x i32> <i32 0, i32 0, i32 0, i32 0>
20565 %reduction = call i32 @llvm.vector.reduce.or.v4i32(<4 x i32> %masked.a)
20566 %also.r = or i32 %reduction, %start
20568 .. _int_vp_reduce_xor:
20570 '``llvm.vp.reduce.xor.*``' Intrinsics
20571 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20575 This is an overloaded intrinsic.
20579 declare i32 @llvm.vp.reduce.xor.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>)
20580 declare i16 @llvm.vp.reduce.xor.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
20585 Predicated integer ``XOR`` reduction of a vector and a scalar starting value,
20586 returning the result as a scalar.
20592 The first operand is the start value of the reduction, which must be a scalar
20593 integer type equal to the result type. The second operand is the vector on
20594 which the reduction is performed and must be a vector of integer values whose
20595 element type is the result/start type. The third operand is the vector mask and
20596 is a vector of boolean values with the same number of elements as the vector
20597 operand. The fourth operand is the explicit vector length of the operation.
20602 The '``llvm.vp.reduce.xor``' intrinsic performs the integer ``XOR`` reduction
20603 (:ref:`llvm.vector.reduce.xor <int_vector_reduce_xor>`) of the vector operand
20604 ``val`` on each enabled lane, performing an '``xor``' of that with the scalar
20605 ``start_value``. Disabled lanes are treated as containing the neutral value
20606 ``0`` (i.e. having no effect on the reduction operation). If the vector length
20607 is zero, the result is the start value.
20609 To ignore the start value, the neutral value can be used.
20614 .. code-block:: llvm
20616 %r = call i32 @llvm.vp.reduce.xor.v4i32(i32 %start, <4 x i32> %a, <4 x i1> %mask, i32 %evl)
20617 ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
20618 ; are treated as though %mask were false for those lanes.
20620 %masked.a = select <4 x i1> %mask, <4 x i32> %a, <4 x i32> <i32 0, i32 0, i32 0, i32 0>
20621 %reduction = call i32 @llvm.vector.reduce.xor.v4i32(<4 x i32> %masked.a)
20622 %also.r = xor i32 %reduction, %start
20625 .. _int_vp_reduce_smax:
20627 '``llvm.vp.reduce.smax.*``' Intrinsics
20628 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20632 This is an overloaded intrinsic.
20636 declare i32 @llvm.vp.reduce.smax.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>)
20637 declare i16 @llvm.vp.reduce.smax.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
20642 Predicated signed-integer ``MAX`` reduction of a vector and a scalar starting
20643 value, returning the result as a scalar.
20649 The first operand is the start value of the reduction, which must be a scalar
20650 integer type equal to the result type. The second operand is the vector on
20651 which the reduction is performed and must be a vector of integer values whose
20652 element type is the result/start type. The third operand is the vector mask and
20653 is a vector of boolean values with the same number of elements as the vector
20654 operand. The fourth operand is the explicit vector length of the operation.
20659 The '``llvm.vp.reduce.smax``' intrinsic performs the signed-integer ``MAX``
20660 reduction (:ref:`llvm.vector.reduce.smax <int_vector_reduce_smax>`) of the
20661 vector operand ``val`` on each enabled lane, and taking the maximum of that and
20662 the scalar ``start_value``. Disabled lanes are treated as containing the
20663 neutral value ``INT_MIN`` (i.e. having no effect on the reduction operation).
20664 If the vector length is zero, the result is the start value.
20666 To ignore the start value, the neutral value can be used.
20671 .. code-block:: llvm
20673 %r = call i8 @llvm.vp.reduce.smax.v4i8(i8 %start, <4 x i8> %a, <4 x i1> %mask, i32 %evl)
20674 ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
20675 ; are treated as though %mask were false for those lanes.
20677 %masked.a = select <4 x i1> %mask, <4 x i8> %a, <4 x i8> <i8 -128, i8 -128, i8 -128, i8 -128>
20678 %reduction = call i8 @llvm.vector.reduce.smax.v4i8(<4 x i8> %masked.a)
20679 %also.r = call i8 @llvm.smax.i8(i8 %reduction, i8 %start)
20682 .. _int_vp_reduce_smin:
20684 '``llvm.vp.reduce.smin.*``' Intrinsics
20685 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20689 This is an overloaded intrinsic.
20693 declare i32 @llvm.vp.reduce.smin.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>)
20694 declare i16 @llvm.vp.reduce.smin.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
20699 Predicated signed-integer ``MIN`` reduction of a vector and a scalar starting
20700 value, returning the result as a scalar.
20706 The first operand is the start value of the reduction, which must be a scalar
20707 integer type equal to the result type. The second operand is the vector on
20708 which the reduction is performed and must be a vector of integer values whose
20709 element type is the result/start type. The third operand is the vector mask and
20710 is a vector of boolean values with the same number of elements as the vector
20711 operand. The fourth operand is the explicit vector length of the operation.
20716 The '``llvm.vp.reduce.smin``' intrinsic performs the signed-integer ``MIN``
20717 reduction (:ref:`llvm.vector.reduce.smin <int_vector_reduce_smin>`) of the
20718 vector operand ``val`` on each enabled lane, and taking the minimum of that and
20719 the scalar ``start_value``. Disabled lanes are treated as containing the
20720 neutral value ``INT_MAX`` (i.e. having no effect on the reduction operation).
20721 If the vector length is zero, the result is the start value.
20723 To ignore the start value, the neutral value can be used.
20728 .. code-block:: llvm
20730 %r = call i8 @llvm.vp.reduce.smin.v4i8(i8 %start, <4 x i8> %a, <4 x i1> %mask, i32 %evl)
20731 ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
20732 ; are treated as though %mask were false for those lanes.
20734 %masked.a = select <4 x i1> %mask, <4 x i8> %a, <4 x i8> <i8 127, i8 127, i8 127, i8 127>
20735 %reduction = call i8 @llvm.vector.reduce.smin.v4i8(<4 x i8> %masked.a)
20736 %also.r = call i8 @llvm.smin.i8(i8 %reduction, i8 %start)
20739 .. _int_vp_reduce_umax:
20741 '``llvm.vp.reduce.umax.*``' Intrinsics
20742 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20746 This is an overloaded intrinsic.
20750 declare i32 @llvm.vp.reduce.umax.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>)
20751 declare i16 @llvm.vp.reduce.umax.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
20756 Predicated unsigned-integer ``MAX`` reduction of a vector and a scalar starting
20757 value, returning the result as a scalar.
20763 The first operand is the start value of the reduction, which must be a scalar
20764 integer type equal to the result type. The second operand is the vector on
20765 which the reduction is performed and must be a vector of integer values whose
20766 element type is the result/start type. The third operand is the vector mask and
20767 is a vector of boolean values with the same number of elements as the vector
20768 operand. The fourth operand is the explicit vector length of the operation.
20773 The '``llvm.vp.reduce.umax``' intrinsic performs the unsigned-integer ``MAX``
20774 reduction (:ref:`llvm.vector.reduce.umax <int_vector_reduce_umax>`) of the
20775 vector operand ``val`` on each enabled lane, and taking the maximum of that and
20776 the scalar ``start_value``. Disabled lanes are treated as containing the
20777 neutral value ``0`` (i.e. having no effect on the reduction operation). If the
20778 vector length is zero, the result is the start value.
20780 To ignore the start value, the neutral value can be used.
20785 .. code-block:: llvm
20787 %r = call i32 @llvm.vp.reduce.umax.v4i32(i32 %start, <4 x i32> %a, <4 x i1> %mask, i32 %evl)
20788 ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
20789 ; are treated as though %mask were false for those lanes.
20791 %masked.a = select <4 x i1> %mask, <4 x i32> %a, <4 x i32> <i32 0, i32 0, i32 0, i32 0>
20792 %reduction = call i32 @llvm.vector.reduce.umax.v4i32(<4 x i32> %masked.a)
20793 %also.r = call i32 @llvm.umax.i32(i32 %reduction, i32 %start)
20796 .. _int_vp_reduce_umin:
20798 '``llvm.vp.reduce.umin.*``' Intrinsics
20799 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20803 This is an overloaded intrinsic.
20807 declare i32 @llvm.vp.reduce.umin.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>)
20808 declare i16 @llvm.vp.reduce.umin.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
20813 Predicated unsigned-integer ``MIN`` reduction of a vector and a scalar starting
20814 value, returning the result as a scalar.
20820 The first operand is the start value of the reduction, which must be a scalar
20821 integer type equal to the result type. The second operand is the vector on
20822 which the reduction is performed and must be a vector of integer values whose
20823 element type is the result/start type. The third operand is the vector mask and
20824 is a vector of boolean values with the same number of elements as the vector
20825 operand. The fourth operand is the explicit vector length of the operation.
20830 The '``llvm.vp.reduce.umin``' intrinsic performs the unsigned-integer ``MIN``
20831 reduction (:ref:`llvm.vector.reduce.umin <int_vector_reduce_umin>`) of the
20832 vector operand ``val`` on each enabled lane, taking the minimum of that and the
20833 scalar ``start_value``. Disabled lanes are treated as containing the neutral
20834 value ``UINT_MAX``, or ``-1`` (i.e. having no effect on the reduction
20835 operation). If the vector length is zero, the result is the start value.
20837 To ignore the start value, the neutral value can be used.
20842 .. code-block:: llvm
20844 %r = call i32 @llvm.vp.reduce.umin.v4i32(i32 %start, <4 x i32> %a, <4 x i1> %mask, i32 %evl)
20845 ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
20846 ; are treated as though %mask were false for those lanes.
20848 %masked.a = select <4 x i1> %mask, <4 x i32> %a, <4 x i32> <i32 -1, i32 -1, i32 -1, i32 -1>
20849 %reduction = call i32 @llvm.vector.reduce.umin.v4i32(<4 x i32> %masked.a)
20850 %also.r = call i32 @llvm.umin.i32(i32 %reduction, i32 %start)
20853 .. _int_vp_reduce_fmax:
20855 '``llvm.vp.reduce.fmax.*``' Intrinsics
20856 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20860 This is an overloaded intrinsic.
20864 declare float @llvm.vp.reduce.fmax.v4f32(float <start_value>, <4 x float> <val>, <4 x i1> <mask>, float <vector_length>)
20865 declare double @llvm.vp.reduce.fmax.nxv8f64(double <start_value>, <vscale x 8 x double> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
20870 Predicated floating-point ``MAX`` reduction of a vector and a scalar starting
20871 value, returning the result as a scalar.
20877 The first operand is the start value of the reduction, which must be a scalar
20878 floating-point type equal to the result type. The second operand is the vector
20879 on which the reduction is performed and must be a vector of floating-point
20880 values whose element type is the result/start type. The third operand is the
20881 vector mask and is a vector of boolean values with the same number of elements
20882 as the vector operand. The fourth operand is the explicit vector length of the
20888 The '``llvm.vp.reduce.fmax``' intrinsic performs the floating-point ``MAX``
20889 reduction (:ref:`llvm.vector.reduce.fmax <int_vector_reduce_fmax>`) of the
20890 vector operand ``val`` on each enabled lane, taking the maximum of that and the
20891 scalar ``start_value``. Disabled lanes are treated as containing the neutral
20892 value (i.e. having no effect on the reduction operation). If the vector length
20893 is zero, the result is the start value.
20895 The neutral value is dependent on the :ref:`fast-math flags <fastmath>`. If no
20896 flags are set, the neutral value is ``-QNAN``. If ``nnan`` and ``ninf`` are
20897 both set, then the neutral value is the smallest floating-point value for the
20898 result type. If only ``nnan`` is set then the neutral value is ``-Infinity``.
20900 This instruction has the same comparison semantics as the
20901 :ref:`llvm.vector.reduce.fmax <int_vector_reduce_fmax>` intrinsic (and thus the
20902 '``llvm.maxnum.*``' intrinsic). That is, the result will always be a number
20903 unless all elements of the vector and the starting value are ``NaN``. For a
20904 vector with maximum element magnitude ``0.0`` and containing both ``+0.0`` and
20905 ``-0.0`` elements, the sign of the result is unspecified.
20907 To ignore the start value, the neutral value can be used.
20912 .. code-block:: llvm
20914 %r = call float @llvm.vp.reduce.fmax.v4f32(float %float, <4 x float> %a, <4 x i1> %mask, i32 %evl)
20915 ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
20916 ; are treated as though %mask were false for those lanes.
20918 %masked.a = select <4 x i1> %mask, <4 x float> %a, <4 x float> <float QNAN, float QNAN, float QNAN, float QNAN>
20919 %reduction = call float @llvm.vector.reduce.fmax.v4f32(<4 x float> %masked.a)
20920 %also.r = call float @llvm.maxnum.f32(float %reduction, float %start)
20923 .. _int_vp_reduce_fmin:
20925 '``llvm.vp.reduce.fmin.*``' Intrinsics
20926 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20930 This is an overloaded intrinsic.
20934 declare float @llvm.vp.reduce.fmin.v4f32(float <start_value>, <4 x float> <val>, <4 x i1> <mask>, float <vector_length>)
20935 declare double @llvm.vp.reduce.fmin.nxv8f64(double <start_value>, <vscale x 8 x double> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
20940 Predicated floating-point ``MIN`` reduction of a vector and a scalar starting
20941 value, returning the result as a scalar.
20947 The first operand is the start value of the reduction, which must be a scalar
20948 floating-point type equal to the result type. The second operand is the vector
20949 on which the reduction is performed and must be a vector of floating-point
20950 values whose element type is the result/start type. The third operand is the
20951 vector mask and is a vector of boolean values with the same number of elements
20952 as the vector operand. The fourth operand is the explicit vector length of the
20958 The '``llvm.vp.reduce.fmin``' intrinsic performs the floating-point ``MIN``
20959 reduction (:ref:`llvm.vector.reduce.fmin <int_vector_reduce_fmin>`) of the
20960 vector operand ``val`` on each enabled lane, taking the minimum of that and the
20961 scalar ``start_value``. Disabled lanes are treated as containing the neutral
20962 value (i.e. having no effect on the reduction operation). If the vector length
20963 is zero, the result is the start value.
20965 The neutral value is dependent on the :ref:`fast-math flags <fastmath>`. If no
20966 flags are set, the neutral value is ``+QNAN``. If ``nnan`` and ``ninf`` are
20967 both set, then the neutral value is the largest floating-point value for the
20968 result type. If only ``nnan`` is set then the neutral value is ``+Infinity``.
20970 This instruction has the same comparison semantics as the
20971 :ref:`llvm.vector.reduce.fmin <int_vector_reduce_fmin>` intrinsic (and thus the
20972 '``llvm.minnum.*``' intrinsic). That is, the result will always be a number
20973 unless all elements of the vector and the starting value are ``NaN``. For a
20974 vector with maximum element magnitude ``0.0`` and containing both ``+0.0`` and
20975 ``-0.0`` elements, the sign of the result is unspecified.
20977 To ignore the start value, the neutral value can be used.
20982 .. code-block:: llvm
20984 %r = call float @llvm.vp.reduce.fmin.v4f32(float %start, <4 x float> %a, <4 x i1> %mask, i32 %evl)
20985 ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
20986 ; are treated as though %mask were false for those lanes.
20988 %masked.a = select <4 x i1> %mask, <4 x float> %a, <4 x float> <float QNAN, float QNAN, float QNAN, float QNAN>
20989 %reduction = call float @llvm.vector.reduce.fmin.v4f32(<4 x float> %masked.a)
20990 %also.r = call float @llvm.minnum.f32(float %reduction, float %start)
20993 .. _int_get_active_lane_mask:
20995 '``llvm.get.active.lane.mask.*``' Intrinsics
20996 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21000 This is an overloaded intrinsic.
21004 declare <4 x i1> @llvm.get.active.lane.mask.v4i1.i32(i32 %base, i32 %n)
21005 declare <8 x i1> @llvm.get.active.lane.mask.v8i1.i64(i64 %base, i64 %n)
21006 declare <16 x i1> @llvm.get.active.lane.mask.v16i1.i64(i64 %base, i64 %n)
21007 declare <vscale x 16 x i1> @llvm.get.active.lane.mask.nxv16i1.i64(i64 %base, i64 %n)
21013 Create a mask representing active and inactive vector lanes.
21019 Both operands have the same scalar integer type. The result is a vector with
21020 the i1 element type.
21025 The '``llvm.get.active.lane.mask.*``' intrinsics are semantically equivalent
21030 %m[i] = icmp ult (%base + i), %n
21032 where ``%m`` is a vector (mask) of active/inactive lanes with its elements
21033 indexed by ``i``, and ``%base``, ``%n`` are the two arguments to
21034 ``llvm.get.active.lane.mask.*``, ``%icmp`` is an integer compare and ``ult``
21035 the unsigned less-than comparison operator. Overflow cannot occur in
21036 ``(%base + i)`` and its comparison against ``%n`` as it is performed in integer
21037 numbers and not in machine numbers. If ``%n`` is ``0``, then the result is a
21038 poison value. The above is equivalent to:
21042 %m = @llvm.get.active.lane.mask(%base, %n)
21044 This can, for example, be emitted by the loop vectorizer in which case
21045 ``%base`` is the first element of the vector induction variable (VIV) and
21046 ``%n`` is the loop tripcount. Thus, these intrinsics perform an element-wise
21047 less than comparison of VIV with the loop tripcount, producing a mask of
21048 true/false values representing active/inactive vector lanes, except if the VIV
21049 overflows in which case they return false in the lanes where the VIV overflows.
21050 The arguments are scalar types to accommodate scalable vector types, for which
21051 it is unknown what the type of the step vector needs to be that enumerate its
21052 lanes without overflow.
21054 This mask ``%m`` can e.g. be used in masked load/store instructions. These
21055 intrinsics provide a hint to the backend. I.e., for a vector loop, the
21056 back-edge taken count of the original scalar loop is explicit as the second
21063 .. code-block:: llvm
21065 %active.lane.mask = call <4 x i1> @llvm.get.active.lane.mask.v4i1.i64(i64 %elem0, i64 429)
21066 %wide.masked.load = call <4 x i32> @llvm.masked.load.v4i32.p0v4i32(<4 x i32>* %3, i32 4, <4 x i1> %active.lane.mask, <4 x i32> poison)
21069 .. _int_experimental_vp_splice:
21071 '``llvm.experimental.vp.splice``' Intrinsic
21072 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21076 This is an overloaded intrinsic.
21080 declare <2 x double> @llvm.experimental.vp.splice.v2f64(<2 x double> %vec1, <2 x double> %vec2, i32 %imm, <2 x i1> %mask, i32 %evl1, i32 %evl2)
21081 declare <vscale x 4 x i32> @llvm.experimental.vp.splice.nxv4i32(<vscale x 4 x i32> %vec1, <vscale x 4 x i32> %vec2, i32 %imm, <vscale x 4 x i1> %mask, i32 %evl1, i32 %evl2)
21086 The '``llvm.experimental.vp.splice.*``' intrinsic is the vector length
21087 predicated version of the '``llvm.experimental.vector.splice.*``' intrinsic.
21092 The result and the first two arguments ``vec1`` and ``vec2`` are vectors with
21093 the same type. The third argument ``imm`` is an immediate signed integer that
21094 indicates the offset index. The fourth argument ``mask`` is a vector mask and
21095 has the same number of elements as the result. The last two arguments ``evl1``
21096 and ``evl2`` are unsigned integers indicating the explicit vector lengths of
21097 ``vec1`` and ``vec2`` respectively. ``imm``, ``evl1`` and ``evl2`` should
21098 respect the following constraints: ``-evl1 <= imm < evl1``, ``0 <= evl1 <= VL``
21099 and ``0 <= evl2 <= VL``, where ``VL`` is the runtime vector factor. If these
21100 constraints are not satisfied the intrinsic has undefined behaviour.
21105 Effectively, this intrinsic concatenates ``vec1[0..evl1-1]`` and
21106 ``vec2[0..evl2-1]`` and creates the result vector by selecting the elements in a
21107 window of size ``evl2``, starting at index ``imm`` (for a positive immediate) of
21108 the concatenated vector. Elements in the result vector beyond ``evl2`` are
21109 ``undef``. If ``imm`` is negative the starting index is ``evl1 + imm``. The result
21110 vector of active vector length ``evl2`` contains ``evl1 - imm`` (``-imm`` for
21111 negative ``imm``) elements from indices ``[imm..evl1 - 1]``
21112 (``[evl1 + imm..evl1 -1]`` for negative ``imm``) of ``vec1`` followed by the
21113 first ``evl2 - (evl1 - imm)`` (``evl2 + imm`` for negative ``imm``) elements of
21114 ``vec2``. If ``evl1 - imm`` (``-imm``) >= ``evl2``, only the first ``evl2``
21115 elements are considered and the remaining are ``undef``. The lanes in the result
21116 vector disabled by ``mask`` are ``poison``.
21121 .. code-block:: text
21123 llvm.experimental.vp.splice(<A,B,C,D>, <E,F,G,H>, 1, 2, 3) ==> <B, E, F, poison> ; index
21124 llvm.experimental.vp.splice(<A,B,C,D>, <E,F,G,H>, -2, 3, 2) ==> <B, C, poison, poison> ; trailing elements
21129 '``llvm.vp.load``' Intrinsic
21130 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21134 This is an overloaded intrinsic.
21138 declare <4 x float> @llvm.vp.load.v4f32.p0(ptr %ptr, <4 x i1> %mask, i32 %evl)
21139 declare <vscale x 2 x i16> @llvm.vp.load.nxv2i16.p0(ptr %ptr, <vscale x 2 x i1> %mask, i32 %evl)
21140 declare <8 x float> @llvm.vp.load.v8f32.p1(ptr addrspace(1) %ptr, <8 x i1> %mask, i32 %evl)
21141 declare <vscale x 1 x i64> @llvm.vp.load.nxv1i64.p6(ptr addrspace(6) %ptr, <vscale x 1 x i1> %mask, i32 %evl)
21146 The '``llvm.vp.load.*``' intrinsic is the vector length predicated version of
21147 the :ref:`llvm.masked.load <int_mload>` intrinsic.
21152 The first operand is the base pointer for the load. The second operand is a
21153 vector of boolean values with the same number of elements as the return type.
21154 The third is the explicit vector length of the operation. The return type and
21155 underlying type of the base pointer are the same vector types.
21157 The :ref:`align <attr_align>` parameter attribute can be provided for the first
21163 The '``llvm.vp.load``' intrinsic reads a vector from memory in the same way as
21164 the '``llvm.masked.load``' intrinsic, where the mask is taken from the
21165 combination of the '``mask``' and '``evl``' operands in the usual VP way.
21166 Certain '``llvm.masked.load``' operands do not have corresponding operands in
21167 '``llvm.vp.load``': the '``passthru``' operand is implicitly ``poison``; the
21168 '``alignment``' operand is taken as the ``align`` parameter attribute, if
21169 provided. The default alignment is taken as the ABI alignment of the return
21170 type as specified by the :ref:`datalayout string<langref_datalayout>`.
21175 .. code-block:: text
21177 %r = call <8 x i8> @llvm.vp.load.v8i8.p0(ptr align 2 %ptr, <8 x i1> %mask, i32 %evl)
21178 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
21180 %also.r = call <8 x i8> @llvm.masked.load.v8i8.p0(ptr %ptr, i32 2, <8 x i1> %mask, <8 x i8> poison)
21185 '``llvm.vp.store``' Intrinsic
21186 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21190 This is an overloaded intrinsic.
21194 declare void @llvm.vp.store.v4f32.p0(<4 x float> %val, ptr %ptr, <4 x i1> %mask, i32 %evl)
21195 declare void @llvm.vp.store.nxv2i16.p0(<vscale x 2 x i16> %val, ptr %ptr, <vscale x 2 x i1> %mask, i32 %evl)
21196 declare void @llvm.vp.store.v8f32.p1(<8 x float> %val, ptr addrspace(1) %ptr, <8 x i1> %mask, i32 %evl)
21197 declare void @llvm.vp.store.nxv1i64.p6(<vscale x 1 x i64> %val, ptr addrspace(6) %ptr, <vscale x 1 x i1> %mask, i32 %evl)
21202 The '``llvm.vp.store.*``' intrinsic is the vector length predicated version of
21203 the :ref:`llvm.masked.store <int_mstore>` intrinsic.
21208 The first operand is the vector value to be written to memory. The second
21209 operand is the base pointer for the store. It has the same underlying type as
21210 the value operand. The third operand is a vector of boolean values with the
21211 same number of elements as the return type. The fourth is the explicit vector
21212 length of the operation.
21214 The :ref:`align <attr_align>` parameter attribute can be provided for the
21220 The '``llvm.vp.store``' intrinsic reads a vector from memory in the same way as
21221 the '``llvm.masked.store``' intrinsic, where the mask is taken from the
21222 combination of the '``mask``' and '``evl``' operands in the usual VP way. The
21223 alignment of the operation (corresponding to the '``alignment``' operand of
21224 '``llvm.masked.store``') is specified by the ``align`` parameter attribute (see
21225 above). If it is not provided then the ABI alignment of the type of the
21226 '``value``' operand as specified by the :ref:`datalayout
21227 string<langref_datalayout>` is used instead.
21232 .. code-block:: text
21234 call void @llvm.vp.store.v8i8.p0(<8 x i8> %val, ptr align 4 %ptr, <8 x i1> %mask, i32 %evl)
21235 ;; For all lanes below %evl, the call above is lane-wise equivalent to the call below.
21237 call void @llvm.masked.store.v8i8.p0(<8 x i8> %val, ptr %ptr, i32 4, <8 x i1> %mask)
21240 .. _int_experimental_vp_strided_load:
21242 '``llvm.experimental.vp.strided.load``' Intrinsic
21243 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21247 This is an overloaded intrinsic.
21251 declare <4 x float> @llvm.experimental.vp.strided.load.v4f32.i64(ptr %ptr, i64 %stride, <4 x i1> %mask, i32 %evl)
21252 declare <vscale x 2 x i16> @llvm.experimental.vp.strided.load.nxv2i16.i64(ptr %ptr, i64 %stride, <vscale x 2 x i1> %mask, i32 %evl)
21257 The '``llvm.experimental.vp.strided.load``' intrinsic loads, into a vector, scalar values from
21258 memory locations evenly spaced apart by '``stride``' number of bytes, starting from '``ptr``'.
21263 The first operand is the base pointer for the load. The second operand is the stride
21264 value expressed in bytes. The third operand is a vector of boolean values
21265 with the same number of elements as the return type. The fourth is the explicit
21266 vector length of the operation. The base pointer underlying type matches the type of the scalar
21267 elements of the return operand.
21269 The :ref:`align <attr_align>` parameter attribute can be provided for the first
21275 The '``llvm.experimental.vp.strided.load``' intrinsic loads, into a vector, multiple scalar
21276 values from memory in the same way as the :ref:`llvm.vp.gather <int_vp_gather>` intrinsic,
21277 where the vector of pointers is in the form:
21279 ``%ptrs = <%ptr, %ptr + %stride, %ptr + 2 * %stride, ... >``,
21281 with '``ptr``' previously casted to a pointer '``i8``', '``stride``' always interpreted as a signed
21282 integer and all arithmetic occurring in the pointer type.
21287 .. code-block:: text
21289 %r = call <8 x i64> @llvm.experimental.vp.strided.load.v8i64.i64(i64* %ptr, i64 %stride, <8 x i64> %mask, i32 %evl)
21290 ;; The operation can also be expressed like this:
21292 %addr = bitcast i64* %ptr to i8*
21293 ;; Create a vector of pointers %addrs in the form:
21294 ;; %addrs = <%addr, %addr + %stride, %addr + 2 * %stride, ...>
21295 %ptrs = bitcast <8 x i8* > %addrs to <8 x i64* >
21296 %also.r = call <8 x i64> @llvm.vp.gather.v8i64.v8p0i64(<8 x i64* > %ptrs, <8 x i64> %mask, i32 %evl)
21299 .. _int_experimental_vp_strided_store:
21301 '``llvm.experimental.vp.strided.store``' Intrinsic
21302 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21306 This is an overloaded intrinsic.
21310 declare void @llvm.experimental.vp.strided.store.v4f32.i64(<4 x float> %val, ptr %ptr, i64 %stride, <4 x i1> %mask, i32 %evl)
21311 declare void @llvm.experimental.vp.strided.store.nxv2i16.i64(<vscale x 2 x i16> %val, ptr %ptr, i64 %stride, <vscale x 2 x i1> %mask, i32 %evl)
21316 The '``@llvm.experimental.vp.strided.store``' intrinsic stores the elements of
21317 '``val``' into memory locations evenly spaced apart by '``stride``' number of
21318 bytes, starting from '``ptr``'.
21323 The first operand is the vector value to be written to memory. The second
21324 operand is the base pointer for the store. Its underlying type matches the
21325 scalar element type of the value operand. The third operand is the stride value
21326 expressed in bytes. The fourth operand is a vector of boolean values with the
21327 same number of elements as the return type. The fifth is the explicit vector
21328 length of the operation.
21330 The :ref:`align <attr_align>` parameter attribute can be provided for the
21336 The '``llvm.experimental.vp.strided.store``' intrinsic stores the elements of
21337 '``val``' in the same way as the :ref:`llvm.vp.scatter <int_vp_scatter>` intrinsic,
21338 where the vector of pointers is in the form:
21340 ``%ptrs = <%ptr, %ptr + %stride, %ptr + 2 * %stride, ... >``,
21342 with '``ptr``' previously casted to a pointer '``i8``', '``stride``' always interpreted as a signed
21343 integer and all arithmetic occurring in the pointer type.
21348 .. code-block:: text
21350 call void @llvm.experimental.vp.strided.store.v8i64.i64(<8 x i64> %val, i64* %ptr, i64 %stride, <8 x i1> %mask, i32 %evl)
21351 ;; The operation can also be expressed like this:
21353 %addr = bitcast i64* %ptr to i8*
21354 ;; Create a vector of pointers %addrs in the form:
21355 ;; %addrs = <%addr, %addr + %stride, %addr + 2 * %stride, ...>
21356 %ptrs = bitcast <8 x i8* > %addrs to <8 x i64* >
21357 call void @llvm.vp.scatter.v8i64.v8p0i64(<8 x i64> %val, <8 x i64*> %ptrs, <8 x i1> %mask, i32 %evl)
21362 '``llvm.vp.gather``' Intrinsic
21363 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21367 This is an overloaded intrinsic.
21371 declare <4 x double> @llvm.vp.gather.v4f64.v4p0(<4 x ptr> %ptrs, <4 x i1> %mask, i32 %evl)
21372 declare <vscale x 2 x i8> @llvm.vp.gather.nxv2i8.nxv2p0(<vscale x 2 x ptr> %ptrs, <vscale x 2 x i1> %mask, i32 %evl)
21373 declare <2 x float> @llvm.vp.gather.v2f32.v2p2(<2 x ptr addrspace(2)> %ptrs, <2 x i1> %mask, i32 %evl)
21374 declare <vscale x 4 x i32> @llvm.vp.gather.nxv4i32.nxv4p4(<vscale x 4 x ptr addrspace(4)> %ptrs, <vscale x 4 x i1> %mask, i32 %evl)
21379 The '``llvm.vp.gather.*``' intrinsic is the vector length predicated version of
21380 the :ref:`llvm.masked.gather <int_mgather>` intrinsic.
21385 The first operand is a vector of pointers which holds all memory addresses to
21386 read. The second operand is a vector of boolean values with the same number of
21387 elements as the return type. The third is the explicit vector length of the
21388 operation. The return type and underlying type of the vector of pointers are
21389 the same vector types.
21391 The :ref:`align <attr_align>` parameter attribute can be provided for the first
21397 The '``llvm.vp.gather``' intrinsic reads multiple scalar values from memory in
21398 the same way as the '``llvm.masked.gather``' intrinsic, where the mask is taken
21399 from the combination of the '``mask``' and '``evl``' operands in the usual VP
21400 way. Certain '``llvm.masked.gather``' operands do not have corresponding
21401 operands in '``llvm.vp.gather``': the '``passthru``' operand is implicitly
21402 ``poison``; the '``alignment``' operand is taken as the ``align`` parameter, if
21403 provided. The default alignment is taken as the ABI alignment of the source
21404 addresses as specified by the :ref:`datalayout string<langref_datalayout>`.
21409 .. code-block:: text
21411 %r = call <8 x i8> @llvm.vp.gather.v8i8.v8p0(<8 x ptr> align 8 %ptrs, <8 x i1> %mask, i32 %evl)
21412 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
21414 %also.r = call <8 x i8> @llvm.masked.gather.v8i8.v8p0(<8 x ptr> %ptrs, i32 8, <8 x i1> %mask, <8 x i8> poison)
21417 .. _int_vp_scatter:
21419 '``llvm.vp.scatter``' Intrinsic
21420 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21424 This is an overloaded intrinsic.
21428 declare void @llvm.vp.scatter.v4f64.v4p0(<4 x double> %val, <4 x ptr> %ptrs, <4 x i1> %mask, i32 %evl)
21429 declare void @llvm.vp.scatter.nxv2i8.nxv2p0(<vscale x 2 x i8> %val, <vscale x 2 x ptr> %ptrs, <vscale x 2 x i1> %mask, i32 %evl)
21430 declare void @llvm.vp.scatter.v2f32.v2p2(<2 x float> %val, <2 x ptr addrspace(2)> %ptrs, <2 x i1> %mask, i32 %evl)
21431 declare void @llvm.vp.scatter.nxv4i32.nxv4p4(<vscale x 4 x i32> %val, <vscale x 4 x ptr addrspace(4)> %ptrs, <vscale x 4 x i1> %mask, i32 %evl)
21436 The '``llvm.vp.scatter.*``' intrinsic is the vector length predicated version of
21437 the :ref:`llvm.masked.scatter <int_mscatter>` intrinsic.
21442 The first operand is a vector value to be written to memory. The second operand
21443 is a vector of pointers, pointing to where the value elements should be stored.
21444 The third operand is a vector of boolean values with the same number of
21445 elements as the return type. The fourth is the explicit vector length of the
21448 The :ref:`align <attr_align>` parameter attribute can be provided for the
21454 The '``llvm.vp.scatter``' intrinsic writes multiple scalar values to memory in
21455 the same way as the '``llvm.masked.scatter``' intrinsic, where the mask is
21456 taken from the combination of the '``mask``' and '``evl``' operands in the
21457 usual VP way. The '``alignment``' operand of the '``llvm.masked.scatter``' does
21458 not have a corresponding operand in '``llvm.vp.scatter``': it is instead
21459 provided via the optional ``align`` parameter attribute on the
21460 vector-of-pointers operand. Otherwise it is taken as the ABI alignment of the
21461 destination addresses as specified by the :ref:`datalayout
21462 string<langref_datalayout>`.
21467 .. code-block:: text
21469 call void @llvm.vp.scatter.v8i8.v8p0(<8 x i8> %val, <8 x ptr> align 1 %ptrs, <8 x i1> %mask, i32 %evl)
21470 ;; For all lanes below %evl, the call above is lane-wise equivalent to the call below.
21472 call void @llvm.masked.scatter.v8i8.v8p0(<8 x i8> %val, <8 x ptr> %ptrs, i32 1, <8 x i1> %mask)
21477 '``llvm.vp.trunc.*``' Intrinsics
21478 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21482 This is an overloaded intrinsic.
21486 declare <16 x i16> @llvm.vp.trunc.v16i16.v16i32 (<16 x i32> <op>, <16 x i1> <mask>, i32 <vector_length>)
21487 declare <vscale x 4 x i16> @llvm.vp.trunc.nxv4i16.nxv4i32 (<vscale x 4 x i32> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
21492 The '``llvm.vp.trunc``' intrinsic truncates its first operand to the return
21493 type. The operation has a mask and an explicit vector length parameter.
21499 The '``llvm.vp.trunc``' intrinsic takes a value to cast as its first operand.
21500 The return type is the type to cast the value to. Both types must be vector of
21501 :ref:`integer <t_integer>` type. The bit size of the value must be larger than
21502 the bit size of the return type. The second operand is the vector mask. The
21503 return type, the value to cast, and the vector mask have the same number of
21504 elements. The third operand is the explicit vector length of the operation.
21509 The '``llvm.vp.trunc``' intrinsic truncates the high order bits in value and
21510 converts the remaining bits to return type. Since the source size must be larger
21511 than the destination size, '``llvm.vp.trunc``' cannot be a *no-op cast*. It will
21512 always truncate bits. The conversion is performed on lane positions below the
21513 explicit vector length and where the vector mask is true. Masked-off lanes are
21519 .. code-block:: llvm
21521 %r = call <4 x i16> @llvm.vp.trunc.v4i16.v4i32(<4 x i32> %a, <4 x i1> %mask, i32 %evl)
21522 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
21524 %t = trunc <4 x i32> %a to <4 x i16>
21525 %also.r = select <4 x i1> %mask, <4 x i16> %t, <4 x i16> poison
21530 '``llvm.vp.zext.*``' Intrinsics
21531 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21535 This is an overloaded intrinsic.
21539 declare <16 x i32> @llvm.vp.zext.v16i32.v16i16 (<16 x i16> <op>, <16 x i1> <mask>, i32 <vector_length>)
21540 declare <vscale x 4 x i32> @llvm.vp.zext.nxv4i32.nxv4i16 (<vscale x 4 x i16> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
21545 The '``llvm.vp.zext``' intrinsic zero extends its first operand to the return
21546 type. The operation has a mask and an explicit vector length parameter.
21552 The '``llvm.vp.zext``' intrinsic takes a value to cast as its first operand.
21553 The return type is the type to cast the value to. Both types must be vectors of
21554 :ref:`integer <t_integer>` type. The bit size of the value must be smaller than
21555 the bit size of the return type. The second operand is the vector mask. The
21556 return type, the value to cast, and the vector mask have the same number of
21557 elements. The third operand is the explicit vector length of the operation.
21562 The '``llvm.vp.zext``' intrinsic fill the high order bits of the value with zero
21563 bits until it reaches the size of the return type. When zero extending from i1,
21564 the result will always be either 0 or 1. The conversion is performed on lane
21565 positions below the explicit vector length and where the vector mask is true.
21566 Masked-off lanes are ``poison``.
21571 .. code-block:: llvm
21573 %r = call <4 x i32> @llvm.vp.zext.v4i32.v4i16(<4 x i16> %a, <4 x i1> %mask, i32 %evl)
21574 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
21576 %t = zext <4 x i16> %a to <4 x i32>
21577 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
21582 '``llvm.vp.sext.*``' Intrinsics
21583 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21587 This is an overloaded intrinsic.
21591 declare <16 x i32> @llvm.vp.sext.v16i32.v16i16 (<16 x i16> <op>, <16 x i1> <mask>, i32 <vector_length>)
21592 declare <vscale x 4 x i32> @llvm.vp.sext.nxv4i32.nxv4i16 (<vscale x 4 x i16> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
21597 The '``llvm.vp.sext``' intrinsic sign extends its first operand to the return
21598 type. The operation has a mask and an explicit vector length parameter.
21604 The '``llvm.vp.sext``' intrinsic takes a value to cast as its first operand.
21605 The return type is the type to cast the value to. Both types must be vectors of
21606 :ref:`integer <t_integer>` type. The bit size of the value must be smaller than
21607 the bit size of the return type. The second operand is the vector mask. The
21608 return type, the value to cast, and the vector mask have the same number of
21609 elements. The third operand is the explicit vector length of the operation.
21614 The '``llvm.vp.sext``' intrinsic performs a sign extension by copying the sign
21615 bit (highest order bit) of the value until it reaches the size of the return
21616 type. When sign extending from i1, the result will always be either -1 or 0.
21617 The conversion is performed on lane positions below the explicit vector length
21618 and where the vector mask is true. Masked-off lanes are ``poison``.
21623 .. code-block:: llvm
21625 %r = call <4 x i32> @llvm.vp.sext.v4i32.v4i16(<4 x i16> %a, <4 x i1> %mask, i32 %evl)
21626 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
21628 %t = sext <4 x i16> %a to <4 x i32>
21629 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
21632 .. _int_vp_fptrunc:
21634 '``llvm.vp.fptrunc.*``' Intrinsics
21635 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21639 This is an overloaded intrinsic.
21643 declare <16 x float> @llvm.vp.fptrunc.v16f32.v16f64 (<16 x double> <op>, <16 x i1> <mask>, i32 <vector_length>)
21644 declare <vscale x 4 x float> @llvm.vp.trunc.nxv4f32.nxv4f64 (<vscale x 4 x double> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
21649 The '``llvm.vp.fptrunc``' intrinsic truncates its first operand to the return
21650 type. The operation has a mask and an explicit vector length parameter.
21656 The '``llvm.vp.fptrunc``' intrinsic takes a value to cast as its first operand.
21657 The return type is the type to cast the value to. Both types must be vector of
21658 :ref:`floating-point <t_floating>` type. The bit size of the value must be
21659 larger than the bit size of the return type. This implies that
21660 '``llvm.vp.fptrunc``' cannot be used to make a *no-op cast*. The second operand
21661 is the vector mask. The return type, the value to cast, and the vector mask have
21662 the same number of elements. The third operand is the explicit vector length of
21668 The '``llvm.vp.fptrunc``' intrinsic casts a ``value`` from a larger
21669 :ref:`floating-point <t_floating>` type to a smaller :ref:`floating-point
21670 <t_floating>` type.
21671 This instruction is assumed to execute in the default :ref:`floating-point
21672 environment <floatenv>`. The conversion is performed on lane positions below the
21673 explicit vector length and where the vector mask is true. Masked-off lanes are
21679 .. code-block:: llvm
21681 %r = call <4 x float> @llvm.vp.fptrunc.v4f32.v4f64(<4 x double> %a, <4 x i1> %mask, i32 %evl)
21682 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
21684 %t = fptrunc <4 x double> %a to <4 x float>
21685 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
21690 '``llvm.vp.fpext.*``' Intrinsics
21691 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21695 This is an overloaded intrinsic.
21699 declare <16 x double> @llvm.vp.fpext.v16f64.v16f32 (<16 x float> <op>, <16 x i1> <mask>, i32 <vector_length>)
21700 declare <vscale x 4 x double> @llvm.vp.fpext.nxv4f64.nxv4f32 (<vscale x 4 x float> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
21705 The '``llvm.vp.fpext``' intrinsic extends its first operand to the return
21706 type. The operation has a mask and an explicit vector length parameter.
21712 The '``llvm.vp.fpext``' intrinsic takes a value to cast as its first operand.
21713 The return type is the type to cast the value to. Both types must be vector of
21714 :ref:`floating-point <t_floating>` type. The bit size of the value must be
21715 smaller than the bit size of the return type. This implies that
21716 '``llvm.vp.fpext``' cannot be used to make a *no-op cast*. The second operand
21717 is the vector mask. The return type, the value to cast, and the vector mask have
21718 the same number of elements. The third operand is the explicit vector length of
21724 The '``llvm.vp.fpext``' intrinsic extends the ``value`` from a smaller
21725 :ref:`floating-point <t_floating>` type to a larger :ref:`floating-point
21726 <t_floating>` type. The '``llvm.vp.fpext``' cannot be used to make a
21727 *no-op cast* because it always changes bits. Use ``bitcast`` to make a
21728 *no-op cast* for a floating-point cast.
21729 The conversion is performed on lane positions below the explicit vector length
21730 and where the vector mask is true. Masked-off lanes are ``poison``.
21735 .. code-block:: llvm
21737 %r = call <4 x double> @llvm.vp.fpext.v4f64.v4f32(<4 x float> %a, <4 x i1> %mask, i32 %evl)
21738 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
21740 %t = fpext <4 x float> %a to <4 x double>
21741 %also.r = select <4 x i1> %mask, <4 x double> %t, <4 x double> poison
21746 '``llvm.vp.fptoui.*``' Intrinsics
21747 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21751 This is an overloaded intrinsic.
21755 declare <16 x i32> @llvm.vp.fptoui.v16i32.v16f32 (<16 x float> <op>, <16 x i1> <mask>, i32 <vector_length>)
21756 declare <vscale x 4 x i32> @llvm.vp.fptoui.nxv4i32.nxv4f32 (<vscale x 4 x float> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
21757 declare <256 x i64> @llvm.vp.fptoui.v256i64.v256f64 (<256 x double> <op>, <256 x i1> <mask>, i32 <vector_length>)
21762 The '``llvm.vp.fptoui``' intrinsic converts the :ref:`floating-point
21763 <t_floating>` operand to the unsigned integer return type.
21764 The operation has a mask and an explicit vector length parameter.
21770 The '``llvm.vp.fptoui``' intrinsic takes a value to cast as its first operand.
21771 The value to cast must be a vector of :ref:`floating-point <t_floating>` type.
21772 The return type is the type to cast the value to. The return type must be
21773 vector of :ref:`integer <t_integer>` type. The second operand is the vector
21774 mask. The return type, the value to cast, and the vector mask have the same
21775 number of elements. The third operand is the explicit vector length of the
21781 The '``llvm.vp.fptoui``' intrinsic converts its :ref:`floating-point
21782 <t_floating>` operand into the nearest (rounding towards zero) unsigned integer
21783 value where the lane position is below the explicit vector length and the
21784 vector mask is true. Masked-off lanes are ``poison``. On enabled lanes where
21785 conversion takes place and the value cannot fit in the return type, the result
21786 on that lane is a :ref:`poison value <poisonvalues>`.
21791 .. code-block:: llvm
21793 %r = call <4 x i32> @llvm.vp.fptoui.v4i32.v4f32(<4 x float> %a, <4 x i1> %mask, i32 %evl)
21794 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
21796 %t = fptoui <4 x float> %a to <4 x i32>
21797 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
21802 '``llvm.vp.fptosi.*``' Intrinsics
21803 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21807 This is an overloaded intrinsic.
21811 declare <16 x i32> @llvm.vp.fptosi.v16i32.v16f32 (<16 x float> <op>, <16 x i1> <mask>, i32 <vector_length>)
21812 declare <vscale x 4 x i32> @llvm.vp.fptosi.nxv4i32.nxv4f32 (<vscale x 4 x float> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
21813 declare <256 x i64> @llvm.vp.fptosi.v256i64.v256f64 (<256 x double> <op>, <256 x i1> <mask>, i32 <vector_length>)
21818 The '``llvm.vp.fptosi``' intrinsic converts the :ref:`floating-point
21819 <t_floating>` operand to the signed integer return type.
21820 The operation has a mask and an explicit vector length parameter.
21826 The '``llvm.vp.fptosi``' intrinsic takes a value to cast as its first operand.
21827 The value to cast must be a vector of :ref:`floating-point <t_floating>` type.
21828 The return type is the type to cast the value to. The return type must be
21829 vector of :ref:`integer <t_integer>` type. The second operand is the vector
21830 mask. The return type, the value to cast, and the vector mask have the same
21831 number of elements. The third operand is the explicit vector length of the
21837 The '``llvm.vp.fptosi``' intrinsic converts its :ref:`floating-point
21838 <t_floating>` operand into the nearest (rounding towards zero) signed integer
21839 value where the lane position is below the explicit vector length and the
21840 vector mask is true. Masked-off lanes are ``poison``. On enabled lanes where
21841 conversion takes place and the value cannot fit in the return type, the result
21842 on that lane is a :ref:`poison value <poisonvalues>`.
21847 .. code-block:: llvm
21849 %r = call <4 x i32> @llvm.vp.fptosi.v4i32.v4f32(<4 x float> %a, <4 x i1> %mask, i32 %evl)
21850 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
21852 %t = fptosi <4 x float> %a to <4 x i32>
21853 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
21858 '``llvm.vp.uitofp.*``' Intrinsics
21859 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21863 This is an overloaded intrinsic.
21867 declare <16 x float> @llvm.vp.uitofp.v16f32.v16i32 (<16 x i32> <op>, <16 x i1> <mask>, i32 <vector_length>)
21868 declare <vscale x 4 x float> @llvm.vp.uitofp.nxv4f32.nxv4i32 (<vscale x 4 x i32> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
21869 declare <256 x double> @llvm.vp.uitofp.v256f64.v256i64 (<256 x i64> <op>, <256 x i1> <mask>, i32 <vector_length>)
21874 The '``llvm.vp.uitofp``' intrinsic converts its unsigned integer operand to the
21875 :ref:`floating-point <t_floating>` return type. The operation has a mask and
21876 an explicit vector length parameter.
21882 The '``llvm.vp.uitofp``' intrinsic takes a value to cast as its first operand.
21883 The value to cast must be vector of :ref:`integer <t_integer>` type. The
21884 return type is the type to cast the value to. The return type must be a vector
21885 of :ref:`floating-point <t_floating>` type. The second operand is the vector
21886 mask. The return type, the value to cast, and the vector mask have the same
21887 number of elements. The third operand is the explicit vector length of the
21893 The '``llvm.vp.uitofp``' intrinsic interprets its first operand as an unsigned
21894 integer quantity and converts it to the corresponding floating-point value. If
21895 the value cannot be exactly represented, it is rounded using the default
21896 rounding mode. The conversion is performed on lane positions below the
21897 explicit vector length and where the vector mask is true. Masked-off lanes are
21903 .. code-block:: llvm
21905 %r = call <4 x float> @llvm.vp.uitofp.v4f32.v4i32(<4 x i32> %a, <4 x i1> %mask, i32 %evl)
21906 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
21908 %t = uitofp <4 x i32> %a to <4 x float>
21909 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
21914 '``llvm.vp.sitofp.*``' Intrinsics
21915 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21919 This is an overloaded intrinsic.
21923 declare <16 x float> @llvm.vp.sitofp.v16f32.v16i32 (<16 x i32> <op>, <16 x i1> <mask>, i32 <vector_length>)
21924 declare <vscale x 4 x float> @llvm.vp.sitofp.nxv4f32.nxv4i32 (<vscale x 4 x i32> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
21925 declare <256 x double> @llvm.vp.sitofp.v256f64.v256i64 (<256 x i64> <op>, <256 x i1> <mask>, i32 <vector_length>)
21930 The '``llvm.vp.sitofp``' intrinsic converts its signed integer operand to the
21931 :ref:`floating-point <t_floating>` return type. The operation has a mask and
21932 an explicit vector length parameter.
21938 The '``llvm.vp.sitofp``' intrinsic takes a value to cast as its first operand.
21939 The value to cast must be vector of :ref:`integer <t_integer>` type. The
21940 return type is the type to cast the value to. The return type must be a vector
21941 of :ref:`floating-point <t_floating>` type. The second operand is the vector
21942 mask. The return type, the value to cast, and the vector mask have the same
21943 number of elements. The third operand is the explicit vector length of the
21949 The '``llvm.vp.sitofp``' intrinsic interprets its first operand as a signed
21950 integer quantity and converts it to the corresponding floating-point value. If
21951 the value cannot be exactly represented, it is rounded using the default
21952 rounding mode. The conversion is performed on lane positions below the
21953 explicit vector length and where the vector mask is true. Masked-off lanes are
21959 .. code-block:: llvm
21961 %r = call <4 x float> @llvm.vp.sitofp.v4f32.v4i32(<4 x i32> %a, <4 x i1> %mask, i32 %evl)
21962 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
21964 %t = sitofp <4 x i32> %a to <4 x float>
21965 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
21968 .. _int_vp_ptrtoint:
21970 '``llvm.vp.ptrtoint.*``' Intrinsics
21971 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21975 This is an overloaded intrinsic.
21979 declare <16 x i8> @llvm.vp.ptrtoint.v16i8.v16p0(<16 x ptr> <op>, <16 x i1> <mask>, i32 <vector_length>)
21980 declare <vscale x 4 x i8> @llvm.vp.ptrtoint.nxv4i8.nxv4p0(<vscale x 4 x ptr> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
21981 declare <256 x i64> @llvm.vp.ptrtoint.v16i64.v16p0(<256 x ptr> <op>, <256 x i1> <mask>, i32 <vector_length>)
21986 The '``llvm.vp.ptrtoint``' intrinsic converts its pointer to the integer return
21987 type. The operation has a mask and an explicit vector length parameter.
21993 The '``llvm.vp.ptrtoint``' intrinsic takes a value to cast as its first operand
21994 , which must be a vector of pointers, and a type to cast it to return type,
21995 which must be a vector of :ref:`integer <t_integer>` type.
21996 The second operand is the vector mask. The return type, the value to cast, and
21997 the vector mask have the same number of elements.
21998 The third operand is the explicit vector length of the operation.
22003 The '``llvm.vp.ptrtoint``' intrinsic converts value to return type by
22004 interpreting the pointer value as an integer and either truncating or zero
22005 extending that value to the size of the integer type.
22006 If ``value`` is smaller than return type, then a zero extension is done. If
22007 ``value`` is larger than return type, then a truncation is done. If they are
22008 the same size, then nothing is done (*no-op cast*) other than a type
22010 The conversion is performed on lane positions below the explicit vector length
22011 and where the vector mask is true. Masked-off lanes are ``poison``.
22016 .. code-block:: llvm
22018 %r = call <4 x i8> @llvm.vp.ptrtoint.v4i8.v4p0i32(<4 x ptr> %a, <4 x i1> %mask, i32 %evl)
22019 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
22021 %t = ptrtoint <4 x ptr> %a to <4 x i8>
22022 %also.r = select <4 x i1> %mask, <4 x i8> %t, <4 x i8> poison
22025 .. _int_vp_inttoptr:
22027 '``llvm.vp.inttoptr.*``' Intrinsics
22028 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22032 This is an overloaded intrinsic.
22036 declare <16 x ptr> @llvm.vp.inttoptr.v16p0.v16i32 (<16 x i32> <op>, <16 x i1> <mask>, i32 <vector_length>)
22037 declare <vscale x 4 x ptr> @llvm.vp.inttoptr.nxv4p0.nxv4i32 (<vscale x 4 x i32> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
22038 declare <256 x ptr> @llvm.vp.inttoptr.v256p0.v256i32 (<256 x i32> <op>, <256 x i1> <mask>, i32 <vector_length>)
22043 The '``llvm.vp.inttoptr``' intrinsic converts its integer value to the point
22044 return type. The operation has a mask and an explicit vector length parameter.
22050 The '``llvm.vp.inttoptr``' intrinsic takes a value to cast as its first operand
22051 , which must be a vector of :ref:`integer <t_integer>` type, and a type to cast
22052 it to return type, which must be a vector of pointers type.
22053 The second operand is the vector mask. The return type, the value to cast, and
22054 the vector mask have the same number of elements.
22055 The third operand is the explicit vector length of the operation.
22060 The '``llvm.vp.inttoptr``' intrinsic converts ``value`` to return type by
22061 applying either a zero extension or a truncation depending on the size of the
22062 integer ``value``. If ``value`` is larger than the size of a pointer, then a
22063 truncation is done. If ``value`` is smaller than the size of a pointer, then a
22064 zero extension is done. If they are the same size, nothing is done (*no-op cast*).
22065 The conversion is performed on lane positions below the explicit vector length
22066 and where the vector mask is true. Masked-off lanes are ``poison``.
22071 .. code-block:: llvm
22073 %r = call <4 x ptr> @llvm.vp.inttoptr.v4p0i32.v4i32(<4 x i32> %a, <4 x i1> %mask, i32 %evl)
22074 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
22076 %t = inttoptr <4 x i32> %a to <4 x ptr>
22077 %also.r = select <4 x i1> %mask, <4 x ptr> %t, <4 x ptr> poison
22082 '``llvm.vp.fcmp.*``' Intrinsics
22083 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22087 This is an overloaded intrinsic.
22091 declare <16 x i1> @llvm.vp.fcmp.v16f32(<16 x float> <left_op>, <16 x float> <right_op>, metadata <condition code>, <16 x i1> <mask>, i32 <vector_length>)
22092 declare <vscale x 4 x i1> @llvm.vp.fcmp.nxv4f32(<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, metadata <condition code>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
22093 declare <256 x i1> @llvm.vp.fcmp.v256f64(<256 x double> <left_op>, <256 x double> <right_op>, metadata <condition code>, <256 x i1> <mask>, i32 <vector_length>)
22098 The '``llvm.vp.fcmp``' intrinsic returns a vector of boolean values based on
22099 the comparison of its operands. The operation has a mask and an explicit vector
22106 The '``llvm.vp.fcmp``' intrinsic takes the two values to compare as its first
22107 and second operands. These two values must be vectors of :ref:`floating-point
22108 <t_floating>` types.
22109 The return type is the result of the comparison. The return type must be a
22110 vector of :ref:`i1 <t_integer>` type. The fourth operand is the vector mask.
22111 The return type, the values to compare, and the vector mask have the same
22112 number of elements. The third operand is the condition code indicating the kind
22113 of comparison to perform. It must be a metadata string with :ref:`one of the
22114 supported floating-point condition code values <fcmp_md_cc>`. The fifth operand
22115 is the explicit vector length of the operation.
22120 The '``llvm.vp.fcmp``' compares its first two operands according to the
22121 condition code given as the third operand. The operands are compared element by
22122 element on each enabled lane, where the the semantics of the comparison are
22123 defined :ref:`according to the condition code <fcmp_md_cc_sem>`. Masked-off
22124 lanes are ``poison``.
22129 .. code-block:: llvm
22131 %r = call <4 x i1> @llvm.vp.fcmp.v4f32(<4 x float> %a, <4 x float> %b, metadata !"oeq", <4 x i1> %mask, i32 %evl)
22132 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
22134 %t = fcmp oeq <4 x float> %a, %b
22135 %also.r = select <4 x i1> %mask, <4 x i1> %t, <4 x i1> poison
22140 '``llvm.vp.icmp.*``' Intrinsics
22141 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22145 This is an overloaded intrinsic.
22149 declare <32 x i1> @llvm.vp.icmp.v32i32(<32 x i32> <left_op>, <32 x i32> <right_op>, metadata <condition code>, <32 x i1> <mask>, i32 <vector_length>)
22150 declare <vscale x 2 x i1> @llvm.vp.icmp.nxv2i32(<vscale x 2 x i32> <left_op>, <vscale x 2 x i32> <right_op>, metadata <condition code>, <vscale x 2 x i1> <mask>, i32 <vector_length>)
22151 declare <128 x i1> @llvm.vp.icmp.v128i8(<128 x i8> <left_op>, <128 x i8> <right_op>, metadata <condition code>, <128 x i1> <mask>, i32 <vector_length>)
22156 The '``llvm.vp.icmp``' intrinsic returns a vector of boolean values based on
22157 the comparison of its operands. The operation has a mask and an explicit vector
22164 The '``llvm.vp.icmp``' intrinsic takes the two values to compare as its first
22165 and second operands. These two values must be vectors of :ref:`integer
22166 <t_integer>` types.
22167 The return type is the result of the comparison. The return type must be a
22168 vector of :ref:`i1 <t_integer>` type. The fourth operand is the vector mask.
22169 The return type, the values to compare, and the vector mask have the same
22170 number of elements. The third operand is the condition code indicating the kind
22171 of comparison to perform. It must be a metadata string with :ref:`one of the
22172 supported integer condition code values <icmp_md_cc>`. The fifth operand is the
22173 explicit vector length of the operation.
22178 The '``llvm.vp.icmp``' compares its first two operands according to the
22179 condition code given as the third operand. The operands are compared element by
22180 element on each enabled lane, where the the semantics of the comparison are
22181 defined :ref:`according to the condition code <icmp_md_cc_sem>`. Masked-off
22182 lanes are ``poison``.
22187 .. code-block:: llvm
22189 %r = call <4 x i1> @llvm.vp.icmp.v4i32(<4 x i32> %a, <4 x i32> %b, metadata !"ne", <4 x i1> %mask, i32 %evl)
22190 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
22192 %t = icmp ne <4 x i32> %a, %b
22193 %also.r = select <4 x i1> %mask, <4 x i1> %t, <4 x i1> poison
22197 '``llvm.vp.ceil.*``' Intrinsics
22198 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22202 This is an overloaded intrinsic.
22206 declare <16 x float> @llvm.vp.ceil.v16f32 (<16 x float> <op>, <16 x i1> <mask>, i32 <vector_length>)
22207 declare <vscale x 4 x float> @llvm.vp.ceil.nxv4f32 (<vscale x 4 x float> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
22208 declare <256 x double> @llvm.vp.ceil.v256f64 (<256 x double> <op>, <256 x i1> <mask>, i32 <vector_length>)
22213 Predicated floating-point ceiling of a vector of floating-point values.
22219 The first operand and the result have the same vector of floating-point type.
22220 The second operand is the vector mask and has the same number of elements as the
22221 result vector type. The third operand is the explicit vector length of the
22227 The '``llvm.vp.ceil``' intrinsic performs floating-point ceiling
22228 (:ref:`ceil <int_ceil>`) of the first vector operand on each enabled lane. The
22229 result on disabled lanes is a :ref:`poison value <poisonvalues>`.
22234 .. code-block:: llvm
22236 %r = call <4 x float> @llvm.vp.ceil.v4f32(<4 x float> %a, <4 x i1> %mask, i32 %evl)
22237 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
22239 %t = call <4 x float> @llvm.ceil.v4f32(<4 x float> %a)
22240 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
22244 '``llvm.vp.floor.*``' Intrinsics
22245 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22249 This is an overloaded intrinsic.
22253 declare <16 x float> @llvm.vp.floor.v16f32 (<16 x float> <op>, <16 x i1> <mask>, i32 <vector_length>)
22254 declare <vscale x 4 x float> @llvm.vp.floor.nxv4f32 (<vscale x 4 x float> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
22255 declare <256 x double> @llvm.vp.floor.v256f64 (<256 x double> <op>, <256 x i1> <mask>, i32 <vector_length>)
22260 Predicated floating-point floor of a vector of floating-point values.
22266 The first operand and the result have the same vector of floating-point type.
22267 The second operand is the vector mask and has the same number of elements as the
22268 result vector type. The third operand is the explicit vector length of the
22274 The '``llvm.vp.floor``' intrinsic performs floating-point floor
22275 (:ref:`floor <int_floor>`) of the first vector operand on each enabled lane.
22276 The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
22281 .. code-block:: llvm
22283 %r = call <4 x float> @llvm.vp.floor.v4f32(<4 x float> %a, <4 x i1> %mask, i32 %evl)
22284 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
22286 %t = call <4 x float> @llvm.floor.v4f32(<4 x float> %a)
22287 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
22291 '``llvm.vp.rint.*``' Intrinsics
22292 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22296 This is an overloaded intrinsic.
22300 declare <16 x float> @llvm.vp.rint.v16f32 (<16 x float> <op>, <16 x i1> <mask>, i32 <vector_length>)
22301 declare <vscale x 4 x float> @llvm.vp.rint.nxv4f32 (<vscale x 4 x float> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
22302 declare <256 x double> @llvm.vp.rint.v256f64 (<256 x double> <op>, <256 x i1> <mask>, i32 <vector_length>)
22307 Predicated floating-point rint of a vector of floating-point values.
22313 The first operand and the result have the same vector of floating-point type.
22314 The second operand is the vector mask and has the same number of elements as the
22315 result vector type. The third operand is the explicit vector length of the
22321 The '``llvm.vp.rint``' intrinsic performs floating-point rint
22322 (:ref:`rint <int_rint>`) of the first vector operand on each enabled lane.
22323 The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
22328 .. code-block:: llvm
22330 %r = call <4 x float> @llvm.vp.rint.v4f32(<4 x float> %a, <4 x i1> %mask, i32 %evl)
22331 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
22333 %t = call <4 x float> @llvm.rint.v4f32(<4 x float> %a)
22334 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
22336 .. _int_vp_nearbyint:
22338 '``llvm.vp.nearbyint.*``' Intrinsics
22339 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22343 This is an overloaded intrinsic.
22347 declare <16 x float> @llvm.vp.nearbyint.v16f32 (<16 x float> <op>, <16 x i1> <mask>, i32 <vector_length>)
22348 declare <vscale x 4 x float> @llvm.vp.nearbyint.nxv4f32 (<vscale x 4 x float> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
22349 declare <256 x double> @llvm.vp.nearbyint.v256f64 (<256 x double> <op>, <256 x i1> <mask>, i32 <vector_length>)
22354 Predicated floating-point nearbyint of a vector of floating-point values.
22360 The first operand and the result have the same vector of floating-point type.
22361 The second operand is the vector mask and has the same number of elements as the
22362 result vector type. The third operand is the explicit vector length of the
22368 The '``llvm.vp.nearbyint``' intrinsic performs floating-point nearbyint
22369 (:ref:`nearbyint <int_nearbyint>`) of the first vector operand on each enabled lane.
22370 The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
22375 .. code-block:: llvm
22377 %r = call <4 x float> @llvm.vp.nearbyint.v4f32(<4 x float> %a, <4 x i1> %mask, i32 %evl)
22378 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
22380 %t = call <4 x float> @llvm.nearbyint.v4f32(<4 x float> %a)
22381 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
22385 '``llvm.vp.round.*``' Intrinsics
22386 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22390 This is an overloaded intrinsic.
22394 declare <16 x float> @llvm.vp.round.v16f32 (<16 x float> <op>, <16 x i1> <mask>, i32 <vector_length>)
22395 declare <vscale x 4 x float> @llvm.vp.round.nxv4f32 (<vscale x 4 x float> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
22396 declare <256 x double> @llvm.vp.round.v256f64 (<256 x double> <op>, <256 x i1> <mask>, i32 <vector_length>)
22401 Predicated floating-point round of a vector of floating-point values.
22407 The first operand and the result have the same vector of floating-point type.
22408 The second operand is the vector mask and has the same number of elements as the
22409 result vector type. The third operand is the explicit vector length of the
22415 The '``llvm.vp.round``' intrinsic performs floating-point round
22416 (:ref:`round <int_round>`) of the first vector operand on each enabled lane.
22417 The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
22422 .. code-block:: llvm
22424 %r = call <4 x float> @llvm.vp.round.v4f32(<4 x float> %a, <4 x i1> %mask, i32 %evl)
22425 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
22427 %t = call <4 x float> @llvm.round.v4f32(<4 x float> %a)
22428 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
22430 .. _int_vp_roundeven:
22432 '``llvm.vp.roundeven.*``' Intrinsics
22433 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22437 This is an overloaded intrinsic.
22441 declare <16 x float> @llvm.vp.roundeven.v16f32 (<16 x float> <op>, <16 x i1> <mask>, i32 <vector_length>)
22442 declare <vscale x 4 x float> @llvm.vp.roundeven.nxv4f32 (<vscale x 4 x float> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
22443 declare <256 x double> @llvm.vp.roundeven.v256f64 (<256 x double> <op>, <256 x i1> <mask>, i32 <vector_length>)
22448 Predicated floating-point roundeven of a vector of floating-point values.
22454 The first operand and the result have the same vector of floating-point type.
22455 The second operand is the vector mask and has the same number of elements as the
22456 result vector type. The third operand is the explicit vector length of the
22462 The '``llvm.vp.roundeven``' intrinsic performs floating-point roundeven
22463 (:ref:`roundeven <int_roundeven>`) of the first vector operand on each enabled
22464 lane. The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
22469 .. code-block:: llvm
22471 %r = call <4 x float> @llvm.vp.roundeven.v4f32(<4 x float> %a, <4 x i1> %mask, i32 %evl)
22472 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
22474 %t = call <4 x float> @llvm.roundeven.v4f32(<4 x float> %a)
22475 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
22477 .. _int_vp_roundtozero:
22479 '``llvm.vp.roundtozero.*``' Intrinsics
22480 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22484 This is an overloaded intrinsic.
22488 declare <16 x float> @llvm.vp.roundtozero.v16f32 (<16 x float> <op>, <16 x i1> <mask>, i32 <vector_length>)
22489 declare <vscale x 4 x float> @llvm.vp.roundtozero.nxv4f32 (<vscale x 4 x float> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
22490 declare <256 x double> @llvm.vp.roundtozero.v256f64 (<256 x double> <op>, <256 x i1> <mask>, i32 <vector_length>)
22495 Predicated floating-point round-to-zero of a vector of floating-point values.
22501 The first operand and the result have the same vector of floating-point type.
22502 The second operand is the vector mask and has the same number of elements as the
22503 result vector type. The third operand is the explicit vector length of the
22509 The '``llvm.vp.roundtozero``' intrinsic performs floating-point roundeven
22510 (:ref:`llvm.trunc <int_llvm_trunc>`) of the first vector operand on each enabled lane. The
22511 result on disabled lanes is a :ref:`poison value <poisonvalues>`.
22516 .. code-block:: llvm
22518 %r = call <4 x float> @llvm.vp.roundtozero.v4f32(<4 x float> %a, <4 x i1> %mask, i32 %evl)
22519 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
22521 %t = call <4 x float> @llvm.trunc.v4f32(<4 x float> %a)
22522 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
22524 .. _int_vp_bitreverse:
22526 '``llvm.vp.bitreverse.*``' Intrinsics
22527 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22531 This is an overloaded intrinsic.
22535 declare <16 x i32> @llvm.vp.bitreverse.v16i32 (<16 x i32> <op>, <16 x i1> <mask>, i32 <vector_length>)
22536 declare <vscale x 4 x i32> @llvm.vp.bitreverse.nxv4i32 (<vscale x 4 x i32> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
22537 declare <256 x i64> @llvm.vp.bitreverse.v256i64 (<256 x i64> <op>, <256 x i1> <mask>, i32 <vector_length>)
22542 Predicated bitreverse of a vector of integers.
22548 The first operand and the result have the same vector of integer type. The
22549 second operand is the vector mask and has the same number of elements as the
22550 result vector type. The third operand is the explicit vector length of the
22556 The '``llvm.vp.bitreverse``' intrinsic performs bitreverse (:ref:`bitreverse <int_bitreverse>`) of the first operand on each
22557 enabled lane. The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
22562 .. code-block:: llvm
22564 %r = call <4 x i32> @llvm.vp.bitreverse.v4i32(<4 x i32> %a, <4 x i1> %mask, i32 %evl)
22565 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
22567 %t = call <4 x i32> @llvm.bitreverse.v4i32(<4 x i32> %a)
22568 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
22573 '``llvm.vp.bswap.*``' Intrinsics
22574 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22578 This is an overloaded intrinsic.
22582 declare <16 x i32> @llvm.vp.bswap.v16i32 (<16 x i32> <op>, <16 x i1> <mask>, i32 <vector_length>)
22583 declare <vscale x 4 x i32> @llvm.vp.bswap.nxv4i32 (<vscale x 4 x i32> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
22584 declare <256 x i64> @llvm.vp.bswap.v256i64 (<256 x i64> <op>, <256 x i1> <mask>, i32 <vector_length>)
22589 Predicated bswap of a vector of integers.
22595 The first operand and the result have the same vector of integer type. The
22596 second operand is the vector mask and has the same number of elements as the
22597 result vector type. The third operand is the explicit vector length of the
22603 The '``llvm.vp.bswap``' intrinsic performs bswap (:ref:`bswap <int_bswap>`) of the first operand on each
22604 enabled lane. The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
22609 .. code-block:: llvm
22611 %r = call <4 x i32> @llvm.vp.bswap.v4i32(<4 x i32> %a, <4 x i1> %mask, i32 %evl)
22612 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
22614 %t = call <4 x i32> @llvm.bswap.v4i32(<4 x i32> %a)
22615 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
22620 '``llvm.vp.ctpop.*``' Intrinsics
22621 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22625 This is an overloaded intrinsic.
22629 declare <16 x i32> @llvm.vp.ctpop.v16i32 (<16 x i32> <op>, <16 x i1> <mask>, i32 <vector_length>)
22630 declare <vscale x 4 x i32> @llvm.vp.ctpop.nxv4i32 (<vscale x 4 x i32> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
22631 declare <256 x i64> @llvm.vp.ctpop.v256i64 (<256 x i64> <op>, <256 x i1> <mask>, i32 <vector_length>)
22636 Predicated ctpop of a vector of integers.
22642 The first operand and the result have the same vector of integer type. The
22643 second operand is the vector mask and has the same number of elements as the
22644 result vector type. The third operand is the explicit vector length of the
22650 The '``llvm.vp.ctpop``' intrinsic performs ctpop (:ref:`ctpop <int_ctpop>`) of the first operand on each
22651 enabled lane. The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
22656 .. code-block:: llvm
22658 %r = call <4 x i32> @llvm.vp.ctpop.v4i32(<4 x i32> %a, <4 x i1> %mask, i32 %evl)
22659 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
22661 %t = call <4 x i32> @llvm.ctpop.v4i32(<4 x i32> %a)
22662 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
22667 '``llvm.vp.ctlz.*``' Intrinsics
22668 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22672 This is an overloaded intrinsic.
22676 declare <16 x i32> @llvm.vp.ctlz.v16i32 (<16 x i32> <op>, <16 x i1> <mask>, i32 <vector_length>, i1 <is_zero_poison>)
22677 declare <vscale x 4 x i32> @llvm.vp.ctlz.nxv4i32 (<vscale x 4 x i32> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>, i1 <is_zero_poison>)
22678 declare <256 x i64> @llvm.vp.ctlz.v256i64 (<256 x i64> <op>, <256 x i1> <mask>, i32 <vector_length>, i1 <is_zero_poison>)
22683 Predicated ctlz of a vector of integers.
22689 The first operand and the result have the same vector of integer type. The
22690 second operand is the vector mask and has the same number of elements as the
22691 result vector type. The third operand is the explicit vector length of the
22697 The '``llvm.vp.ctlz``' intrinsic performs ctlz (:ref:`ctlz <int_ctlz>`) of the first operand on each
22698 enabled lane. The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
22703 .. code-block:: llvm
22705 %r = call <4 x i32> @llvm.vp.ctlz.v4i32(<4 x i32> %a, <4 x i1> %mask, i32 %evl, i1 false)
22706 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
22708 %t = call <4 x i32> @llvm.ctlz.v4i32(<4 x i32> %a, i1 false)
22709 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
22714 '``llvm.vp.cttz.*``' Intrinsics
22715 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22719 This is an overloaded intrinsic.
22723 declare <16 x i32> @llvm.vp.cttz.v16i32 (<16 x i32> <op>, <16 x i1> <mask>, i32 <vector_length>, i1 <is_zero_poison>)
22724 declare <vscale x 4 x i32> @llvm.vp.cttz.nxv4i32 (<vscale x 4 x i32> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>, i1 <is_zero_poison>)
22725 declare <256 x i64> @llvm.vp.cttz.v256i64 (<256 x i64> <op>, <256 x i1> <mask>, i32 <vector_length>, i1 <is_zero_poison>)
22730 Predicated cttz of a vector of integers.
22736 The first operand and the result have the same vector of integer type. The
22737 second operand is the vector mask and has the same number of elements as the
22738 result vector type. The third operand is the explicit vector length of the
22744 The '``llvm.vp.cttz``' intrinsic performs cttz (:ref:`cttz <int_cttz>`) of the first operand on each
22745 enabled lane. The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
22750 .. code-block:: llvm
22752 %r = call <4 x i32> @llvm.vp.cttz.v4i32(<4 x i32> %a, <4 x i1> %mask, i32 %evl, i1 false)
22753 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
22755 %t = call <4 x i32> @llvm.cttz.v4i32(<4 x i32> %a, i1 false)
22756 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
22761 '``llvm.vp.fshl.*``' Intrinsics
22762 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22766 This is an overloaded intrinsic.
22770 declare <16 x i32> @llvm.vp.fshl.v16i32 (<16 x i32> <left_op>, <16 x i32> <middle_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
22771 declare <vscale x 4 x i32> @llvm.vp.fshl.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <middle_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
22772 declare <256 x i64> @llvm.vp.fshl.v256i64 (<256 x i64> <left_op>, <256 x i64> <middle_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
22777 Predicated fshl of three vectors of integers.
22783 The first three operand and the result have the same vector of integer type. The
22784 fourth operand is the vector mask and has the same number of elements as the
22785 result vector type. The fifth operand is the explicit vector length of the
22791 The '``llvm.vp.fshl``' intrinsic performs fshl (:ref:`fshl <int_fshl>`) of the first, second, and third
22792 vector operand on each enabled lane. The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
22798 .. code-block:: llvm
22800 %r = call <4 x i32> @llvm.vp.fshl.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i32> %c, <4 x i1> %mask, i32 %evl)
22801 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
22803 %t = call <4 x i32> @llvm.fshl.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i32> %c)
22804 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
22807 '``llvm.vp.fshr.*``' Intrinsics
22808 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22812 This is an overloaded intrinsic.
22816 declare <16 x i32> @llvm.vp.fshr.v16i32 (<16 x i32> <left_op>, <16 x i32> <middle_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
22817 declare <vscale x 4 x i32> @llvm.vp.fshr.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <middle_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
22818 declare <256 x i64> @llvm.vp.fshr.v256i64 (<256 x i64> <left_op>, <256 x i64> <middle_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
22823 Predicated fshr of three vectors of integers.
22829 The first three operand and the result have the same vector of integer type. The
22830 fourth operand is the vector mask and has the same number of elements as the
22831 result vector type. The fifth operand is the explicit vector length of the
22837 The '``llvm.vp.fshr``' intrinsic performs fshr (:ref:`fshr <int_fshr>`) of the first, second, and third
22838 vector operand on each enabled lane. The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
22844 .. code-block:: llvm
22846 %r = call <4 x i32> @llvm.vp.fshr.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i32> %c, <4 x i1> %mask, i32 %evl)
22847 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
22849 %t = call <4 x i32> @llvm.fshr.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i32> %c)
22850 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
22853 .. _int_mload_mstore:
22855 Masked Vector Load and Store Intrinsics
22856 ---------------------------------------
22858 LLVM provides intrinsics for predicated vector load and store operations. The predicate is specified by a mask operand, which holds one bit per vector element, switching the associated vector lane on or off. The memory addresses corresponding to the "off" lanes are not accessed. When all bits of the mask are on, the intrinsic is identical to a regular vector load or store. When all bits are off, no memory is accessed.
22862 '``llvm.masked.load.*``' Intrinsics
22863 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22867 This is an overloaded intrinsic. The loaded data is a vector of any integer, floating-point or pointer data type.
22871 declare <16 x float> @llvm.masked.load.v16f32.p0(ptr <ptr>, i32 <alignment>, <16 x i1> <mask>, <16 x float> <passthru>)
22872 declare <2 x double> @llvm.masked.load.v2f64.p0(ptr <ptr>, i32 <alignment>, <2 x i1> <mask>, <2 x double> <passthru>)
22873 ;; The data is a vector of pointers
22874 declare <8 x ptr> @llvm.masked.load.v8p0.p0(ptr <ptr>, i32 <alignment>, <8 x i1> <mask>, <8 x ptr> <passthru>)
22879 Reads a vector from memory according to the provided mask. The mask holds a bit for each vector lane, and is used to prevent memory accesses to the masked-off lanes. The masked-off lanes in the result vector are taken from the corresponding lanes of the '``passthru``' operand.
22885 The first operand is the base pointer for the load. The second operand is the alignment of the source location. It must be a power of two constant integer value. The third operand, mask, is a vector of boolean values with the same number of elements as the return type. The fourth is a pass-through value that is used to fill the masked-off lanes of the result. The return type, underlying type of the base pointer and the type of the '``passthru``' operand are the same vector types.
22890 The '``llvm.masked.load``' intrinsic is designed for conditional reading of selected vector elements in a single IR operation. It is useful for targets that support vector masked loads and allows vectorizing predicated basic blocks on these targets. Other targets may support this intrinsic differently, for example by lowering it into a sequence of branches that guard scalar load operations.
22891 The result of this operation is equivalent to a regular vector load instruction followed by a 'select' between the loaded and the passthru values, predicated on the same mask. However, using this intrinsic prevents exceptions on memory access to masked-off lanes.
22896 %res = call <16 x float> @llvm.masked.load.v16f32.p0(ptr %ptr, i32 4, <16 x i1>%mask, <16 x float> %passthru)
22898 ;; The result of the two following instructions is identical aside from potential memory access exception
22899 %loadlal = load <16 x float>, ptr %ptr, align 4
22900 %res = select <16 x i1> %mask, <16 x float> %loadlal, <16 x float> %passthru
22904 '``llvm.masked.store.*``' Intrinsics
22905 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22909 This is an overloaded intrinsic. The data stored in memory is a vector of any integer, floating-point or pointer data type.
22913 declare void @llvm.masked.store.v8i32.p0 (<8 x i32> <value>, ptr <ptr>, i32 <alignment>, <8 x i1> <mask>)
22914 declare void @llvm.masked.store.v16f32.p0(<16 x float> <value>, ptr <ptr>, i32 <alignment>, <16 x i1> <mask>)
22915 ;; The data is a vector of pointers
22916 declare void @llvm.masked.store.v8p0.p0 (<8 x ptr> <value>, ptr <ptr>, i32 <alignment>, <8 x i1> <mask>)
22921 Writes a vector to memory according to the provided mask. The mask holds a bit for each vector lane, and is used to prevent memory accesses to the masked-off lanes.
22926 The first operand is the vector value to be written to memory. The second operand is the base pointer for the store, it has the same underlying type as the value operand. The third operand is the alignment of the destination location. It must be a power of two constant integer value. The fourth operand, mask, is a vector of boolean values. The types of the mask and the value operand must have the same number of vector elements.
22932 The '``llvm.masked.store``' intrinsics is designed for conditional writing of selected vector elements in a single IR operation. It is useful for targets that support vector masked store and allows vectorizing predicated basic blocks on these targets. Other targets may support this intrinsic differently, for example by lowering it into a sequence of branches that guard scalar store operations.
22933 The result of this operation is equivalent to a load-modify-store sequence. However, using this intrinsic prevents exceptions and data races on memory access to masked-off lanes.
22937 call void @llvm.masked.store.v16f32.p0(<16 x float> %value, ptr %ptr, i32 4, <16 x i1> %mask)
22939 ;; The result of the following instructions is identical aside from potential data races and memory access exceptions
22940 %oldval = load <16 x float>, ptr %ptr, align 4
22941 %res = select <16 x i1> %mask, <16 x float> %value, <16 x float> %oldval
22942 store <16 x float> %res, ptr %ptr, align 4
22945 Masked Vector Gather and Scatter Intrinsics
22946 -------------------------------------------
22948 LLVM provides intrinsics for vector gather and scatter operations. They are similar to :ref:`Masked Vector Load and Store <int_mload_mstore>`, except they are designed for arbitrary memory accesses, rather than sequential memory accesses. Gather and scatter also employ a mask operand, which holds one bit per vector element, switching the associated vector lane on or off. The memory addresses corresponding to the "off" lanes are not accessed. When all bits are off, no memory is accessed.
22952 '``llvm.masked.gather.*``' Intrinsics
22953 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22957 This is an overloaded intrinsic. The loaded data are multiple scalar values of any integer, floating-point or pointer data type gathered together into one vector.
22961 declare <16 x float> @llvm.masked.gather.v16f32.v16p0(<16 x ptr> <ptrs>, i32 <alignment>, <16 x i1> <mask>, <16 x float> <passthru>)
22962 declare <2 x double> @llvm.masked.gather.v2f64.v2p1(<2 x ptr addrspace(1)> <ptrs>, i32 <alignment>, <2 x i1> <mask>, <2 x double> <passthru>)
22963 declare <8 x ptr> @llvm.masked.gather.v8p0.v8p0(<8 x ptr> <ptrs>, i32 <alignment>, <8 x i1> <mask>, <8 x ptr> <passthru>)
22968 Reads scalar values from arbitrary memory locations and gathers them into one vector. The memory locations are provided in the vector of pointers '``ptrs``'. The memory is accessed according to the provided mask. The mask holds a bit for each vector lane, and is used to prevent memory accesses to the masked-off lanes. The masked-off lanes in the result vector are taken from the corresponding lanes of the '``passthru``' operand.
22974 The first operand is a vector of pointers which holds all memory addresses to read. The second operand is an alignment of the source addresses. It must be 0 or a power of two constant integer value. The third operand, mask, is a vector of boolean values with the same number of elements as the return type. The fourth is a pass-through value that is used to fill the masked-off lanes of the result. The return type, underlying type of the vector of pointers and the type of the '``passthru``' operand are the same vector types.
22979 The '``llvm.masked.gather``' intrinsic is designed for conditional reading of multiple scalar values from arbitrary memory locations in a single IR operation. It is useful for targets that support vector masked gathers and allows vectorizing basic blocks with data and control divergence. Other targets may support this intrinsic differently, for example by lowering it into a sequence of scalar load operations.
22980 The semantics of this operation are equivalent to a sequence of conditional scalar loads with subsequent gathering all loaded values into a single vector. The mask restricts memory access to certain lanes and facilitates vectorization of predicated basic blocks.
22985 %res = call <4 x double> @llvm.masked.gather.v4f64.v4p0(<4 x ptr> %ptrs, i32 8, <4 x i1> <i1 true, i1 true, i1 true, i1 true>, <4 x double> poison)
22987 ;; The gather with all-true mask is equivalent to the following instruction sequence
22988 %ptr0 = extractelement <4 x ptr> %ptrs, i32 0
22989 %ptr1 = extractelement <4 x ptr> %ptrs, i32 1
22990 %ptr2 = extractelement <4 x ptr> %ptrs, i32 2
22991 %ptr3 = extractelement <4 x ptr> %ptrs, i32 3
22993 %val0 = load double, ptr %ptr0, align 8
22994 %val1 = load double, ptr %ptr1, align 8
22995 %val2 = load double, ptr %ptr2, align 8
22996 %val3 = load double, ptr %ptr3, align 8
22998 %vec0 = insertelement <4 x double> poison, %val0, 0
22999 %vec01 = insertelement <4 x double> %vec0, %val1, 1
23000 %vec012 = insertelement <4 x double> %vec01, %val2, 2
23001 %vec0123 = insertelement <4 x double> %vec012, %val3, 3
23005 '``llvm.masked.scatter.*``' Intrinsics
23006 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23010 This is an overloaded intrinsic. The data stored in memory is a vector of any integer, floating-point or pointer data type. Each vector element is stored in an arbitrary memory address. Scatter with overlapping addresses is guaranteed to be ordered from least-significant to most-significant element.
23014 declare void @llvm.masked.scatter.v8i32.v8p0 (<8 x i32> <value>, <8 x ptr> <ptrs>, i32 <alignment>, <8 x i1> <mask>)
23015 declare void @llvm.masked.scatter.v16f32.v16p1(<16 x float> <value>, <16 x ptr addrspace(1)> <ptrs>, i32 <alignment>, <16 x i1> <mask>)
23016 declare void @llvm.masked.scatter.v4p0.v4p0 (<4 x ptr> <value>, <4 x ptr> <ptrs>, i32 <alignment>, <4 x i1> <mask>)
23021 Writes each element from the value vector to the corresponding memory address. The memory addresses are represented as a vector of pointers. Writing is done according to the provided mask. The mask holds a bit for each vector lane, and is used to prevent memory accesses to the masked-off lanes.
23026 The first operand is a vector value to be written to memory. The second operand is a vector of pointers, pointing to where the value elements should be stored. It has the same underlying type as the value operand. The third operand is an alignment of the destination addresses. It must be 0 or a power of two constant integer value. The fourth operand, mask, is a vector of boolean values. The types of the mask and the value operand must have the same number of vector elements.
23031 The '``llvm.masked.scatter``' intrinsics is designed for writing selected vector elements to arbitrary memory addresses in a single IR operation. The operation may be conditional, when not all bits in the mask are switched on. It is useful for targets that support vector masked scatter and allows vectorizing basic blocks with data and control divergence. Other targets may support this intrinsic differently, for example by lowering it into a sequence of branches that guard scalar store operations.
23035 ;; This instruction unconditionally stores data vector in multiple addresses
23036 call @llvm.masked.scatter.v8i32.v8p0(<8 x i32> %value, <8 x ptr> %ptrs, i32 4, <8 x i1> <true, true, .. true>)
23038 ;; It is equivalent to a list of scalar stores
23039 %val0 = extractelement <8 x i32> %value, i32 0
23040 %val1 = extractelement <8 x i32> %value, i32 1
23042 %val7 = extractelement <8 x i32> %value, i32 7
23043 %ptr0 = extractelement <8 x ptr> %ptrs, i32 0
23044 %ptr1 = extractelement <8 x ptr> %ptrs, i32 1
23046 %ptr7 = extractelement <8 x ptr> %ptrs, i32 7
23047 ;; Note: the order of the following stores is important when they overlap:
23048 store i32 %val0, ptr %ptr0, align 4
23049 store i32 %val1, ptr %ptr1, align 4
23051 store i32 %val7, ptr %ptr7, align 4
23054 Masked Vector Expanding Load and Compressing Store Intrinsics
23055 -------------------------------------------------------------
23057 LLVM provides intrinsics for expanding load and compressing store operations. Data selected from a vector according to a mask is stored in consecutive memory addresses (compressed store), and vice-versa (expanding load). These operations effective map to "if (cond.i) a[j++] = v.i" and "if (cond.i) v.i = a[j++]" patterns, respectively. Note that when the mask starts with '1' bits followed by '0' bits, these operations are identical to :ref:`llvm.masked.store <int_mstore>` and :ref:`llvm.masked.load <int_mload>`.
23059 .. _int_expandload:
23061 '``llvm.masked.expandload.*``' Intrinsics
23062 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23066 This is an overloaded intrinsic. Several values of integer, floating point or pointer data type are loaded from consecutive memory addresses and stored into the elements of a vector according to the mask.
23070 declare <16 x float> @llvm.masked.expandload.v16f32 (ptr <ptr>, <16 x i1> <mask>, <16 x float> <passthru>)
23071 declare <2 x i64> @llvm.masked.expandload.v2i64 (ptr <ptr>, <2 x i1> <mask>, <2 x i64> <passthru>)
23076 Reads a number of scalar values sequentially from memory location provided in '``ptr``' and spreads them in a vector. The '``mask``' holds a bit for each vector lane. The number of elements read from memory is equal to the number of '1' bits in the mask. The loaded elements are positioned in the destination vector according to the sequence of '1' and '0' bits in the mask. E.g., if the mask vector is '10010001', "expandload" reads 3 values from memory addresses ptr, ptr+1, ptr+2 and places them in lanes 0, 3 and 7 accordingly. The masked-off lanes are filled by elements from the corresponding lanes of the '``passthru``' operand.
23082 The first operand is the base pointer for the load. It has the same underlying type as the element of the returned vector. The second operand, mask, is a vector of boolean values with the same number of elements as the return type. The third is a pass-through value that is used to fill the masked-off lanes of the result. The return type and the type of the '``passthru``' operand have the same vector type.
23087 The '``llvm.masked.expandload``' intrinsic is designed for reading multiple scalar values from adjacent memory addresses into possibly non-adjacent vector lanes. It is useful for targets that support vector expanding loads and allows vectorizing loop with cross-iteration dependency like in the following example:
23091 // In this loop we load from B and spread the elements into array A.
23092 double *A, B; int *C;
23093 for (int i = 0; i < size; ++i) {
23099 .. code-block:: llvm
23101 ; Load several elements from array B and expand them in a vector.
23102 ; The number of loaded elements is equal to the number of '1' elements in the Mask.
23103 %Tmp = call <8 x double> @llvm.masked.expandload.v8f64(ptr %Bptr, <8 x i1> %Mask, <8 x double> poison)
23104 ; Store the result in A
23105 call void @llvm.masked.store.v8f64.p0(<8 x double> %Tmp, ptr %Aptr, i32 8, <8 x i1> %Mask)
23107 ; %Bptr should be increased on each iteration according to the number of '1' elements in the Mask.
23108 %MaskI = bitcast <8 x i1> %Mask to i8
23109 %MaskIPopcnt = call i8 @llvm.ctpop.i8(i8 %MaskI)
23110 %MaskI64 = zext i8 %MaskIPopcnt to i64
23111 %BNextInd = add i64 %BInd, %MaskI64
23114 Other targets may support this intrinsic differently, for example, by lowering it into a sequence of conditional scalar load operations and shuffles.
23115 If all mask elements are '1', the intrinsic behavior is equivalent to the regular unmasked vector load.
23117 .. _int_compressstore:
23119 '``llvm.masked.compressstore.*``' Intrinsics
23120 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23124 This is an overloaded intrinsic. A number of scalar values of integer, floating point or pointer data type are collected from an input vector and stored into adjacent memory addresses. A mask defines which elements to collect from the vector.
23128 declare void @llvm.masked.compressstore.v8i32 (<8 x i32> <value>, ptr <ptr>, <8 x i1> <mask>)
23129 declare void @llvm.masked.compressstore.v16f32 (<16 x float> <value>, ptr <ptr>, <16 x i1> <mask>)
23134 Selects elements from input vector '``value``' according to the '``mask``'. All selected elements are written into adjacent memory addresses starting at address '`ptr`', from lower to higher. The mask holds a bit for each vector lane, and is used to select elements to be stored. The number of elements to be stored is equal to the number of active bits in the mask.
23139 The first operand is the input vector, from which elements are collected and written to memory. The second operand is the base pointer for the store, it has the same underlying type as the element of the input vector operand. The third operand is the mask, a vector of boolean values. The mask and the input vector must have the same number of vector elements.
23145 The '``llvm.masked.compressstore``' intrinsic is designed for compressing data in memory. It allows to collect elements from possibly non-adjacent lanes of a vector and store them contiguously in memory in one IR operation. It is useful for targets that support compressing store operations and allows vectorizing loops with cross-iteration dependences like in the following example:
23149 // In this loop we load elements from A and store them consecutively in B
23150 double *A, B; int *C;
23151 for (int i = 0; i < size; ++i) {
23157 .. code-block:: llvm
23159 ; Load elements from A.
23160 %Tmp = call <8 x double> @llvm.masked.load.v8f64.p0(ptr %Aptr, i32 8, <8 x i1> %Mask, <8 x double> poison)
23161 ; Store all selected elements consecutively in array B
23162 call <void> @llvm.masked.compressstore.v8f64(<8 x double> %Tmp, ptr %Bptr, <8 x i1> %Mask)
23164 ; %Bptr should be increased on each iteration according to the number of '1' elements in the Mask.
23165 %MaskI = bitcast <8 x i1> %Mask to i8
23166 %MaskIPopcnt = call i8 @llvm.ctpop.i8(i8 %MaskI)
23167 %MaskI64 = zext i8 %MaskIPopcnt to i64
23168 %BNextInd = add i64 %BInd, %MaskI64
23171 Other targets may support this intrinsic differently, for example, by lowering it into a sequence of branches that guard scalar store operations.
23177 This class of intrinsics provides information about the
23178 :ref:`lifetime of memory objects <objectlifetime>` and ranges where variables
23183 '``llvm.lifetime.start``' Intrinsic
23184 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23191 declare void @llvm.lifetime.start(i64 <size>, ptr nocapture <ptr>)
23196 The '``llvm.lifetime.start``' intrinsic specifies the start of a memory
23202 The first argument is a constant integer representing the size of the
23203 object, or -1 if it is variable sized. The second argument is a pointer
23209 If ``ptr`` is a stack-allocated object and it points to the first byte of
23210 the object, the object is initially marked as dead.
23211 ``ptr`` is conservatively considered as a non-stack-allocated object if
23212 the stack coloring algorithm that is used in the optimization pipeline cannot
23213 conclude that ``ptr`` is a stack-allocated object.
23215 After '``llvm.lifetime.start``', the stack object that ``ptr`` points is marked
23216 as alive and has an uninitialized value.
23217 The stack object is marked as dead when either
23218 :ref:`llvm.lifetime.end <int_lifeend>` to the alloca is executed or the
23221 After :ref:`llvm.lifetime.end <int_lifeend>` is called,
23222 '``llvm.lifetime.start``' on the stack object can be called again.
23223 The second '``llvm.lifetime.start``' call marks the object as alive, but it
23224 does not change the address of the object.
23226 If ``ptr`` is a non-stack-allocated object, it does not point to the first
23227 byte of the object or it is a stack object that is already alive, it simply
23228 fills all bytes of the object with ``poison``.
23233 '``llvm.lifetime.end``' Intrinsic
23234 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23241 declare void @llvm.lifetime.end(i64 <size>, ptr nocapture <ptr>)
23246 The '``llvm.lifetime.end``' intrinsic specifies the end of a memory object's
23252 The first argument is a constant integer representing the size of the
23253 object, or -1 if it is variable sized. The second argument is a pointer
23259 If ``ptr`` is a stack-allocated object and it points to the first byte of the
23260 object, the object is dead.
23261 ``ptr`` is conservatively considered as a non-stack-allocated object if
23262 the stack coloring algorithm that is used in the optimization pipeline cannot
23263 conclude that ``ptr`` is a stack-allocated object.
23265 Calling ``llvm.lifetime.end`` on an already dead alloca is no-op.
23267 If ``ptr`` is a non-stack-allocated object or it does not point to the first
23268 byte of the object, it is equivalent to simply filling all bytes of the object
23272 '``llvm.invariant.start``' Intrinsic
23273 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23277 This is an overloaded intrinsic. The memory object can belong to any address space.
23281 declare ptr @llvm.invariant.start.p0(i64 <size>, ptr nocapture <ptr>)
23286 The '``llvm.invariant.start``' intrinsic specifies that the contents of
23287 a memory object will not change.
23292 The first argument is a constant integer representing the size of the
23293 object, or -1 if it is variable sized. The second argument is a pointer
23299 This intrinsic indicates that until an ``llvm.invariant.end`` that uses
23300 the return value, the referenced memory location is constant and
23303 '``llvm.invariant.end``' Intrinsic
23304 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23308 This is an overloaded intrinsic. The memory object can belong to any address space.
23312 declare void @llvm.invariant.end.p0(ptr <start>, i64 <size>, ptr nocapture <ptr>)
23317 The '``llvm.invariant.end``' intrinsic specifies that the contents of a
23318 memory object are mutable.
23323 The first argument is the matching ``llvm.invariant.start`` intrinsic.
23324 The second argument is a constant integer representing the size of the
23325 object, or -1 if it is variable sized and the third argument is a
23326 pointer to the object.
23331 This intrinsic indicates that the memory is mutable again.
23333 '``llvm.launder.invariant.group``' Intrinsic
23334 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23338 This is an overloaded intrinsic. The memory object can belong to any address
23339 space. The returned pointer must belong to the same address space as the
23344 declare ptr @llvm.launder.invariant.group.p0(ptr <ptr>)
23349 The '``llvm.launder.invariant.group``' intrinsic can be used when an invariant
23350 established by ``invariant.group`` metadata no longer holds, to obtain a new
23351 pointer value that carries fresh invariant group information. It is an
23352 experimental intrinsic, which means that its semantics might change in the
23359 The ``llvm.launder.invariant.group`` takes only one argument, which is a pointer
23365 Returns another pointer that aliases its argument but which is considered different
23366 for the purposes of ``load``/``store`` ``invariant.group`` metadata.
23367 It does not read any accessible memory and the execution can be speculated.
23369 '``llvm.strip.invariant.group``' Intrinsic
23370 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23374 This is an overloaded intrinsic. The memory object can belong to any address
23375 space. The returned pointer must belong to the same address space as the
23380 declare ptr @llvm.strip.invariant.group.p0(ptr <ptr>)
23385 The '``llvm.strip.invariant.group``' intrinsic can be used when an invariant
23386 established by ``invariant.group`` metadata no longer holds, to obtain a new pointer
23387 value that does not carry the invariant information. It is an experimental
23388 intrinsic, which means that its semantics might change in the future.
23394 The ``llvm.strip.invariant.group`` takes only one argument, which is a pointer
23400 Returns another pointer that aliases its argument but which has no associated
23401 ``invariant.group`` metadata.
23402 It does not read any memory and can be speculated.
23408 Constrained Floating-Point Intrinsics
23409 -------------------------------------
23411 These intrinsics are used to provide special handling of floating-point
23412 operations when specific rounding mode or floating-point exception behavior is
23413 required. By default, LLVM optimization passes assume that the rounding mode is
23414 round-to-nearest and that floating-point exceptions will not be monitored.
23415 Constrained FP intrinsics are used to support non-default rounding modes and
23416 accurately preserve exception behavior without compromising LLVM's ability to
23417 optimize FP code when the default behavior is used.
23419 If any FP operation in a function is constrained then they all must be
23420 constrained. This is required for correct LLVM IR. Optimizations that
23421 move code around can create miscompiles if mixing of constrained and normal
23422 operations is done. The correct way to mix constrained and less constrained
23423 operations is to use the rounding mode and exception handling metadata to
23424 mark constrained intrinsics as having LLVM's default behavior.
23426 Each of these intrinsics corresponds to a normal floating-point operation. The
23427 data arguments and the return value are the same as the corresponding FP
23430 The rounding mode argument is a metadata string specifying what
23431 assumptions, if any, the optimizer can make when transforming constant
23432 values. Some constrained FP intrinsics omit this argument. If required
23433 by the intrinsic, this argument must be one of the following strings:
23442 "round.tonearestaway"
23444 If this argument is "round.dynamic" optimization passes must assume that the
23445 rounding mode is unknown and may change at runtime. No transformations that
23446 depend on rounding mode may be performed in this case.
23448 The other possible values for the rounding mode argument correspond to the
23449 similarly named IEEE rounding modes. If the argument is any of these values
23450 optimization passes may perform transformations as long as they are consistent
23451 with the specified rounding mode.
23453 For example, 'x-0'->'x' is not a valid transformation if the rounding mode is
23454 "round.downward" or "round.dynamic" because if the value of 'x' is +0 then
23455 'x-0' should evaluate to '-0' when rounding downward. However, this
23456 transformation is legal for all other rounding modes.
23458 For values other than "round.dynamic" optimization passes may assume that the
23459 actual runtime rounding mode (as defined in a target-specific manner) matches
23460 the specified rounding mode, but this is not guaranteed. Using a specific
23461 non-dynamic rounding mode which does not match the actual rounding mode at
23462 runtime results in undefined behavior.
23464 The exception behavior argument is a metadata string describing the floating
23465 point exception semantics that required for the intrinsic. This argument
23466 must be one of the following strings:
23474 If this argument is "fpexcept.ignore" optimization passes may assume that the
23475 exception status flags will not be read and that floating-point exceptions will
23476 be masked. This allows transformations to be performed that may change the
23477 exception semantics of the original code. For example, FP operations may be
23478 speculatively executed in this case whereas they must not be for either of the
23479 other possible values of this argument.
23481 If the exception behavior argument is "fpexcept.maytrap" optimization passes
23482 must avoid transformations that may raise exceptions that would not have been
23483 raised by the original code (such as speculatively executing FP operations), but
23484 passes are not required to preserve all exceptions that are implied by the
23485 original code. For example, exceptions may be potentially hidden by constant
23488 If the exception behavior argument is "fpexcept.strict" all transformations must
23489 strictly preserve the floating-point exception semantics of the original code.
23490 Any FP exception that would have been raised by the original code must be raised
23491 by the transformed code, and the transformed code must not raise any FP
23492 exceptions that would not have been raised by the original code. This is the
23493 exception behavior argument that will be used if the code being compiled reads
23494 the FP exception status flags, but this mode can also be used with code that
23495 unmasks FP exceptions.
23497 The number and order of floating-point exceptions is NOT guaranteed. For
23498 example, a series of FP operations that each may raise exceptions may be
23499 vectorized into a single instruction that raises each unique exception a single
23502 Proper :ref:`function attributes <fnattrs>` usage is required for the
23503 constrained intrinsics to function correctly.
23505 All function *calls* done in a function that uses constrained floating
23506 point intrinsics must have the ``strictfp`` attribute either on the
23507 calling instruction or on the declaration or definition of the function
23510 All function *definitions* that use constrained floating point intrinsics
23511 must have the ``strictfp`` attribute.
23513 '``llvm.experimental.constrained.fadd``' Intrinsic
23514 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23522 @llvm.experimental.constrained.fadd(<type> <op1>, <type> <op2>,
23523 metadata <rounding mode>,
23524 metadata <exception behavior>)
23529 The '``llvm.experimental.constrained.fadd``' intrinsic returns the sum of its
23536 The first two arguments to the '``llvm.experimental.constrained.fadd``'
23537 intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>`
23538 of floating-point values. Both arguments must have identical types.
23540 The third and fourth arguments specify the rounding mode and exception
23541 behavior as described above.
23546 The value produced is the floating-point sum of the two value operands and has
23547 the same type as the operands.
23550 '``llvm.experimental.constrained.fsub``' Intrinsic
23551 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23559 @llvm.experimental.constrained.fsub(<type> <op1>, <type> <op2>,
23560 metadata <rounding mode>,
23561 metadata <exception behavior>)
23566 The '``llvm.experimental.constrained.fsub``' intrinsic returns the difference
23567 of its two operands.
23573 The first two arguments to the '``llvm.experimental.constrained.fsub``'
23574 intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>`
23575 of floating-point values. Both arguments must have identical types.
23577 The third and fourth arguments specify the rounding mode and exception
23578 behavior as described above.
23583 The value produced is the floating-point difference of the two value operands
23584 and has the same type as the operands.
23587 '``llvm.experimental.constrained.fmul``' Intrinsic
23588 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23596 @llvm.experimental.constrained.fmul(<type> <op1>, <type> <op2>,
23597 metadata <rounding mode>,
23598 metadata <exception behavior>)
23603 The '``llvm.experimental.constrained.fmul``' intrinsic returns the product of
23610 The first two arguments to the '``llvm.experimental.constrained.fmul``'
23611 intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>`
23612 of floating-point values. Both arguments must have identical types.
23614 The third and fourth arguments specify the rounding mode and exception
23615 behavior as described above.
23620 The value produced is the floating-point product of the two value operands and
23621 has the same type as the operands.
23624 '``llvm.experimental.constrained.fdiv``' Intrinsic
23625 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23633 @llvm.experimental.constrained.fdiv(<type> <op1>, <type> <op2>,
23634 metadata <rounding mode>,
23635 metadata <exception behavior>)
23640 The '``llvm.experimental.constrained.fdiv``' intrinsic returns the quotient of
23647 The first two arguments to the '``llvm.experimental.constrained.fdiv``'
23648 intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>`
23649 of floating-point values. Both arguments must have identical types.
23651 The third and fourth arguments specify the rounding mode and exception
23652 behavior as described above.
23657 The value produced is the floating-point quotient of the two value operands and
23658 has the same type as the operands.
23661 '``llvm.experimental.constrained.frem``' Intrinsic
23662 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23670 @llvm.experimental.constrained.frem(<type> <op1>, <type> <op2>,
23671 metadata <rounding mode>,
23672 metadata <exception behavior>)
23677 The '``llvm.experimental.constrained.frem``' intrinsic returns the remainder
23678 from the division of its two operands.
23684 The first two arguments to the '``llvm.experimental.constrained.frem``'
23685 intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>`
23686 of floating-point values. Both arguments must have identical types.
23688 The third and fourth arguments specify the rounding mode and exception
23689 behavior as described above. The rounding mode argument has no effect, since
23690 the result of frem is never rounded, but the argument is included for
23691 consistency with the other constrained floating-point intrinsics.
23696 The value produced is the floating-point remainder from the division of the two
23697 value operands and has the same type as the operands. The remainder has the
23698 same sign as the dividend.
23700 '``llvm.experimental.constrained.fma``' Intrinsic
23701 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23709 @llvm.experimental.constrained.fma(<type> <op1>, <type> <op2>, <type> <op3>,
23710 metadata <rounding mode>,
23711 metadata <exception behavior>)
23716 The '``llvm.experimental.constrained.fma``' intrinsic returns the result of a
23717 fused-multiply-add operation on its operands.
23722 The first three arguments to the '``llvm.experimental.constrained.fma``'
23723 intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector
23724 <t_vector>` of floating-point values. All arguments must have identical types.
23726 The fourth and fifth arguments specify the rounding mode and exception behavior
23727 as described above.
23732 The result produced is the product of the first two operands added to the third
23733 operand computed with infinite precision, and then rounded to the target
23736 '``llvm.experimental.constrained.fptoui``' Intrinsic
23737 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23745 @llvm.experimental.constrained.fptoui(<type> <value>,
23746 metadata <exception behavior>)
23751 The '``llvm.experimental.constrained.fptoui``' intrinsic converts a
23752 floating-point ``value`` to its unsigned integer equivalent of type ``ty2``.
23757 The first argument to the '``llvm.experimental.constrained.fptoui``'
23758 intrinsic must be :ref:`floating point <t_floating>` or :ref:`vector
23759 <t_vector>` of floating point values.
23761 The second argument specifies the exception behavior as described above.
23766 The result produced is an unsigned integer converted from the floating
23767 point operand. The value is truncated, so it is rounded towards zero.
23769 '``llvm.experimental.constrained.fptosi``' Intrinsic
23770 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23778 @llvm.experimental.constrained.fptosi(<type> <value>,
23779 metadata <exception behavior>)
23784 The '``llvm.experimental.constrained.fptosi``' intrinsic converts
23785 :ref:`floating-point <t_floating>` ``value`` to type ``ty2``.
23790 The first argument to the '``llvm.experimental.constrained.fptosi``'
23791 intrinsic must be :ref:`floating point <t_floating>` or :ref:`vector
23792 <t_vector>` of floating point values.
23794 The second argument specifies the exception behavior as described above.
23799 The result produced is a signed integer converted from the floating
23800 point operand. The value is truncated, so it is rounded towards zero.
23802 '``llvm.experimental.constrained.uitofp``' Intrinsic
23803 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23811 @llvm.experimental.constrained.uitofp(<type> <value>,
23812 metadata <rounding mode>,
23813 metadata <exception behavior>)
23818 The '``llvm.experimental.constrained.uitofp``' intrinsic converts an
23819 unsigned integer ``value`` to a floating-point of type ``ty2``.
23824 The first argument to the '``llvm.experimental.constrained.uitofp``'
23825 intrinsic must be an :ref:`integer <t_integer>` or :ref:`vector
23826 <t_vector>` of integer values.
23828 The second and third arguments specify the rounding mode and exception
23829 behavior as described above.
23834 An inexact floating-point exception will be raised if rounding is required.
23835 Any result produced is a floating point value converted from the input
23838 '``llvm.experimental.constrained.sitofp``' Intrinsic
23839 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23847 @llvm.experimental.constrained.sitofp(<type> <value>,
23848 metadata <rounding mode>,
23849 metadata <exception behavior>)
23854 The '``llvm.experimental.constrained.sitofp``' intrinsic converts a
23855 signed integer ``value`` to a floating-point of type ``ty2``.
23860 The first argument to the '``llvm.experimental.constrained.sitofp``'
23861 intrinsic must be an :ref:`integer <t_integer>` or :ref:`vector
23862 <t_vector>` of integer values.
23864 The second and third arguments specify the rounding mode and exception
23865 behavior as described above.
23870 An inexact floating-point exception will be raised if rounding is required.
23871 Any result produced is a floating point value converted from the input
23874 '``llvm.experimental.constrained.fptrunc``' Intrinsic
23875 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23883 @llvm.experimental.constrained.fptrunc(<type> <value>,
23884 metadata <rounding mode>,
23885 metadata <exception behavior>)
23890 The '``llvm.experimental.constrained.fptrunc``' intrinsic truncates ``value``
23896 The first argument to the '``llvm.experimental.constrained.fptrunc``'
23897 intrinsic must be :ref:`floating point <t_floating>` or :ref:`vector
23898 <t_vector>` of floating point values. This argument must be larger in size
23901 The second and third arguments specify the rounding mode and exception
23902 behavior as described above.
23907 The result produced is a floating point value truncated to be smaller in size
23910 '``llvm.experimental.constrained.fpext``' Intrinsic
23911 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23919 @llvm.experimental.constrained.fpext(<type> <value>,
23920 metadata <exception behavior>)
23925 The '``llvm.experimental.constrained.fpext``' intrinsic extends a
23926 floating-point ``value`` to a larger floating-point value.
23931 The first argument to the '``llvm.experimental.constrained.fpext``'
23932 intrinsic must be :ref:`floating point <t_floating>` or :ref:`vector
23933 <t_vector>` of floating point values. This argument must be smaller in size
23936 The second argument specifies the exception behavior as described above.
23941 The result produced is a floating point value extended to be larger in size
23942 than the operand. All restrictions that apply to the fpext instruction also
23943 apply to this intrinsic.
23945 '``llvm.experimental.constrained.fcmp``' and '``llvm.experimental.constrained.fcmps``' Intrinsics
23946 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23954 @llvm.experimental.constrained.fcmp(<type> <op1>, <type> <op2>,
23955 metadata <condition code>,
23956 metadata <exception behavior>)
23958 @llvm.experimental.constrained.fcmps(<type> <op1>, <type> <op2>,
23959 metadata <condition code>,
23960 metadata <exception behavior>)
23965 The '``llvm.experimental.constrained.fcmp``' and
23966 '``llvm.experimental.constrained.fcmps``' intrinsics return a boolean
23967 value or vector of boolean values based on comparison of its operands.
23969 If the operands are floating-point scalars, then the result type is a
23970 boolean (:ref:`i1 <t_integer>`).
23972 If the operands are floating-point vectors, then the result type is a
23973 vector of boolean with the same number of elements as the operands being
23976 The '``llvm.experimental.constrained.fcmp``' intrinsic performs a quiet
23977 comparison operation while the '``llvm.experimental.constrained.fcmps``'
23978 intrinsic performs a signaling comparison operation.
23983 The first two arguments to the '``llvm.experimental.constrained.fcmp``'
23984 and '``llvm.experimental.constrained.fcmps``' intrinsics must be
23985 :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>`
23986 of floating-point values. Both arguments must have identical types.
23988 The third argument is the condition code indicating the kind of comparison
23989 to perform. It must be a metadata string with one of the following values:
23993 - "``oeq``": ordered and equal
23994 - "``ogt``": ordered and greater than
23995 - "``oge``": ordered and greater than or equal
23996 - "``olt``": ordered and less than
23997 - "``ole``": ordered and less than or equal
23998 - "``one``": ordered and not equal
23999 - "``ord``": ordered (no nans)
24000 - "``ueq``": unordered or equal
24001 - "``ugt``": unordered or greater than
24002 - "``uge``": unordered or greater than or equal
24003 - "``ult``": unordered or less than
24004 - "``ule``": unordered or less than or equal
24005 - "``une``": unordered or not equal
24006 - "``uno``": unordered (either nans)
24008 *Ordered* means that neither operand is a NAN while *unordered* means
24009 that either operand may be a NAN.
24011 The fourth argument specifies the exception behavior as described above.
24016 ``op1`` and ``op2`` are compared according to the condition code given
24017 as the third argument. If the operands are vectors, then the
24018 vectors are compared element by element. Each comparison performed
24019 always yields an :ref:`i1 <t_integer>` result, as follows:
24021 .. _fcmp_md_cc_sem:
24023 - "``oeq``": yields ``true`` if both operands are not a NAN and ``op1``
24024 is equal to ``op2``.
24025 - "``ogt``": yields ``true`` if both operands are not a NAN and ``op1``
24026 is greater than ``op2``.
24027 - "``oge``": yields ``true`` if both operands are not a NAN and ``op1``
24028 is greater than or equal to ``op2``.
24029 - "``olt``": yields ``true`` if both operands are not a NAN and ``op1``
24030 is less than ``op2``.
24031 - "``ole``": yields ``true`` if both operands are not a NAN and ``op1``
24032 is less than or equal to ``op2``.
24033 - "``one``": yields ``true`` if both operands are not a NAN and ``op1``
24034 is not equal to ``op2``.
24035 - "``ord``": yields ``true`` if both operands are not a NAN.
24036 - "``ueq``": yields ``true`` if either operand is a NAN or ``op1`` is
24038 - "``ugt``": yields ``true`` if either operand is a NAN or ``op1`` is
24039 greater than ``op2``.
24040 - "``uge``": yields ``true`` if either operand is a NAN or ``op1`` is
24041 greater than or equal to ``op2``.
24042 - "``ult``": yields ``true`` if either operand is a NAN or ``op1`` is
24044 - "``ule``": yields ``true`` if either operand is a NAN or ``op1`` is
24045 less than or equal to ``op2``.
24046 - "``une``": yields ``true`` if either operand is a NAN or ``op1`` is
24047 not equal to ``op2``.
24048 - "``uno``": yields ``true`` if either operand is a NAN.
24050 The quiet comparison operation performed by
24051 '``llvm.experimental.constrained.fcmp``' will only raise an exception
24052 if either operand is a SNAN. The signaling comparison operation
24053 performed by '``llvm.experimental.constrained.fcmps``' will raise an
24054 exception if either operand is a NAN (QNAN or SNAN). Such an exception
24055 does not preclude a result being produced (e.g. exception might only
24056 set a flag), therefore the distinction between ordered and unordered
24057 comparisons is also relevant for the
24058 '``llvm.experimental.constrained.fcmps``' intrinsic.
24060 '``llvm.experimental.constrained.fmuladd``' Intrinsic
24061 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24069 @llvm.experimental.constrained.fmuladd(<type> <op1>, <type> <op2>,
24071 metadata <rounding mode>,
24072 metadata <exception behavior>)
24077 The '``llvm.experimental.constrained.fmuladd``' intrinsic represents
24078 multiply-add expressions that can be fused if the code generator determines
24079 that (a) the target instruction set has support for a fused operation,
24080 and (b) that the fused operation is more efficient than the equivalent,
24081 separate pair of mul and add instructions.
24086 The first three arguments to the '``llvm.experimental.constrained.fmuladd``'
24087 intrinsic must be floating-point or vector of floating-point values.
24088 All three arguments must have identical types.
24090 The fourth and fifth arguments specify the rounding mode and exception behavior
24091 as described above.
24100 %0 = call float @llvm.experimental.constrained.fmuladd.f32(%a, %b, %c,
24101 metadata <rounding mode>,
24102 metadata <exception behavior>)
24104 is equivalent to the expression:
24108 %0 = call float @llvm.experimental.constrained.fmul.f32(%a, %b,
24109 metadata <rounding mode>,
24110 metadata <exception behavior>)
24111 %1 = call float @llvm.experimental.constrained.fadd.f32(%0, %c,
24112 metadata <rounding mode>,
24113 metadata <exception behavior>)
24115 except that it is unspecified whether rounding will be performed between the
24116 multiplication and addition steps. Fusion is not guaranteed, even if the target
24117 platform supports it.
24118 If a fused multiply-add is required, the corresponding
24119 :ref:`llvm.experimental.constrained.fma <int_fma>` intrinsic function should be
24121 This never sets errno, just as '``llvm.experimental.constrained.fma.*``'.
24123 Constrained libm-equivalent Intrinsics
24124 --------------------------------------
24126 In addition to the basic floating-point operations for which constrained
24127 intrinsics are described above, there are constrained versions of various
24128 operations which provide equivalent behavior to a corresponding libm function.
24129 These intrinsics allow the precise behavior of these operations with respect to
24130 rounding mode and exception behavior to be controlled.
24132 As with the basic constrained floating-point intrinsics, the rounding mode
24133 and exception behavior arguments only control the behavior of the optimizer.
24134 They do not change the runtime floating-point environment.
24137 '``llvm.experimental.constrained.sqrt``' Intrinsic
24138 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24146 @llvm.experimental.constrained.sqrt(<type> <op1>,
24147 metadata <rounding mode>,
24148 metadata <exception behavior>)
24153 The '``llvm.experimental.constrained.sqrt``' intrinsic returns the square root
24154 of the specified value, returning the same value as the libm '``sqrt``'
24155 functions would, but without setting ``errno``.
24160 The first argument and the return type are floating-point numbers of the same
24163 The second and third arguments specify the rounding mode and exception
24164 behavior as described above.
24169 This function returns the nonnegative square root of the specified value.
24170 If the value is less than negative zero, a floating-point exception occurs
24171 and the return value is architecture specific.
24174 '``llvm.experimental.constrained.pow``' Intrinsic
24175 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24183 @llvm.experimental.constrained.pow(<type> <op1>, <type> <op2>,
24184 metadata <rounding mode>,
24185 metadata <exception behavior>)
24190 The '``llvm.experimental.constrained.pow``' intrinsic returns the first operand
24191 raised to the (positive or negative) power specified by the second operand.
24196 The first two arguments and the return value are floating-point numbers of the
24197 same type. The second argument specifies the power to which the first argument
24200 The third and fourth arguments specify the rounding mode and exception
24201 behavior as described above.
24206 This function returns the first value raised to the second power,
24207 returning the same values as the libm ``pow`` functions would, and
24208 handles error conditions in the same way.
24211 '``llvm.experimental.constrained.powi``' Intrinsic
24212 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24220 @llvm.experimental.constrained.powi(<type> <op1>, i32 <op2>,
24221 metadata <rounding mode>,
24222 metadata <exception behavior>)
24227 The '``llvm.experimental.constrained.powi``' intrinsic returns the first operand
24228 raised to the (positive or negative) power specified by the second operand. The
24229 order of evaluation of multiplications is not defined. When a vector of
24230 floating-point type is used, the second argument remains a scalar integer value.
24236 The first argument and the return value are floating-point numbers of the same
24237 type. The second argument is a 32-bit signed integer specifying the power to
24238 which the first argument should be raised.
24240 The third and fourth arguments specify the rounding mode and exception
24241 behavior as described above.
24246 This function returns the first value raised to the second power with an
24247 unspecified sequence of rounding operations.
24250 '``llvm.experimental.constrained.sin``' Intrinsic
24251 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24259 @llvm.experimental.constrained.sin(<type> <op1>,
24260 metadata <rounding mode>,
24261 metadata <exception behavior>)
24266 The '``llvm.experimental.constrained.sin``' intrinsic returns the sine of the
24272 The first argument and the return type are floating-point numbers of the same
24275 The second and third arguments specify the rounding mode and exception
24276 behavior as described above.
24281 This function returns the sine of the specified operand, returning the
24282 same values as the libm ``sin`` functions would, and handles error
24283 conditions in the same way.
24286 '``llvm.experimental.constrained.cos``' Intrinsic
24287 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24295 @llvm.experimental.constrained.cos(<type> <op1>,
24296 metadata <rounding mode>,
24297 metadata <exception behavior>)
24302 The '``llvm.experimental.constrained.cos``' intrinsic returns the cosine of the
24308 The first argument and the return type are floating-point numbers of the same
24311 The second and third arguments specify the rounding mode and exception
24312 behavior as described above.
24317 This function returns the cosine of the specified operand, returning the
24318 same values as the libm ``cos`` functions would, and handles error
24319 conditions in the same way.
24322 '``llvm.experimental.constrained.exp``' Intrinsic
24323 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24331 @llvm.experimental.constrained.exp(<type> <op1>,
24332 metadata <rounding mode>,
24333 metadata <exception behavior>)
24338 The '``llvm.experimental.constrained.exp``' intrinsic computes the base-e
24339 exponential of the specified value.
24344 The first argument and the return value are floating-point numbers of the same
24347 The second and third arguments specify the rounding mode and exception
24348 behavior as described above.
24353 This function returns the same values as the libm ``exp`` functions
24354 would, and handles error conditions in the same way.
24357 '``llvm.experimental.constrained.exp2``' Intrinsic
24358 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24366 @llvm.experimental.constrained.exp2(<type> <op1>,
24367 metadata <rounding mode>,
24368 metadata <exception behavior>)
24373 The '``llvm.experimental.constrained.exp2``' intrinsic computes the base-2
24374 exponential of the specified value.
24380 The first argument and the return value are floating-point numbers of the same
24383 The second and third arguments specify the rounding mode and exception
24384 behavior as described above.
24389 This function returns the same values as the libm ``exp2`` functions
24390 would, and handles error conditions in the same way.
24393 '``llvm.experimental.constrained.log``' Intrinsic
24394 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24402 @llvm.experimental.constrained.log(<type> <op1>,
24403 metadata <rounding mode>,
24404 metadata <exception behavior>)
24409 The '``llvm.experimental.constrained.log``' intrinsic computes the base-e
24410 logarithm of the specified value.
24415 The first argument and the return value are floating-point numbers of the same
24418 The second and third arguments specify the rounding mode and exception
24419 behavior as described above.
24425 This function returns the same values as the libm ``log`` functions
24426 would, and handles error conditions in the same way.
24429 '``llvm.experimental.constrained.log10``' Intrinsic
24430 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24438 @llvm.experimental.constrained.log10(<type> <op1>,
24439 metadata <rounding mode>,
24440 metadata <exception behavior>)
24445 The '``llvm.experimental.constrained.log10``' intrinsic computes the base-10
24446 logarithm of the specified value.
24451 The first argument and the return value are floating-point numbers of the same
24454 The second and third arguments specify the rounding mode and exception
24455 behavior as described above.
24460 This function returns the same values as the libm ``log10`` functions
24461 would, and handles error conditions in the same way.
24464 '``llvm.experimental.constrained.log2``' Intrinsic
24465 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24473 @llvm.experimental.constrained.log2(<type> <op1>,
24474 metadata <rounding mode>,
24475 metadata <exception behavior>)
24480 The '``llvm.experimental.constrained.log2``' intrinsic computes the base-2
24481 logarithm of the specified value.
24486 The first argument and the return value are floating-point numbers of the same
24489 The second and third arguments specify the rounding mode and exception
24490 behavior as described above.
24495 This function returns the same values as the libm ``log2`` functions
24496 would, and handles error conditions in the same way.
24499 '``llvm.experimental.constrained.rint``' Intrinsic
24500 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24508 @llvm.experimental.constrained.rint(<type> <op1>,
24509 metadata <rounding mode>,
24510 metadata <exception behavior>)
24515 The '``llvm.experimental.constrained.rint``' intrinsic returns the first
24516 operand rounded to the nearest integer. It may raise an inexact floating-point
24517 exception if the operand is not an integer.
24522 The first argument and the return value are floating-point numbers of the same
24525 The second and third arguments specify the rounding mode and exception
24526 behavior as described above.
24531 This function returns the same values as the libm ``rint`` functions
24532 would, and handles error conditions in the same way. The rounding mode is
24533 described, not determined, by the rounding mode argument. The actual rounding
24534 mode is determined by the runtime floating-point environment. The rounding
24535 mode argument is only intended as information to the compiler.
24538 '``llvm.experimental.constrained.lrint``' Intrinsic
24539 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24547 @llvm.experimental.constrained.lrint(<fptype> <op1>,
24548 metadata <rounding mode>,
24549 metadata <exception behavior>)
24554 The '``llvm.experimental.constrained.lrint``' intrinsic returns the first
24555 operand rounded to the nearest integer. An inexact floating-point exception
24556 will be raised if the operand is not an integer. An invalid exception is
24557 raised if the result is too large to fit into a supported integer type,
24558 and in this case the result is undefined.
24563 The first argument is a floating-point number. The return value is an
24564 integer type. Not all types are supported on all targets. The supported
24565 types are the same as the ``llvm.lrint`` intrinsic and the ``lrint``
24568 The second and third arguments specify the rounding mode and exception
24569 behavior as described above.
24574 This function returns the same values as the libm ``lrint`` functions
24575 would, and handles error conditions in the same way.
24577 The rounding mode is described, not determined, by the rounding mode
24578 argument. The actual rounding mode is determined by the runtime floating-point
24579 environment. The rounding mode argument is only intended as information
24582 If the runtime floating-point environment is using the default rounding mode
24583 then the results will be the same as the llvm.lrint intrinsic.
24586 '``llvm.experimental.constrained.llrint``' Intrinsic
24587 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24595 @llvm.experimental.constrained.llrint(<fptype> <op1>,
24596 metadata <rounding mode>,
24597 metadata <exception behavior>)
24602 The '``llvm.experimental.constrained.llrint``' intrinsic returns the first
24603 operand rounded to the nearest integer. An inexact floating-point exception
24604 will be raised if the operand is not an integer. An invalid exception is
24605 raised if the result is too large to fit into a supported integer type,
24606 and in this case the result is undefined.
24611 The first argument is a floating-point number. The return value is an
24612 integer type. Not all types are supported on all targets. The supported
24613 types are the same as the ``llvm.llrint`` intrinsic and the ``llrint``
24616 The second and third arguments specify the rounding mode and exception
24617 behavior as described above.
24622 This function returns the same values as the libm ``llrint`` functions
24623 would, and handles error conditions in the same way.
24625 The rounding mode is described, not determined, by the rounding mode
24626 argument. The actual rounding mode is determined by the runtime floating-point
24627 environment. The rounding mode argument is only intended as information
24630 If the runtime floating-point environment is using the default rounding mode
24631 then the results will be the same as the llvm.llrint intrinsic.
24634 '``llvm.experimental.constrained.nearbyint``' Intrinsic
24635 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24643 @llvm.experimental.constrained.nearbyint(<type> <op1>,
24644 metadata <rounding mode>,
24645 metadata <exception behavior>)
24650 The '``llvm.experimental.constrained.nearbyint``' intrinsic returns the first
24651 operand rounded to the nearest integer. It will not raise an inexact
24652 floating-point exception if the operand is not an integer.
24658 The first argument and the return value are floating-point numbers of the same
24661 The second and third arguments specify the rounding mode and exception
24662 behavior as described above.
24667 This function returns the same values as the libm ``nearbyint`` functions
24668 would, and handles error conditions in the same way. The rounding mode is
24669 described, not determined, by the rounding mode argument. The actual rounding
24670 mode is determined by the runtime floating-point environment. The rounding
24671 mode argument is only intended as information to the compiler.
24674 '``llvm.experimental.constrained.maxnum``' Intrinsic
24675 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24683 @llvm.experimental.constrained.maxnum(<type> <op1>, <type> <op2>
24684 metadata <exception behavior>)
24689 The '``llvm.experimental.constrained.maxnum``' intrinsic returns the maximum
24690 of the two arguments.
24695 The first two arguments and the return value are floating-point numbers
24698 The third argument specifies the exception behavior as described above.
24703 This function follows the IEEE-754 semantics for maxNum.
24706 '``llvm.experimental.constrained.minnum``' Intrinsic
24707 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24715 @llvm.experimental.constrained.minnum(<type> <op1>, <type> <op2>
24716 metadata <exception behavior>)
24721 The '``llvm.experimental.constrained.minnum``' intrinsic returns the minimum
24722 of the two arguments.
24727 The first two arguments and the return value are floating-point numbers
24730 The third argument specifies the exception behavior as described above.
24735 This function follows the IEEE-754 semantics for minNum.
24738 '``llvm.experimental.constrained.maximum``' Intrinsic
24739 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24747 @llvm.experimental.constrained.maximum(<type> <op1>, <type> <op2>
24748 metadata <exception behavior>)
24753 The '``llvm.experimental.constrained.maximum``' intrinsic returns the maximum
24754 of the two arguments, propagating NaNs and treating -0.0 as less than +0.0.
24759 The first two arguments and the return value are floating-point numbers
24762 The third argument specifies the exception behavior as described above.
24767 This function follows semantics specified in the draft of IEEE 754-2018.
24770 '``llvm.experimental.constrained.minimum``' Intrinsic
24771 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24779 @llvm.experimental.constrained.minimum(<type> <op1>, <type> <op2>
24780 metadata <exception behavior>)
24785 The '``llvm.experimental.constrained.minimum``' intrinsic returns the minimum
24786 of the two arguments, propagating NaNs and treating -0.0 as less than +0.0.
24791 The first two arguments and the return value are floating-point numbers
24794 The third argument specifies the exception behavior as described above.
24799 This function follows semantics specified in the draft of IEEE 754-2018.
24802 '``llvm.experimental.constrained.ceil``' Intrinsic
24803 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24811 @llvm.experimental.constrained.ceil(<type> <op1>,
24812 metadata <exception behavior>)
24817 The '``llvm.experimental.constrained.ceil``' intrinsic returns the ceiling of the
24823 The first argument and the return value are floating-point numbers of the same
24826 The second argument specifies the exception behavior as described above.
24831 This function returns the same values as the libm ``ceil`` functions
24832 would and handles error conditions in the same way.
24835 '``llvm.experimental.constrained.floor``' Intrinsic
24836 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24844 @llvm.experimental.constrained.floor(<type> <op1>,
24845 metadata <exception behavior>)
24850 The '``llvm.experimental.constrained.floor``' intrinsic returns the floor of the
24856 The first argument and the return value are floating-point numbers of the same
24859 The second argument specifies the exception behavior as described above.
24864 This function returns the same values as the libm ``floor`` functions
24865 would and handles error conditions in the same way.
24868 '``llvm.experimental.constrained.round``' Intrinsic
24869 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24877 @llvm.experimental.constrained.round(<type> <op1>,
24878 metadata <exception behavior>)
24883 The '``llvm.experimental.constrained.round``' intrinsic returns the first
24884 operand rounded to the nearest integer.
24889 The first argument and the return value are floating-point numbers of the same
24892 The second argument specifies the exception behavior as described above.
24897 This function returns the same values as the libm ``round`` functions
24898 would and handles error conditions in the same way.
24901 '``llvm.experimental.constrained.roundeven``' Intrinsic
24902 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24910 @llvm.experimental.constrained.roundeven(<type> <op1>,
24911 metadata <exception behavior>)
24916 The '``llvm.experimental.constrained.roundeven``' intrinsic returns the first
24917 operand rounded to the nearest integer in floating-point format, rounding
24918 halfway cases to even (that is, to the nearest value that is an even integer),
24919 regardless of the current rounding direction.
24924 The first argument and the return value are floating-point numbers of the same
24927 The second argument specifies the exception behavior as described above.
24932 This function implements IEEE-754 operation ``roundToIntegralTiesToEven``. It
24933 also behaves in the same way as C standard function ``roundeven`` and can signal
24934 the invalid operation exception for a SNAN operand.
24937 '``llvm.experimental.constrained.lround``' Intrinsic
24938 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24946 @llvm.experimental.constrained.lround(<fptype> <op1>,
24947 metadata <exception behavior>)
24952 The '``llvm.experimental.constrained.lround``' intrinsic returns the first
24953 operand rounded to the nearest integer with ties away from zero. It will
24954 raise an inexact floating-point exception if the operand is not an integer.
24955 An invalid exception is raised if the result is too large to fit into a
24956 supported integer type, and in this case the result is undefined.
24961 The first argument is a floating-point number. The return value is an
24962 integer type. Not all types are supported on all targets. The supported
24963 types are the same as the ``llvm.lround`` intrinsic and the ``lround``
24966 The second argument specifies the exception behavior as described above.
24971 This function returns the same values as the libm ``lround`` functions
24972 would and handles error conditions in the same way.
24975 '``llvm.experimental.constrained.llround``' Intrinsic
24976 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24984 @llvm.experimental.constrained.llround(<fptype> <op1>,
24985 metadata <exception behavior>)
24990 The '``llvm.experimental.constrained.llround``' intrinsic returns the first
24991 operand rounded to the nearest integer with ties away from zero. It will
24992 raise an inexact floating-point exception if the operand is not an integer.
24993 An invalid exception is raised if the result is too large to fit into a
24994 supported integer type, and in this case the result is undefined.
24999 The first argument is a floating-point number. The return value is an
25000 integer type. Not all types are supported on all targets. The supported
25001 types are the same as the ``llvm.llround`` intrinsic and the ``llround``
25004 The second argument specifies the exception behavior as described above.
25009 This function returns the same values as the libm ``llround`` functions
25010 would and handles error conditions in the same way.
25013 '``llvm.experimental.constrained.trunc``' Intrinsic
25014 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25022 @llvm.experimental.constrained.trunc(<type> <op1>,
25023 metadata <exception behavior>)
25028 The '``llvm.experimental.constrained.trunc``' intrinsic returns the first
25029 operand rounded to the nearest integer not larger in magnitude than the
25035 The first argument and the return value are floating-point numbers of the same
25038 The second argument specifies the exception behavior as described above.
25043 This function returns the same values as the libm ``trunc`` functions
25044 would and handles error conditions in the same way.
25046 .. _int_experimental_noalias_scope_decl:
25048 '``llvm.experimental.noalias.scope.decl``' Intrinsic
25049 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25057 declare void @llvm.experimental.noalias.scope.decl(metadata !id.scope.list)
25062 The ``llvm.experimental.noalias.scope.decl`` intrinsic identifies where a
25063 noalias scope is declared. When the intrinsic is duplicated, a decision must
25064 also be made about the scope: depending on the reason of the duplication,
25065 the scope might need to be duplicated as well.
25071 The ``!id.scope.list`` argument is metadata that is a list of ``noalias``
25072 metadata references. The format is identical to that required for ``noalias``
25073 metadata. This list must have exactly one element.
25078 The ``llvm.experimental.noalias.scope.decl`` intrinsic identifies where a
25079 noalias scope is declared. When the intrinsic is duplicated, a decision must
25080 also be made about the scope: depending on the reason of the duplication,
25081 the scope might need to be duplicated as well.
25083 For example, when the intrinsic is used inside a loop body, and that loop is
25084 unrolled, the associated noalias scope must also be duplicated. Otherwise, the
25085 noalias property it signifies would spill across loop iterations, whereas it
25086 was only valid within a single iteration.
25088 .. code-block:: llvm
25090 ; This examples shows two possible positions for noalias.decl and how they impact the semantics:
25091 ; If it is outside the loop (Version 1), then %a and %b are noalias across *all* iterations.
25092 ; If it is inside the loop (Version 2), then %a and %b are noalias only within *one* iteration.
25093 declare void @decl_in_loop(ptr %a.base, ptr %b.base) {
25095 ; call void @llvm.experimental.noalias.scope.decl(metadata !2) ; Version 1: noalias decl outside loop
25099 %a = phi ptr [ %a.base, %entry ], [ %a.inc, %loop ]
25100 %b = phi ptr [ %b.base, %entry ], [ %b.inc, %loop ]
25101 ; call void @llvm.experimental.noalias.scope.decl(metadata !2) ; Version 2: noalias decl inside loop
25102 %val = load i8, ptr %a, !alias.scope !2
25103 store i8 %val, ptr %b, !noalias !2
25104 %a.inc = getelementptr inbounds i8, ptr %a, i64 1
25105 %b.inc = getelementptr inbounds i8, ptr %b, i64 1
25106 %cond = call i1 @cond()
25107 br i1 %cond, label %loop, label %exit
25113 !0 = !{!0} ; domain
25114 !1 = !{!1, !0} ; scope
25115 !2 = !{!1} ; scope list
25117 Multiple calls to `@llvm.experimental.noalias.scope.decl` for the same scope
25118 are possible, but one should never dominate another. Violations are pointed out
25119 by the verifier as they indicate a problem in either a transformation pass or
25123 Floating Point Environment Manipulation intrinsics
25124 --------------------------------------------------
25126 These functions read or write floating point environment, such as rounding
25127 mode or state of floating point exceptions. Altering the floating point
25128 environment requires special care. See :ref:`Floating Point Environment <floatenv>`.
25130 '``llvm.get.rounding``' Intrinsic
25131 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25138 declare i32 @llvm.get.rounding()
25143 The '``llvm.get.rounding``' intrinsic reads the current rounding mode.
25148 The '``llvm.get.rounding``' intrinsic returns the current rounding mode.
25149 Encoding of the returned values is same as the result of ``FLT_ROUNDS``,
25150 specified by C standard:
25155 1 - to nearest, ties to even
25156 2 - toward positive infinity
25157 3 - toward negative infinity
25158 4 - to nearest, ties away from zero
25160 Other values may be used to represent additional rounding modes, supported by a
25161 target. These values are target-specific.
25163 '``llvm.set.rounding``' Intrinsic
25164 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25171 declare void @llvm.set.rounding(i32 <val>)
25176 The '``llvm.set.rounding``' intrinsic sets current rounding mode.
25181 The argument is the required rounding mode. Encoding of rounding mode is
25182 the same as used by '``llvm.get.rounding``'.
25187 The '``llvm.set.rounding``' intrinsic sets the current rounding mode. It is
25188 similar to C library function 'fesetround', however this intrinsic does not
25189 return any value and uses platform-independent representation of IEEE rounding
25193 Floating-Point Test Intrinsics
25194 ------------------------------
25196 These functions get properties of floating-point values.
25199 .. _llvm.is.fpclass:
25201 '``llvm.is.fpclass``' Intrinsic
25202 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25209 declare i1 @llvm.is.fpclass(<fptype> <op>, i32 <test>)
25210 declare <N x i1> @llvm.is.fpclass(<vector-fptype> <op>, i32 <test>)
25215 The '``llvm.is.fpclass``' intrinsic returns a boolean value or vector of boolean
25216 values depending on whether the first argument satisfies the test specified by
25217 the second argument.
25219 If the first argument is a floating-point scalar, then the result type is a
25220 boolean (:ref:`i1 <t_integer>`).
25222 If the first argument is a floating-point vector, then the result type is a
25223 vector of boolean with the same number of elements as the first argument.
25228 The first argument to the '``llvm.is.fpclass``' intrinsic must be
25229 :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>`
25230 of floating-point values.
25232 The second argument specifies, which tests to perform. It must be a compile-time
25233 integer constant, each bit in which specifies floating-point class:
25235 +-------+----------------------+
25236 | Bit # | floating-point class |
25237 +=======+======================+
25238 | 0 | Signaling NaN |
25239 +-------+----------------------+
25241 +-------+----------------------+
25242 | 2 | Negative infinity |
25243 +-------+----------------------+
25244 | 3 | Negative normal |
25245 +-------+----------------------+
25246 | 4 | Negative subnormal |
25247 +-------+----------------------+
25248 | 5 | Negative zero |
25249 +-------+----------------------+
25250 | 6 | Positive zero |
25251 +-------+----------------------+
25252 | 7 | Positive subnormal |
25253 +-------+----------------------+
25254 | 8 | Positive normal |
25255 +-------+----------------------+
25256 | 9 | Positive infinity |
25257 +-------+----------------------+
25262 The function checks if ``op`` belongs to any of the floating-point classes
25263 specified by ``test``. If ``op`` is a vector, then the check is made element by
25264 element. Each check yields an :ref:`i1 <t_integer>` result, which is ``true``,
25265 if the element value satisfies the specified test. The argument ``test`` is a
25266 bit mask where each bit specifies floating-point class to test. For example, the
25267 value 0x108 makes test for normal value, - bits 3 and 8 in it are set, which
25268 means that the function returns ``true`` if ``op`` is a positive or negative
25269 normal value. The function never raises floating-point exceptions. The
25270 function does not canonicalize its input value and does not depend
25271 on the floating-point environment. If the floating-point environment
25272 has a zeroing treatment of subnormal input values (such as indicated
25273 by the ``"denormal-fp-math"`` attribute), a subnormal value will be
25274 observed (will not be implicitly treated as zero).
25280 This class of intrinsics is designed to be generic and has no specific
25283 '``llvm.var.annotation``' Intrinsic
25284 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25291 declare void @llvm.var.annotation(ptr <val>, ptr <str>, ptr <str>, i32 <int>)
25296 The '``llvm.var.annotation``' intrinsic.
25301 The first argument is a pointer to a value, the second is a pointer to a
25302 global string, the third is a pointer to a global string which is the
25303 source file name, and the last argument is the line number.
25308 This intrinsic allows annotation of local variables with arbitrary
25309 strings. This can be useful for special purpose optimizations that want
25310 to look for these annotations. These have no other defined use; they are
25311 ignored by code generation and optimization.
25313 '``llvm.ptr.annotation.*``' Intrinsic
25314 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25319 This is an overloaded intrinsic. You can use '``llvm.ptr.annotation``' on a
25320 pointer to an integer of any width. *NOTE* you must specify an address space for
25321 the pointer. The identifier for the default address space is the integer
25326 declare ptr @llvm.ptr.annotation.p0(ptr <val>, ptr <str>, ptr <str>, i32 <int>)
25327 declare ptr @llvm.ptr.annotation.p1(ptr addrspace(1) <val>, ptr <str>, ptr <str>, i32 <int>)
25332 The '``llvm.ptr.annotation``' intrinsic.
25337 The first argument is a pointer to an integer value of arbitrary bitwidth
25338 (result of some expression), the second is a pointer to a global string, the
25339 third is a pointer to a global string which is the source file name, and the
25340 last argument is the line number. It returns the value of the first argument.
25345 This intrinsic allows annotation of a pointer to an integer with arbitrary
25346 strings. This can be useful for special purpose optimizations that want to look
25347 for these annotations. These have no other defined use; transformations preserve
25348 annotations on a best-effort basis but are allowed to replace the intrinsic with
25349 its first argument without breaking semantics and the intrinsic is completely
25350 dropped during instruction selection.
25352 '``llvm.annotation.*``' Intrinsic
25353 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25358 This is an overloaded intrinsic. You can use '``llvm.annotation``' on
25359 any integer bit width.
25363 declare i8 @llvm.annotation.i8(i8 <val>, ptr <str>, ptr <str>, i32 <int>)
25364 declare i16 @llvm.annotation.i16(i16 <val>, ptr <str>, ptr <str>, i32 <int>)
25365 declare i32 @llvm.annotation.i32(i32 <val>, ptr <str>, ptr <str>, i32 <int>)
25366 declare i64 @llvm.annotation.i64(i64 <val>, ptr <str>, ptr <str>, i32 <int>)
25367 declare i256 @llvm.annotation.i256(i256 <val>, ptr <str>, ptr <str>, i32 <int>)
25372 The '``llvm.annotation``' intrinsic.
25377 The first argument is an integer value (result of some expression), the
25378 second is a pointer to a global string, the third is a pointer to a
25379 global string which is the source file name, and the last argument is
25380 the line number. It returns the value of the first argument.
25385 This intrinsic allows annotations to be put on arbitrary expressions with
25386 arbitrary strings. This can be useful for special purpose optimizations that
25387 want to look for these annotations. These have no other defined use;
25388 transformations preserve annotations on a best-effort basis but are allowed to
25389 replace the intrinsic with its first argument without breaking semantics and the
25390 intrinsic is completely dropped during instruction selection.
25392 '``llvm.codeview.annotation``' Intrinsic
25393 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25398 This annotation emits a label at its program point and an associated
25399 ``S_ANNOTATION`` codeview record with some additional string metadata. This is
25400 used to implement MSVC's ``__annotation`` intrinsic. It is marked
25401 ``noduplicate``, so calls to this intrinsic prevent inlining and should be
25402 considered expensive.
25406 declare void @llvm.codeview.annotation(metadata)
25411 The argument should be an MDTuple containing any number of MDStrings.
25413 '``llvm.trap``' Intrinsic
25414 ^^^^^^^^^^^^^^^^^^^^^^^^^
25421 declare void @llvm.trap() cold noreturn nounwind
25426 The '``llvm.trap``' intrinsic.
25436 This intrinsic is lowered to the target dependent trap instruction. If
25437 the target does not have a trap instruction, this intrinsic will be
25438 lowered to a call of the ``abort()`` function.
25440 '``llvm.debugtrap``' Intrinsic
25441 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25448 declare void @llvm.debugtrap() nounwind
25453 The '``llvm.debugtrap``' intrinsic.
25463 This intrinsic is lowered to code which is intended to cause an
25464 execution trap with the intention of requesting the attention of a
25467 '``llvm.ubsantrap``' Intrinsic
25468 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25475 declare void @llvm.ubsantrap(i8 immarg) cold noreturn nounwind
25480 The '``llvm.ubsantrap``' intrinsic.
25485 An integer describing the kind of failure detected.
25490 This intrinsic is lowered to code which is intended to cause an execution trap,
25491 embedding the argument into encoding of that trap somehow to discriminate
25492 crashes if possible.
25494 Equivalent to ``@llvm.trap`` for targets that do not support this behaviour.
25496 '``llvm.stackprotector``' Intrinsic
25497 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25504 declare void @llvm.stackprotector(ptr <guard>, ptr <slot>)
25509 The ``llvm.stackprotector`` intrinsic takes the ``guard`` and stores it
25510 onto the stack at ``slot``. The stack slot is adjusted to ensure that it
25511 is placed on the stack before local variables.
25516 The ``llvm.stackprotector`` intrinsic requires two pointer arguments.
25517 The first argument is the value loaded from the stack guard
25518 ``@__stack_chk_guard``. The second variable is an ``alloca`` that has
25519 enough space to hold the value of the guard.
25524 This intrinsic causes the prologue/epilogue inserter to force the position of
25525 the ``AllocaInst`` stack slot to be before local variables on the stack. This is
25526 to ensure that if a local variable on the stack is overwritten, it will destroy
25527 the value of the guard. When the function exits, the guard on the stack is
25528 checked against the original guard by ``llvm.stackprotectorcheck``. If they are
25529 different, then ``llvm.stackprotectorcheck`` causes the program to abort by
25530 calling the ``__stack_chk_fail()`` function.
25532 '``llvm.stackguard``' Intrinsic
25533 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25540 declare ptr @llvm.stackguard()
25545 The ``llvm.stackguard`` intrinsic returns the system stack guard value.
25547 It should not be generated by frontends, since it is only for internal usage.
25548 The reason why we create this intrinsic is that we still support IR form Stack
25549 Protector in FastISel.
25559 On some platforms, the value returned by this intrinsic remains unchanged
25560 between loads in the same thread. On other platforms, it returns the same
25561 global variable value, if any, e.g. ``@__stack_chk_guard``.
25563 Currently some platforms have IR-level customized stack guard loading (e.g.
25564 X86 Linux) that is not handled by ``llvm.stackguard()``, while they should be
25567 '``llvm.objectsize``' Intrinsic
25568 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25575 declare i32 @llvm.objectsize.i32(ptr <object>, i1 <min>, i1 <nullunknown>, i1 <dynamic>)
25576 declare i64 @llvm.objectsize.i64(ptr <object>, i1 <min>, i1 <nullunknown>, i1 <dynamic>)
25581 The ``llvm.objectsize`` intrinsic is designed to provide information to the
25582 optimizer to determine whether a) an operation (like memcpy) will overflow a
25583 buffer that corresponds to an object, or b) that a runtime check for overflow
25584 isn't necessary. An object in this context means an allocation of a specific
25585 class, structure, array, or other object.
25590 The ``llvm.objectsize`` intrinsic takes four arguments. The first argument is a
25591 pointer to or into the ``object``. The second argument determines whether
25592 ``llvm.objectsize`` returns 0 (if true) or -1 (if false) when the object size is
25593 unknown. The third argument controls how ``llvm.objectsize`` acts when ``null``
25594 in address space 0 is used as its pointer argument. If it's ``false``,
25595 ``llvm.objectsize`` reports 0 bytes available when given ``null``. Otherwise, if
25596 the ``null`` is in a non-zero address space or if ``true`` is given for the
25597 third argument of ``llvm.objectsize``, we assume its size is unknown. The fourth
25598 argument to ``llvm.objectsize`` determines if the value should be evaluated at
25601 The second, third, and fourth arguments only accept constants.
25606 The ``llvm.objectsize`` intrinsic is lowered to a value representing the size of
25607 the object concerned. If the size cannot be determined, ``llvm.objectsize``
25608 returns ``i32/i64 -1 or 0`` (depending on the ``min`` argument).
25610 '``llvm.expect``' Intrinsic
25611 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
25616 This is an overloaded intrinsic. You can use ``llvm.expect`` on any
25621 declare i1 @llvm.expect.i1(i1 <val>, i1 <expected_val>)
25622 declare i32 @llvm.expect.i32(i32 <val>, i32 <expected_val>)
25623 declare i64 @llvm.expect.i64(i64 <val>, i64 <expected_val>)
25628 The ``llvm.expect`` intrinsic provides information about expected (the
25629 most probable) value of ``val``, which can be used by optimizers.
25634 The ``llvm.expect`` intrinsic takes two arguments. The first argument is
25635 a value. The second argument is an expected value.
25640 This intrinsic is lowered to the ``val``.
25642 '``llvm.expect.with.probability``' Intrinsic
25643 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25648 This intrinsic is similar to ``llvm.expect``. This is an overloaded intrinsic.
25649 You can use ``llvm.expect.with.probability`` on any integer bit width.
25653 declare i1 @llvm.expect.with.probability.i1(i1 <val>, i1 <expected_val>, double <prob>)
25654 declare i32 @llvm.expect.with.probability.i32(i32 <val>, i32 <expected_val>, double <prob>)
25655 declare i64 @llvm.expect.with.probability.i64(i64 <val>, i64 <expected_val>, double <prob>)
25660 The ``llvm.expect.with.probability`` intrinsic provides information about
25661 expected value of ``val`` with probability(or confidence) ``prob``, which can
25662 be used by optimizers.
25667 The ``llvm.expect.with.probability`` intrinsic takes three arguments. The first
25668 argument is a value. The second argument is an expected value. The third
25669 argument is a probability.
25674 This intrinsic is lowered to the ``val``.
25678 '``llvm.assume``' Intrinsic
25679 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25686 declare void @llvm.assume(i1 %cond)
25691 The ``llvm.assume`` allows the optimizer to assume that the provided
25692 condition is true. This information can then be used in simplifying other parts
25695 More complex assumptions can be encoded as
25696 :ref:`assume operand bundles <assume_opbundles>`.
25701 The argument of the call is the condition which the optimizer may assume is
25707 The intrinsic allows the optimizer to assume that the provided condition is
25708 always true whenever the control flow reaches the intrinsic call. No code is
25709 generated for this intrinsic, and instructions that contribute only to the
25710 provided condition are not used for code generation. If the condition is
25711 violated during execution, the behavior is undefined.
25713 Note that the optimizer might limit the transformations performed on values
25714 used by the ``llvm.assume`` intrinsic in order to preserve the instructions
25715 only used to form the intrinsic's input argument. This might prove undesirable
25716 if the extra information provided by the ``llvm.assume`` intrinsic does not cause
25717 sufficient overall improvement in code quality. For this reason,
25718 ``llvm.assume`` should not be used to document basic mathematical invariants
25719 that the optimizer can otherwise deduce or facts that are of little use to the
25724 '``llvm.ssa.copy``' Intrinsic
25725 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25732 declare type @llvm.ssa.copy(type returned %operand) memory(none)
25737 The first argument is an operand which is used as the returned value.
25742 The ``llvm.ssa.copy`` intrinsic can be used to attach information to
25743 operations by copying them and giving them new names. For example,
25744 the PredicateInfo utility uses it to build Extended SSA form, and
25745 attach various forms of information to operands that dominate specific
25746 uses. It is not meant for general use, only for building temporary
25747 renaming forms that require value splits at certain points.
25751 '``llvm.type.test``' Intrinsic
25752 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25759 declare i1 @llvm.type.test(ptr %ptr, metadata %type) nounwind memory(none)
25765 The first argument is a pointer to be tested. The second argument is a
25766 metadata object representing a :doc:`type identifier <TypeMetadata>`.
25771 The ``llvm.type.test`` intrinsic tests whether the given pointer is associated
25772 with the given type identifier.
25774 .. _type.checked.load:
25776 '``llvm.type.checked.load``' Intrinsic
25777 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25784 declare {ptr, i1} @llvm.type.checked.load(ptr %ptr, i32 %offset, metadata %type) nounwind memory(argmem: read)
25790 The first argument is a pointer from which to load a function pointer. The
25791 second argument is the byte offset from which to load the function pointer. The
25792 third argument is a metadata object representing a :doc:`type identifier
25798 The ``llvm.type.checked.load`` intrinsic safely loads a function pointer from a
25799 virtual table pointer using type metadata. This intrinsic is used to implement
25800 control flow integrity in conjunction with virtual call optimization. The
25801 virtual call optimization pass will optimize away ``llvm.type.checked.load``
25802 intrinsics associated with devirtualized calls, thereby removing the type
25803 check in cases where it is not needed to enforce the control flow integrity
25806 If the given pointer is associated with a type metadata identifier, this
25807 function returns true as the second element of its return value. (Note that
25808 the function may also return true if the given pointer is not associated
25809 with a type metadata identifier.) If the function's return value's second
25810 element is true, the following rules apply to the first element:
25812 - If the given pointer is associated with the given type metadata identifier,
25813 it is the function pointer loaded from the given byte offset from the given
25816 - If the given pointer is not associated with the given type metadata
25817 identifier, it is one of the following (the choice of which is unspecified):
25819 1. The function pointer that would have been loaded from an arbitrarily chosen
25820 (through an unspecified mechanism) pointer associated with the type
25823 2. If the function has a non-void return type, a pointer to a function that
25824 returns an unspecified value without causing side effects.
25826 If the function's return value's second element is false, the value of the
25827 first element is undefined.
25830 '``llvm.arithmetic.fence``' Intrinsic
25831 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25839 @llvm.arithmetic.fence(<type> <op>)
25844 The purpose of the ``llvm.arithmetic.fence`` intrinsic
25845 is to prevent the optimizer from performing fast-math optimizations,
25846 particularly reassociation,
25847 between the argument and the expression that contains the argument.
25848 It can be used to preserve the parentheses in the source language.
25853 The ``llvm.arithmetic.fence`` intrinsic takes only one argument.
25854 The argument and the return value are floating-point numbers,
25855 or vector floating-point numbers, of the same type.
25860 This intrinsic returns the value of its operand. The optimizer can optimize
25861 the argument, but the optimizer cannot hoist any component of the operand
25862 to the containing context, and the optimizer cannot move the calculation of
25863 any expression in the containing context into the operand.
25866 '``llvm.donothing``' Intrinsic
25867 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25874 declare void @llvm.donothing() nounwind memory(none)
25879 The ``llvm.donothing`` intrinsic doesn't perform any operation. It's one of only
25880 three intrinsics (besides ``llvm.experimental.patchpoint`` and
25881 ``llvm.experimental.gc.statepoint``) that can be called with an invoke
25892 This intrinsic does nothing, and it's removed by optimizers and ignored
25895 '``llvm.experimental.deoptimize``' Intrinsic
25896 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25903 declare type @llvm.experimental.deoptimize(...) [ "deopt"(...) ]
25908 This intrinsic, together with :ref:`deoptimization operand bundles
25909 <deopt_opbundles>`, allow frontends to express transfer of control and
25910 frame-local state from the currently executing (typically more specialized,
25911 hence faster) version of a function into another (typically more generic, hence
25914 In languages with a fully integrated managed runtime like Java and JavaScript
25915 this intrinsic can be used to implement "uncommon trap" or "side exit" like
25916 functionality. In unmanaged languages like C and C++, this intrinsic can be
25917 used to represent the slow paths of specialized functions.
25923 The intrinsic takes an arbitrary number of arguments, whose meaning is
25924 decided by the :ref:`lowering strategy<deoptimize_lowering>`.
25929 The ``@llvm.experimental.deoptimize`` intrinsic executes an attached
25930 deoptimization continuation (denoted using a :ref:`deoptimization
25931 operand bundle <deopt_opbundles>`) and returns the value returned by
25932 the deoptimization continuation. Defining the semantic properties of
25933 the continuation itself is out of scope of the language reference --
25934 as far as LLVM is concerned, the deoptimization continuation can
25935 invoke arbitrary side effects, including reading from and writing to
25938 Deoptimization continuations expressed using ``"deopt"`` operand bundles always
25939 continue execution to the end of the physical frame containing them, so all
25940 calls to ``@llvm.experimental.deoptimize`` must be in "tail position":
25942 - ``@llvm.experimental.deoptimize`` cannot be invoked.
25943 - The call must immediately precede a :ref:`ret <i_ret>` instruction.
25944 - The ``ret`` instruction must return the value produced by the
25945 ``@llvm.experimental.deoptimize`` call if there is one, or void.
25947 Note that the above restrictions imply that the return type for a call to
25948 ``@llvm.experimental.deoptimize`` will match the return type of its immediate
25951 The inliner composes the ``"deopt"`` continuations of the caller into the
25952 ``"deopt"`` continuations present in the inlinee, and also updates calls to this
25953 intrinsic to return directly from the frame of the function it inlined into.
25955 All declarations of ``@llvm.experimental.deoptimize`` must share the
25956 same calling convention.
25958 .. _deoptimize_lowering:
25963 Calls to ``@llvm.experimental.deoptimize`` are lowered to calls to the
25964 symbol ``__llvm_deoptimize`` (it is the frontend's responsibility to
25965 ensure that this symbol is defined). The call arguments to
25966 ``@llvm.experimental.deoptimize`` are lowered as if they were formal
25967 arguments of the specified types, and not as varargs.
25970 '``llvm.experimental.guard``' Intrinsic
25971 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25978 declare void @llvm.experimental.guard(i1, ...) [ "deopt"(...) ]
25983 This intrinsic, together with :ref:`deoptimization operand bundles
25984 <deopt_opbundles>`, allows frontends to express guards or checks on
25985 optimistic assumptions made during compilation. The semantics of
25986 ``@llvm.experimental.guard`` is defined in terms of
25987 ``@llvm.experimental.deoptimize`` -- its body is defined to be
25990 .. code-block:: text
25992 define void @llvm.experimental.guard(i1 %pred, <args...>) {
25993 %realPred = and i1 %pred, undef
25994 br i1 %realPred, label %continue, label %leave [, !make.implicit !{}]
25997 call void @llvm.experimental.deoptimize(<args...>) [ "deopt"() ]
26005 with the optional ``[, !make.implicit !{}]`` present if and only if it
26006 is present on the call site. For more details on ``!make.implicit``,
26007 see :doc:`FaultMaps`.
26009 In words, ``@llvm.experimental.guard`` executes the attached
26010 ``"deopt"`` continuation if (but **not** only if) its first argument
26011 is ``false``. Since the optimizer is allowed to replace the ``undef``
26012 with an arbitrary value, it can optimize guard to fail "spuriously",
26013 i.e. without the original condition being false (hence the "not only
26014 if"); and this allows for "check widening" type optimizations.
26016 ``@llvm.experimental.guard`` cannot be invoked.
26018 After ``@llvm.experimental.guard`` was first added, a more general
26019 formulation was found in ``@llvm.experimental.widenable.condition``.
26020 Support for ``@llvm.experimental.guard`` is slowly being rephrased in
26021 terms of this alternate.
26023 '``llvm.experimental.widenable.condition``' Intrinsic
26024 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26031 declare i1 @llvm.experimental.widenable.condition()
26036 This intrinsic represents a "widenable condition" which is
26037 boolean expressions with the following property: whether this
26038 expression is `true` or `false`, the program is correct and
26041 Together with :ref:`deoptimization operand bundles <deopt_opbundles>`,
26042 ``@llvm.experimental.widenable.condition`` allows frontends to
26043 express guards or checks on optimistic assumptions made during
26044 compilation and represent them as branch instructions on special
26047 While this may appear similar in semantics to `undef`, it is very
26048 different in that an invocation produces a particular, singular
26049 value. It is also intended to be lowered late, and remain available
26050 for specific optimizations and transforms that can benefit from its
26051 special properties.
26061 The intrinsic ``@llvm.experimental.widenable.condition()``
26062 returns either `true` or `false`. For each evaluation of a call
26063 to this intrinsic, the program must be valid and correct both if
26064 it returns `true` and if it returns `false`. This allows
26065 transformation passes to replace evaluations of this intrinsic
26066 with either value whenever one is beneficial.
26068 When used in a branch condition, it allows us to choose between
26069 two alternative correct solutions for the same problem, like
26072 .. code-block:: text
26074 %cond = call i1 @llvm.experimental.widenable.condition()
26075 br i1 %cond, label %solution_1, label %solution_2
26078 ; Apply memory-consuming but fast solution for a task.
26081 ; Cheap in memory but slow solution.
26083 Whether the result of intrinsic's call is `true` or `false`,
26084 it should be correct to pick either solution. We can switch
26085 between them by replacing the result of
26086 ``@llvm.experimental.widenable.condition`` with different
26089 This is how it can be used to represent guards as widenable branches:
26091 .. code-block:: text
26094 ; Unguarded instructions
26095 call void @llvm.experimental.guard(i1 %cond, <args...>) ["deopt"(<deopt_args...>)]
26096 ; Guarded instructions
26098 Can be expressed in an alternative equivalent form of explicit branch using
26099 ``@llvm.experimental.widenable.condition``:
26101 .. code-block:: text
26104 ; Unguarded instructions
26105 %widenable_condition = call i1 @llvm.experimental.widenable.condition()
26106 %guard_condition = and i1 %cond, %widenable_condition
26107 br i1 %guard_condition, label %guarded, label %deopt
26110 ; Guarded instructions
26113 call type @llvm.experimental.deoptimize(<args...>) [ "deopt"(<deopt_args...>) ]
26115 So the block `guarded` is only reachable when `%cond` is `true`,
26116 and it should be valid to go to the block `deopt` whenever `%cond`
26117 is `true` or `false`.
26119 ``@llvm.experimental.widenable.condition`` will never throw, thus
26120 it cannot be invoked.
26125 When ``@llvm.experimental.widenable.condition()`` is used in
26126 condition of a guard represented as explicit branch, it is
26127 legal to widen the guard's condition with any additional
26130 Guard widening looks like replacement of
26132 .. code-block:: text
26134 %widenable_cond = call i1 @llvm.experimental.widenable.condition()
26135 %guard_cond = and i1 %cond, %widenable_cond
26136 br i1 %guard_cond, label %guarded, label %deopt
26140 .. code-block:: text
26142 %widenable_cond = call i1 @llvm.experimental.widenable.condition()
26143 %new_cond = and i1 %any_other_cond, %widenable_cond
26144 %new_guard_cond = and i1 %cond, %new_cond
26145 br i1 %new_guard_cond, label %guarded, label %deopt
26147 for this branch. Here `%any_other_cond` is an arbitrarily chosen
26148 well-defined `i1` value. By making guard widening, we may
26149 impose stricter conditions on `guarded` block and bail to the
26150 deopt when the new condition is not met.
26155 Default lowering strategy is replacing the result of
26156 call of ``@llvm.experimental.widenable.condition`` with
26157 constant `true`. However it is always correct to replace
26158 it with any other `i1` value. Any pass can
26159 freely do it if it can benefit from non-default lowering.
26162 '``llvm.load.relative``' Intrinsic
26163 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26170 declare ptr @llvm.load.relative.iN(ptr %ptr, iN %offset) nounwind memory(argmem: read)
26175 This intrinsic loads a 32-bit value from the address ``%ptr + %offset``,
26176 adds ``%ptr`` to that value and returns it. The constant folder specifically
26177 recognizes the form of this intrinsic and the constant initializers it may
26178 load from; if a loaded constant initializer is known to have the form
26179 ``i32 trunc(x - %ptr)``, the intrinsic call is folded to ``x``.
26181 LLVM provides that the calculation of such a constant initializer will
26182 not overflow at link time under the medium code model if ``x`` is an
26183 ``unnamed_addr`` function. However, it does not provide this guarantee for
26184 a constant initializer folded into a function body. This intrinsic can be
26185 used to avoid the possibility of overflows when loading from such a constant.
26187 .. _llvm_sideeffect:
26189 '``llvm.sideeffect``' Intrinsic
26190 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26197 declare void @llvm.sideeffect() inaccessiblememonly nounwind willreturn
26202 The ``llvm.sideeffect`` intrinsic doesn't perform any operation. Optimizers
26203 treat it as having side effects, so it can be inserted into a loop to
26204 indicate that the loop shouldn't be assumed to terminate (which could
26205 potentially lead to the loop being optimized away entirely), even if it's
26206 an infinite loop with no other side effects.
26216 This intrinsic actually does nothing, but optimizers must assume that it
26217 has externally observable side effects.
26219 '``llvm.is.constant.*``' Intrinsic
26220 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26225 This is an overloaded intrinsic. You can use llvm.is.constant with any argument type.
26229 declare i1 @llvm.is.constant.i32(i32 %operand) nounwind memory(none)
26230 declare i1 @llvm.is.constant.f32(float %operand) nounwind memory(none)
26231 declare i1 @llvm.is.constant.TYPENAME(TYPE %operand) nounwind memory(none)
26236 The '``llvm.is.constant``' intrinsic will return true if the argument
26237 is known to be a manifest compile-time constant. It is guaranteed to
26238 fold to either true or false before generating machine code.
26243 This intrinsic generates no code. If its argument is known to be a
26244 manifest compile-time constant value, then the intrinsic will be
26245 converted to a constant true value. Otherwise, it will be converted to
26246 a constant false value.
26248 In particular, note that if the argument is a constant expression
26249 which refers to a global (the address of which _is_ a constant, but
26250 not manifest during the compile), then the intrinsic evaluates to
26253 The result also intentionally depends on the result of optimization
26254 passes -- e.g., the result can change depending on whether a
26255 function gets inlined or not. A function's parameters are
26256 obviously not constant. However, a call like
26257 ``llvm.is.constant.i32(i32 %param)`` *can* return true after the
26258 function is inlined, if the value passed to the function parameter was
26261 On the other hand, if constant folding is not run, it will never
26262 evaluate to true, even in simple cases.
26266 '``llvm.ptrmask``' Intrinsic
26267 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26274 declare ptrty llvm.ptrmask(ptrty %ptr, intty %mask) speculatable memory(none)
26279 The first argument is a pointer. The second argument is an integer.
26284 The ``llvm.ptrmask`` intrinsic masks out bits of the pointer according to a mask.
26285 This allows stripping data from tagged pointers without converting them to an
26286 integer (ptrtoint/inttoptr). As a consequence, we can preserve more information
26287 to facilitate alias analysis and underlying-object detection.
26292 The result of ``ptrmask(ptr, mask)`` is equivalent to
26293 ``getelementptr ptr, (ptrtoint(ptr) & mask) - ptrtoint(ptr)``. Both the returned
26294 pointer and the first argument are based on the same underlying object (for more
26295 information on the *based on* terminology see
26296 :ref:`the pointer aliasing rules <pointeraliasing>`). If the bitwidth of the
26297 mask argument does not match the pointer size of the target, the mask is
26298 zero-extended or truncated accordingly.
26300 .. _int_threadlocal_address:
26302 '``llvm.threadlocal.address``' Intrinsic
26303 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26310 declare ptr @llvm.threadlocal.address(ptr) nounwind willreturn memory(none)
26315 The first argument is a pointer, which refers to a thread local global.
26320 The address of a thread local global is not a constant, since it depends on
26321 the calling thread. The `llvm.threadlocal.address` intrinsic returns the
26322 address of the given thread local global in the calling thread.
26326 '``llvm.vscale``' Intrinsic
26327 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
26334 declare i32 llvm.vscale.i32()
26335 declare i64 llvm.vscale.i64()
26340 The ``llvm.vscale`` intrinsic returns the value for ``vscale`` in scalable
26341 vectors such as ``<vscale x 16 x i8>``.
26346 ``vscale`` is a positive value that is constant throughout program
26347 execution, but is unknown at compile time.
26348 If the result value does not fit in the result type, then the result is
26349 a :ref:`poison value <poisonvalues>`.
26352 Stack Map Intrinsics
26353 --------------------
26355 LLVM provides experimental intrinsics to support runtime patching
26356 mechanisms commonly desired in dynamic language JITs. These intrinsics
26357 are described in :doc:`StackMaps`.
26359 Element Wise Atomic Memory Intrinsics
26360 -------------------------------------
26362 These intrinsics are similar to the standard library memory intrinsics except
26363 that they perform memory transfer as a sequence of atomic memory accesses.
26365 .. _int_memcpy_element_unordered_atomic:
26367 '``llvm.memcpy.element.unordered.atomic``' Intrinsic
26368 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26373 This is an overloaded intrinsic. You can use ``llvm.memcpy.element.unordered.atomic`` on
26374 any integer bit width and for different address spaces. Not all targets
26375 support all bit widths however.
26379 declare void @llvm.memcpy.element.unordered.atomic.p0.p0.i32(ptr <dest>,
26382 i32 <element_size>)
26383 declare void @llvm.memcpy.element.unordered.atomic.p0.p0.i64(ptr <dest>,
26386 i32 <element_size>)
26391 The '``llvm.memcpy.element.unordered.atomic.*``' intrinsic is a specialization of the
26392 '``llvm.memcpy.*``' intrinsic. It differs in that the ``dest`` and ``src`` are treated
26393 as arrays with elements that are exactly ``element_size`` bytes, and the copy between
26394 buffers uses a sequence of :ref:`unordered atomic <ordering>` load/store operations
26395 that are a positive integer multiple of the ``element_size`` in size.
26400 The first three arguments are the same as they are in the :ref:`@llvm.memcpy <int_memcpy>`
26401 intrinsic, with the added constraint that ``len`` is required to be a positive integer
26402 multiple of the ``element_size``. If ``len`` is not a positive integer multiple of
26403 ``element_size``, then the behaviour of the intrinsic is undefined.
26405 ``element_size`` must be a compile-time constant positive power of two no greater than
26406 target-specific atomic access size limit.
26408 For each of the input pointers ``align`` parameter attribute must be specified. It
26409 must be a power of two no less than the ``element_size``. Caller guarantees that
26410 both the source and destination pointers are aligned to that boundary.
26415 The '``llvm.memcpy.element.unordered.atomic.*``' intrinsic copies ``len`` bytes of
26416 memory from the source location to the destination location. These locations are not
26417 allowed to overlap. The memory copy is performed as a sequence of load/store operations
26418 where each access is guaranteed to be a multiple of ``element_size`` bytes wide and
26419 aligned at an ``element_size`` boundary.
26421 The order of the copy is unspecified. The same value may be read from the source
26422 buffer many times, but only one write is issued to the destination buffer per
26423 element. It is well defined to have concurrent reads and writes to both source and
26424 destination provided those reads and writes are unordered atomic when specified.
26426 This intrinsic does not provide any additional ordering guarantees over those
26427 provided by a set of unordered loads from the source location and stores to the
26433 In the most general case call to the '``llvm.memcpy.element.unordered.atomic.*``' is
26434 lowered to a call to the symbol ``__llvm_memcpy_element_unordered_atomic_*``. Where '*'
26435 is replaced with an actual element size. See :ref:`RewriteStatepointsForGC intrinsic
26436 lowering <RewriteStatepointsForGC_intrinsic_lowering>` for details on GC specific
26439 Optimizer is allowed to inline memory copy when it's profitable to do so.
26441 '``llvm.memmove.element.unordered.atomic``' Intrinsic
26442 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26447 This is an overloaded intrinsic. You can use
26448 ``llvm.memmove.element.unordered.atomic`` on any integer bit width and for
26449 different address spaces. Not all targets support all bit widths however.
26453 declare void @llvm.memmove.element.unordered.atomic.p0.p0.i32(ptr <dest>,
26456 i32 <element_size>)
26457 declare void @llvm.memmove.element.unordered.atomic.p0.p0.i64(ptr <dest>,
26460 i32 <element_size>)
26465 The '``llvm.memmove.element.unordered.atomic.*``' intrinsic is a specialization
26466 of the '``llvm.memmove.*``' intrinsic. It differs in that the ``dest`` and
26467 ``src`` are treated as arrays with elements that are exactly ``element_size``
26468 bytes, and the copy between buffers uses a sequence of
26469 :ref:`unordered atomic <ordering>` load/store operations that are a positive
26470 integer multiple of the ``element_size`` in size.
26475 The first three arguments are the same as they are in the
26476 :ref:`@llvm.memmove <int_memmove>` intrinsic, with the added constraint that
26477 ``len`` is required to be a positive integer multiple of the ``element_size``.
26478 If ``len`` is not a positive integer multiple of ``element_size``, then the
26479 behaviour of the intrinsic is undefined.
26481 ``element_size`` must be a compile-time constant positive power of two no
26482 greater than a target-specific atomic access size limit.
26484 For each of the input pointers the ``align`` parameter attribute must be
26485 specified. It must be a power of two no less than the ``element_size``. Caller
26486 guarantees that both the source and destination pointers are aligned to that
26492 The '``llvm.memmove.element.unordered.atomic.*``' intrinsic copies ``len`` bytes
26493 of memory from the source location to the destination location. These locations
26494 are allowed to overlap. The memory copy is performed as a sequence of load/store
26495 operations where each access is guaranteed to be a multiple of ``element_size``
26496 bytes wide and aligned at an ``element_size`` boundary.
26498 The order of the copy is unspecified. The same value may be read from the source
26499 buffer many times, but only one write is issued to the destination buffer per
26500 element. It is well defined to have concurrent reads and writes to both source
26501 and destination provided those reads and writes are unordered atomic when
26504 This intrinsic does not provide any additional ordering guarantees over those
26505 provided by a set of unordered loads from the source location and stores to the
26511 In the most general case call to the
26512 '``llvm.memmove.element.unordered.atomic.*``' is lowered to a call to the symbol
26513 ``__llvm_memmove_element_unordered_atomic_*``. Where '*' is replaced with an
26514 actual element size. See :ref:`RewriteStatepointsForGC intrinsic lowering
26515 <RewriteStatepointsForGC_intrinsic_lowering>` for details on GC specific
26518 The optimizer is allowed to inline the memory copy when it's profitable to do so.
26520 .. _int_memset_element_unordered_atomic:
26522 '``llvm.memset.element.unordered.atomic``' Intrinsic
26523 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26528 This is an overloaded intrinsic. You can use ``llvm.memset.element.unordered.atomic`` on
26529 any integer bit width and for different address spaces. Not all targets
26530 support all bit widths however.
26534 declare void @llvm.memset.element.unordered.atomic.p0.i32(ptr <dest>,
26537 i32 <element_size>)
26538 declare void @llvm.memset.element.unordered.atomic.p0.i64(ptr <dest>,
26541 i32 <element_size>)
26546 The '``llvm.memset.element.unordered.atomic.*``' intrinsic is a specialization of the
26547 '``llvm.memset.*``' intrinsic. It differs in that the ``dest`` is treated as an array
26548 with elements that are exactly ``element_size`` bytes, and the assignment to that array
26549 uses uses a sequence of :ref:`unordered atomic <ordering>` store operations
26550 that are a positive integer multiple of the ``element_size`` in size.
26555 The first three arguments are the same as they are in the :ref:`@llvm.memset <int_memset>`
26556 intrinsic, with the added constraint that ``len`` is required to be a positive integer
26557 multiple of the ``element_size``. If ``len`` is not a positive integer multiple of
26558 ``element_size``, then the behaviour of the intrinsic is undefined.
26560 ``element_size`` must be a compile-time constant positive power of two no greater than
26561 target-specific atomic access size limit.
26563 The ``dest`` input pointer must have the ``align`` parameter attribute specified. It
26564 must be a power of two no less than the ``element_size``. Caller guarantees that
26565 the destination pointer is aligned to that boundary.
26570 The '``llvm.memset.element.unordered.atomic.*``' intrinsic sets the ``len`` bytes of
26571 memory starting at the destination location to the given ``value``. The memory is
26572 set with a sequence of store operations where each access is guaranteed to be a
26573 multiple of ``element_size`` bytes wide and aligned at an ``element_size`` boundary.
26575 The order of the assignment is unspecified. Only one write is issued to the
26576 destination buffer per element. It is well defined to have concurrent reads and
26577 writes to the destination provided those reads and writes are unordered atomic
26580 This intrinsic does not provide any additional ordering guarantees over those
26581 provided by a set of unordered stores to the destination.
26586 In the most general case call to the '``llvm.memset.element.unordered.atomic.*``' is
26587 lowered to a call to the symbol ``__llvm_memset_element_unordered_atomic_*``. Where '*'
26588 is replaced with an actual element size.
26590 The optimizer is allowed to inline the memory assignment when it's profitable to do so.
26592 Objective-C ARC Runtime Intrinsics
26593 ----------------------------------
26595 LLVM provides intrinsics that lower to Objective-C ARC runtime entry points.
26596 LLVM is aware of the semantics of these functions, and optimizes based on that
26597 knowledge. You can read more about the details of Objective-C ARC `here
26598 <https://clang.llvm.org/docs/AutomaticReferenceCounting.html>`_.
26600 '``llvm.objc.autorelease``' Intrinsic
26601 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26607 declare ptr @llvm.objc.autorelease(ptr)
26612 Lowers to a call to `objc_autorelease <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-autorelease>`_.
26614 '``llvm.objc.autoreleasePoolPop``' Intrinsic
26615 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26621 declare void @llvm.objc.autoreleasePoolPop(ptr)
26626 Lowers to a call to `objc_autoreleasePoolPop <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-autoreleasepoolpop-void-pool>`_.
26628 '``llvm.objc.autoreleasePoolPush``' Intrinsic
26629 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26635 declare ptr @llvm.objc.autoreleasePoolPush()
26640 Lowers to a call to `objc_autoreleasePoolPush <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-autoreleasepoolpush-void>`_.
26642 '``llvm.objc.autoreleaseReturnValue``' Intrinsic
26643 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26649 declare ptr @llvm.objc.autoreleaseReturnValue(ptr)
26654 Lowers to a call to `objc_autoreleaseReturnValue <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-autoreleasereturnvalue>`_.
26656 '``llvm.objc.copyWeak``' Intrinsic
26657 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26663 declare void @llvm.objc.copyWeak(ptr, ptr)
26668 Lowers to a call to `objc_copyWeak <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-copyweak-id-dest-id-src>`_.
26670 '``llvm.objc.destroyWeak``' Intrinsic
26671 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26677 declare void @llvm.objc.destroyWeak(ptr)
26682 Lowers to a call to `objc_destroyWeak <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-destroyweak-id-object>`_.
26684 '``llvm.objc.initWeak``' Intrinsic
26685 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26691 declare ptr @llvm.objc.initWeak(ptr, ptr)
26696 Lowers to a call to `objc_initWeak <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-initweak>`_.
26698 '``llvm.objc.loadWeak``' Intrinsic
26699 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26705 declare ptr @llvm.objc.loadWeak(ptr)
26710 Lowers to a call to `objc_loadWeak <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-loadweak>`_.
26712 '``llvm.objc.loadWeakRetained``' Intrinsic
26713 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26719 declare ptr @llvm.objc.loadWeakRetained(ptr)
26724 Lowers to a call to `objc_loadWeakRetained <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-loadweakretained>`_.
26726 '``llvm.objc.moveWeak``' Intrinsic
26727 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26733 declare void @llvm.objc.moveWeak(ptr, ptr)
26738 Lowers to a call to `objc_moveWeak <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-moveweak-id-dest-id-src>`_.
26740 '``llvm.objc.release``' Intrinsic
26741 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26747 declare void @llvm.objc.release(ptr)
26752 Lowers to a call to `objc_release <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-release-id-value>`_.
26754 '``llvm.objc.retain``' Intrinsic
26755 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26761 declare ptr @llvm.objc.retain(ptr)
26766 Lowers to a call to `objc_retain <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-retain>`_.
26768 '``llvm.objc.retainAutorelease``' Intrinsic
26769 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26775 declare ptr @llvm.objc.retainAutorelease(ptr)
26780 Lowers to a call to `objc_retainAutorelease <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-retainautorelease>`_.
26782 '``llvm.objc.retainAutoreleaseReturnValue``' Intrinsic
26783 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26789 declare ptr @llvm.objc.retainAutoreleaseReturnValue(ptr)
26794 Lowers to a call to `objc_retainAutoreleaseReturnValue <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-retainautoreleasereturnvalue>`_.
26796 '``llvm.objc.retainAutoreleasedReturnValue``' Intrinsic
26797 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26803 declare ptr @llvm.objc.retainAutoreleasedReturnValue(ptr)
26808 Lowers to a call to `objc_retainAutoreleasedReturnValue <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-retainautoreleasedreturnvalue>`_.
26810 '``llvm.objc.retainBlock``' Intrinsic
26811 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26817 declare ptr @llvm.objc.retainBlock(ptr)
26822 Lowers to a call to `objc_retainBlock <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-retainblock>`_.
26824 '``llvm.objc.storeStrong``' Intrinsic
26825 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26831 declare void @llvm.objc.storeStrong(ptr, ptr)
26836 Lowers to a call to `objc_storeStrong <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-storestrong-id-object-id-value>`_.
26838 '``llvm.objc.storeWeak``' Intrinsic
26839 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26845 declare ptr @llvm.objc.storeWeak(ptr, ptr)
26850 Lowers to a call to `objc_storeWeak <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-storeweak>`_.
26852 Preserving Debug Information Intrinsics
26853 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26855 These intrinsics are used to carry certain debuginfo together with
26856 IR-level operations. For example, it may be desirable to
26857 know the structure/union name and the original user-level field
26858 indices. Such information got lost in IR GetElementPtr instruction
26859 since the IR types are different from debugInfo types and unions
26860 are converted to structs in IR.
26862 '``llvm.preserve.array.access.index``' Intrinsic
26863 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26870 @llvm.preserve.array.access.index.p0s_union.anons.p0a10s_union.anons(<type> base,
26877 The '``llvm.preserve.array.access.index``' intrinsic returns the getelementptr address
26878 based on array base ``base``, array dimension ``dim`` and the last access index ``index``
26879 into the array. The return type ``ret_type`` is a pointer type to the array element.
26880 The array ``dim`` and ``index`` are preserved which is more robust than
26881 getelementptr instruction which may be subject to compiler transformation.
26882 The ``llvm.preserve.access.index`` type of metadata is attached to this call instruction
26883 to provide array or pointer debuginfo type.
26884 The metadata is a ``DICompositeType`` or ``DIDerivedType`` representing the
26885 debuginfo version of ``type``.
26890 The ``base`` is the array base address. The ``dim`` is the array dimension.
26891 The ``base`` is a pointer if ``dim`` equals 0.
26892 The ``index`` is the last access index into the array or pointer.
26894 The ``base`` argument must be annotated with an :ref:`elementtype
26895 <attr_elementtype>` attribute at the call-site. This attribute specifies the
26896 getelementptr element type.
26901 The '``llvm.preserve.array.access.index``' intrinsic produces the same result
26902 as a getelementptr with base ``base`` and access operands ``{dim's 0's, index}``.
26904 '``llvm.preserve.union.access.index``' Intrinsic
26905 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26912 @llvm.preserve.union.access.index.p0s_union.anons.p0s_union.anons(<type> base,
26918 The '``llvm.preserve.union.access.index``' intrinsic carries the debuginfo field index
26919 ``di_index`` and returns the ``base`` address.
26920 The ``llvm.preserve.access.index`` type of metadata is attached to this call instruction
26921 to provide union debuginfo type.
26922 The metadata is a ``DICompositeType`` representing the debuginfo version of ``type``.
26923 The return type ``type`` is the same as the ``base`` type.
26928 The ``base`` is the union base address. The ``di_index`` is the field index in debuginfo.
26933 The '``llvm.preserve.union.access.index``' intrinsic returns the ``base`` address.
26935 '``llvm.preserve.struct.access.index``' Intrinsic
26936 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26943 @llvm.preserve.struct.access.index.p0i8.p0s_struct.anon.0s(<type> base,
26950 The '``llvm.preserve.struct.access.index``' intrinsic returns the getelementptr address
26951 based on struct base ``base`` and IR struct member index ``gep_index``.
26952 The ``llvm.preserve.access.index`` type of metadata is attached to this call instruction
26953 to provide struct debuginfo type.
26954 The metadata is a ``DICompositeType`` representing the debuginfo version of ``type``.
26955 The return type ``ret_type`` is a pointer type to the structure member.
26960 The ``base`` is the structure base address. The ``gep_index`` is the struct member index
26961 based on IR structures. The ``di_index`` is the struct member index based on debuginfo.
26963 The ``base`` argument must be annotated with an :ref:`elementtype
26964 <attr_elementtype>` attribute at the call-site. This attribute specifies the
26965 getelementptr element type.
26970 The '``llvm.preserve.struct.access.index``' intrinsic produces the same result
26971 as a getelementptr with base ``base`` and access operands ``{0, gep_index}``.
26973 '``llvm.fptrunc.round``' Intrinsic
26974 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26982 @llvm.fptrunc.round(<type> <value>, metadata <rounding mode>)
26987 The '``llvm.fptrunc.round``' intrinsic truncates
26988 :ref:`floating-point <t_floating>` ``value`` to type ``ty2``
26989 with a specified rounding mode.
26994 The '``llvm.fptrunc.round``' intrinsic takes a :ref:`floating-point
26995 <t_floating>` value to cast and a :ref:`floating-point <t_floating>` type
26996 to cast it to. This argument must be larger in size than the result.
26998 The second argument specifies the rounding mode as described in the constrained
26999 intrinsics section.
27000 For this intrinsic, the "round.dynamic" mode is not supported.
27005 The '``llvm.fptrunc.round``' intrinsic casts a ``value`` from a larger
27006 :ref:`floating-point <t_floating>` type to a smaller :ref:`floating-point
27007 <t_floating>` type.
27008 This intrinsic is assumed to execute in the default :ref:`floating-point
27009 environment <floatenv>` *except* for the rounding mode.
27010 This intrinsic is not supported on all targets. Some targets may not support
27011 all rounding modes.