1 ==============================
2 LLVM Language Reference Manual
3 ==============================
12 This document is a reference manual for the LLVM assembly language. LLVM
13 is a Static Single Assignment (SSA) based representation that provides
14 type safety, low-level operations, flexibility, and the capability of
15 representing 'all' high-level languages cleanly. It is the common code
16 representation used throughout all phases of the LLVM compilation
22 The LLVM code representation is designed to be used in three different
23 forms: as an in-memory compiler IR, as an on-disk bitcode representation
24 (suitable for fast loading by a Just-In-Time compiler), and as a human
25 readable assembly language representation. This allows LLVM to provide a
26 powerful intermediate representation for efficient compiler
27 transformations and analysis, while providing a natural means to debug
28 and visualize the transformations. The three different forms of LLVM are
29 all equivalent. This document describes the human readable
30 representation and notation.
32 The LLVM representation aims to be light-weight and low-level while
33 being expressive, typed, and extensible at the same time. It aims to be
34 a "universal IR" of sorts, by being at a low enough level that
35 high-level ideas may be cleanly mapped to it (similar to how
36 microprocessors are "universal IR's", allowing many source languages to
37 be mapped to them). By providing type information, LLVM can be used as
38 the target of optimizations: for example, through pointer analysis, it
39 can be proven that a C automatic variable is never accessed outside of
40 the current function, allowing it to be promoted to a simple SSA value
41 instead of a memory location.
48 It is important to note that this document describes 'well formed' LLVM
49 assembly language. There is a difference between what the parser accepts
50 and what is considered 'well formed'. For example, the following
51 instruction is syntactically okay, but not well formed:
57 because the definition of ``%x`` does not dominate all of its uses. The
58 LLVM infrastructure provides a verification pass that may be used to
59 verify that an LLVM module is well formed. This pass is automatically
60 run by the parser after parsing input assembly and by the optimizer
61 before it outputs bitcode. The violations pointed out by the verifier
62 pass indicate bugs in transformation passes or input to the parser.
72 LLVM identifiers come in two basic types: global and local. Global
73 identifiers (functions, global variables) begin with the ``'@'``
74 character. Local identifiers (register names, types) begin with the
75 ``'%'`` character. Additionally, there are three different formats for
76 identifiers, for different purposes:
78 #. Named values are represented as a string of characters with their
79 prefix. For example, ``%foo``, ``@DivisionByZero``,
80 ``%a.really.long.identifier``. The actual regular expression used is
81 '``[%@][-a-zA-Z$._][-a-zA-Z$._0-9]*``'. Identifiers that require other
82 characters in their names can be surrounded with quotes. Special
83 characters may be escaped using ``"\xx"`` where ``xx`` is the ASCII
84 code for the character in hexadecimal. In this way, any character can
85 be used in a name value, even quotes themselves. The ``"\01"`` prefix
86 can be used on global values to suppress mangling.
87 #. Unnamed values are represented as an unsigned numeric value with
88 their prefix. For example, ``%12``, ``@2``, ``%44``.
89 #. Constants, which are described in the section Constants_ below.
91 LLVM requires that values start with a prefix for two reasons: Compilers
92 don't need to worry about name clashes with reserved words, and the set
93 of reserved words may be expanded in the future without penalty.
94 Additionally, unnamed identifiers allow a compiler to quickly come up
95 with a temporary variable without having to avoid symbol table
98 Reserved words in LLVM are very similar to reserved words in other
99 languages. There are keywords for different opcodes ('``add``',
100 '``bitcast``', '``ret``', etc...), for primitive type names ('``void``',
101 '``i32``', etc...), and others. These reserved words cannot conflict
102 with variable names, because none of them start with a prefix character
103 (``'%'`` or ``'@'``).
105 Here is an example of LLVM code to multiply the integer variable
112 %result = mul i32 %X, 8
114 After strength reduction:
118 %result = shl i32 %X, 3
124 %0 = add i32 %X, %X ; yields i32:%0
125 %1 = add i32 %0, %0 /* yields i32:%1 */
126 %result = add i32 %1, %1
128 This last way of multiplying ``%X`` by 8 illustrates several important
129 lexical features of LLVM:
131 #. Comments are delimited with a '``;``' and go until the end of line.
132 Alternatively, comments can start with ``/*`` and terminate with ``*/``.
133 #. Unnamed temporaries are created when the result of a computation is
134 not assigned to a named value.
135 #. By default, unnamed temporaries are numbered sequentially (using a
136 per-function incrementing counter, starting with 0). However, when explicitly
137 specifying temporary numbers, it is allowed to skip over numbers.
139 Note that basic blocks and unnamed function parameters are included in this
140 numbering. For example, if the entry basic block is not given a label name
141 and all function parameters are named, then it will get number 0.
143 It also shows a convention that we follow in this document. When
144 demonstrating instructions, we will follow an instruction with a comment
145 that defines the type and name of value produced.
147 .. _string_constants:
152 Strings in LLVM programs are delimited by ``"`` characters. Within a
153 string, all bytes are treated literally with the exception of ``\``
154 characters, which start escapes, and the first ``"`` character, which
157 There are two kinds of escapes.
159 * ``\\`` represents a single ``\`` character.
161 * ``\`` followed by two hexadecimal characters (0-9, a-f, or A-F)
162 represents the byte with the given value (e.g. \x00 represents a
165 To represent a ``"`` character, use ``\22``. (``\"`` will end the string
166 with a trailing ``\``.)
168 Newlines do not terminate string constants; strings can span multiple
171 The interpretation of string constants (e.g. their character encoding)
181 LLVM programs are composed of ``Module``'s, each of which is a
182 translation unit of the input programs. Each module consists of
183 functions, global variables, and symbol table entries. Modules may be
184 combined together with the LLVM linker, which merges function (and
185 global variable) definitions, resolves forward declarations, and merges
186 symbol table entries. Here is an example of the "hello world" module:
190 ; Declare the string constant as a global constant.
191 @.str = private unnamed_addr constant [13 x i8] c"hello world\0A\00"
193 ; External declaration of the puts function
194 declare i32 @puts(ptr nocapture) nounwind
196 ; Definition of main function
198 ; Call puts function to write out the string to stdout.
199 call i32 @puts(ptr @.str)
204 !0 = !{i32 42, null, !"string"}
207 This example is made up of a :ref:`global variable <globalvars>` named
208 "``.str``", an external declaration of the "``puts``" function, a
209 :ref:`function definition <functionstructure>` for "``main``" and
210 :ref:`named metadata <namedmetadatastructure>` "``foo``".
212 In general, a module is made up of a list of global values (where both
213 functions and global variables are global values). Global values are
214 represented by a pointer to a memory location (in this case, a pointer
215 to an array of char, and a pointer to a function), and have one of the
216 following :ref:`linkage types <linkage>`.
223 All Global Variables and Functions have one of the following types of
227 Global values with "``private``" linkage are only directly
228 accessible by objects in the current module. In particular, linking
229 code into a module with a private global value may cause the
230 private to be renamed as necessary to avoid collisions. Because the
231 symbol is private to the module, all references can be updated. This
232 doesn't show up in any symbol table in the object file.
234 Similar to private, but the value shows as a local symbol
235 (``STB_LOCAL`` in the case of ELF) in the object file. This
236 corresponds to the notion of the '``static``' keyword in C.
237 ``available_externally``
238 Globals with "``available_externally``" linkage are never emitted into
239 the object file corresponding to the LLVM module. From the linker's
240 perspective, an ``available_externally`` global is equivalent to
241 an external declaration. They exist to allow inlining and other
242 optimizations to take place given knowledge of the definition of the
243 global, which is known to be somewhere outside the module. Globals
244 with ``available_externally`` linkage are allowed to be discarded at
245 will, and allow inlining and other optimizations. This linkage type is
246 only allowed on definitions, not declarations.
248 Globals with "``linkonce``" linkage are merged with other globals of
249 the same name when linkage occurs. This can be used to implement
250 some forms of inline functions, templates, or other code which must
251 be generated in each translation unit that uses it, but where the
252 body may be overridden with a more definitive definition later.
253 Unreferenced ``linkonce`` globals are allowed to be discarded. Note
254 that ``linkonce`` linkage does not actually allow the optimizer to
255 inline the body of this function into callers because it doesn't
256 know if this definition of the function is the definitive definition
257 within the program or whether it will be overridden by a stronger
258 definition. To enable inlining and other optimizations, use
259 "``linkonce_odr``" linkage.
261 "``weak``" linkage has the same merging semantics as ``linkonce``
262 linkage, except that unreferenced globals with ``weak`` linkage may
263 not be discarded. This is used for globals that are declared "weak"
266 "``common``" linkage is most similar to "``weak``" linkage, but they
267 are used for tentative definitions in C, such as "``int X;``" at
268 global scope. Symbols with "``common``" linkage are merged in the
269 same way as ``weak symbols``, and they may not be deleted if
270 unreferenced. ``common`` symbols may not have an explicit section,
271 must have a zero initializer, and may not be marked
272 ':ref:`constant <globalvars>`'. Functions and aliases may not have
275 .. _linkage_appending:
278 "``appending``" linkage may only be applied to global variables of
279 pointer to array type. When two global variables with appending
280 linkage are linked together, the two global arrays are appended
281 together. This is the LLVM, typesafe, equivalent of having the
282 system linker append together "sections" with identical names when
285 Unfortunately this doesn't correspond to any feature in .o files, so it
286 can only be used for variables like ``llvm.global_ctors`` which llvm
287 interprets specially.
290 The semantics of this linkage follow the ELF object file model: the
291 symbol is weak until linked, if not linked, the symbol becomes null
292 instead of being an undefined reference.
293 ``linkonce_odr``, ``weak_odr``
294 The ``odr`` suffix indicates that all globals defined with the given name
295 are equivalent, along the lines of the C++ "one definition rule" ("ODR").
296 Informally, this means we can inline functions and fold loads of constants.
298 Formally, use the following definition: when an ``odr`` function is
299 called, one of the definitions is non-deterministically chosen to run. For
300 ``odr`` variables, if any byte in the value is not equal in all
301 initializers, that byte is a :ref:`poison value <poisonvalues>`. For
302 aliases and ifuncs, apply the rule for the underlying function or variable.
304 These linkage types are otherwise the same as their non-``odr`` versions.
306 If none of the above identifiers are used, the global is externally
307 visible, meaning that it participates in linkage and can be used to
308 resolve external symbol references.
310 It is illegal for a global variable or function *declaration* to have any
311 linkage type other than ``external`` or ``extern_weak``.
318 LLVM :ref:`functions <functionstructure>`, :ref:`calls <i_call>` and
319 :ref:`invokes <i_invoke>` can all have an optional calling convention
320 specified for the call. The calling convention of any pair of dynamic
321 caller/callee must match, or the behavior of the program is undefined.
322 The following calling conventions are supported by LLVM, and more may be
325 "``ccc``" - The C calling convention
326 This calling convention (the default if no other calling convention
327 is specified) matches the target C calling conventions. This calling
328 convention supports varargs function calls and tolerates some
329 mismatch in the declared prototype and implemented declaration of
330 the function (as does normal C).
331 "``fastcc``" - The fast calling convention
332 This calling convention attempts to make calls as fast as possible
333 (e.g. by passing things in registers). This calling convention
334 allows the target to use whatever tricks it wants to produce fast
335 code for the target, without having to conform to an externally
336 specified ABI (Application Binary Interface). `Tail calls can only
337 be optimized when this, the tailcc, the GHC or the HiPE convention is
338 used. <CodeGenerator.html#tail-call-optimization>`_ This calling
339 convention does not support varargs and requires the prototype of all
340 callees to exactly match the prototype of the function definition.
341 "``coldcc``" - The cold calling convention
342 This calling convention attempts to make code in the caller as
343 efficient as possible under the assumption that the call is not
344 commonly executed. As such, these calls often preserve all registers
345 so that the call does not break any live ranges in the caller side.
346 This calling convention does not support varargs and requires the
347 prototype of all callees to exactly match the prototype of the
348 function definition. Furthermore the inliner doesn't consider such function
350 "``ghccc``" - GHC convention
351 This calling convention has been implemented specifically for use by
352 the `Glasgow Haskell Compiler (GHC) <http://www.haskell.org/ghc>`_.
353 It passes everything in registers, going to extremes to achieve this
354 by disabling callee save registers. This calling convention should
355 not be used lightly but only for specific situations such as an
356 alternative to the *register pinning* performance technique often
357 used when implementing functional programming languages. At the
358 moment only X86, AArch64, and RISCV support this convention. The
359 following limitations exist:
361 - On *X86-32* only up to 4 bit type parameters are supported. No
362 floating-point types are supported.
363 - On *X86-64* only up to 10 bit type parameters and 6
364 floating-point parameters are supported.
365 - On *AArch64* only up to 4 32-bit floating-point parameters,
366 4 64-bit floating-point parameters, and 10 bit type parameters
368 - *RISCV64* only supports up to 11 bit type parameters, 4
369 32-bit floating-point parameters, and 4 64-bit floating-point
372 This calling convention supports `tail call
373 optimization <CodeGenerator.html#tail-call-optimization>`_ but requires
374 both the caller and callee are using it.
375 "``cc 11``" - The HiPE calling convention
376 This calling convention has been implemented specifically for use by
377 the `High-Performance Erlang
378 (HiPE) <http://www.it.uu.se/research/group/hipe/>`_ compiler, *the*
379 native code compiler of the `Ericsson's Open Source Erlang/OTP
380 system <http://www.erlang.org/download.shtml>`_. It uses more
381 registers for argument passing than the ordinary C calling
382 convention and defines no callee-saved registers. The calling
383 convention properly supports `tail call
384 optimization <CodeGenerator.html#tail-call-optimization>`_ but requires
385 that both the caller and the callee use it. It uses a *register pinning*
386 mechanism, similar to GHC's convention, for keeping frequently
387 accessed runtime components pinned to specific hardware registers.
388 At the moment only X86 supports this convention (both 32 and 64
390 "``anyregcc``" - Dynamic calling convention for code patching
391 This is a special convention that supports patching an arbitrary code
392 sequence in place of a call site. This convention forces the call
393 arguments into registers but allows them to be dynamically
394 allocated. This can currently only be used with calls to
395 llvm.experimental.patchpoint because only this intrinsic records
396 the location of its arguments in a side table. See :doc:`StackMaps`.
397 "``preserve_mostcc``" - The `PreserveMost` calling convention
398 This calling convention attempts to make the code in the caller as
399 unintrusive as possible. This convention behaves identically to the `C`
400 calling convention on how arguments and return values are passed, but it
401 uses a different set of caller/callee-saved registers. This alleviates the
402 burden of saving and recovering a large register set before and after the
403 call in the caller. If the arguments are passed in callee-saved registers,
404 then they will be preserved by the callee across the call. This doesn't
405 apply for values returned in callee-saved registers.
407 - On X86-64 the callee preserves all general purpose registers, except for
408 R11 and return registers, if any. R11 can be used as a scratch register.
409 The treatment of floating-point registers (XMMs/YMMs) matches the OS's C
410 calling convention: on most platforms, they are not preserved and need to
411 be saved by the caller, but on Windows, xmm6-xmm15 are preserved.
413 - On AArch64 the callee preserve all general purpose registers, except X0-X8
416 The idea behind this convention is to support calls to runtime functions
417 that have a hot path and a cold path. The hot path is usually a small piece
418 of code that doesn't use many registers. The cold path might need to call out to
419 another function and therefore only needs to preserve the caller-saved
420 registers, which haven't already been saved by the caller. The
421 `PreserveMost` calling convention is very similar to the `cold` calling
422 convention in terms of caller/callee-saved registers, but they are used for
423 different types of function calls. `coldcc` is for function calls that are
424 rarely executed, whereas `preserve_mostcc` function calls are intended to be
425 on the hot path and definitely executed a lot. Furthermore `preserve_mostcc`
426 doesn't prevent the inliner from inlining the function call.
428 This calling convention will be used by a future version of the ObjectiveC
429 runtime and should therefore still be considered experimental at this time.
430 Although this convention was created to optimize certain runtime calls to
431 the ObjectiveC runtime, it is not limited to this runtime and might be used
432 by other runtimes in the future too. The current implementation only
433 supports X86-64, but the intention is to support more architectures in the
435 "``preserve_allcc``" - The `PreserveAll` calling convention
436 This calling convention attempts to make the code in the caller even less
437 intrusive than the `PreserveMost` calling convention. This calling
438 convention also behaves identical to the `C` calling convention on how
439 arguments and return values are passed, but it uses a different set of
440 caller/callee-saved registers. This removes the burden of saving and
441 recovering a large register set before and after the call in the caller. If
442 the arguments are passed in callee-saved registers, then they will be
443 preserved by the callee across the call. This doesn't apply for values
444 returned in callee-saved registers.
446 - On X86-64 the callee preserves all general purpose registers, except for
447 R11. R11 can be used as a scratch register. Furthermore it also preserves
448 all floating-point registers (XMMs/YMMs).
450 - On AArch64 the callee preserve all general purpose registers, except X0-X8
451 and X16-X18. Furthermore it also preserves lower 128 bits of V8-V31 SIMD -
452 floating point registers.
454 The idea behind this convention is to support calls to runtime functions
455 that don't need to call out to any other functions.
457 This calling convention, like the `PreserveMost` calling convention, will be
458 used by a future version of the ObjectiveC runtime and should be considered
459 experimental at this time.
460 "``preserve_nonecc``" - The `PreserveNone` calling convention
461 This calling convention doesn't preserve any general registers. So all
462 general registers are caller saved registers. It also uses all general
463 registers to pass arguments. This attribute doesn't impact non-general
464 purpose registers (e.g. floating point registers, on X86 XMMs/YMMs).
465 Non-general purpose registers still follow the standard c calling
466 convention. Currently it is for x86_64 and AArch64 only.
467 "``cxx_fast_tlscc``" - The `CXX_FAST_TLS` calling convention for access functions
468 Clang generates an access function to access C++-style TLS. The access
469 function generally has an entry block, an exit block and an initialization
470 block that is run at the first time. The entry and exit blocks can access
471 a few TLS IR variables, each access will be lowered to a platform-specific
474 This calling convention aims to minimize overhead in the caller by
475 preserving as many registers as possible (all the registers that are
476 preserved on the fast path, composed of the entry and exit blocks).
478 This calling convention behaves identical to the `C` calling convention on
479 how arguments and return values are passed, but it uses a different set of
480 caller/callee-saved registers.
482 Given that each platform has its own lowering sequence, hence its own set
483 of preserved registers, we can't use the existing `PreserveMost`.
485 - On X86-64 the callee preserves all general purpose registers, except for
487 "``tailcc``" - Tail callable calling convention
488 This calling convention ensures that calls in tail position will always be
489 tail call optimized. This calling convention is equivalent to fastcc,
490 except for an additional guarantee that tail calls will be produced
491 whenever possible. `Tail calls can only be optimized when this, the fastcc,
492 the GHC or the HiPE convention is used. <CodeGenerator.html#tail-call-optimization>`_
493 This calling convention does not support varargs and requires the prototype of
494 all callees to exactly match the prototype of the function definition.
495 "``swiftcc``" - This calling convention is used for Swift language.
496 - On X86-64 RCX and R8 are available for additional integer returns, and
497 XMM2 and XMM3 are available for additional FP/vector returns.
498 - On iOS platforms, we use AAPCS-VFP calling convention.
500 This calling convention is like ``swiftcc`` in most respects, but also the
501 callee pops the argument area of the stack so that mandatory tail calls are
502 possible as in ``tailcc``.
503 "``cfguard_checkcc``" - Windows Control Flow Guard (Check mechanism)
504 This calling convention is used for the Control Flow Guard check function,
505 calls to which can be inserted before indirect calls to check that the call
506 target is a valid function address. The check function has no return value,
507 but it will trigger an OS-level error if the address is not a valid target.
508 The set of registers preserved by the check function, and the register
509 containing the target address are architecture-specific.
511 - On X86 the target address is passed in ECX.
512 - On ARM the target address is passed in R0.
513 - On AArch64 the target address is passed in X15.
514 "``cc <n>``" - Numbered convention
515 Any calling convention may be specified by number, allowing
516 target-specific calling conventions to be used. Target specific
517 calling conventions start at 64.
519 More calling conventions can be added/defined on an as-needed basis, to
520 support Pascal conventions or any other well-known target-independent
523 .. _visibilitystyles:
528 All Global Variables and Functions have one of the following visibility
531 "``default``" - Default style
532 On targets that use the ELF object file format, default visibility
533 means that the declaration is visible to other modules and, in
534 shared libraries, means that the declared entity may be overridden.
535 On Darwin, default visibility means that the declaration is visible
536 to other modules. On XCOFF, default visibility means no explicit
537 visibility bit will be set and whether the symbol is visible
538 (i.e "exported") to other modules depends primarily on export lists
539 provided to the linker. Default visibility corresponds to "external
540 linkage" in the language.
541 "``hidden``" - Hidden style
542 Two declarations of an object with hidden visibility refer to the
543 same object if they are in the same shared object. Usually, hidden
544 visibility indicates that the symbol will not be placed into the
545 dynamic symbol table, so no other module (executable or shared
546 library) can reference it directly.
547 "``protected``" - Protected style
548 On ELF, protected visibility indicates that the symbol will be
549 placed in the dynamic symbol table, but that references within the
550 defining module will bind to the local symbol. That is, the symbol
551 cannot be overridden by another module.
553 A symbol with ``internal`` or ``private`` linkage must have ``default``
561 All Global Variables, Functions and Aliases can have one of the following
565 "``dllimport``" causes the compiler to reference a function or variable via
566 a global pointer to a pointer that is set up by the DLL exporting the
567 symbol. On Microsoft Windows targets, the pointer name is formed by
568 combining ``__imp_`` and the function or variable name.
570 On Microsoft Windows targets, "``dllexport``" causes the compiler to provide
571 a global pointer to a pointer in a DLL, so that it can be referenced with the
572 ``dllimport`` attribute. the pointer name is formed by combining ``__imp_``
573 and the function or variable name. On XCOFF targets, ``dllexport`` indicates
574 that the symbol will be made visible to other modules using "exported"
575 visibility and thus placed by the linker in the loader section symbol table.
576 Since this storage class exists for defining a dll interface, the compiler,
577 assembler and linker know it is externally referenced and must refrain from
580 A symbol with ``internal`` or ``private`` linkage cannot have a DLL storage
585 Thread Local Storage Models
586 ---------------------------
588 A variable may be defined as ``thread_local``, which means that it will
589 not be shared by threads (each thread will have a separated copy of the
590 variable). Not all targets support thread-local variables. Optionally, a
591 TLS model may be specified:
594 For variables that are only used within the current shared library.
596 For variables in modules that will not be loaded dynamically.
598 For variables defined in the executable and only used within it.
600 If no explicit model is given, the "general dynamic" model is used.
602 The models correspond to the ELF TLS models; see `ELF Handling For
603 Thread-Local Storage <http://people.redhat.com/drepper/tls.pdf>`_ for
604 more information on under which circumstances the different models may
605 be used. The target may choose a different TLS model if the specified
606 model is not supported, or if a better choice of model can be made.
608 A model can also be specified in an alias, but then it only governs how
609 the alias is accessed. It will not have any effect in the aliasee.
611 For platforms without linker support of ELF TLS model, the -femulated-tls
612 flag can be used to generate GCC compatible emulated TLS code.
614 .. _runtime_preemption_model:
616 Runtime Preemption Specifiers
617 -----------------------------
619 Global variables, functions and aliases may have an optional runtime preemption
620 specifier. If a preemption specifier isn't given explicitly, then a
621 symbol is assumed to be ``dso_preemptable``.
624 Indicates that the function or variable may be replaced by a symbol from
625 outside the linkage unit at runtime.
628 The compiler may assume that a function or variable marked as ``dso_local``
629 will resolve to a symbol within the same linkage unit. Direct access will
630 be generated even if the definition is not within this compilation unit.
637 LLVM IR allows you to specify both "identified" and "literal" :ref:`structure
638 types <t_struct>`. Literal types are uniqued structurally, but identified types
639 are never uniqued. An :ref:`opaque structural type <t_opaque>` can also be used
640 to forward declare a type that is not yet available.
642 An example of an identified structure specification is:
646 %mytype = type { %mytype*, i32 }
648 Prior to the LLVM 3.0 release, identified types were structurally uniqued. Only
649 literal types are uniqued in recent versions of LLVM.
653 Non-Integral Pointer Type
654 -------------------------
656 Note: non-integral pointer types are a work in progress, and they should be
657 considered experimental at this time.
659 LLVM IR optionally allows the frontend to denote pointers in certain address
660 spaces as "non-integral" via the :ref:`datalayout string<langref_datalayout>`.
661 Non-integral pointer types represent pointers that have an *unspecified* bitwise
662 representation; that is, the integral representation may be target dependent or
663 unstable (not backed by a fixed integer).
665 ``inttoptr`` and ``ptrtoint`` instructions have the same semantics as for
666 integral (i.e. normal) pointers in that they convert integers to and from
667 corresponding pointer types, but there are additional implications to be
668 aware of. Because the bit-representation of a non-integral pointer may
669 not be stable, two identical casts of the same operand may or may not
670 return the same value. Said differently, the conversion to or from the
671 non-integral type depends on environmental state in an implementation
674 If the frontend wishes to observe a *particular* value following a cast, the
675 generated IR must fence with the underlying environment in an implementation
676 defined manner. (In practice, this tends to require ``noinline`` routines for
679 From the perspective of the optimizer, ``inttoptr`` and ``ptrtoint`` for
680 non-integral types are analogous to ones on integral types with one
681 key exception: the optimizer may not, in general, insert new dynamic
682 occurrences of such casts. If a new cast is inserted, the optimizer would
683 need to either ensure that a) all possible values are valid, or b)
684 appropriate fencing is inserted. Since the appropriate fencing is
685 implementation defined, the optimizer can't do the latter. The former is
686 challenging as many commonly expected properties, such as
687 ``ptrtoint(v)-ptrtoint(v) == 0``, don't hold for non-integral types.
688 Similar restrictions apply to intrinsics that might examine the pointer bits,
689 such as :ref:`llvm.ptrmask<int_ptrmask>`.
691 The alignment information provided by the frontend for a non-integral pointer
692 (typically using attributes or metadata) must be valid for every possible
693 representation of the pointer.
700 Global variables define regions of memory allocated at compilation time
703 Global variable definitions must be initialized.
705 Global variables in other translation units can also be declared, in which
706 case they don't have an initializer.
708 Global variables can optionally specify a :ref:`linkage type <linkage>`.
710 Either global variable definitions or declarations may have an explicit section
711 to be placed in and may have an optional explicit alignment specified. If there
712 is a mismatch between the explicit or inferred section information for the
713 variable declaration and its definition the resulting behavior is undefined.
715 A variable may be defined as a global ``constant``, which indicates that
716 the contents of the variable will **never** be modified (enabling better
717 optimization, allowing the global data to be placed in the read-only
718 section of an executable, etc). Note that variables that need runtime
719 initialization cannot be marked ``constant`` as there is a store to the
722 LLVM explicitly allows *declarations* of global variables to be marked
723 constant, even if the final definition of the global is not. This
724 capability can be used to enable slightly better optimization of the
725 program, but requires the language definition to guarantee that
726 optimizations based on the 'constantness' are valid for the translation
727 units that do not include the definition.
729 As SSA values, global variables define pointer values that are in scope
730 (i.e. they dominate) all basic blocks in the program. Global variables
731 always define a pointer to their "content" type because they describe a
732 region of memory, and all memory objects in LLVM are accessed through
735 Global variables can be marked with ``unnamed_addr`` which indicates
736 that the address is not significant, only the content. Constants marked
737 like this can be merged with other constants if they have the same
738 initializer. Note that a constant with significant address *can* be
739 merged with a ``unnamed_addr`` constant, the result being a constant
740 whose address is significant.
742 If the ``local_unnamed_addr`` attribute is given, the address is known to
743 not be significant within the module.
745 A global variable may be declared to reside in a target-specific
746 numbered address space. For targets that support them, address spaces
747 may affect how optimizations are performed and/or what target
748 instructions are used to access the variable. The default address space
749 is zero. The address space qualifier must precede any other attributes.
751 LLVM allows an explicit section to be specified for globals. If the
752 target supports it, it will emit globals to the section specified.
753 Additionally, the global can placed in a comdat if the target has the necessary
756 External declarations may have an explicit section specified. Section
757 information is retained in LLVM IR for targets that make use of this
758 information. Attaching section information to an external declaration is an
759 assertion that its definition is located in the specified section. If the
760 definition is located in a different section, the behavior is undefined.
762 LLVM allows an explicit code model to be specified for globals. If the
763 target supports it, it will emit globals in the code model specified,
764 overriding the code model used to compile the translation unit.
765 The allowed values are "tiny", "small", "kernel", "medium", "large".
766 This may be extended in the future to specify global data layout that
767 doesn't cleanly fit into a specific code model.
769 By default, global initializers are optimized by assuming that global
770 variables defined within the module are not modified from their
771 initial values before the start of the global initializer. This is
772 true even for variables potentially accessible from outside the
773 module, including those with external linkage or appearing in
774 ``@llvm.used`` or dllexported variables. This assumption may be suppressed
775 by marking the variable with ``externally_initialized``.
777 An explicit alignment may be specified for a global, which must be a
778 power of 2. If not present, or if the alignment is set to zero, the
779 alignment of the global is set by the target to whatever it feels
780 convenient. If an explicit alignment is specified, the global is forced
781 to have exactly that alignment. Targets and optimizers are not allowed
782 to over-align the global if the global has an assigned section. In this
783 case, the extra alignment could be observable: for example, code could
784 assume that the globals are densely packed in their section and try to
785 iterate over them as an array, alignment padding would break this
786 iteration. For TLS variables, the module flag ``MaxTLSAlign``, if present,
787 limits the alignment to the given value. Optimizers are not allowed to
788 impose a stronger alignment on these variables. The maximum alignment
791 For global variable declarations, as well as definitions that may be
792 replaced at link time (``linkonce``, ``weak``, ``extern_weak`` and ``common``
793 linkage types), the allocation size and alignment of the definition it resolves
794 to must be greater than or equal to that of the declaration or replaceable
795 definition, otherwise the behavior is undefined.
797 Globals can also have a :ref:`DLL storage class <dllstorageclass>`,
798 an optional :ref:`runtime preemption specifier <runtime_preemption_model>`,
799 an optional :ref:`global attributes <glattrs>` and
800 an optional list of attached :ref:`metadata <metadata>`.
802 Variables and aliases can have a
803 :ref:`Thread Local Storage Model <tls_model>`.
805 Globals cannot be or contain :ref:`Scalable vectors <t_vector>` because their
806 size is unknown at compile time. They are allowed in structs to facilitate
807 intrinsics returning multiple values. Generally, structs containing scalable
808 vectors are not considered "sized" and cannot be used in loads, stores, allocas,
809 or GEPs. The only exception to this rule is for structs that contain scalable
810 vectors of the same type (e.g. ``{<vscale x 2 x i32>, <vscale x 2 x i32>}``
811 contains the same type while ``{<vscale x 2 x i32>, <vscale x 2 x i64>}``
812 doesn't). These kinds of structs (we may call them homogeneous scalable vector
813 structs) are considered sized and can be used in loads, stores, allocas, but
816 Globals with ``toc-data`` attribute set are stored in TOC of XCOFF. Their
817 alignments are not larger than that of a TOC entry. Optimizations should not
818 increase their alignments to mitigate TOC overflow.
822 @<GlobalVarName> = [Linkage] [PreemptionSpecifier] [Visibility]
823 [DLLStorageClass] [ThreadLocal]
824 [(unnamed_addr|local_unnamed_addr)] [AddrSpace]
825 [ExternallyInitialized]
826 <global | constant> <Type> [<InitializerConstant>]
827 [, section "name"] [, partition "name"]
828 [, comdat [($name)]] [, align <Alignment>]
829 [, code_model "model"]
830 [, no_sanitize_address] [, no_sanitize_hwaddress]
831 [, sanitize_address_dyninit] [, sanitize_memtag]
834 For example, the following defines a global in a numbered address space
835 with an initializer, section, and alignment:
839 @G = addrspace(5) constant float 1.0, section "foo", align 4
841 The following example just declares a global variable
845 @G = external global i32
847 The following example defines a global variable with the
848 ``large`` code model:
852 @G = internal global i32 0, code_model "large"
854 The following example defines a thread-local global with the
855 ``initialexec`` TLS model:
859 @G = thread_local(initialexec) global i32 0, align 4
861 .. _functionstructure:
866 LLVM function definitions consist of the "``define``" keyword, an
867 optional :ref:`linkage type <linkage>`, an optional :ref:`runtime preemption
868 specifier <runtime_preemption_model>`, an optional :ref:`visibility
869 style <visibility>`, an optional :ref:`DLL storage class <dllstorageclass>`,
870 an optional :ref:`calling convention <callingconv>`,
871 an optional ``unnamed_addr`` attribute, a return type, an optional
872 :ref:`parameter attribute <paramattrs>` for the return type, a function
873 name, a (possibly empty) argument list (each with optional :ref:`parameter
874 attributes <paramattrs>`), optional :ref:`function attributes <fnattrs>`,
875 an optional address space, an optional section, an optional partition,
876 an optional alignment, an optional :ref:`comdat <langref_comdats>`,
877 an optional :ref:`garbage collector name <gc>`, an optional :ref:`prefix <prefixdata>`,
878 an optional :ref:`prologue <prologuedata>`,
879 an optional :ref:`personality <personalityfn>`,
880 an optional list of attached :ref:`metadata <metadata>`,
881 an opening curly brace, a list of basic blocks, and a closing curly brace.
885 define [linkage] [PreemptionSpecifier] [visibility] [DLLStorageClass]
887 <ResultType> @<FunctionName> ([argument list])
888 [(unnamed_addr|local_unnamed_addr)] [AddrSpace] [fn Attrs]
889 [section "name"] [partition "name"] [comdat [($name)]] [align N]
890 [gc] [prefix Constant] [prologue Constant] [personality Constant]
893 The argument list is a comma separated sequence of arguments where each
894 argument is of the following form:
898 <type> [parameter Attrs] [name]
900 LLVM function declarations consist of the "``declare``" keyword, an
901 optional :ref:`linkage type <linkage>`, an optional :ref:`visibility style
902 <visibility>`, an optional :ref:`DLL storage class <dllstorageclass>`, an
903 optional :ref:`calling convention <callingconv>`, an optional ``unnamed_addr``
904 or ``local_unnamed_addr`` attribute, an optional address space, a return type,
905 an optional :ref:`parameter attribute <paramattrs>` for the return type, a function name, a possibly
906 empty list of arguments, an optional alignment, an optional :ref:`garbage
907 collector name <gc>`, an optional :ref:`prefix <prefixdata>`, and an optional
908 :ref:`prologue <prologuedata>`.
912 declare [linkage] [visibility] [DLLStorageClass]
914 <ResultType> @<FunctionName> ([argument list])
915 [(unnamed_addr|local_unnamed_addr)] [align N] [gc]
916 [prefix Constant] [prologue Constant]
918 A function definition contains a list of basic blocks, forming the CFG (Control
919 Flow Graph) for the function. Each basic block may optionally start with a label
920 (giving the basic block a symbol table entry), contains a list of instructions
921 and :ref:`debug records <debugrecords>`,
922 and ends with a :ref:`terminator <terminators>` instruction (such as a branch or
923 function return). If an explicit label name is not provided, a block is assigned
924 an implicit numbered label, using the next value from the same counter as used
925 for unnamed temporaries (:ref:`see above<identifiers>`). For example, if a
926 function entry block does not have an explicit label, it will be assigned label
927 "%0", then the first unnamed temporary in that block will be "%1", etc. If a
928 numeric label is explicitly specified, it must match the numeric label that
929 would be used implicitly.
931 The first basic block in a function is special in two ways: it is
932 immediately executed on entrance to the function, and it is not allowed
933 to have predecessor basic blocks (i.e. there can not be any branches to
934 the entry block of a function). Because the block can have no
935 predecessors, it also cannot have any :ref:`PHI nodes <i_phi>`.
937 LLVM allows an explicit section to be specified for functions. If the
938 target supports it, it will emit functions to the section specified.
939 Additionally, the function can be placed in a COMDAT.
941 An explicit alignment may be specified for a function. If not present,
942 or if the alignment is set to zero, the alignment of the function is set
943 by the target to whatever it feels convenient. If an explicit alignment
944 is specified, the function is forced to have at least that much
945 alignment. All alignments must be a power of 2.
947 If the ``unnamed_addr`` attribute is given, the address is known to not
948 be significant and two identical functions can be merged.
950 If the ``local_unnamed_addr`` attribute is given, the address is known to
951 not be significant within the module.
953 If an explicit address space is not given, it will default to the program
954 address space from the :ref:`datalayout string<langref_datalayout>`.
961 Aliases, unlike function or variables, don't create any new data. They
962 are just a new symbol and metadata for an existing position.
964 Aliases have a name and an aliasee that is either a global value or a
967 Aliases may have an optional :ref:`linkage type <linkage>`, an optional
968 :ref:`runtime preemption specifier <runtime_preemption_model>`, an optional
969 :ref:`visibility style <visibility>`, an optional :ref:`DLL storage class
970 <dllstorageclass>` and an optional :ref:`tls model <tls_model>`.
974 @<Name> = [Linkage] [PreemptionSpecifier] [Visibility] [DLLStorageClass] [ThreadLocal] [(unnamed_addr|local_unnamed_addr)] alias <AliaseeTy>, <AliaseeTy>* @<Aliasee>
977 The linkage must be one of ``private``, ``internal``, ``linkonce``, ``weak``,
978 ``linkonce_odr``, ``weak_odr``, ``external``, ``available_externally``. Note
979 that some system linkers might not correctly handle dropping a weak symbol that
982 Aliases that are not ``unnamed_addr`` are guaranteed to have the same address as
983 the aliasee expression. ``unnamed_addr`` ones are only guaranteed to point
986 If the ``local_unnamed_addr`` attribute is given, the address is known to
987 not be significant within the module.
989 Since aliases are only a second name, some restrictions apply, of which
990 some can only be checked when producing an object file:
992 * The expression defining the aliasee must be computable at assembly
993 time. Since it is just a name, no relocations can be used.
995 * No alias in the expression can be weak as the possibility of the
996 intermediate alias being overridden cannot be represented in an
999 * If the alias has the ``available_externally`` linkage, the aliasee must be an
1000 ``available_externally`` global value; otherwise the aliasee can be an
1001 expression but no global value in the expression can be a declaration, since
1002 that would require a relocation, which is not possible.
1004 * If either the alias or the aliasee may be replaced by a symbol outside the
1005 module at link time or runtime, any optimization cannot replace the alias with
1006 the aliasee, since the behavior may be different. The alias may be used as a
1007 name guaranteed to point to the content in the current module.
1014 IFuncs, like as aliases, don't create any new data or func. They are just a new
1015 symbol that is resolved at runtime by calling a resolver function.
1017 On ELF platforms, IFuncs are resolved by the dynamic linker at load time. On
1018 Mach-O platforms, they are lowered in terms of ``.symbol_resolver`` functions,
1019 which lazily resolve the callee the first time they are called.
1021 IFunc may have an optional :ref:`linkage type <linkage>` and an optional
1022 :ref:`visibility style <visibility>`.
1026 @<Name> = [Linkage] [PreemptionSpecifier] [Visibility] ifunc <IFuncTy>, <ResolverTy>* @<Resolver>
1027 [, partition "name"]
1030 .. _langref_comdats:
1035 Comdat IR provides access to object file COMDAT/section group functionality
1036 which represents interrelated sections.
1038 Comdats have a name which represents the COMDAT key and a selection kind to
1039 provide input on how the linker deduplicates comdats with the same key in two
1040 different object files. A comdat must be included or omitted as a unit.
1041 Discarding the whole comdat is allowed but discarding a subset is not.
1043 A global object may be a member of at most one comdat. Aliases are placed in the
1044 same COMDAT that their aliasee computes to, if any.
1048 $<Name> = comdat SelectionKind
1050 For selection kinds other than ``nodeduplicate``, only one of the duplicate
1051 comdats may be retained by the linker and the members of the remaining comdats
1052 must be discarded. The following selection kinds are supported:
1055 The linker may choose any COMDAT key, the choice is arbitrary.
1057 The linker may choose any COMDAT key but the sections must contain the
1060 The linker will choose the section containing the largest COMDAT key.
1062 No deduplication is performed.
1064 The linker may choose any COMDAT key but the sections must contain the
1065 same amount of data.
1067 - XCOFF and Mach-O don't support COMDATs.
1068 - COFF supports all selection kinds. Non-``nodeduplicate`` selection kinds need
1069 a non-local linkage COMDAT symbol.
1070 - ELF supports ``any`` and ``nodeduplicate``.
1071 - WebAssembly only supports ``any``.
1073 Here is an example of a COFF COMDAT where a function will only be selected if
1074 the COMDAT key's section is the largest:
1076 .. code-block:: text
1078 $foo = comdat largest
1079 @foo = global i32 2, comdat($foo)
1081 define void @bar() comdat($foo) {
1085 In a COFF object file, this will create a COMDAT section with selection kind
1086 ``IMAGE_COMDAT_SELECT_LARGEST`` containing the contents of the ``@foo`` symbol
1087 and another COMDAT section with selection kind
1088 ``IMAGE_COMDAT_SELECT_ASSOCIATIVE`` which is associated with the first COMDAT
1089 section and contains the contents of the ``@bar`` symbol.
1091 As a syntactic sugar the ``$name`` can be omitted if the name is the same as
1094 .. code-block:: llvm
1097 @foo = global i32 2, comdat
1098 @bar = global i32 3, comdat($foo)
1100 There are some restrictions on the properties of the global object.
1101 It, or an alias to it, must have the same name as the COMDAT group when
1103 The contents and size of this object may be used during link-time to determine
1104 which COMDAT groups get selected depending on the selection kind.
1105 Because the name of the object must match the name of the COMDAT group, the
1106 linkage of the global object must not be local; local symbols can get renamed
1107 if a collision occurs in the symbol table.
1109 The combined use of COMDATS and section attributes may yield surprising results.
1112 .. code-block:: llvm
1116 @g1 = global i32 42, section "sec", comdat($foo)
1117 @g2 = global i32 42, section "sec", comdat($bar)
1119 From the object file perspective, this requires the creation of two sections
1120 with the same name. This is necessary because both globals belong to different
1121 COMDAT groups and COMDATs, at the object file level, are represented by
1124 Note that certain IR constructs like global variables and functions may
1125 create COMDATs in the object file in addition to any which are specified using
1126 COMDAT IR. This arises when the code generator is configured to emit globals
1127 in individual sections (e.g. when `-data-sections` or `-function-sections`
1128 is supplied to `llc`).
1130 .. _namedmetadatastructure:
1135 Named metadata is a collection of metadata. :ref:`Metadata
1136 nodes <metadata>` (but not metadata strings) are the only valid
1137 operands for a named metadata.
1139 #. Named metadata are represented as a string of characters with the
1140 metadata prefix. The rules for metadata names are the same as for
1141 identifiers, but quoted names are not allowed. ``"\xx"`` type escapes
1142 are still valid, which allows any character to be part of a name.
1146 ; Some unnamed metadata nodes, which are referenced by the named metadata.
1151 !name = !{!0, !1, !2}
1155 Parameter Attributes
1156 --------------------
1158 The return type and each parameter of a function type may have a set of
1159 *parameter attributes* associated with them. Parameter attributes are
1160 used to communicate additional information about the result or
1161 parameters of a function. Parameter attributes are considered to be part
1162 of the function, not of the function type, so functions with different
1163 parameter attributes can have the same function type.
1165 Parameter attributes are either simple keywords or strings that follow the
1166 specified type. Multiple parameter attributes, when required, are separated by
1167 spaces. For example:
1169 .. code-block:: llvm
1171 declare i32 @printf(ptr noalias nocapture, ...)
1172 declare i32 @atoi(i8 zeroext)
1173 declare signext i8 @returns_signed_char()
1174 define void @baz(i32 "amdgpu-flat-work-group-size"="1,256" %x)
1176 Note that any attributes for the function result (``nonnull``,
1177 ``signext``) come before the result type.
1179 If an integer argument to a function is not marked signext/zeroext/noext, the
1180 kind of extension used is target-specific. Some targets depend for
1181 correctness on the kind of extension to be explicitly specified.
1183 Currently, only the following parameter attributes are defined:
1186 This indicates to the code generator that the parameter or return
1187 value should be zero-extended to the extent required by the target's
1188 ABI by the caller (for a parameter) or the callee (for a return value).
1190 This indicates to the code generator that the parameter or return
1191 value should be sign-extended to the extent required by the target's
1192 ABI (which is usually 32-bits) by the caller (for a parameter) or
1193 the callee (for a return value).
1195 This indicates to the code generator that the parameter or return
1196 value has the high bits undefined, as for a struct in register, and
1197 therefore does not need to be sign or zero extended. This is the same
1198 as default behavior and is only actually used (by some targets) to
1199 validate that one of the attributes is always present.
1201 This indicates that this parameter or return value should be treated
1202 in a special target-dependent fashion while emitting code for
1203 a function call or return (usually, by putting it in a register as
1204 opposed to memory, though some targets use it to distinguish between
1205 two different kinds of registers). Use of this attribute is
1208 This indicates that the pointer parameter should really be passed by
1209 value to the function. The attribute implies that a hidden copy of
1210 the pointee is made between the caller and the callee, so the callee
1211 is unable to modify the value in the caller. This attribute is only
1212 valid on LLVM pointer arguments. It is generally used to pass
1213 structs and arrays by value, but is also valid on pointers to
1214 scalars. The copy is considered to belong to the caller not the
1215 callee (for example, ``readonly`` functions should not write to
1216 ``byval`` parameters). This is not a valid attribute for return
1219 The byval type argument indicates the in-memory value type, and
1220 must be the same as the pointee type of the argument.
1222 The byval attribute also supports specifying an alignment with the
1223 align attribute. It indicates the alignment of the stack slot to
1224 form and the known alignment of the pointer specified to the call
1225 site. If the alignment is not specified, then the code generator
1226 makes a target-specific assumption.
1232 The ``byref`` argument attribute allows specifying the pointee
1233 memory type of an argument. This is similar to ``byval``, but does
1234 not imply a copy is made anywhere, or that the argument is passed
1235 on the stack. This implies the pointer is dereferenceable up to
1236 the storage size of the type.
1238 It is not generally permissible to introduce a write to an
1239 ``byref`` pointer. The pointer may have any address space and may
1242 This is not a valid attribute for return values.
1244 The alignment for an ``byref`` parameter can be explicitly
1245 specified by combining it with the ``align`` attribute, similar to
1246 ``byval``. If the alignment is not specified, then the code generator
1247 makes a target-specific assumption.
1249 This is intended for representing ABI constraints, and is not
1250 intended to be inferred for optimization use.
1252 .. _attr_preallocated:
1254 ``preallocated(<ty>)``
1255 This indicates that the pointer parameter should really be passed by
1256 value to the function, and that the pointer parameter's pointee has
1257 already been initialized before the call instruction. This attribute
1258 is only valid on LLVM pointer arguments. The argument must be the value
1259 returned by the appropriate
1260 :ref:`llvm.call.preallocated.arg<int_call_preallocated_arg>` on non
1261 ``musttail`` calls, or the corresponding caller parameter in ``musttail``
1262 calls, although it is ignored during codegen.
1264 A non ``musttail`` function call with a ``preallocated`` attribute in
1265 any parameter must have a ``"preallocated"`` operand bundle. A ``musttail``
1266 function call cannot have a ``"preallocated"`` operand bundle.
1268 The preallocated attribute requires a type argument, which must be
1269 the same as the pointee type of the argument.
1271 The preallocated attribute also supports specifying an alignment with the
1272 align attribute. It indicates the alignment of the stack slot to
1273 form and the known alignment of the pointer specified to the call
1274 site. If the alignment is not specified, then the code generator
1275 makes a target-specific assumption.
1281 The ``inalloca`` argument attribute allows the caller to take the
1282 address of outgoing stack arguments. An ``inalloca`` argument must
1283 be a pointer to stack memory produced by an ``alloca`` instruction.
1284 The alloca, or argument allocation, must also be tagged with the
1285 inalloca keyword. Only the last argument may have the ``inalloca``
1286 attribute, and that argument is guaranteed to be passed in memory.
1288 An argument allocation may be used by a call at most once because
1289 the call may deallocate it. The ``inalloca`` attribute cannot be
1290 used in conjunction with other attributes that affect argument
1291 storage, like ``inreg``, ``nest``, ``sret``, or ``byval``. The
1292 ``inalloca`` attribute also disables LLVM's implicit lowering of
1293 large aggregate return values, which means that frontend authors
1294 must lower them with ``sret`` pointers.
1296 When the call site is reached, the argument allocation must have
1297 been the most recent stack allocation that is still live, or the
1298 behavior is undefined. It is possible to allocate additional stack
1299 space after an argument allocation and before its call site, but it
1300 must be cleared off with :ref:`llvm.stackrestore
1301 <int_stackrestore>`.
1303 The inalloca attribute requires a type argument, which must be the
1304 same as the pointee type of the argument.
1306 See :doc:`InAlloca` for more information on how to use this
1310 This indicates that the pointer parameter specifies the address of a
1311 structure that is the return value of the function in the source
1312 program. This pointer must be guaranteed by the caller to be valid:
1313 loads and stores to the structure may be assumed by the callee not
1314 to trap and to be properly aligned.
1316 The sret type argument specifies the in memory type, which must be
1317 the same as the pointee type of the argument.
1319 A function that accepts an ``sret`` argument must return ``void``.
1320 A return value may not be ``sret``.
1322 .. _attr_elementtype:
1324 ``elementtype(<ty>)``
1326 The ``elementtype`` argument attribute can be used to specify a pointer
1327 element type in a way that is compatible with `opaque pointers
1328 <OpaquePointers.html>`__.
1330 The ``elementtype`` attribute by itself does not carry any specific
1331 semantics. However, certain intrinsics may require this attribute to be
1332 present and assign it particular semantics. This will be documented on
1333 individual intrinsics.
1335 The attribute may only be applied to pointer typed arguments of intrinsic
1336 calls. It cannot be applied to non-intrinsic calls, and cannot be applied
1337 to parameters on function declarations. For non-opaque pointers, the type
1338 passed to ``elementtype`` must match the pointer element type.
1342 ``align <n>`` or ``align(<n>)``
1343 This indicates that the pointer value or vector of pointers has the
1344 specified alignment. If applied to a vector of pointers, *all* pointers
1345 (elements) have the specified alignment. If the pointer value does not have
1346 the specified alignment, :ref:`poison value <poisonvalues>` is returned or
1347 passed instead. The ``align`` attribute should be combined with the
1348 ``noundef`` attribute to ensure a pointer is aligned, or otherwise the
1349 behavior is undefined. Note that ``align 1`` has no effect on non-byval,
1350 non-preallocated arguments.
1352 Note that this attribute has additional semantics when combined with the
1353 ``byval`` or ``preallocated`` attribute, which are documented there.
1358 This indicates that memory locations accessed via pointer values
1359 :ref:`based <pointeraliasing>` on the argument or return value are not also
1360 accessed, during the execution of the function, via pointer values not
1361 *based* on the argument or return value. This guarantee only holds for
1362 memory locations that are *modified*, by any means, during the execution of
1363 the function. If there are other accesses not based on the argument or
1364 return value, the behavior is undefined. The attribute on a return value
1365 also has additional semantics described below. The caller shares the
1366 responsibility with the callee for described below. The caller shares the
1367 responsibility with the callee for ensuring that these requirements are met.
1368 For further details, please see the discussion of the NoAlias response in
1369 :ref:`alias analysis <Must, May, or No>`.
1371 Note that this definition of ``noalias`` is intentionally similar
1372 to the definition of ``restrict`` in C99 for function arguments.
1374 For function return values, C99's ``restrict`` is not meaningful,
1375 while LLVM's ``noalias`` is. Furthermore, the semantics of the ``noalias``
1376 attribute on return values are stronger than the semantics of the attribute
1377 when used on function arguments. On function return values, the ``noalias``
1378 attribute indicates that the function acts like a system memory allocation
1379 function, returning a pointer to allocated storage disjoint from the
1380 storage for any other object accessible to the caller.
1385 This indicates that the callee does not :ref:`capture <pointercapture>` the
1386 pointer. This is not a valid attribute for return values.
1387 This attribute applies only to the particular copy of the pointer passed in
1388 this argument. A caller could pass two copies of the same pointer with one
1389 being annotated nocapture and the other not, and the callee could validly
1390 capture through the non annotated parameter.
1392 .. code-block:: llvm
1394 define void @f(ptr nocapture %a, ptr %b) {
1398 call void @f(ptr @glb, ptr @glb) ; well-defined
1401 This indicates that callee does not free the pointer argument. This is not
1402 a valid attribute for return values.
1407 This indicates that the pointer parameter can be excised using the
1408 :ref:`trampoline intrinsics <int_trampoline>`. This is not a valid
1409 attribute for return values and can only be applied to one parameter.
1412 This indicates that the function always returns the argument as its return
1413 value. This is a hint to the optimizer and code generator used when
1414 generating the caller, allowing value propagation, tail call optimization,
1415 and omission of register saves and restores in some cases; it is not
1416 checked or enforced when generating the callee. The parameter and the
1417 function return type must be valid operands for the
1418 :ref:`bitcast instruction <i_bitcast>`. This is not a valid attribute for
1419 return values and can only be applied to one parameter.
1422 This indicates that the parameter or return pointer is not null. This
1423 attribute may only be applied to pointer typed parameters. This is not
1424 checked or enforced by LLVM; if the parameter or return pointer is null,
1425 :ref:`poison value <poisonvalues>` is returned or passed instead.
1426 The ``nonnull`` attribute should be combined with the ``noundef`` attribute
1427 to ensure a pointer is not null or otherwise the behavior is undefined.
1429 ``dereferenceable(<n>)``
1430 This indicates that the parameter or return pointer is dereferenceable. This
1431 attribute may only be applied to pointer typed parameters. A pointer that
1432 is dereferenceable can be loaded from speculatively without a risk of
1433 trapping. The number of bytes known to be dereferenceable must be provided
1434 in parentheses. It is legal for the number of bytes to be less than the
1435 size of the pointee type. The ``nonnull`` attribute does not imply
1436 dereferenceability (consider a pointer to one element past the end of an
1437 array), however ``dereferenceable(<n>)`` does imply ``nonnull`` in
1438 ``addrspace(0)`` (which is the default address space), except if the
1439 ``null_pointer_is_valid`` function attribute is present.
1440 ``n`` should be a positive number. The pointer should be well defined,
1441 otherwise it is undefined behavior. This means ``dereferenceable(<n>)``
1442 implies ``noundef``.
1444 ``dereferenceable_or_null(<n>)``
1445 This indicates that the parameter or return value isn't both
1446 non-null and non-dereferenceable (up to ``<n>`` bytes) at the same
1447 time. All non-null pointers tagged with
1448 ``dereferenceable_or_null(<n>)`` are ``dereferenceable(<n>)``.
1449 For address space 0 ``dereferenceable_or_null(<n>)`` implies that
1450 a pointer is exactly one of ``dereferenceable(<n>)`` or ``null``,
1451 and in other address spaces ``dereferenceable_or_null(<n>)``
1452 implies that a pointer is at least one of ``dereferenceable(<n>)``
1453 or ``null`` (i.e. it may be both ``null`` and
1454 ``dereferenceable(<n>)``). This attribute may only be applied to
1455 pointer typed parameters.
1458 This indicates that the parameter is the self/context parameter. This is not
1459 a valid attribute for return values and can only be applied to one
1465 This indicates that the parameter is the asynchronous context parameter and
1466 triggers the creation of a target-specific extended frame record to store
1467 this pointer. This is not a valid attribute for return values and can only
1468 be applied to one parameter.
1471 This attribute is motivated to model and optimize Swift error handling. It
1472 can be applied to a parameter with pointer to pointer type or a
1473 pointer-sized alloca. At the call site, the actual argument that corresponds
1474 to a ``swifterror`` parameter has to come from a ``swifterror`` alloca or
1475 the ``swifterror`` parameter of the caller. A ``swifterror`` value (either
1476 the parameter or the alloca) can only be loaded and stored from, or used as
1477 a ``swifterror`` argument. This is not a valid attribute for return values
1478 and can only be applied to one parameter.
1480 These constraints allow the calling convention to optimize access to
1481 ``swifterror`` variables by associating them with a specific register at
1482 call boundaries rather than placing them in memory. Since this does change
1483 the calling convention, a function which uses the ``swifterror`` attribute
1484 on a parameter is not ABI-compatible with one which does not.
1486 These constraints also allow LLVM to assume that a ``swifterror`` argument
1487 does not alias any other memory visible within a function and that a
1488 ``swifterror`` alloca passed as an argument does not escape.
1491 This indicates the parameter is required to be an immediate
1492 value. This must be a trivial immediate integer or floating-point
1493 constant. Undef or constant expressions are not valid. This is
1494 only valid on intrinsic declarations and cannot be applied to a
1495 call site or arbitrary function.
1498 This attribute applies to parameters and return values. If the value
1499 representation contains any undefined or poison bits, the behavior is
1500 undefined. Note that this does not refer to padding introduced by the
1501 type's storage representation.
1505 ``nofpclass(<test mask>)``
1506 This attribute applies to parameters and return values with
1507 floating-point and vector of floating-point types, as well as
1508 :ref:`supported aggregates <fastmath_return_types>` of such types
1509 (matching the supported types for :ref:`fast-math flags <fastmath>`).
1510 The test mask has the same format as the second argument to the
1511 :ref:`llvm.is.fpclass <llvm.is.fpclass>`, and indicates which classes
1512 of floating-point values are not permitted for the value. For example
1513 a bitmask of 3 indicates the parameter may not be a NaN.
1515 If the value is a floating-point class indicated by the
1516 ``nofpclass`` test mask, a :ref:`poison value <poisonvalues>` is
1517 passed or returned instead.
1519 .. code-block:: text
1520 :caption: The following invariants hold
1522 @llvm.is.fpclass(nofpclass(test_mask) %x, test_mask) => false
1523 @llvm.is.fpclass(nofpclass(test_mask) %x, ~test_mask) => true
1524 nofpclass(all) => poison
1527 In textual IR, various string names are supported for readability
1528 and can be combined. For example ``nofpclass(nan pinf nzero)``
1529 evaluates to a mask of 547.
1531 This does not depend on the floating-point environment. For
1532 example, a function parameter marked ``nofpclass(zero)`` indicates
1533 no zero inputs. If this is applied to an argument in a function
1534 marked with :ref:`\"denormal-fp-math\" <denormal_fp_math>`
1535 indicating zero treatment of input denormals, it does not imply the
1536 value cannot be a denormal value which would compare equal to 0.
1538 .. table:: Recognized test mask names
1540 +-------+----------------------+---------------+
1541 | Name | floating-point class | Bitmask value |
1542 +=======+======================+===============+
1543 | nan | Any NaN | 3 |
1544 +-------+----------------------+---------------+
1545 | inf | +/- infinity | 516 |
1546 +-------+----------------------+---------------+
1547 | norm | +/- normal | 264 |
1548 +-------+----------------------+---------------+
1549 | sub | +/- subnormal | 144 |
1550 +-------+----------------------+---------------+
1551 | zero | +/- 0 | 96 |
1552 +-------+----------------------+---------------+
1553 | all | All values | 1023 |
1554 +-------+----------------------+---------------+
1555 | snan | Signaling NaN | 1 |
1556 +-------+----------------------+---------------+
1557 | qnan | Quiet NaN | 2 |
1558 +-------+----------------------+---------------+
1559 | ninf | Negative infinity | 4 |
1560 +-------+----------------------+---------------+
1561 | nnorm | Negative normal | 8 |
1562 +-------+----------------------+---------------+
1563 | nsub | Negative subnormal | 16 |
1564 +-------+----------------------+---------------+
1565 | nzero | Negative zero | 32 |
1566 +-------+----------------------+---------------+
1567 | pzero | Positive zero | 64 |
1568 +-------+----------------------+---------------+
1569 | psub | Positive subnormal | 128 |
1570 +-------+----------------------+---------------+
1571 | pnorm | Positive normal | 256 |
1572 +-------+----------------------+---------------+
1573 | pinf | Positive infinity | 512 |
1574 +-------+----------------------+---------------+
1578 This indicates the alignment that should be considered by the backend when
1579 assigning this parameter to a stack slot during calling convention
1580 lowering. The enforcement of the specified alignment is target-dependent,
1581 as target-specific calling convention rules may override this value. This
1582 attribute serves the purpose of carrying language specific alignment
1583 information that is not mapped to base types in the backend (for example,
1584 over-alignment specification through language attributes).
1587 The function parameter marked with this attribute is the alignment in bytes of the
1588 newly allocated block returned by this function. The returned value must either have
1589 the specified alignment or be the null pointer. The return value MAY be more aligned
1590 than the requested alignment, but not less aligned. Invalid (e.g. non-power-of-2)
1591 alignments are permitted for the allocalign parameter, so long as the returned pointer
1592 is null. This attribute may only be applied to integer parameters.
1595 The function parameter marked with this attribute is the pointer
1596 that will be manipulated by the allocator. For a realloc-like
1597 function the pointer will be invalidated upon success (but the
1598 same address may be returned), for a free-like function the
1599 pointer will always be invalidated.
1602 This attribute indicates that the function does not dereference that
1603 pointer argument, even though it may read or write the memory that the
1604 pointer points to if accessed through other pointers.
1606 If a function reads from or writes to a readnone pointer argument, the
1607 behavior is undefined.
1610 This attribute indicates that the function does not write through this
1611 pointer argument, even though it may write to the memory that the pointer
1614 If a function writes to a readonly pointer argument, the behavior is
1618 This attribute indicates that the function may write to, but does not read
1619 through this pointer argument (even though it may read from the memory that
1620 the pointer points to).
1622 This attribute is understood in the same way as the ``memory(write)``
1623 attribute. That is, the pointer may still be read as long as the read is
1624 not observable outside the function. See the ``memory`` documentation for
1628 This attribute is only meaningful in conjunction with ``dereferenceable(N)``
1629 or another attribute that implies the first ``N`` bytes of the pointer
1630 argument are dereferenceable.
1632 In that case, the attribute indicates that the first ``N`` bytes will be
1633 (non-atomically) loaded and stored back on entry to the function.
1635 This implies that it's possible to introduce spurious stores on entry to
1636 the function without introducing traps or data races. This does not
1637 necessarily hold throughout the whole function, as the pointer may escape
1638 to a different thread during the execution of the function. See also the
1639 :ref:`atomic optimization guide <Optimization outside atomic>`
1641 The "other attributes" that imply dereferenceability are
1642 ``dereferenceable_or_null`` (if the pointer is non-null) and the
1643 ``sret``, ``byval``, ``byref``, ``inalloca``, ``preallocated`` family of
1644 attributes. Note that not all of these combinations are useful, e.g.
1645 ``byval`` arguments are known to be writable even without this attribute.
1647 The ``writable`` attribute cannot be combined with ``readnone``,
1648 ``readonly`` or a ``memory`` attribute that does not contain
1651 ``initializes((Lo1, Hi1), ...)``
1652 This attribute indicates that the function initializes the ranges of the
1653 pointer parameter's memory, ``[%p+LoN, %p+HiN)``. Initialization of memory
1654 means the first memory access is a non-volatile, non-atomic write. The
1655 write must happen before the function returns. If the function unwinds,
1656 the write may not happen.
1658 This attribute only holds for the memory accessed via this pointer
1659 parameter. Other arbitrary accesses to the same memory via other pointers
1662 The ``writable`` or ``dereferenceable`` attribute do not imply the
1663 ``initializes`` attribute. The ``initializes`` attribute does not imply
1664 ``writeonly`` since ``initializes`` allows reading from the pointer
1667 This attribute is a list of constant ranges in ascending order with no
1668 overlapping or consecutive list elements. ``LoN/HiN`` are 64-bit integers,
1669 and negative values are allowed in case the argument points partway into
1670 an allocation. An empty list is not allowed.
1673 At a high level, this attribute indicates that the pointer argument is dead
1674 if the call unwinds, in the sense that the caller will not depend on the
1675 contents of the memory. Stores that would only be visible on the unwind
1678 More precisely, the behavior is as-if any memory written through the
1679 pointer during the execution of the function is overwritten with a poison
1680 value on unwind. This includes memory written by the implicit write implied
1681 by the ``writable`` attribute. The caller is allowed to access the affected
1682 memory, but all loads that are not preceded by a store will return poison.
1684 This attribute cannot be applied to return values.
1686 ``range(<ty> <a>, <b>)``
1687 This attribute expresses the possible range of the parameter or return value.
1688 If the value is not in the specified range, it is converted to poison.
1689 The arguments passed to ``range`` have the following properties:
1691 - The type must match the scalar type of the parameter or return value.
1692 - The pair ``a,b`` represents the range ``[a,b)``.
1693 - Both ``a`` and ``b`` are constants.
1694 - The range is allowed to wrap.
1695 - The empty range is represented using ``0,0``.
1696 - Otherwise, ``a`` and ``b`` are not allowed to be equal.
1698 This attribute may only be applied to parameters or return values with integer
1699 or vector of integer types.
1701 For vector-typed parameters, the range is applied element-wise.
1705 Garbage Collector Strategy Names
1706 --------------------------------
1708 Each function may specify a garbage collector strategy name, which is simply a
1711 .. code-block:: llvm
1713 define void @f() gc "name" { ... }
1715 The supported values of *name* includes those :ref:`built in to LLVM
1716 <builtin-gc-strategies>` and any provided by loaded plugins. Specifying a GC
1717 strategy will cause the compiler to alter its output in order to support the
1718 named garbage collection algorithm. Note that LLVM itself does not contain a
1719 garbage collector, this functionality is restricted to generating machine code
1720 which can interoperate with a collector provided externally.
1727 Prefix data is data associated with a function which the code
1728 generator will emit immediately before the function's entrypoint.
1729 The purpose of this feature is to allow frontends to associate
1730 language-specific runtime metadata with specific functions and make it
1731 available through the function pointer while still allowing the
1732 function pointer to be called.
1734 To access the data for a given function, a program may bitcast the
1735 function pointer to a pointer to the constant's type and dereference
1736 index -1. This implies that the IR symbol points just past the end of
1737 the prefix data. For instance, take the example of a function annotated
1738 with a single ``i32``,
1740 .. code-block:: llvm
1742 define void @f() prefix i32 123 { ... }
1744 The prefix data can be referenced as,
1746 .. code-block:: llvm
1748 %a = getelementptr inbounds i32, ptr @f, i32 -1
1749 %b = load i32, ptr %a
1751 Prefix data is laid out as if it were an initializer for a global variable
1752 of the prefix data's type. The function will be placed such that the
1753 beginning of the prefix data is aligned. This means that if the size
1754 of the prefix data is not a multiple of the alignment size, the
1755 function's entrypoint will not be aligned. If alignment of the
1756 function's entrypoint is desired, padding must be added to the prefix
1759 A function may have prefix data but no body. This has similar semantics
1760 to the ``available_externally`` linkage in that the data may be used by the
1761 optimizers but will not be emitted in the object file.
1768 The ``prologue`` attribute allows arbitrary code (encoded as bytes) to
1769 be inserted prior to the function body. This can be used for enabling
1770 function hot-patching and instrumentation.
1772 To maintain the semantics of ordinary function calls, the prologue data must
1773 have a particular format. Specifically, it must begin with a sequence of
1774 bytes which decode to a sequence of machine instructions, valid for the
1775 module's target, which transfer control to the point immediately succeeding
1776 the prologue data, without performing any other visible action. This allows
1777 the inliner and other passes to reason about the semantics of the function
1778 definition without needing to reason about the prologue data. Obviously this
1779 makes the format of the prologue data highly target dependent.
1781 A trivial example of valid prologue data for the x86 architecture is ``i8 144``,
1782 which encodes the ``nop`` instruction:
1784 .. code-block:: text
1786 define void @f() prologue i8 144 { ... }
1788 Generally prologue data can be formed by encoding a relative branch instruction
1789 which skips the metadata, as in this example of valid prologue data for the
1790 x86_64 architecture, where the first two bytes encode ``jmp .+10``:
1792 .. code-block:: text
1794 %0 = type <{ i8, i8, ptr }>
1796 define void @f() prologue %0 <{ i8 235, i8 8, ptr @md}> { ... }
1798 A function may have prologue data but no body. This has similar semantics
1799 to the ``available_externally`` linkage in that the data may be used by the
1800 optimizers but will not be emitted in the object file.
1804 Personality Function
1805 --------------------
1807 The ``personality`` attribute permits functions to specify what function
1808 to use for exception handling.
1815 Attribute groups are groups of attributes that are referenced by objects within
1816 the IR. They are important for keeping ``.ll`` files readable, because a lot of
1817 functions will use the same set of attributes. In the degenerative case of a
1818 ``.ll`` file that corresponds to a single ``.c`` file, the single attribute
1819 group will capture the important command line flags used to build that file.
1821 An attribute group is a module-level object. To use an attribute group, an
1822 object references the attribute group's ID (e.g. ``#37``). An object may refer
1823 to more than one attribute group. In that situation, the attributes from the
1824 different groups are merged.
1826 Here is an example of attribute groups for a function that should always be
1827 inlined, has a stack alignment of 4, and which shouldn't use SSE instructions:
1829 .. code-block:: llvm
1831 ; Target-independent attributes:
1832 attributes #0 = { alwaysinline alignstack=4 }
1834 ; Target-dependent attributes:
1835 attributes #1 = { "no-sse" }
1837 ; Function @f has attributes: alwaysinline, alignstack=4, and "no-sse".
1838 define void @f() #0 #1 { ... }
1845 Function attributes are set to communicate additional information about
1846 a function. Function attributes are considered to be part of the
1847 function, not of the function type, so functions with different function
1848 attributes can have the same function type.
1850 Function attributes are simple keywords or strings that follow the specified
1851 type. Multiple attributes, when required, are separated by spaces.
1854 .. code-block:: llvm
1856 define void @f() noinline { ... }
1857 define void @f() alwaysinline { ... }
1858 define void @f() alwaysinline optsize { ... }
1859 define void @f() optsize { ... }
1860 define void @f() "no-sse" { ... }
1863 This attribute indicates that, when emitting the prologue and
1864 epilogue, the backend should forcibly align the stack pointer.
1865 Specify the desired alignment, which must be a power of two, in
1867 ``"alloc-family"="FAMILY"``
1868 This indicates which "family" an allocator function is part of. To avoid
1869 collisions, the family name should match the mangled name of the primary
1870 allocator function, that is "malloc" for malloc/calloc/realloc/free,
1871 "_Znwm" for ``::operator::new`` and ``::operator::delete``, and
1872 "_ZnwmSt11align_val_t" for aligned ``::operator::new`` and
1873 ``::operator::delete``. Matching malloc/realloc/free calls within a family
1874 can be optimized, but mismatched ones will be left alone.
1875 ``allockind("KIND")``
1876 Describes the behavior of an allocation function. The KIND string contains comma
1877 separated entries from the following options:
1879 * "alloc": the function returns a new block of memory or null.
1880 * "realloc": the function returns a new block of memory or null. If the
1881 result is non-null the memory contents from the start of the block up to
1882 the smaller of the original allocation size and the new allocation size
1883 will match that of the ``allocptr`` argument and the ``allocptr``
1884 argument is invalidated, even if the function returns the same address.
1885 * "free": the function frees the block of memory specified by ``allocptr``.
1886 Functions marked as "free" ``allockind`` must return void.
1887 * "uninitialized": Any newly-allocated memory (either a new block from
1888 a "alloc" function or the enlarged capacity from a "realloc" function)
1889 will be uninitialized.
1890 * "zeroed": Any newly-allocated memory (either a new block from a "alloc"
1891 function or the enlarged capacity from a "realloc" function) will be
1893 * "aligned": the function returns memory aligned according to the
1894 ``allocalign`` parameter.
1896 The first three options are mutually exclusive, and the remaining options
1897 describe more details of how the function behaves. The remaining options
1898 are invalid for "free"-type functions.
1899 ``allocsize(<EltSizeParam>[, <NumEltsParam>])``
1900 This attribute indicates that the annotated function will always return at
1901 least a given number of bytes (or null). Its arguments are zero-indexed
1902 parameter numbers; if one argument is provided, then it's assumed that at
1903 least ``CallSite.Args[EltSizeParam]`` bytes will be available at the
1904 returned pointer. If two are provided, then it's assumed that
1905 ``CallSite.Args[EltSizeParam] * CallSite.Args[NumEltsParam]`` bytes are
1906 available. The referenced parameters must be integer types. No assumptions
1907 are made about the contents of the returned block of memory.
1909 This attribute indicates that the inliner should attempt to inline
1910 this function into callers whenever possible, ignoring any active
1911 inlining size threshold for this caller.
1913 This indicates that the callee function at a call site should be
1914 recognized as a built-in function, even though the function's declaration
1915 uses the ``nobuiltin`` attribute. This is only valid at call sites for
1916 direct calls to functions that are declared with the ``nobuiltin``
1919 This attribute indicates that this function is rarely called. When
1920 computing edge weights, basic blocks post-dominated by a cold
1921 function call are also considered to be cold; and, thus, given low
1924 .. _attr_convergent:
1927 This attribute indicates that this function is convergent.
1928 When it appears on a call/invoke, the convergent attribute
1929 indicates that we should treat the call as though we’re calling a
1930 convergent function. This is particularly useful on indirect
1931 calls; without this we may treat such calls as though the target
1934 See :doc:`ConvergentOperations` for further details.
1936 It is an error to call :ref:`llvm.experimental.convergence.entry
1937 <llvm.experimental.convergence.entry>` from a function that
1938 does not have this attribute.
1939 ``disable_sanitizer_instrumentation``
1940 When instrumenting code with sanitizers, it can be important to skip certain
1941 functions to ensure no instrumentation is applied to them.
1943 This attribute is not always similar to absent ``sanitize_<name>``
1944 attributes: depending on the specific sanitizer, code can be inserted into
1945 functions regardless of the ``sanitize_<name>`` attribute to prevent false
1948 ``disable_sanitizer_instrumentation`` disables all kinds of instrumentation,
1949 taking precedence over the ``sanitize_<name>`` attributes and other compiler
1951 ``"dontcall-error"``
1952 This attribute denotes that an error diagnostic should be emitted when a
1953 call of a function with this attribute is not eliminated via optimization.
1954 Front ends can provide optional ``srcloc`` metadata nodes on call sites of
1955 such callees to attach information about where in the source language such a
1956 call came from. A string value can be provided as a note.
1958 This attribute denotes that a warning diagnostic should be emitted when a
1959 call of a function with this attribute is not eliminated via optimization.
1960 Front ends can provide optional ``srcloc`` metadata nodes on call sites of
1961 such callees to attach information about where in the source language such a
1962 call came from. A string value can be provided as a note.
1963 ``fn_ret_thunk_extern``
1964 This attribute tells the code generator that returns from functions should
1965 be replaced with jumps to externally-defined architecture-specific symbols.
1966 For X86, this symbol's identifier is ``__x86_return_thunk``.
1968 This attribute tells the code generator whether the function
1969 should keep the frame pointer. The code generator may emit the frame pointer
1970 even if this attribute says the frame pointer can be eliminated.
1971 The allowed string values are:
1973 * ``"none"`` (default) - the frame pointer can be eliminated, and it's
1974 register can be used for other purposes.
1975 * ``"reserved"`` - the frame pointer register must either be updated to
1976 point to a valid frame record for the current function, or not be
1978 * ``"non-leaf"`` - the frame pointer should be kept if the function calls
1980 * ``"all"`` - the frame pointer should be kept.
1982 This attribute indicates that this function is a hot spot of the program
1983 execution. The function will be optimized more aggressively and will be
1984 placed into special subsection of the text section to improving locality.
1986 When profile feedback is enabled, this attribute has the precedence over
1987 the profile information. By marking a function ``hot``, users can work
1988 around the cases where the training input does not have good coverage
1989 on all the hot functions.
1991 This attribute indicates that the source code contained a hint that
1992 inlining this function is desirable (such as the "inline" keyword in
1993 C/C++). It is just a hint; it imposes no requirements on the
1996 This attribute indicates that the function should be added to a
1997 jump-instruction table at code-generation time, and that all address-taken
1998 references to this function should be replaced with a reference to the
1999 appropriate jump-instruction-table function pointer. Note that this creates
2000 a new pointer for the original function, which means that code that depends
2001 on function-pointer identity can break. So, any function annotated with
2002 ``jumptable`` must also be ``unnamed_addr``.
2004 This attribute specifies the possible memory effects of the call-site or
2005 function. It allows specifying the possible access kinds (``none``,
2006 ``read``, ``write``, or ``readwrite``) for the possible memory location
2007 kinds (``argmem``, ``inaccessiblemem``, as well as a default). It is best
2008 understood by example:
2010 - ``memory(none)``: Does not access any memory.
2011 - ``memory(read)``: May read (but not write) any memory.
2012 - ``memory(write)``: May write (but not read) any memory.
2013 - ``memory(readwrite)``: May read or write any memory.
2014 - ``memory(argmem: read)``: May only read argument memory.
2015 - ``memory(argmem: read, inaccessiblemem: write)``: May only read argument
2016 memory and only write inaccessible memory.
2017 - ``memory(read, argmem: readwrite)``: May read any memory (default mode)
2018 and additionally write argument memory.
2019 - ``memory(readwrite, argmem: none)``: May access any memory apart from
2022 The supported access kinds are:
2024 - ``readwrite``: Any kind of access to the location is allowed.
2025 - ``read``: The location is only read. Writing to the location is immediate
2026 undefined behavior. This includes the case where the location is read from
2027 and then the same value is written back.
2028 - ``write``: Only writes to the location are observable outside the function
2029 call. However, the function may still internally read the location after
2030 writing it, as this is not observable. Reading the location prior to
2031 writing it results in a poison value.
2032 - ``none``: No reads or writes to the location are observed outside the
2033 function. It is always valid to read and write allocas, and to read global
2034 constants, even if ``memory(none)`` is used, as these effects are not
2035 externally observable.
2037 The supported memory location kinds are:
2039 - ``argmem``: This refers to accesses that are based on pointer arguments
2041 - ``inaccessiblemem``: This refers to accesses to memory which is not
2042 accessible by the current module (before return from the function -- an
2043 allocator function may return newly accessible memory while only
2044 accessing inaccessible memory itself). Inaccessible memory is often used
2045 to model control dependencies of intrinsics.
2046 - The default access kind (specified without a location prefix) applies to
2047 all locations that haven't been specified explicitly, including those that
2048 don't currently have a dedicated location kind (e.g. accesses to globals
2049 or captured pointers).
2051 If the ``memory`` attribute is not specified, then ``memory(readwrite)``
2052 is implied (all memory effects are possible).
2054 The memory effects of a call can be computed as
2055 ``CallSiteEffects & (FunctionEffects | OperandBundleEffects)``. Thus, the
2056 call-site annotation takes precedence over the potential effects described
2057 by either the function annotation or the operand bundles.
2059 This attribute suggests that optimization passes and code generator
2060 passes make choices that keep the code size of this function as small
2061 as possible and perform optimizations that may sacrifice runtime
2062 performance in order to minimize the size of the generated code.
2063 This attribute is incompatible with the ``optdebug`` and ``optnone``
2066 This attribute disables prologue / epilogue emission for the
2067 function. This can have very system-specific consequences. The arguments of
2068 a ``naked`` function can not be referenced through IR values.
2069 ``"no-inline-line-tables"``
2070 When this attribute is set to true, the inliner discards source locations
2071 when inlining code and instead uses the source location of the call site.
2072 Breakpoints set on code that was inlined into the current function will
2073 not fire during the execution of the inlined call sites. If the debugger
2074 stops inside an inlined call site, it will appear to be stopped at the
2075 outermost inlined call site.
2077 When this attribute is set to true, the jump tables and lookup tables that
2078 can be generated from a switch case lowering are disabled.
2080 This indicates that the callee function at a call site is not recognized as
2081 a built-in function. LLVM will retain the original call and not replace it
2082 with equivalent code based on the semantics of the built-in function, unless
2083 the call site uses the ``builtin`` attribute. This is valid at call sites
2084 and on function declarations and definitions.
2086 This attribute indicates that the function is only allowed to jump back into
2087 caller's module by a return or an exception, and is not allowed to jump back
2088 by invoking a callback function, a direct, possibly transitive, external
2089 function call, use of ``longjmp``, or other means. It is a compiler hint that
2090 is used at module level to improve dataflow analysis, dropped during linking,
2091 and has no effect on functions defined in the current module.
2092 ``nodivergencesource``
2093 A call to this function is not a source of divergence. In uniformity
2094 analysis, a *source of divergence* is an instruction that generates
2095 divergence even if its inputs are uniform. A call with no further information
2096 would normally be considered a source of divergence; setting this attribute
2097 on a function means that a call to it is not a source of divergence.
2099 This attribute indicates that calls to the function cannot be
2100 duplicated. A call to a ``noduplicate`` function may be moved
2101 within its parent function, but may not be duplicated within
2102 its parent function.
2104 A function containing a ``noduplicate`` call may still
2105 be an inlining candidate, provided that the call is not
2106 duplicated by inlining. That implies that the function has
2107 internal linkage and only has one call site, so the original
2108 call is dead after inlining.
2110 This function attribute indicates that the function does not, directly or
2111 transitively, call a memory-deallocation function (``free``, for example)
2112 on a memory allocation which existed before the call.
2114 As a result, uncaptured pointers that are known to be dereferenceable
2115 prior to a call to a function with the ``nofree`` attribute are still
2116 known to be dereferenceable after the call. The capturing condition is
2117 necessary in environments where the function might communicate the
2118 pointer to another thread which then deallocates the memory. Alternatively,
2119 ``nosync`` would ensure such communication cannot happen and even captured
2120 pointers cannot be freed by the function.
2122 A ``nofree`` function is explicitly allowed to free memory which it
2123 allocated or (if not ``nosync``) arrange for another thread to free
2124 memory on it's behalf. As a result, perhaps surprisingly, a ``nofree``
2125 function can return a pointer to a previously deallocated memory object.
2127 Disallows implicit floating-point code. This inhibits optimizations that
2128 use floating-point code and floating-point registers for operations that are
2129 not nominally floating-point. LLVM instructions that perform floating-point
2130 operations or require access to floating-point registers may still cause
2131 floating-point code to be generated.
2133 Also inhibits optimizations that create SIMD/vector code and registers from
2134 scalar code such as vectorization or memcpy/memset optimization. This
2135 includes integer vectors. Vector instructions present in IR may still cause
2136 vector code to be generated.
2138 This attribute indicates that the inliner should never inline this
2139 function in any situation. This attribute may not be used together
2140 with the ``alwaysinline`` attribute.
2142 This attribute indicates that calls to this function should never be merged
2143 during optimization. For example, it will prevent tail merging otherwise
2144 identical code sequences that raise an exception or terminate the program.
2145 Tail merging normally reduces the precision of source location information,
2146 making stack traces less useful for debugging. This attribute gives the
2147 user control over the tradeoff between code size and debug information
2150 This attribute suppresses lazy symbol binding for the function. This
2151 may make calls to the function faster, at the cost of extra program
2152 startup time if the function is not called during program startup.
2154 This function attribute prevents instrumentation based profiling, used for
2155 coverage or profile based optimization, from being added to a function. It
2156 also blocks inlining if the caller and callee have different values of this
2159 This function attribute prevents instrumentation based profiling, used for
2160 coverage or profile based optimization, from being added to a function. This
2161 attribute does not restrict inlining, so instrumented instruction could end
2162 up in this function.
2164 This attribute indicates that the code generator should not use a
2165 red zone, even if the target-specific ABI normally permits it.
2166 ``indirect-tls-seg-refs``
2167 This attribute indicates that the code generator should not use
2168 direct TLS access through segment registers, even if the
2169 target-specific ABI normally permits it.
2171 This function attribute indicates that the function never returns
2172 normally, hence through a return instruction. This produces undefined
2173 behavior at runtime if the function ever does dynamically return. Annotated
2174 functions may still raise an exception, i.a., ``nounwind`` is not implied.
2176 This function attribute indicates that the function does not call itself
2177 either directly or indirectly down any possible call path. This produces
2178 undefined behavior at runtime if the function ever does recurse.
2180 .. _langref_willreturn:
2183 This function attribute indicates that a call of this function will
2184 either exhibit undefined behavior or comes back and continues execution
2185 at a point in the existing call stack that includes the current invocation.
2186 Annotated functions may still raise an exception, i.a., ``nounwind`` is not implied.
2187 If an invocation of an annotated function does not return control back
2188 to a point in the call stack, the behavior is undefined.
2190 This function attribute indicates that the function does not communicate
2191 (synchronize) with another thread through memory or other well-defined means.
2192 Synchronization is considered possible in the presence of `atomic` accesses
2193 that enforce an order, thus not "unordered" and "monotonic", `volatile` accesses,
2194 as well as `convergent` function calls.
2196 Note that `convergent` operations can involve communication that is
2197 considered to be not through memory and does not necessarily imply an
2198 ordering between threads for the purposes of the memory model. Therefore,
2199 an operation can be both `convergent` and `nosync`.
2201 If a `nosync` function does ever synchronize with another thread,
2202 the behavior is undefined.
2204 This function attribute indicates that the function never raises an
2205 exception. If the function does raise an exception, its runtime
2206 behavior is undefined. However, functions marked nounwind may still
2207 trap or generate asynchronous exceptions. Exception handling schemes
2208 that are recognized by LLVM to handle asynchronous exceptions, such
2209 as SEH, will still provide their implementation defined semantics.
2210 ``nosanitize_bounds``
2211 This attribute indicates that bounds checking sanitizer instrumentation
2212 is disabled for this function.
2213 ``nosanitize_coverage``
2214 This attribute indicates that SanitizerCoverage instrumentation is disabled
2216 ``null_pointer_is_valid``
2217 If ``null_pointer_is_valid`` is set, then the ``null`` address
2218 in address-space 0 is considered to be a valid address for memory loads and
2219 stores. Any analysis or optimization should not treat dereferencing a
2220 pointer to ``null`` as undefined behavior in this function.
2221 Note: Comparing address of a global variable to ``null`` may still
2222 evaluate to false because of a limitation in querying this attribute inside
2223 constant expressions.
2225 This attribute suggests that optimization passes and code generator passes
2226 should make choices that try to preserve debug info without significantly
2227 degrading runtime performance.
2228 This attribute is incompatible with the ``minsize``, ``optsize``, and
2229 ``optnone`` attributes.
2231 This attribute indicates that this function should be optimized
2232 for maximum fuzzing signal.
2234 This function attribute indicates that most optimization passes will skip
2235 this function, with the exception of interprocedural optimization passes.
2236 Code generation defaults to the "fast" instruction selector.
2237 This attribute cannot be used together with the ``alwaysinline``
2238 attribute; this attribute is also incompatible
2239 with the ``minsize``, ``optsize``, and ``optdebug`` attributes.
2241 This attribute requires the ``noinline`` attribute to be specified on
2242 the function as well, so the function is never inlined into any caller.
2243 Only functions with the ``alwaysinline`` attribute are valid
2244 candidates for inlining into the body of this function.
2246 This attribute suggests that optimization passes and code generator
2247 passes make choices that keep the code size of this function low,
2248 and otherwise do optimizations specifically to reduce code size as
2249 long as they do not significantly impact runtime performance.
2250 This attribute is incompatible with the ``optdebug`` and ``optnone``
2252 ``"patchable-function"``
2253 This attribute tells the code generator that the code
2254 generated for this function needs to follow certain conventions that
2255 make it possible for a runtime function to patch over it later.
2256 The exact effect of this attribute depends on its string value,
2257 for which there currently is one legal possibility:
2259 * ``"prologue-short-redirect"`` - This style of patchable
2260 function is intended to support patching a function prologue to
2261 redirect control away from the function in a thread safe
2262 manner. It guarantees that the first instruction of the
2263 function will be large enough to accommodate a short jump
2264 instruction, and will be sufficiently aligned to allow being
2265 fully changed via an atomic compare-and-swap instruction.
2266 While the first requirement can be satisfied by inserting large
2267 enough NOP, LLVM can and will try to re-purpose an existing
2268 instruction (i.e. one that would have to be emitted anyway) as
2269 the patchable instruction larger than a short jump.
2271 ``"prologue-short-redirect"`` is currently only supported on
2274 This attribute by itself does not imply restrictions on
2275 inter-procedural optimizations. All of the semantic effects the
2276 patching may have to be separately conveyed via the linkage type.
2278 This attribute indicates that the function will trigger a guard region
2279 in the end of the stack. It ensures that accesses to the stack must be
2280 no further apart than the size of the guard region to a previous
2281 access of the stack. It takes one required string value, the name of
2282 the stack probing function that will be called.
2284 If a function that has a ``"probe-stack"`` attribute is inlined into
2285 a function with another ``"probe-stack"`` attribute, the resulting
2286 function has the ``"probe-stack"`` attribute of the caller. If a
2287 function that has a ``"probe-stack"`` attribute is inlined into a
2288 function that has no ``"probe-stack"`` attribute at all, the resulting
2289 function has the ``"probe-stack"`` attribute of the callee.
2290 ``"stack-probe-size"``
2291 This attribute controls the behavior of stack probes: either
2292 the ``"probe-stack"`` attribute, or ABI-required stack probes, if any.
2293 It defines the size of the guard region. It ensures that if the function
2294 may use more stack space than the size of the guard region, stack probing
2295 sequence will be emitted. It takes one required integer value, which
2298 If a function that has a ``"stack-probe-size"`` attribute is inlined into
2299 a function with another ``"stack-probe-size"`` attribute, the resulting
2300 function has the ``"stack-probe-size"`` attribute that has the lower
2301 numeric value. If a function that has a ``"stack-probe-size"`` attribute is
2302 inlined into a function that has no ``"stack-probe-size"`` attribute
2303 at all, the resulting function has the ``"stack-probe-size"`` attribute
2305 ``"no-stack-arg-probe"``
2306 This attribute disables ABI-required stack probes, if any.
2308 This attribute indicates that this function can return twice. The C
2309 ``setjmp`` is an example of such a function. The compiler disables
2310 some optimizations (like tail calls) in the caller of these
2313 This attribute indicates that
2314 `SafeStack <https://clang.llvm.org/docs/SafeStack.html>`_
2315 protection is enabled for this function.
2317 If a function that has a ``safestack`` attribute is inlined into a
2318 function that doesn't have a ``safestack`` attribute or which has an
2319 ``ssp``, ``sspstrong`` or ``sspreq`` attribute, then the resulting
2320 function will have a ``safestack`` attribute.
2321 ``sanitize_address``
2322 This attribute indicates that AddressSanitizer checks
2323 (dynamic address safety analysis) are enabled for this function.
2325 This attribute indicates that MemorySanitizer checks (dynamic detection
2326 of accesses to uninitialized memory) are enabled for this function.
2328 This attribute indicates that ThreadSanitizer checks
2329 (dynamic thread safety analysis) are enabled for this function.
2330 ``sanitize_hwaddress``
2331 This attribute indicates that HWAddressSanitizer checks
2332 (dynamic address safety analysis based on tagged pointers) are enabled for
2335 This attribute indicates that MemTagSanitizer checks
2336 (dynamic address safety analysis based on Armv8 MTE) are enabled for
2338 ``sanitize_realtime``
2339 This attribute indicates that RealtimeSanitizer checks
2340 (realtime safety analysis - no allocations, syscalls or exceptions) are enabled
2342 ``sanitize_realtime_blocking``
2343 This attribute indicates that RealtimeSanitizer should error immediately
2344 if the attributed function is called during invocation of a function
2345 attributed with ``sanitize_realtime``.
2346 This attribute is incompatible with the ``sanitize_realtime`` attribute.
2347 ``speculative_load_hardening``
2348 This attribute indicates that
2349 `Speculative Load Hardening <https://llvm.org/docs/SpeculativeLoadHardening.html>`_
2350 should be enabled for the function body.
2352 Speculative Load Hardening is a best-effort mitigation against
2353 information leak attacks that make use of control flow
2354 miss-speculation - specifically miss-speculation of whether a branch
2355 is taken or not. Typically vulnerabilities enabling such attacks are
2356 classified as "Spectre variant #1". Notably, this does not attempt to
2357 mitigate against miss-speculation of branch target, classified as
2358 "Spectre variant #2" vulnerabilities.
2360 When inlining, the attribute is sticky. Inlining a function that carries
2361 this attribute will cause the caller to gain the attribute. This is intended
2362 to provide a maximally conservative model where the code in a function
2363 annotated with this attribute will always (even after inlining) end up
2366 This function attribute indicates that the function does not have any
2367 effects besides calculating its result and does not have undefined behavior.
2368 Note that ``speculatable`` is not enough to conclude that along any
2369 particular execution path the number of calls to this function will not be
2370 externally observable. This attribute is only valid on functions
2371 and declarations, not on individual call sites. If a function is
2372 incorrectly marked as speculatable and really does exhibit
2373 undefined behavior, the undefined behavior may be observed even
2374 if the call site is dead code.
2377 This attribute indicates that the function should emit a stack
2378 smashing protector. It is in the form of a "canary" --- a random value
2379 placed on the stack before the local variables that's checked upon
2380 return from the function to see if it has been overwritten. A
2381 heuristic is used to determine if a function needs stack protectors
2382 or not. The heuristic used will enable protectors for functions with:
2384 - Character arrays larger than ``ssp-buffer-size`` (default 8).
2385 - Aggregates containing character arrays larger than ``ssp-buffer-size``.
2386 - Calls to alloca() with variable sizes or constant sizes greater than
2387 ``ssp-buffer-size``.
2389 Variables that are identified as requiring a protector will be arranged
2390 on the stack such that they are adjacent to the stack protector guard.
2392 If a function with an ``ssp`` attribute is inlined into a calling function,
2393 the attribute is not carried over to the calling function.
2396 This attribute indicates that the function should emit a stack smashing
2397 protector. This attribute causes a strong heuristic to be used when
2398 determining if a function needs stack protectors. The strong heuristic
2399 will enable protectors for functions with:
2401 - Arrays of any size and type
2402 - Aggregates containing an array of any size and type.
2403 - Calls to alloca().
2404 - Local variables that have had their address taken.
2406 Variables that are identified as requiring a protector will be arranged
2407 on the stack such that they are adjacent to the stack protector guard.
2408 The specific layout rules are:
2410 #. Large arrays and structures containing large arrays
2411 (``>= ssp-buffer-size``) are closest to the stack protector.
2412 #. Small arrays and structures containing small arrays
2413 (``< ssp-buffer-size``) are 2nd closest to the protector.
2414 #. Variables that have had their address taken are 3rd closest to the
2417 This overrides the ``ssp`` function attribute.
2419 If a function with an ``sspstrong`` attribute is inlined into a calling
2420 function which has an ``ssp`` attribute, the calling function's attribute
2421 will be upgraded to ``sspstrong``.
2424 This attribute indicates that the function should *always* emit a stack
2425 smashing protector. This overrides the ``ssp`` and ``sspstrong`` function
2428 Variables that are identified as requiring a protector will be arranged
2429 on the stack such that they are adjacent to the stack protector guard.
2430 The specific layout rules are:
2432 #. Large arrays and structures containing large arrays
2433 (``>= ssp-buffer-size``) are closest to the stack protector.
2434 #. Small arrays and structures containing small arrays
2435 (``< ssp-buffer-size``) are 2nd closest to the protector.
2436 #. Variables that have had their address taken are 3rd closest to the
2439 If a function with an ``sspreq`` attribute is inlined into a calling
2440 function which has an ``ssp`` or ``sspstrong`` attribute, the calling
2441 function's attribute will be upgraded to ``sspreq``.
2446 This attribute indicates that the function was called from a scope that
2447 requires strict floating-point semantics. LLVM will not attempt any
2448 optimizations that require assumptions about the floating-point rounding
2449 mode or that might alter the state of floating-point status flags that
2450 might otherwise be set or cleared by calling this function. LLVM will
2451 not introduce any new floating-point instructions that may trap.
2453 .. _denormal_fp_math:
2455 ``"denormal-fp-math"``
2456 This indicates the denormal (subnormal) handling that may be
2457 assumed for the default floating-point environment. This is a
2458 comma separated pair. The elements may be one of ``"ieee"``,
2459 ``"preserve-sign"``, ``"positive-zero"``, or ``"dynamic"``. The
2460 first entry indicates the flushing mode for the result of floating
2461 point operations. The second indicates the handling of denormal inputs
2462 to floating point instructions. For compatibility with older
2463 bitcode, if the second value is omitted, both input and output
2464 modes will assume the same mode.
2466 If this is attribute is not specified, the default is ``"ieee,ieee"``.
2468 If the output mode is ``"preserve-sign"``, or ``"positive-zero"``,
2469 denormal outputs may be flushed to zero by standard floating-point
2470 operations. It is not mandated that flushing to zero occurs, but if
2471 a denormal output is flushed to zero, it must respect the sign
2472 mode. Not all targets support all modes.
2474 If the mode is ``"dynamic"``, the behavior is derived from the
2475 dynamic state of the floating-point environment. Transformations
2476 which depend on the behavior of denormal values should not be
2479 While this indicates the expected floating point mode the function
2480 will be executed with, this does not make any attempt to ensure
2481 the mode is consistent. User or platform code is expected to set
2482 the floating point mode appropriately before function entry.
2484 If the input mode is ``"preserve-sign"``, or ``"positive-zero"``,
2485 a floating-point operation must treat any input denormal value as
2486 zero. In some situations, if an instruction does not respect this
2487 mode, the input may need to be converted to 0 as if by
2488 ``@llvm.canonicalize`` during lowering for correctness.
2490 ``"denormal-fp-math-f32"``
2491 Same as ``"denormal-fp-math"``, but only controls the behavior of
2492 the 32-bit float type (or vectors of 32-bit floats). If both are
2493 are present, this overrides ``"denormal-fp-math"``. Not all targets
2494 support separately setting the denormal mode per type, and no
2495 attempt is made to diagnose unsupported uses. Currently this
2496 attribute is respected by the AMDGPU and NVPTX backends.
2499 This attribute indicates that the function will delegate to some other
2500 function with a tail call. The prototype of a thunk should not be used for
2501 optimization purposes. The caller is expected to cast the thunk prototype to
2502 match the thunk target prototype.
2503 ``uwtable[(sync|async)]``
2504 This attribute indicates that the ABI being targeted requires that
2505 an unwind table entry be produced for this function even if we can
2506 show that no exceptions passes by it. This is normally the case for
2507 the ELF x86-64 abi, but it can be disabled for some compilation
2508 units. The optional parameter describes what kind of unwind tables
2509 to generate: ``sync`` for normal unwind tables, ``async`` for asynchronous
2510 (instruction precise) unwind tables. Without the parameter, the attribute
2511 ``uwtable`` is equivalent to ``uwtable(async)``.
2513 This attribute indicates that no control-flow check will be performed on
2514 the attributed entity. It disables -fcf-protection=<> for a specific
2515 entity to fine grain the HW control flow protection mechanism. The flag
2516 is target independent and currently appertains to a function or function
2519 This attribute indicates that the ShadowCallStack checks are enabled for
2520 the function. The instrumentation checks that the return address for the
2521 function has not changed between the function prolog and epilog. It is
2522 currently x86_64-specific.
2524 .. _langref_mustprogress:
2527 This attribute indicates that the function is required to return, unwind,
2528 or interact with the environment in an observable way e.g. via a volatile
2529 memory access, I/O, or other synchronization. The ``mustprogress``
2530 attribute is intended to model the requirements of the first section of
2531 [intro.progress] of the C++ Standard. As a consequence, a loop in a
2532 function with the ``mustprogress`` attribute can be assumed to terminate if
2533 it does not interact with the environment in an observable way, and
2534 terminating loops without side-effects can be removed. If a ``mustprogress``
2535 function does not satisfy this contract, the behavior is undefined. If a
2536 ``mustprogress`` function calls a function not marked ``mustprogress``,
2537 and that function never returns, the program is well-defined even if there
2538 isn't any other observable progress. Note that ``willreturn`` implies
2540 ``"warn-stack-size"="<threshold>"``
2541 This attribute sets a threshold to emit diagnostics once the frame size is
2542 known should the frame size exceed the specified value. It takes one
2543 required integer value, which should be a non-negative integer, and less
2544 than `UINT_MAX`. It's unspecified which threshold will be used when
2545 duplicate definitions are linked together with differing values.
2546 ``vscale_range(<min>[, <max>])``
2547 This function attribute indicates `vscale` is a power-of-two within a
2548 specified range. `min` must be a power-of-two that is greater than 0. When
2549 specified, `max` must be a power-of-two greater-than-or-equal to `min` or 0
2550 to signify an unbounded maximum. The syntax `vscale_range(<val>)` can be
2551 used to set both `min` and `max` to the same value. Functions that don't
2552 include this attribute make no assumptions about the value of `vscale`.
2554 This attribute indicates that outlining passes should not modify the
2557 Call Site Attributes
2558 ----------------------
2560 In addition to function attributes the following call site only
2561 attributes are supported:
2563 ``vector-function-abi-variant``
2564 This attribute can be attached to a :ref:`call <i_call>` to list
2565 the vector functions associated to the function. Notice that the
2566 attribute cannot be attached to a :ref:`invoke <i_invoke>` or a
2567 :ref:`callbr <i_callbr>` instruction. The attribute consists of a
2568 comma separated list of mangled names. The order of the list does
2569 not imply preference (it is logically a set). The compiler is free
2570 to pick any listed vector function of its choosing.
2572 The syntax for the mangled names is as follows:::
2574 _ZGV<isa><mask><vlen><parameters>_<scalar_name>[(<vector_redirection>)]
2576 When present, the attribute informs the compiler that the function
2577 ``<scalar_name>`` has a corresponding vector variant that can be
2578 used to perform the concurrent invocation of ``<scalar_name>`` on
2579 vectors. The shape of the vector function is described by the
2580 tokens between the prefix ``_ZGV`` and the ``<scalar_name>``
2581 token. The standard name of the vector function is
2582 ``_ZGV<isa><mask><vlen><parameters>_<scalar_name>``. When present,
2583 the optional token ``(<vector_redirection>)`` informs the compiler
2584 that a custom name is provided in addition to the standard one
2585 (custom names can be provided for example via the use of ``declare
2586 variant`` in OpenMP 5.0). The declaration of the variant must be
2587 present in the IR Module. The signature of the vector variant is
2588 determined by the rules of the Vector Function ABI (VFABI)
2589 specifications of the target. For Arm and X86, the VFABI can be
2590 found at https://github.com/ARM-software/abi-aa and
2591 https://software.intel.com/content/www/us/en/develop/download/vector-simd-function-abi.html,
2594 For X86 and Arm targets, the values of the tokens in the standard
2595 name are those that are defined in the VFABI. LLVM has an internal
2596 ``<isa>`` token that can be used to create scalar-to-vector
2597 mappings for functions that are not directly associated to any of
2598 the target ISAs (for example, some of the mappings stored in the
2599 TargetLibraryInfo). Valid values for the ``<isa>`` token are:::
2601 <isa>:= b | c | d | e -> X86 SSE, AVX, AVX2, AVX512
2602 | n | s -> Armv8 Advanced SIMD, SVE
2603 | __LLVM__ -> Internal LLVM Vector ISA
2605 For all targets currently supported (x86, Arm and Internal LLVM),
2606 the remaining tokens can have the following values:::
2608 <mask>:= M | N -> mask | no mask
2610 <vlen>:= number -> number of lanes
2611 | x -> VLA (Vector Length Agnostic)
2613 <parameters>:= v -> vector
2614 | l | l <number> -> linear
2615 | R | R <number> -> linear with ref modifier
2616 | L | L <number> -> linear with val modifier
2617 | U | U <number> -> linear with uval modifier
2618 | ls <pos> -> runtime linear
2619 | Rs <pos> -> runtime linear with ref modifier
2620 | Ls <pos> -> runtime linear with val modifier
2621 | Us <pos> -> runtime linear with uval modifier
2624 <scalar_name>:= name of the scalar function
2626 <vector_redirection>:= optional, custom name of the vector function
2628 ``preallocated(<ty>)``
2629 This attribute is required on calls to ``llvm.call.preallocated.arg``
2630 and cannot be used on any other call. See
2631 :ref:`llvm.call.preallocated.arg<int_call_preallocated_arg>` for more
2639 Attributes may be set to communicate additional information about a global variable.
2640 Unlike :ref:`function attributes <fnattrs>`, attributes on a global variable
2641 are grouped into a single :ref:`attribute group <attrgrp>`.
2643 ``no_sanitize_address``
2644 This attribute indicates that the global variable should not have
2645 AddressSanitizer instrumentation applied to it, because it was annotated
2646 with `__attribute__((no_sanitize("address")))`,
2647 `__attribute__((disable_sanitizer_instrumentation))`, or included in the
2648 `-fsanitize-ignorelist` file.
2649 ``no_sanitize_hwaddress``
2650 This attribute indicates that the global variable should not have
2651 HWAddressSanitizer instrumentation applied to it, because it was annotated
2652 with `__attribute__((no_sanitize("hwaddress")))`,
2653 `__attribute__((disable_sanitizer_instrumentation))`, or included in the
2654 `-fsanitize-ignorelist` file.
2656 This attribute indicates that the global variable should have AArch64 memory
2657 tags (MTE) instrumentation applied to it. This attribute causes the
2658 suppression of certain optimizations, like GlobalMerge, as well as ensuring
2659 extra directives are emitted in the assembly and extra bits of metadata are
2660 placed in the object file so that the linker can ensure the accesses are
2661 protected by MTE. This attribute is added by clang when
2662 `-fsanitize=memtag-globals` is provided, as long as the global is not marked
2663 with `__attribute__((no_sanitize("memtag")))`,
2664 `__attribute__((disable_sanitizer_instrumentation))`, or included in the
2665 `-fsanitize-ignorelist` file. The AArch64 Globals Tagging pass may remove
2666 this attribute when it's not possible to tag the global (e.g. it's a TLS
2668 ``sanitize_address_dyninit``
2669 This attribute indicates that the global variable, when instrumented with
2670 AddressSanitizer, should be checked for ODR violations. This attribute is
2671 applied to global variables that are dynamically initialized according to
2679 Operand bundles are tagged sets of SSA values or metadata strings that can be
2680 associated with certain LLVM instructions (currently only ``call`` s and
2681 ``invoke`` s). In a way they are like metadata, but dropping them is
2682 incorrect and will change program semantics.
2686 operand bundle set ::= '[' operand bundle (, operand bundle )* ']'
2687 operand bundle ::= tag '(' [ bundle operand ] (, bundle operand )* ')'
2688 bundle operand ::= SSA value | metadata string
2689 tag ::= string constant
2691 Operand bundles are **not** part of a function's signature, and a
2692 given function may be called from multiple places with different kinds
2693 of operand bundles. This reflects the fact that the operand bundles
2694 are conceptually a part of the ``call`` (or ``invoke``), not the
2695 callee being dispatched to.
2697 Operand bundles are a generic mechanism intended to support
2698 runtime-introspection-like functionality for managed languages. While
2699 the exact semantics of an operand bundle depend on the bundle tag,
2700 there are certain limitations to how much the presence of an operand
2701 bundle can influence the semantics of a program. These restrictions
2702 are described as the semantics of an "unknown" operand bundle. As
2703 long as the behavior of an operand bundle is describable within these
2704 restrictions, LLVM does not need to have special knowledge of the
2705 operand bundle to not miscompile programs containing it.
2707 - The bundle operands for an unknown operand bundle escape in unknown
2708 ways before control is transferred to the callee or invokee.
2709 - Calls and invokes with operand bundles have unknown read / write
2710 effect on the heap on entry and exit (even if the call target specifies
2711 a ``memory`` attribute), unless they're overridden with
2712 callsite specific attributes.
2713 - An operand bundle at a call site cannot change the implementation
2714 of the called function. Inter-procedural optimizations work as
2715 usual as long as they take into account the first two properties.
2717 More specific types of operand bundles are described below.
2719 .. _deopt_opbundles:
2721 Deoptimization Operand Bundles
2722 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2724 Deoptimization operand bundles are characterized by the ``"deopt"``
2725 operand bundle tag. These operand bundles represent an alternate
2726 "safe" continuation for the call site they're attached to, and can be
2727 used by a suitable runtime to deoptimize the compiled frame at the
2728 specified call site. There can be at most one ``"deopt"`` operand
2729 bundle attached to a call site. Exact details of deoptimization is
2730 out of scope for the language reference, but it usually involves
2731 rewriting a compiled frame into a set of interpreted frames.
2733 From the compiler's perspective, deoptimization operand bundles make
2734 the call sites they're attached to at least ``readonly``. They read
2735 through all of their pointer typed operands (even if they're not
2736 otherwise escaped) and the entire visible heap. Deoptimization
2737 operand bundles do not capture their operands except during
2738 deoptimization, in which case control will not be returned to the
2741 The inliner knows how to inline through calls that have deoptimization
2742 operand bundles. Just like inlining through a normal call site
2743 involves composing the normal and exceptional continuations, inlining
2744 through a call site with a deoptimization operand bundle needs to
2745 appropriately compose the "safe" deoptimization continuation. The
2746 inliner does this by prepending the parent's deoptimization
2747 continuation to every deoptimization continuation in the inlined body.
2748 E.g. inlining ``@f`` into ``@g`` in the following example
2750 .. code-block:: llvm
2753 call void @x() ;; no deopt state
2754 call void @y() [ "deopt"(i32 10) ]
2755 call void @y() [ "deopt"(i32 10), "unknown"(ptr null) ]
2760 call void @f() [ "deopt"(i32 20) ]
2766 .. code-block:: llvm
2769 call void @x() ;; still no deopt state
2770 call void @y() [ "deopt"(i32 20, i32 10) ]
2771 call void @y() [ "deopt"(i32 20, i32 10), "unknown"(ptr null) ]
2775 It is the frontend's responsibility to structure or encode the
2776 deoptimization state in a way that syntactically prepending the
2777 caller's deoptimization state to the callee's deoptimization state is
2778 semantically equivalent to composing the caller's deoptimization
2779 continuation after the callee's deoptimization continuation.
2783 Funclet Operand Bundles
2784 ^^^^^^^^^^^^^^^^^^^^^^^
2786 Funclet operand bundles are characterized by the ``"funclet"``
2787 operand bundle tag. These operand bundles indicate that a call site
2788 is within a particular funclet. There can be at most one
2789 ``"funclet"`` operand bundle attached to a call site and it must have
2790 exactly one bundle operand.
2792 If any funclet EH pads have been "entered" but not "exited" (per the
2793 `description in the EH doc\ <ExceptionHandling.html#wineh-constraints>`_),
2794 it is undefined behavior to execute a ``call`` or ``invoke`` which:
2796 * does not have a ``"funclet"`` bundle and is not a ``call`` to a nounwind
2798 * has a ``"funclet"`` bundle whose operand is not the most-recently-entered
2799 not-yet-exited funclet EH pad.
2801 Similarly, if no funclet EH pads have been entered-but-not-yet-exited,
2802 executing a ``call`` or ``invoke`` with a ``"funclet"`` bundle is undefined behavior.
2804 GC Transition Operand Bundles
2805 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2807 GC transition operand bundles are characterized by the
2808 ``"gc-transition"`` operand bundle tag. These operand bundles mark a
2809 call as a transition between a function with one GC strategy to a
2810 function with a different GC strategy. If coordinating the transition
2811 between GC strategies requires additional code generation at the call
2812 site, these bundles may contain any values that are needed by the
2813 generated code. For more details, see :ref:`GC Transitions
2814 <gc_transition_args>`.
2816 The bundle contain an arbitrary list of Values which need to be passed
2817 to GC transition code. They will be lowered and passed as operands to
2818 the appropriate GC_TRANSITION nodes in the selection DAG. It is assumed
2819 that these arguments must be available before and after (but not
2820 necessarily during) the execution of the callee.
2822 .. _assume_opbundles:
2824 Assume Operand Bundles
2825 ^^^^^^^^^^^^^^^^^^^^^^
2827 Operand bundles on an :ref:`llvm.assume <int_assume>` allows representing
2828 assumptions, such as that a :ref:`parameter attribute <paramattrs>` or a
2829 :ref:`function attribute <fnattrs>` holds for a certain value at a certain
2830 location. Operand bundles enable assumptions that are either hard or impossible
2831 to represent as a boolean argument of an :ref:`llvm.assume <int_assume>`.
2833 An assume operand bundle has the form:
2837 "<tag>"([ <arguments>] ])
2839 In the case of function or parameter attributes, the operand bundle has the
2844 "<tag>"([ <holds for value> [, <attribute argument>] ])
2846 * The tag of the operand bundle is usually the name of attribute that can be
2847 assumed to hold. It can also be `ignore`, this tag doesn't contain any
2848 information and should be ignored.
2849 * The first argument if present is the value for which the attribute hold.
2850 * The second argument if present is an argument of the attribute.
2852 If there are no arguments the attribute is a property of the call location.
2856 .. code-block:: llvm
2858 call void @llvm.assume(i1 true) ["align"(ptr %val, i32 8)]
2860 allows the optimizer to assume that at location of call to
2861 :ref:`llvm.assume <int_assume>` ``%val`` has an alignment of at least 8.
2863 .. code-block:: llvm
2865 call void @llvm.assume(i1 %cond) ["cold"(), "nonnull"(ptr %val)]
2867 allows the optimizer to assume that the :ref:`llvm.assume <int_assume>`
2868 call location is cold and that ``%val`` may not be null.
2870 Just like for the argument of :ref:`llvm.assume <int_assume>`, if any of the
2871 provided guarantees are violated at runtime the behavior is undefined.
2873 While attributes expect constant arguments, assume operand bundles may be
2874 provided a dynamic value, for example:
2876 .. code-block:: llvm
2878 call void @llvm.assume(i1 true) ["align"(ptr %val, i32 %align)]
2880 If the operand bundle value violates any requirements on the attribute value,
2881 the behavior is undefined, unless one of the following exceptions applies:
2883 * ``"align"`` operand bundles may specify a non-power-of-two alignment
2884 (including a zero alignment). If this is the case, then the pointer value
2885 must be a null pointer, otherwise the behavior is undefined.
2887 In addition to allowing operand bundles encoding function and parameter
2888 attributes, an assume operand bundle my also encode a ``separate_storage``
2889 operand bundle. This has the form:
2891 .. code-block:: llvm
2893 separate_storage(<val1>, <val2>)``
2895 This indicates that no pointer :ref:`based <pointeraliasing>` on one of its
2896 arguments can alias any pointer based on the other.
2898 Even if the assumed property can be encoded as a boolean value, like
2899 ``nonnull``, using operand bundles to express the property can still have
2902 * Attributes that can be expressed via operand bundles are directly the
2903 property that the optimizer uses and cares about. Encoding attributes as
2904 operand bundles removes the need for an instruction sequence that represents
2905 the property (e.g., `icmp ne ptr %p, null` for `nonnull`) and for the
2906 optimizer to deduce the property from that instruction sequence.
2907 * Expressing the property using operand bundles makes it easy to identify the
2908 use of the value as a use in an :ref:`llvm.assume <int_assume>`. This then
2909 simplifies and improves heuristics, e.g., for use "use-sensitive"
2912 .. _ob_preallocated:
2914 Preallocated Operand Bundles
2915 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2917 Preallocated operand bundles are characterized by the ``"preallocated"``
2918 operand bundle tag. These operand bundles allow separation of the allocation
2919 of the call argument memory from the call site. This is necessary to pass
2920 non-trivially copyable objects by value in a way that is compatible with MSVC
2921 on some targets. There can be at most one ``"preallocated"`` operand bundle
2922 attached to a call site and it must have exactly one bundle operand, which is
2923 a token generated by ``@llvm.call.preallocated.setup``. A call with this
2924 operand bundle should not adjust the stack before entering the function, as
2925 that will have been done by one of the ``@llvm.call.preallocated.*`` intrinsics.
2927 .. code-block:: llvm
2929 %foo = type { i64, i32 }
2933 %t = call token @llvm.call.preallocated.setup(i32 1)
2934 %a = call ptr @llvm.call.preallocated.arg(token %t, i32 0) preallocated(%foo)
2936 call void @bar(i32 42, ptr preallocated(%foo) %a) ["preallocated"(token %t)]
2940 GC Live Operand Bundles
2941 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2943 A "gc-live" operand bundle is only valid on a :ref:`gc.statepoint <gc_statepoint>`
2944 intrinsic. The operand bundle must contain every pointer to a garbage collected
2945 object which potentially needs to be updated by the garbage collector.
2947 When lowered, any relocated value will be recorded in the corresponding
2948 :ref:`stackmap entry <statepoint-stackmap-format>`. See the intrinsic description
2949 for further details.
2951 ObjC ARC Attached Call Operand Bundles
2952 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2954 A ``"clang.arc.attachedcall"`` operand bundle on a call indicates the call is
2955 implicitly followed by a marker instruction and a call to an ObjC runtime
2956 function that uses the result of the call. The operand bundle takes a mandatory
2957 pointer to the runtime function (``@objc_retainAutoreleasedReturnValue`` or
2958 ``@objc_unsafeClaimAutoreleasedReturnValue``).
2959 The return value of a call with this bundle is used by a call to
2960 ``@llvm.objc.clang.arc.noop.use`` unless the called function's return type is
2961 void, in which case the operand bundle is ignored.
2963 .. code-block:: llvm
2965 ; The marker instruction and a runtime function call are inserted after the call
2967 call ptr @foo() [ "clang.arc.attachedcall"(ptr @objc_retainAutoreleasedReturnValue) ]
2968 call ptr @foo() [ "clang.arc.attachedcall"(ptr @objc_unsafeClaimAutoreleasedReturnValue) ]
2970 The operand bundle is needed to ensure the call is immediately followed by the
2971 marker instruction and the ObjC runtime call in the final output.
2975 Pointer Authentication Operand Bundles
2976 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2978 Pointer Authentication operand bundles are characterized by the
2979 ``"ptrauth"`` operand bundle tag. They are described in the
2980 `Pointer Authentication <PointerAuth.html#operand-bundle>`__ document.
2984 KCFI Operand Bundles
2985 ^^^^^^^^^^^^^^^^^^^^
2987 A ``"kcfi"`` operand bundle on an indirect call indicates that the call will
2988 be preceded by a runtime type check, which validates that the call target is
2989 prefixed with a :ref:`type identifier<md_kcfi_type>` that matches the operand
2990 bundle attribute. For example:
2992 .. code-block:: llvm
2994 call void %0() ["kcfi"(i32 1234)]
2996 Clang emits KCFI operand bundles and the necessary metadata with
2997 ``-fsanitize=kcfi``.
2999 .. _convergencectrl:
3001 Convergence Control Operand Bundles
3002 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
3004 A "convergencectrl" operand bundle is only valid on a ``convergent`` operation.
3005 When present, the operand bundle must contain exactly one value of token type.
3006 See the :doc:`ConvergentOperations` document for details.
3010 Module-Level Inline Assembly
3011 ----------------------------
3013 Modules may contain "module-level inline asm" blocks, which corresponds
3014 to the GCC "file scope inline asm" blocks. These blocks are internally
3015 concatenated by LLVM and treated as a single unit, but may be separated
3016 in the ``.ll`` file if desired. The syntax is very simple:
3018 .. code-block:: llvm
3020 module asm "inline asm code goes here"
3021 module asm "more can go here"
3023 The strings can contain any character by escaping non-printable
3024 characters. The escape sequence used is simply "\\xx" where "xx" is the
3025 two digit hex code for the number.
3027 Note that the assembly string *must* be parseable by LLVM's integrated assembler
3028 (unless it is disabled), even when emitting a ``.s`` file.
3030 .. _langref_datalayout:
3035 A module may specify a target specific data layout string that specifies
3036 how data is to be laid out in memory. The syntax for the data layout is
3039 .. code-block:: llvm
3041 target datalayout = "layout specification"
3043 The *layout specification* consists of a list of specifications
3044 separated by the minus sign character ('-'). Each specification starts
3045 with a letter and may include other information after the letter to
3046 define some aspect of the data layout. The specifications accepted are
3050 Specifies that the target lays out data in big-endian form. That is,
3051 the bits with the most significance have the lowest address
3054 Specifies that the target lays out data in little-endian form. That
3055 is, the bits with the least significance have the lowest address
3058 Specifies the natural alignment of the stack in bits. Alignment
3059 promotion of stack variables is limited to the natural stack
3060 alignment to avoid dynamic stack realignment. The stack alignment
3061 must be a multiple of 8-bits. If omitted, the natural stack
3062 alignment defaults to "unspecified", which does not prevent any
3063 alignment promotions.
3064 ``P<address space>``
3065 Specifies the address space that corresponds to program memory.
3066 Harvard architectures can use this to specify what space LLVM
3067 should place things such as functions into. If omitted, the
3068 program memory space defaults to the default address space of 0,
3069 which corresponds to a Von Neumann architecture that has code
3070 and data in the same space.
3071 ``G<address space>``
3072 Specifies the address space to be used by default when creating global
3073 variables. If omitted, the globals address space defaults to the default
3075 Note: variable declarations without an address space are always created in
3076 address space 0, this property only affects the default value to be used
3077 when creating globals without additional contextual information (e.g. in
3080 .. _alloca_addrspace:
3082 ``A<address space>``
3083 Specifies the address space of objects created by '``alloca``'.
3084 Defaults to the default address space of 0.
3085 ``p[n]:<size>:<abi>[:<pref>][:<idx>]``
3086 This specifies the *size* of a pointer and its ``<abi>`` and
3087 ``<pref>``\erred alignments for address space ``n``. ``<pref>`` is optional
3088 and defaults to ``<abi>``. The fourth parameter ``<idx>`` is the size of the
3089 index that used for address calculation, which must be less than or equal
3090 to the pointer size. If not
3091 specified, the default index size is equal to the pointer size. All sizes
3092 are in bits. The address space, ``n``, is optional, and if not specified,
3093 denotes the default address space 0. The value of ``n`` must be
3094 in the range [1,2^24).
3095 ``i<size>:<abi>[:<pref>]``
3096 This specifies the alignment for an integer type of a given bit
3097 ``<size>``. The value of ``<size>`` must be in the range [1,2^24).
3098 ``<pref>`` is optional and defaults to ``<abi>``.
3099 For ``i8``, the ``<abi>`` value must equal 8,
3100 that is, ``i8`` must be naturally aligned.
3101 ``v<size>:<abi>[:<pref>]``
3102 This specifies the alignment for a vector type of a given bit
3103 ``<size>``. The value of ``<size>`` must be in the range [1,2^24).
3104 ``<pref>`` is optional and defaults to ``<abi>``.
3105 ``f<size>:<abi>[:<pref>]``
3106 This specifies the alignment for a floating-point type of a given bit
3107 ``<size>``. Only values of ``<size>`` that are supported by the target
3108 will work. 32 (float) and 64 (double) are supported on all targets; 80
3109 or 128 (different flavors of long double) are also supported on some
3110 targets. The value of ``<size>`` must be in the range [1,2^24).
3111 ``<pref>`` is optional and defaults to ``<abi>``.
3112 ``a:<abi>[:<pref>]``
3113 This specifies the alignment for an object of aggregate type.
3114 ``<pref>`` is optional and defaults to ``<abi>``.
3116 This specifies the alignment for function pointers.
3117 The options for ``<type>`` are:
3119 * ``i``: The alignment of function pointers is independent of the alignment
3120 of functions, and is a multiple of ``<abi>``.
3121 * ``n``: The alignment of function pointers is a multiple of the explicit
3122 alignment specified on the function, and is a multiple of ``<abi>``.
3124 If present, specifies that llvm names are mangled in the output. Symbols
3125 prefixed with the mangling escape character ``\01`` are passed through
3126 directly to the assembler without the escape character. The mangling style
3129 * ``e``: ELF mangling: Private symbols get a ``.L`` prefix.
3130 * ``l``: GOFF mangling: Private symbols get a ``@`` prefix.
3131 * ``m``: Mips mangling: Private symbols get a ``$`` prefix.
3132 * ``o``: Mach-O mangling: Private symbols get ``L`` prefix. Other
3133 symbols get a ``_`` prefix.
3134 * ``x``: Windows x86 COFF mangling: Private symbols get the usual prefix.
3135 Regular C symbols get a ``_`` prefix. Functions with ``__stdcall``,
3136 ``__fastcall``, and ``__vectorcall`` have custom mangling that appends
3137 ``@N`` where N is the number of bytes used to pass parameters. C++ symbols
3138 starting with ``?`` are not mangled in any way.
3139 * ``w``: Windows COFF mangling: Similar to ``x``, except that normal C
3140 symbols do not receive a ``_`` prefix.
3141 * ``a``: XCOFF mangling: Private symbols get a ``L..`` prefix.
3142 ``n<size1>:<size2>:<size3>...``
3143 This specifies a set of native integer widths for the target CPU in
3144 bits. For example, it might contain ``n32`` for 32-bit PowerPC,
3145 ``n32:64`` for PowerPC 64, or ``n8:16:32:64`` for X86-64. Elements of
3146 this set are considered to support most general arithmetic operations
3148 ``ni:<address space0>:<address space1>:<address space2>...``
3149 This specifies pointer types with the specified address spaces
3150 as :ref:`Non-Integral Pointer Type <nointptrtype>` s. The ``0``
3151 address space cannot be specified as non-integral.
3153 On every specification that takes a ``<abi>:<pref>``, specifying the
3154 ``<pref>`` alignment is optional. If omitted, the preceding ``:``
3155 should be omitted too and ``<pref>`` will be equal to ``<abi>``.
3157 When constructing the data layout for a given target, LLVM starts with a
3158 default set of specifications which are then (possibly) overridden by
3159 the specifications in the ``datalayout`` keyword. The default
3160 specifications are given in this list:
3162 - ``e`` - little endian
3163 - ``p:64:64:64`` - 64-bit pointers with 64-bit alignment.
3164 - ``p[n]:64:64:64`` - Other address spaces are assumed to be the
3165 same as the default address space.
3166 - ``S0`` - natural stack alignment is unspecified
3167 - ``i1:8:8`` - i1 is 8-bit (byte) aligned
3168 - ``i8:8:8`` - i8 is 8-bit (byte) aligned as mandated
3169 - ``i16:16:16`` - i16 is 16-bit aligned
3170 - ``i32:32:32`` - i32 is 32-bit aligned
3171 - ``i64:32:64`` - i64 has ABI alignment of 32-bits but preferred
3172 alignment of 64-bits
3173 - ``f16:16:16`` - half is 16-bit aligned
3174 - ``f32:32:32`` - float is 32-bit aligned
3175 - ``f64:64:64`` - double is 64-bit aligned
3176 - ``f128:128:128`` - quad is 128-bit aligned
3177 - ``v64:64:64`` - 64-bit vector is 64-bit aligned
3178 - ``v128:128:128`` - 128-bit vector is 128-bit aligned
3179 - ``a:0:64`` - aggregates are 64-bit aligned
3181 When LLVM is determining the alignment for a given type, it uses the
3184 #. If the type sought is an exact match for one of the specifications,
3185 that specification is used.
3186 #. If no match is found, and the type sought is an integer type, then
3187 the smallest integer type that is larger than the bitwidth of the
3188 sought type is used. If none of the specifications are larger than
3189 the bitwidth then the largest integer type is used. For example,
3190 given the default specifications above, the i7 type will use the
3191 alignment of i8 (next largest) while both i65 and i256 will use the
3192 alignment of i64 (largest specified).
3194 The function of the data layout string may not be what you expect.
3195 Notably, this is not a specification from the frontend of what alignment
3196 the code generator should use.
3198 Instead, if specified, the target data layout is required to match what
3199 the ultimate *code generator* expects. This string is used by the
3200 mid-level optimizers to improve code, and this only works if it matches
3201 what the ultimate code generator uses. There is no way to generate IR
3202 that does not embed this target-specific detail into the IR. If you
3203 don't specify the string, the default specifications will be used to
3204 generate a Data Layout and the optimization phases will operate
3205 accordingly and introduce target specificity into the IR with respect to
3206 these default specifications.
3213 A module may specify a target triple string that describes the target
3214 host. The syntax for the target triple is simply:
3216 .. code-block:: llvm
3218 target triple = "x86_64-apple-macosx10.7.0"
3220 The *target triple* string consists of a series of identifiers delimited
3221 by the minus sign character ('-'). The canonical forms are:
3225 ARCHITECTURE-VENDOR-OPERATING_SYSTEM
3226 ARCHITECTURE-VENDOR-OPERATING_SYSTEM-ENVIRONMENT
3228 This information is passed along to the backend so that it generates
3229 code for the proper architecture. It's possible to override this on the
3230 command line with the ``-mtriple`` command line option.
3235 ----------------------
3237 A memory object, or simply object, is a region of a memory space that is
3238 reserved by a memory allocation such as :ref:`alloca <i_alloca>`, heap
3239 allocation calls, and global variable definitions.
3240 Once it is allocated, the bytes stored in the region can only be read or written
3241 through a pointer that is :ref:`based on <pointeraliasing>` the allocation
3243 If a pointer that is not based on the object tries to read or write to the
3244 object, it is undefined behavior.
3246 A lifetime of a memory object is a property that decides its accessibility.
3247 Unless stated otherwise, a memory object is alive since its allocation, and
3248 dead after its deallocation.
3249 It is undefined behavior to access a memory object that isn't alive, but
3250 operations that don't dereference it such as
3251 :ref:`getelementptr <i_getelementptr>`, :ref:`ptrtoint <i_ptrtoint>` and
3252 :ref:`icmp <i_icmp>` return a valid result.
3253 This explains code motion of these instructions across operations that
3254 impact the object's lifetime.
3255 A stack object's lifetime can be explicitly specified using
3256 :ref:`llvm.lifetime.start <int_lifestart>` and
3257 :ref:`llvm.lifetime.end <int_lifeend>` intrinsic function calls.
3259 .. _pointeraliasing:
3261 Pointer Aliasing Rules
3262 ----------------------
3264 Any memory access must be done through a pointer value associated with
3265 an address range of the memory access, otherwise the behavior is
3266 undefined. Pointer values are associated with address ranges according
3267 to the following rules:
3269 - A pointer value is associated with the addresses associated with any
3270 value it is *based* on.
3271 - An address of a global variable is associated with the address range
3272 of the variable's storage.
3273 - The result value of an allocation instruction is associated with the
3274 address range of the allocated storage.
3275 - A null pointer in the default address-space is associated with no
3277 - An :ref:`undef value <undefvalues>` in *any* address-space is
3278 associated with no address.
3279 - An integer constant other than zero or a pointer value returned from
3280 a function not defined within LLVM may be associated with address
3281 ranges allocated through mechanisms other than those provided by
3282 LLVM. Such ranges shall not overlap with any ranges of addresses
3283 allocated by mechanisms provided by LLVM.
3285 A pointer value is *based* on another pointer value according to the
3288 - A pointer value formed from a scalar ``getelementptr`` operation is *based* on
3289 the pointer-typed operand of the ``getelementptr``.
3290 - The pointer in lane *l* of the result of a vector ``getelementptr`` operation
3291 is *based* on the pointer in lane *l* of the vector-of-pointers-typed operand
3292 of the ``getelementptr``.
3293 - The result value of a ``bitcast`` is *based* on the operand of the
3295 - A pointer value formed by an ``inttoptr`` is *based* on all pointer
3296 values that contribute (directly or indirectly) to the computation of
3297 the pointer's value.
3298 - The "*based* on" relationship is transitive.
3300 Note that this definition of *"based"* is intentionally similar to the
3301 definition of *"based"* in C99, though it is slightly weaker.
3303 LLVM IR does not associate types with memory. The result type of a
3304 ``load`` merely indicates the size and alignment of the memory from
3305 which to load, as well as the interpretation of the value. The first
3306 operand type of a ``store`` similarly only indicates the size and
3307 alignment of the store.
3309 Consequently, type-based alias analysis, aka TBAA, aka
3310 ``-fstrict-aliasing``, is not applicable to general unadorned LLVM IR.
3311 :ref:`Metadata <metadata>` may be used to encode additional information
3312 which specialized optimization passes may use to implement type-based
3320 Given a function call and a pointer that is passed as an argument or stored in
3321 the memory before the call, a pointer is *captured* by the call if it makes a
3322 copy of any part of the pointer that outlives the call.
3323 To be precise, a pointer is captured if one or more of the following conditions
3326 1. The call stores any bit of the pointer carrying information into a place,
3327 and the stored bits can be read from the place by the caller after this call
3330 .. code-block:: llvm
3332 @glb = global ptr null
3333 @glb2 = global ptr null
3334 @glb3 = global ptr null
3335 @glbi = global i32 0
3337 define ptr @f(ptr %a, ptr %b, ptr %c, ptr %d, ptr %e) {
3338 store ptr %a, ptr @glb ; %a is captured by this call
3340 store ptr %b, ptr @glb2 ; %b isn't captured because the stored value is overwritten by the store below
3341 store ptr null, ptr @glb2
3343 store ptr %c, ptr @glb3
3344 call void @g() ; If @g makes a copy of %c that outlives this call (@f), %c is captured
3345 store ptr null, ptr @glb3
3347 %i = ptrtoint ptr %d to i64
3348 %j = trunc i64 %i to i32
3349 store i32 %j, ptr @glbi ; %d is captured
3351 ret ptr %e ; %e is captured
3354 2. The call stores any bit of the pointer carrying information into a place,
3355 and the stored bits can be safely read from the place by another thread via
3358 .. code-block:: llvm
3360 @lock = global i1 true
3362 define void @f(ptr %a) {
3363 store ptr %a, ptr* @glb
3364 store atomic i1 false, ptr @lock release ; %a is captured because another thread can safely read @glb
3365 store ptr null, ptr @glb
3369 3. The call's behavior depends on any bit of the pointer carrying information.
3371 .. code-block:: llvm
3375 define void @f(ptr %a) {
3376 %c = icmp eq ptr %a, @glb
3377 br i1 %c, label %BB_EXIT, label %BB_CONTINUE ; escapes %a
3385 4. The pointer is used in a volatile access as its address.
3390 Volatile Memory Accesses
3391 ------------------------
3393 Certain memory accesses, such as :ref:`load <i_load>`'s,
3394 :ref:`store <i_store>`'s, and :ref:`llvm.memcpy <int_memcpy>`'s may be
3395 marked ``volatile``. The optimizers must not change the number of
3396 volatile operations or change their order of execution relative to other
3397 volatile operations. The optimizers *may* change the order of volatile
3398 operations relative to non-volatile operations. This is not Java's
3399 "volatile" and has no cross-thread synchronization behavior.
3401 A volatile load or store may have additional target-specific semantics.
3402 Any volatile operation can have side effects, and any volatile operation
3403 can read and/or modify state which is not accessible via a regular load
3404 or store in this module. Volatile operations may use addresses which do
3405 not point to memory (like MMIO registers). This means the compiler may
3406 not use a volatile operation to prove a non-volatile access to that
3407 address has defined behavior.
3409 The allowed side-effects for volatile accesses are limited. If a
3410 non-volatile store to a given address would be legal, a volatile
3411 operation may modify the memory at that address. A volatile operation
3412 may not modify any other memory accessible by the module being compiled.
3413 A volatile operation may not call any code in the current module.
3415 In general (without target specific context), the address space of a
3416 volatile operation may not be changed. Different address spaces may
3417 have different trapping behavior when dereferencing an invalid
3420 The compiler may assume execution will continue after a volatile operation,
3421 so operations which modify memory or may have undefined behavior can be
3422 hoisted past a volatile operation.
3424 As an exception to the preceding rule, the compiler may not assume execution
3425 will continue after a volatile store operation. This restriction is necessary
3426 to support the somewhat common pattern in C of intentionally storing to an
3427 invalid pointer to crash the program. In the future, it might make sense to
3428 allow frontends to control this behavior.
3430 IR-level volatile loads and stores cannot safely be optimized into llvm.memcpy
3431 or llvm.memmove intrinsics even when those intrinsics are flagged volatile.
3432 Likewise, the backend should never split or merge target-legal volatile
3433 load/store instructions. Similarly, IR-level volatile loads and stores cannot
3434 change from integer to floating-point or vice versa.
3436 .. admonition:: Rationale
3438 Platforms may rely on volatile loads and stores of natively supported
3439 data width to be executed as single instruction. For example, in C
3440 this holds for an l-value of volatile primitive type with native
3441 hardware support, but not necessarily for aggregate types. The
3442 frontend upholds these expectations, which are intentionally
3443 unspecified in the IR. The rules above ensure that IR transformations
3444 do not violate the frontend's contract with the language.
3448 Memory Model for Concurrent Operations
3449 --------------------------------------
3451 The LLVM IR does not define any way to start parallel threads of
3452 execution or to register signal handlers. Nonetheless, there are
3453 platform-specific ways to create them, and we define LLVM IR's behavior
3454 in their presence. This model is inspired by the C++ memory model.
3456 For a more informal introduction to this model, see the :doc:`Atomics`.
3458 We define a *happens-before* partial order as the least partial order
3461 - Is a superset of single-thread program order, and
3462 - When ``a`` *synchronizes-with* ``b``, includes an edge from ``a`` to
3463 ``b``. *Synchronizes-with* pairs are introduced by platform-specific
3464 techniques, like pthread locks, thread creation, thread joining,
3465 etc., and by atomic instructions. (See also :ref:`Atomic Memory Ordering
3466 Constraints <ordering>`).
3468 Note that program order does not introduce *happens-before* edges
3469 between a thread and signals executing inside that thread.
3471 Every (defined) read operation (load instructions, memcpy, atomic
3472 loads/read-modify-writes, etc.) R reads a series of bytes written by
3473 (defined) write operations (store instructions, atomic
3474 stores/read-modify-writes, memcpy, etc.). For the purposes of this
3475 section, initialized globals are considered to have a write of the
3476 initializer which is atomic and happens before any other read or write
3477 of the memory in question. For each byte of a read R, R\ :sub:`byte`
3478 may see any write to the same byte, except:
3480 - If write\ :sub:`1` happens before write\ :sub:`2`, and
3481 write\ :sub:`2` happens before R\ :sub:`byte`, then
3482 R\ :sub:`byte` does not see write\ :sub:`1`.
3483 - If R\ :sub:`byte` happens before write\ :sub:`3`, then
3484 R\ :sub:`byte` does not see write\ :sub:`3`.
3486 Given that definition, R\ :sub:`byte` is defined as follows:
3488 - If R is volatile, the result is target-dependent. (Volatile is
3489 supposed to give guarantees which can support ``sig_atomic_t`` in
3490 C/C++, and may be used for accesses to addresses that do not behave
3491 like normal memory. It does not generally provide cross-thread
3493 - Otherwise, if there is no write to the same byte that happens before
3494 R\ :sub:`byte`, R\ :sub:`byte` returns ``undef`` for that byte.
3495 - Otherwise, if R\ :sub:`byte` may see exactly one write,
3496 R\ :sub:`byte` returns the value written by that write.
3497 - Otherwise, if R is atomic, and all the writes R\ :sub:`byte` may
3498 see are atomic, it chooses one of the values written. See the :ref:`Atomic
3499 Memory Ordering Constraints <ordering>` section for additional
3500 constraints on how the choice is made.
3501 - Otherwise R\ :sub:`byte` returns ``undef``.
3503 R returns the value composed of the series of bytes it read. This
3504 implies that some bytes within the value may be ``undef`` **without**
3505 the entire value being ``undef``. Note that this only defines the
3506 semantics of the operation; it doesn't mean that targets will emit more
3507 than one instruction to read the series of bytes.
3509 Note that in cases where none of the atomic intrinsics are used, this
3510 model places only one restriction on IR transformations on top of what
3511 is required for single-threaded execution: introducing a store to a byte
3512 which might not otherwise be stored is not allowed in general.
3513 (Specifically, in the case where another thread might write to and read
3514 from an address, introducing a store can change a load that may see
3515 exactly one write into a load that may see multiple writes.)
3519 Atomic Memory Ordering Constraints
3520 ----------------------------------
3522 Atomic instructions (:ref:`cmpxchg <i_cmpxchg>`,
3523 :ref:`atomicrmw <i_atomicrmw>`, :ref:`fence <i_fence>`,
3524 :ref:`atomic load <i_load>`, and :ref:`atomic store <i_store>`) take
3525 ordering parameters that determine which other atomic instructions on
3526 the same address they *synchronize with*. These semantics implement
3527 the Java or C++ memory models; if these descriptions aren't precise
3528 enough, check those specs (see spec references in the
3529 :doc:`atomics guide <Atomics>`). :ref:`fence <i_fence>` instructions
3530 treat these orderings somewhat differently since they don't take an
3531 address. See that instruction's documentation for details.
3533 For a simpler introduction to the ordering constraints, see the
3537 The set of values that can be read is governed by the happens-before
3538 partial order. A value cannot be read unless some operation wrote
3539 it. This is intended to provide a guarantee strong enough to model
3540 Java's non-volatile shared variables. This ordering cannot be
3541 specified for read-modify-write operations; it is not strong enough
3542 to make them atomic in any interesting way.
3544 In addition to the guarantees of ``unordered``, there is a single
3545 total order for modifications by ``monotonic`` operations on each
3546 address. All modification orders must be compatible with the
3547 happens-before order. There is no guarantee that the modification
3548 orders can be combined to a global total order for the whole program
3549 (and this often will not be possible). The read in an atomic
3550 read-modify-write operation (:ref:`cmpxchg <i_cmpxchg>` and
3551 :ref:`atomicrmw <i_atomicrmw>`) reads the value in the modification
3552 order immediately before the value it writes. If one atomic read
3553 happens before another atomic read of the same address, the later
3554 read must see the same value or a later value in the address's
3555 modification order. This disallows reordering of ``monotonic`` (or
3556 stronger) operations on the same address. If an address is written
3557 ``monotonic``-ally by one thread, and other threads ``monotonic``-ally
3558 read that address repeatedly, the other threads must eventually see
3559 the write. This corresponds to the C/C++ ``memory_order_relaxed``.
3561 In addition to the guarantees of ``monotonic``, a
3562 *synchronizes-with* edge may be formed with a ``release`` operation.
3563 This is intended to model C/C++'s ``memory_order_acquire``.
3565 In addition to the guarantees of ``monotonic``, if this operation
3566 writes a value which is subsequently read by an ``acquire``
3567 operation, it *synchronizes-with* that operation. Furthermore,
3568 this occurs even if the value written by a ``release`` operation
3569 has been modified by a read-modify-write operation before being
3570 read. (Such a set of operations comprises a *release
3571 sequence*). This corresponds to the C/C++
3572 ``memory_order_release``.
3573 ``acq_rel`` (acquire+release)
3574 Acts as both an ``acquire`` and ``release`` operation on its
3575 address. This corresponds to the C/C++ ``memory_order_acq_rel``.
3576 ``seq_cst`` (sequentially consistent)
3577 In addition to the guarantees of ``acq_rel`` (``acquire`` for an
3578 operation that only reads, ``release`` for an operation that only
3579 writes), there is a global total order on all
3580 sequentially-consistent operations on all addresses. Each
3581 sequentially-consistent read sees the last preceding write to the
3582 same address in this global order. This corresponds to the C/C++
3583 ``memory_order_seq_cst`` and Java ``volatile``.
3585 Note: this global total order is *not* guaranteed to be fully
3586 consistent with the *happens-before* partial order if
3587 non-``seq_cst`` accesses are involved. See the C++ standard
3588 `[atomics.order] <https://wg21.link/atomics.order>`_ section
3589 for more details on the exact guarantees.
3593 If an atomic operation is marked ``syncscope("singlethread")``, it only
3594 *synchronizes with* and only participates in the seq\_cst total orderings of
3595 other operations running in the same thread (for example, in signal handlers).
3597 If an atomic operation is marked ``syncscope("<target-scope>")``, where
3598 ``<target-scope>`` is a target specific synchronization scope, then it is target
3599 dependent if it *synchronizes with* and participates in the seq\_cst total
3600 orderings of other operations.
3602 Otherwise, an atomic operation that is not marked ``syncscope("singlethread")``
3603 or ``syncscope("<target-scope>")`` *synchronizes with* and participates in the
3604 seq\_cst total orderings of other operations that are not marked
3605 ``syncscope("singlethread")`` or ``syncscope("<target-scope>")``.
3609 Floating-Point Environment
3610 --------------------------
3612 The default LLVM floating-point environment assumes that traps are disabled and
3613 status flags are not observable. Therefore, floating-point math operations do
3614 not have side effects and may be speculated freely. Results assume the
3615 round-to-nearest rounding mode, and subnormals are assumed to be preserved.
3617 Running LLVM code in an environment where these assumptions are not met
3618 typically leads to undefined behavior. The ``strictfp`` and ``denormal-fp-math``
3619 attributes as well as :ref:`Constrained Floating-Point Intrinsics
3620 <constrainedfp>` can be used to weaken LLVM's assumptions and ensure defined
3621 behavior in non-default floating-point environments; see their respective
3622 documentation for details.
3626 Behavior of Floating-Point NaN values
3627 -------------------------------------
3629 A floating-point NaN value consists of a sign bit, a quiet/signaling bit, and a
3630 payload (which makes up the rest of the mantissa except for the quiet/signaling
3631 bit). LLVM assumes that the quiet/signaling bit being set to ``1`` indicates a
3632 quiet NaN (QNaN), and a value of ``0`` indicates a signaling NaN (SNaN). In the
3633 following we will hence just call it the "quiet bit".
3635 The representation bits of a floating-point value do not mutate arbitrarily; in
3636 particular, if there is no floating-point operation being performed, NaN signs,
3637 quiet bits, and payloads are preserved.
3639 For the purpose of this section, ``bitcast`` as well as the following operations
3640 are not "floating-point math operations": ``fneg``, ``llvm.fabs``, and
3641 ``llvm.copysign``. These operations act directly on the underlying bit
3642 representation and never change anything except possibly for the sign bit.
3644 Floating-point math operations that return a NaN are an exception from the
3645 general principle that LLVM implements IEEE-754 semantics. Unless specified
3646 otherwise, the following rules apply whenever the IEEE-754 semantics say that a
3647 NaN value is returned: the result has a non-deterministic sign; the quiet bit
3648 and payload are non-deterministically chosen from the following set of options:
3650 - The quiet bit is set and the payload is all-zero. ("Preferred NaN" case)
3651 - The quiet bit is set and the payload is copied from any input operand that is
3652 a NaN. ("Quieting NaN propagation" case)
3653 - The quiet bit and payload are copied from any input operand that is a NaN.
3654 ("Unchanged NaN propagation" case)
3655 - The quiet bit is set and the payload is picked from a target-specific set of
3656 "extra" possible NaN payloads. The set can depend on the input operand values.
3657 This set is empty on x86 and ARM, but can be non-empty on other architectures.
3658 (For instance, on wasm, if any input NaN does not have the preferred all-zero
3659 payload or any input NaN is an SNaN, then this set contains all possible
3660 payloads; otherwise, it is empty. On SPARC, this set consists of the all-one
3663 In particular, if all input NaNs are quiet (or if there are no input NaNs), then
3664 the output NaN is definitely quiet. Signaling NaN outputs can only occur if they
3665 are provided as an input value. For example, "fmul SNaN, 1.0" may be simplified
3666 to SNaN rather than QNaN. Similarly, if all input NaNs are preferred (or if
3667 there are no input NaNs) and the target does not have any "extra" NaN payloads,
3668 then the output NaN is guaranteed to be preferred.
3670 Floating-point math operations are allowed to treat all NaNs as if they were
3671 quiet NaNs. For example, "pow(1.0, SNaN)" may be simplified to 1.0.
3673 Code that requires different behavior than this should use the
3674 :ref:`Constrained Floating-Point Intrinsics <constrainedfp>`.
3675 In particular, constrained intrinsics rule out the "Unchanged NaN propagation"
3676 case; they are guaranteed to return a QNaN.
3678 Unfortunately, due to hard-or-impossible-to-fix issues, LLVM violates its own
3679 specification on some architectures:
3681 - x86-32 without SSE2 enabled may convert floating-point values to x86_fp80 and
3682 back when performing floating-point math operations; this can lead to results
3683 with different precision than expected and it can alter NaN values. Since
3684 optimizations can make contradicting assumptions, this can lead to arbitrary
3685 miscompilations. See `issue #44218
3686 <https://github.com/llvm/llvm-project/issues/44218>`_.
3687 - x86-32 (even with SSE2 enabled) may implicitly perform such a conversion on
3688 values returned from a function for some calling conventions. See `issue
3689 #66803 <https://github.com/llvm/llvm-project/issues/66803>`_.
3690 - Older MIPS versions use the opposite polarity for the quiet/signaling bit, and
3691 LLVM does not correctly represent this. See `issue #60796
3692 <https://github.com/llvm/llvm-project/issues/60796>`_.
3696 Floating-Point Semantics
3697 ------------------------
3699 This section defines the semantics for core floating-point operations on types
3700 that use a format specified by IEEE-745. These types are: ``half``, ``float``,
3701 ``double``, and ``fp128``, which correspond to the binary16, binary32, binary64,
3702 and binary128 formats, respectively. The "core" operations are those defined in
3703 section 5 of IEEE-745, which all have corresponding LLVM operations.
3705 The value returned by those operations matches that of the corresponding
3706 IEEE-754 operation executed in the :ref:`default LLVM floating-point environment
3707 <floatenv>`, except that the behavior of NaN results is instead :ref:`as
3708 specified here <floatnan>`. In particular, such a floating-point instruction
3709 returning a non-NaN value is guaranteed to always return the same bit-identical
3710 result on all machines and optimization levels.
3712 This means that optimizations and backends may not change the observed bitwise
3713 result of these operations in any way (unless NaNs are returned), and frontends
3714 can rely on these operations providing correctly rounded results as described in
3717 (Note that this is only about the value returned by these operations; see the
3718 :ref:`floating-point environment section <floatenv>` regarding flags and
3721 Various flags, attributes, and metadata can alter the behavior of these
3722 operations and thus make them not bit-identical across machines and optimization
3723 levels any more: most notably, the :ref:`fast-math flags <fastmath>` as well as
3724 the :ref:`strictfp <strictfp>` and :ref:`denormal-fp-math <denormal_fp_math>`
3725 attributes and :ref:`!fpmath metadata <fpmath-metadata>`. See their
3726 corresponding documentation for details.
3733 LLVM IR floating-point operations (:ref:`fneg <i_fneg>`, :ref:`fadd <i_fadd>`,
3734 :ref:`fsub <i_fsub>`, :ref:`fmul <i_fmul>`, :ref:`fdiv <i_fdiv>`,
3735 :ref:`frem <i_frem>`, :ref:`fcmp <i_fcmp>`), and :ref:`phi <i_phi>`,
3736 :ref:`select <i_select>`, or :ref:`call <i_call>` instructions that return
3737 floating-point types may use the following flags to enable otherwise unsafe
3738 floating-point transformations.
3741 This flag is a shorthand for specifying all fast-math flags at once, and
3742 imparts no additional semantics from using all of them.
3745 No NaNs - Allow optimizations to assume the arguments and result are not
3746 NaN. If an argument is a nan, or the result would be a nan, it produces
3747 a :ref:`poison value <poisonvalues>` instead.
3750 No Infs - Allow optimizations to assume the arguments and result are not
3751 +/-Inf. If an argument is +/-Inf, or the result would be +/-Inf, it
3752 produces a :ref:`poison value <poisonvalues>` instead.
3755 No Signed Zeros - Allow optimizations to treat the sign of a zero
3756 argument or zero result as insignificant. This does not imply that -0.0
3757 is poison and/or guaranteed to not exist in the operation.
3759 Note: For :ref:`phi <i_phi>`, :ref:`select <i_select>`, and :ref:`call <i_call>`
3760 instructions, the following return types are considered to be floating-point
3763 .. _fastmath_return_types:
3765 - Floating-point scalar or vector types
3766 - Array types (nested to any depth) of floating-point scalar or vector types
3767 - Homogeneous literal struct types of floating-point scalar or vector types
3772 The following flags have rewrite-based semantics. These flags allow expressions,
3773 potentially containing multiple non-consecutive instructions, to be rewritten
3774 into alternative instructions. When multiple instructions are involved in an
3775 expression, it is necessary that all of the instructions have the necessary
3776 rewrite-based flag present on them, and the rewritten instructions will
3777 generally have the intersection of the flags present on the input instruction.
3779 In the following example, the floating-point expression in the body of ``@orig``
3780 has ``contract`` and ``reassoc`` in common, and thus if it is rewritten into the
3781 expression in the body of ``@target``, all of the new instructions get those two
3782 flags and only those flags as a result. Since the ``arcp`` is present on only
3783 one of the instructions in the expression, it is not present in the transformed
3784 expression. Furthermore, this reassociation here is only legal because both the
3785 instructions had the ``reassoc`` flag; if only one had it, it would not be legal
3786 to make the transformation.
3788 .. code-block:: llvm
3790 define double @orig(double %a, double %b, double %c) {
3791 %t1 = fmul contract reassoc double %a, %b
3792 %val = fmul contract reassoc arcp double %t1, %c
3796 define double @target(double %a, double %b, double %c) {
3797 %t1 = fmul contract reassoc double %b, %c
3798 %val = fmul contract reassoc double %a, %t1
3802 These rules do not apply to the other fast-math flags. Whether or not a flag
3803 like ``nnan`` is present on any or all of the rewritten instructions is based
3804 on whether or not it is possible for said instruction to have a NaN input or
3805 output, given the original flags.
3808 Allows division to be treated as a multiplication by a reciprocal.
3809 Specifically, this permits ``a / b`` to be considered equivalent to
3810 ``a * (1.0 / b)`` (which may subsequently be susceptible to code motion),
3811 and it also permits ``a / (b / c)`` to be considered equivalent to
3812 ``a * (c / b)``. Both of these rewrites can be applied in either direction:
3813 ``a * (c / b)`` can be rewritten into ``a / (b / c)``.
3816 Allow floating-point contraction (e.g. fusing a multiply followed by an
3817 addition into a fused multiply-and-add). This does not enable reassociation
3818 to form arbitrary contractions. For example, ``(a*b) + (c*d) + e`` can not
3819 be transformed into ``(a*b) + ((c*d) + e)`` to create two fma operations.
3824 Approximate functions - Allow substitution of approximate calculations for
3825 functions (sin, log, sqrt, etc). See floating-point intrinsic definitions
3826 for places where this can apply to LLVM's intrinsic math functions.
3829 Allow reassociation transformations for floating-point instructions.
3830 This may dramatically change results in floating-point.
3834 Use-list Order Directives
3835 -------------------------
3837 Use-list directives encode the in-memory order of each use-list, allowing the
3838 order to be recreated. ``<order-indexes>`` is a comma-separated list of
3839 indexes that are assigned to the referenced value's uses. The referenced
3840 value's use-list is immediately sorted by these indexes.
3842 Use-list directives may appear at function scope or global scope. They are not
3843 instructions, and have no effect on the semantics of the IR. When they're at
3844 function scope, they must appear after the terminator of the final basic block.
3846 If basic blocks have their address taken via ``blockaddress()`` expressions,
3847 ``uselistorder_bb`` can be used to reorder their use-lists from outside their
3854 uselistorder <ty> <value>, { <order-indexes> }
3855 uselistorder_bb @function, %block { <order-indexes> }
3861 define void @foo(i32 %arg1, i32 %arg2) {
3863 ; ... instructions ...
3865 ; ... instructions ...
3867 ; At function scope.
3868 uselistorder i32 %arg1, { 1, 0, 2 }
3869 uselistorder label %bb, { 1, 0 }
3873 uselistorder ptr @global, { 1, 2, 0 }
3874 uselistorder i32 7, { 1, 0 }
3875 uselistorder i32 (i32) @bar, { 1, 0 }
3876 uselistorder_bb @foo, %bb, { 5, 1, 3, 2, 0, 4 }
3878 .. _source_filename:
3883 The *source filename* string is set to the original module identifier,
3884 which will be the name of the compiled source file when compiling from
3885 source through the clang front end, for example. It is then preserved through
3888 This is currently necessary to generate a consistent unique global
3889 identifier for local functions used in profile data, which prepends the
3890 source file name to the local function name.
3892 The syntax for the source file name is simply:
3894 .. code-block:: text
3896 source_filename = "/path/to/source.c"
3903 The LLVM type system is one of the most important features of the
3904 intermediate representation. Being typed enables a number of
3905 optimizations to be performed on the intermediate representation
3906 directly, without having to do extra analyses on the side before the
3907 transformation. A strong type system makes it easier to read the
3908 generated code and enables novel analyses and transformations that are
3909 not feasible to perform on normal three address code representations.
3919 The void type does not represent any value and has no size.
3937 The function type can be thought of as a function signature. It consists of a
3938 return type and a list of formal parameter types. The return type of a function
3939 type is a void type or first class type --- except for :ref:`label <t_label>`
3940 and :ref:`metadata <t_metadata>` types.
3946 <returntype> (<parameter list>)
3948 ...where '``<parameter list>``' is a comma-separated list of type
3949 specifiers. Optionally, the parameter list may include a type ``...``, which
3950 indicates that the function takes a variable number of arguments. Variable
3951 argument functions can access their arguments with the :ref:`variable argument
3952 handling intrinsic <int_varargs>` functions. '``<returntype>``' is any type
3953 except :ref:`label <t_label>` and :ref:`metadata <t_metadata>`.
3957 +---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+
3958 | ``i32 (i32)`` | function taking an ``i32``, returning an ``i32`` |
3959 +---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+
3960 | ``i32 (ptr, ...)`` | A vararg function that takes at least one :ref:`pointer <t_pointer>` argument and returns an integer. This is the signature for ``printf`` in LLVM. |
3961 +---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+
3962 | ``{i32, i32} (i32)`` | A function taking an ``i32``, returning a :ref:`structure <t_struct>` containing two ``i32`` values |
3963 +---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+
3970 The :ref:`first class <t_firstclass>` types are perhaps the most important.
3971 Values of these types are the only ones which can be produced by
3979 These are the types that are valid in registers from CodeGen's perspective.
3988 The integer type is a very simple type that simply specifies an
3989 arbitrary bit width for the integer type desired. Any bit width from 1
3990 bit to 2\ :sup:`23`\ (about 8 million) can be specified.
3998 The number of bits the integer will occupy is specified by the ``N``
4004 +----------------+------------------------------------------------+
4005 | ``i1`` | a single-bit integer. |
4006 +----------------+------------------------------------------------+
4007 | ``i32`` | a 32-bit integer. |
4008 +----------------+------------------------------------------------+
4009 | ``i1942652`` | a really big integer of over 1 million bits. |
4010 +----------------+------------------------------------------------+
4014 Floating-Point Types
4015 """"""""""""""""""""
4024 - 16-bit floating-point value (IEEE-754 binary16)
4027 - 16-bit "brain" floating-point value (7-bit significand). Provides the
4028 same number of exponent bits as ``float``, so that it matches its dynamic
4029 range, but with greatly reduced precision. Used in Intel's AVX-512 BF16
4030 extensions and Arm's ARMv8.6-A extensions, among others.
4033 - 32-bit floating-point value (IEEE-754 binary32)
4036 - 64-bit floating-point value (IEEE-754 binary64)
4039 - 128-bit floating-point value (IEEE-754 binary128)
4042 - 80-bit floating-point value (X87)
4045 - 128-bit floating-point value (two 64-bits)
4052 The x86_amx type represents a value held in an AMX tile register on an x86
4053 machine. The operations allowed on it are quite limited. Only few intrinsics
4054 are allowed: stride load and store, zero and dot product. No instruction is
4055 allowed for this type. There are no arguments, arrays, pointers, vectors
4056 or constants of this type.
4073 The pointer type ``ptr`` is used to specify memory locations. Pointers are
4074 commonly used to reference objects in memory.
4076 Pointer types may have an optional address space attribute defining
4077 the numbered address space where the pointed-to object resides. For
4078 example, ``ptr addrspace(5)`` is a pointer to address space 5.
4079 In addition to integer constants, ``addrspace`` can also reference one of the
4080 address spaces defined in the :ref:`datalayout string<langref_datalayout>`.
4081 ``addrspace("A")`` will use the alloca address space, ``addrspace("G")``
4082 the default globals address space and ``addrspace("P")`` the program address
4085 The default address space is number zero.
4087 The semantics of non-zero address spaces are target-specific. Memory
4088 access through a non-dereferenceable pointer is undefined behavior in
4089 any address space. Pointers with the bit-value 0 are only assumed to
4090 be non-dereferenceable in address space 0, unless the function is
4091 marked with the ``null_pointer_is_valid`` attribute.
4093 If an object can be proven accessible through a pointer with a
4094 different address space, the access may be modified to use that
4095 address space. Exceptions apply if the operation is ``volatile``.
4097 Prior to LLVM 15, pointer types also specified a pointee type, such as
4098 ``i8*``, ``[4 x i32]*`` or ``i32 (i32*)*``. In LLVM 15, such "typed
4099 pointers" are still supported under non-default options. See the
4100 `opaque pointers document <OpaquePointers.html>`__ for more information.
4104 Target Extension Type
4105 """""""""""""""""""""
4109 Target extension types represent types that must be preserved through
4110 optimization, but are otherwise generally opaque to the compiler. They may be
4111 used as function parameters or arguments, and in :ref:`phi <i_phi>` or
4112 :ref:`select <i_select>` instructions. Some types may be also used in
4113 :ref:`alloca <i_alloca>` instructions or as global values, and correspondingly
4114 it is legal to use :ref:`load <i_load>` and :ref:`store <i_store>` instructions
4115 on them. Full semantics for these types are defined by the target.
4117 The only constants that target extension types may have are ``zeroinitializer``,
4118 ``undef``, and ``poison``. Other possible values for target extension types may
4119 arise from target-specific intrinsics and functions.
4121 These types cannot be converted to other types. As such, it is not legal to use
4122 them in :ref:`bitcast <i_bitcast>` instructions (as a source or target type),
4123 nor is it legal to use them in :ref:`ptrtoint <i_ptrtoint>` or
4124 :ref:`inttoptr <i_inttoptr>` instructions. Similarly, they are not legal to use
4125 in an :ref:`icmp <i_icmp>` instruction.
4127 Target extension types have a name and optional type or integer parameters. The
4128 meanings of name and parameters are defined by the target. When being defined in
4129 LLVM IR, all of the type parameters must precede all of the integer parameters.
4131 Specific target extension types are registered with LLVM as having specific
4132 properties. These properties can be used to restrict the type from appearing in
4133 certain contexts, such as being the type of a global variable or having a
4134 ``zeroinitializer`` constant be valid. A complete list of type properties may be
4135 found in the documentation for ``llvm::TargetExtType::Property`` (`doxygen
4136 <https://llvm.org/doxygen/classllvm_1_1TargetExtType.html>`_).
4140 .. code-block:: llvm
4143 target("label", void)
4144 target("label", void, i32)
4145 target("label", 0, 1, 2)
4146 target("label", void, i32, 0, 1, 2)
4156 A vector type is a simple derived type that represents a vector of
4157 elements. Vector types are used when multiple primitive data are
4158 operated in parallel using a single instruction (SIMD). A vector type
4159 requires a size (number of elements), an underlying primitive data type,
4160 and a scalable property to represent vectors where the exact hardware
4161 vector length is unknown at compile time. Vector types are considered
4162 :ref:`first class <t_firstclass>`.
4166 In general vector elements are laid out in memory in the same way as
4167 :ref:`array types <t_array>`. Such an analogy works fine as long as the vector
4168 elements are byte sized. However, when the elements of the vector aren't byte
4169 sized it gets a bit more complicated. One way to describe the layout is by
4170 describing what happens when a vector such as <N x iM> is bitcasted to an
4171 integer type with N*M bits, and then following the rules for storing such an
4174 A bitcast from a vector type to a scalar integer type will see the elements
4175 being packed together (without padding). The order in which elements are
4176 inserted in the integer depends on endianness. For little endian element zero
4177 is put in the least significant bits of the integer, and for big endian
4178 element zero is put in the most significant bits.
4180 Using a vector such as ``<i4 1, i4 2, i4 3, i4 5>`` as an example, together
4181 with the analogy that we can replace a vector store by a bitcast followed by
4182 an integer store, we get this for big endian:
4184 .. code-block:: llvm
4186 %val = bitcast <4 x i4> <i4 1, i4 2, i4 3, i4 5> to i16
4188 ; Bitcasting from a vector to an integral type can be seen as
4189 ; concatenating the values:
4190 ; %val now has the hexadecimal value 0x1235.
4192 store i16 %val, ptr %ptr
4194 ; In memory the content will be (8-bit addressing):
4196 ; [%ptr + 0]: 00010010 (0x12)
4197 ; [%ptr + 1]: 00110101 (0x35)
4199 The same example for little endian:
4201 .. code-block:: llvm
4203 %val = bitcast <4 x i4> <i4 1, i4 2, i4 3, i4 5> to i16
4205 ; Bitcasting from a vector to an integral type can be seen as
4206 ; concatenating the values:
4207 ; %val now has the hexadecimal value 0x5321.
4209 store i16 %val, ptr %ptr
4211 ; In memory the content will be (8-bit addressing):
4213 ; [%ptr + 0]: 00100001 (0x21)
4214 ; [%ptr + 1]: 01010011 (0x53)
4216 When ``<N*M>`` isn't evenly divisible by the byte size the exact memory layout
4217 is unspecified (just like it is for an integral type of the same size). This
4218 is because different targets could put the padding at different positions when
4219 the type size is smaller than the type's store size.
4225 < <# elements> x <elementtype> > ; Fixed-length vector
4226 < vscale x <# elements> x <elementtype> > ; Scalable vector
4228 The number of elements is a constant integer value larger than 0;
4229 elementtype may be any integer, floating-point or pointer type. Vectors
4230 of size zero are not allowed. For scalable vectors, the total number of
4231 elements is a constant multiple (called vscale) of the specified number
4232 of elements; vscale is a positive integer that is unknown at compile time
4233 and the same hardware-dependent constant for all scalable vectors at run
4234 time. The size of a specific scalable vector type is thus constant within
4235 IR, even if the exact size in bytes cannot be determined until run time.
4239 +------------------------+----------------------------------------------------+
4240 | ``<4 x i32>`` | Vector of 4 32-bit integer values. |
4241 +------------------------+----------------------------------------------------+
4242 | ``<8 x float>`` | Vector of 8 32-bit floating-point values. |
4243 +------------------------+----------------------------------------------------+
4244 | ``<2 x i64>`` | Vector of 2 64-bit integer values. |
4245 +------------------------+----------------------------------------------------+
4246 | ``<4 x ptr>`` | Vector of 4 pointers |
4247 +------------------------+----------------------------------------------------+
4248 | ``<vscale x 4 x i32>`` | Vector with a multiple of 4 32-bit integer values. |
4249 +------------------------+----------------------------------------------------+
4258 The label type represents code labels.
4273 The token type is used when a value is associated with an instruction
4274 but all uses of the value must not attempt to introspect or obscure it.
4275 As such, it is not appropriate to have a :ref:`phi <i_phi>` or
4276 :ref:`select <i_select>` of type token.
4293 The metadata type represents embedded metadata. No derived types may be
4294 created from metadata except for :ref:`function <t_function>` arguments.
4307 Aggregate Types are a subset of derived types that can contain multiple
4308 member types. :ref:`Arrays <t_array>` and :ref:`structs <t_struct>` are
4309 aggregate types. :ref:`Vectors <t_vector>` are not considered to be
4319 The array type is a very simple derived type that arranges elements
4320 sequentially in memory. The array type requires a size (number of
4321 elements) and an underlying data type.
4327 [<# elements> x <elementtype>]
4329 The number of elements is a constant integer value; ``elementtype`` may
4330 be any type with a size.
4334 +------------------+--------------------------------------+
4335 | ``[40 x i32]`` | Array of 40 32-bit integer values. |
4336 +------------------+--------------------------------------+
4337 | ``[41 x i32]`` | Array of 41 32-bit integer values. |
4338 +------------------+--------------------------------------+
4339 | ``[4 x i8]`` | Array of 4 8-bit integer values. |
4340 +------------------+--------------------------------------+
4342 Here are some examples of multidimensional arrays:
4344 +-----------------------------+----------------------------------------------------------+
4345 | ``[3 x [4 x i32]]`` | 3x4 array of 32-bit integer values. |
4346 +-----------------------------+----------------------------------------------------------+
4347 | ``[12 x [10 x float]]`` | 12x10 array of single precision floating-point values. |
4348 +-----------------------------+----------------------------------------------------------+
4349 | ``[2 x [3 x [4 x i16]]]`` | 2x3x4 array of 16-bit integer values. |
4350 +-----------------------------+----------------------------------------------------------+
4352 There is no restriction on indexing beyond the end of the array implied
4353 by a static type (though there are restrictions on indexing beyond the
4354 bounds of an allocated object in some cases). This means that
4355 single-dimension 'variable sized array' addressing can be implemented in
4356 LLVM with a zero length array type. An implementation of 'pascal style
4357 arrays' in LLVM could use the type "``{ i32, [0 x float]}``", for
4367 The structure type is used to represent a collection of data members
4368 together in memory. The elements of a structure may be any type that has
4371 Structures in memory are accessed using '``load``' and '``store``' by
4372 getting a pointer to a field with the '``getelementptr``' instruction.
4373 Structures in registers are accessed using the '``extractvalue``' and
4374 '``insertvalue``' instructions.
4376 Structures may optionally be "packed" structures, which indicate that
4377 the alignment of the struct is one byte, and that there is no padding
4378 between the elements. In non-packed structs, padding between field types
4379 is inserted as defined by the DataLayout string in the module, which is
4380 required to match what the underlying code generator expects.
4382 Structures can either be "literal" or "identified". A literal structure
4383 is defined inline with other types (e.g. ``[2 x {i32, i32}]``) whereas
4384 identified types are always defined at the top level with a name.
4385 Literal types are uniqued by their contents and can never be recursive
4386 or opaque since there is no way to write one. Identified types can be
4387 opaqued and are never uniqued. Identified types must not be recursive.
4393 %T1 = type { <type list> } ; Identified normal struct type
4394 %T2 = type <{ <type list> }> ; Identified packed struct type
4398 +------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
4399 | ``{ i32, i32, i32 }`` | A triple of three ``i32`` values (this is a "homogeneous" struct as all element types are the same) |
4400 +------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
4401 | ``{ float, ptr }`` | A pair, where the first element is a ``float`` and the second element is a :ref:`pointer <t_pointer>`. |
4402 +------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
4403 | ``<{ i8, i32 }>`` | A packed struct known to be 5 bytes in size. |
4404 +------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
4408 Opaque Structure Types
4409 """"""""""""""""""""""
4413 Opaque structure types are used to represent structure types that
4414 do not have a body specified. This corresponds (for example) to the C
4415 notion of a forward declared structure. They can be named (``%X``) or
4427 +--------------+-------------------+
4428 | ``opaque`` | An opaque type. |
4429 +--------------+-------------------+
4436 LLVM has several different basic types of constants. This section
4437 describes them all and their syntax.
4442 **Boolean constants**
4443 The two strings '``true``' and '``false``' are both valid constants
4445 **Integer constants**
4446 Standard integers (such as '4') are constants of the :ref:`integer
4447 <t_integer>` type. They can be either decimal or
4448 hexadecimal. Decimal integers can be prefixed with - to represent
4449 negative integers, e.g. '``-1234``'. Hexadecimal integers must be
4450 prefixed with either u or s to indicate whether they are unsigned
4451 or signed respectively. e.g '``u0x8000``' gives 32768, whilst
4452 '``s0x8000``' gives -32768.
4454 Note that hexadecimal integers are sign extended from the number
4455 of active bits, i.e. the bit width minus the number of leading
4456 zeros. So '``s0x0001``' of type '``i16``' will be -1, not 1.
4457 **Floating-point constants**
4458 Floating-point constants use standard decimal notation (e.g.
4459 123.421), exponential notation (e.g. 1.23421e+2), or a more precise
4460 hexadecimal notation (see below). The assembler requires the exact
4461 decimal value of a floating-point constant. For example, the
4462 assembler accepts 1.25 but rejects 1.3 because 1.3 is a repeating
4463 decimal in binary. Floating-point constants must have a
4464 :ref:`floating-point <t_floating>` type.
4465 **Null pointer constants**
4466 The identifier '``null``' is recognized as a null pointer constant
4467 and must be of :ref:`pointer type <t_pointer>`.
4469 The identifier '``none``' is recognized as an empty token constant
4470 and must be of :ref:`token type <t_token>`.
4472 The one non-intuitive notation for constants is the hexadecimal form of
4473 floating-point constants. For example, the form
4474 '``double 0x432ff973cafa8000``' is equivalent to (but harder to read
4475 than) '``double 4.5e+15``'. The only time hexadecimal floating-point
4476 constants are required (and the only time that they are generated by the
4477 disassembler) is when a floating-point constant must be emitted but it
4478 cannot be represented as a decimal floating-point number in a reasonable
4479 number of digits. For example, NaN's, infinities, and other special
4480 values are represented in their IEEE hexadecimal format so that assembly
4481 and disassembly do not cause any bits to change in the constants.
4483 When using the hexadecimal form, constants of types bfloat, half, float, and
4484 double are represented using the 16-digit form shown above (which matches the
4485 IEEE754 representation for double); bfloat, half and float values must, however,
4486 be exactly representable as bfloat, IEEE 754 half, and IEEE 754 single
4487 precision respectively. Hexadecimal format is always used for long double, and
4488 there are three forms of long double. The 80-bit format used by x86 is
4489 represented as ``0xK`` followed by 20 hexadecimal digits. The 128-bit format
4490 used by PowerPC (two adjacent doubles) is represented by ``0xM`` followed by 32
4491 hexadecimal digits. The IEEE 128-bit format is represented by ``0xL`` followed
4492 by 32 hexadecimal digits. Long doubles will only work if they match the long
4493 double format on your target. The IEEE 16-bit format (half precision) is
4494 represented by ``0xH`` followed by 4 hexadecimal digits. The bfloat 16-bit
4495 format is represented by ``0xR`` followed by 4 hexadecimal digits. All
4496 hexadecimal formats are big-endian (sign bit at the left).
4498 There are no constants of type x86_amx.
4500 .. _complexconstants:
4505 Complex constants are a (potentially recursive) combination of simple
4506 constants and smaller complex constants.
4508 **Structure constants**
4509 Structure constants are represented with notation similar to
4510 structure type definitions (a comma separated list of elements,
4511 surrounded by braces (``{}``)). For example:
4512 "``{ i32 4, float 17.0, ptr @G }``", where "``@G``" is declared as
4513 "``@G = external global i32``". Structure constants must have
4514 :ref:`structure type <t_struct>`, and the number and types of elements
4515 must match those specified by the type.
4517 Array constants are represented with notation similar to array type
4518 definitions (a comma separated list of elements, surrounded by
4519 square brackets (``[]``)). For example:
4520 "``[ i32 42, i32 11, i32 74 ]``". Array constants must have
4521 :ref:`array type <t_array>`, and the number and types of elements must
4522 match those specified by the type. As a special case, character array
4523 constants may also be represented as a double-quoted string using the ``c``
4524 prefix. For example: "``c"Hello World\0A\00"``".
4525 **Vector constants**
4526 Vector constants are represented with notation similar to vector
4527 type definitions (a comma separated list of elements, surrounded by
4528 less-than/greater-than's (``<>``)). For example:
4529 "``< i32 42, i32 11, i32 74, i32 100 >``". Vector constants
4530 must have :ref:`vector type <t_vector>`, and the number and types of
4531 elements must match those specified by the type.
4533 When creating a vector whose elements have the same constant value, the
4534 preferred syntax is ``splat (<Ty> Val)``. For example: "``splat (i32 11)``".
4535 These vector constants must have :ref:`vector type <t_vector>` with an
4536 element type that matches the ``splat`` operand.
4537 **Zero initialization**
4538 The string '``zeroinitializer``' can be used to zero initialize a
4539 value to zero of *any* type, including scalar and
4540 :ref:`aggregate <t_aggregate>` types. This is often used to avoid
4541 having to print large zero initializers (e.g. for large arrays) and
4542 is always exactly equivalent to using explicit zero initializers.
4544 A metadata node is a constant tuple without types. For example:
4545 "``!{!0, !{!2, !0}, !"test"}``". Metadata can reference constant values,
4546 for example: "``!{!0, i32 0, ptr @global, ptr @function, !"str"}``".
4547 Unlike other typed constants that are meant to be interpreted as part of
4548 the instruction stream, metadata is a place to attach additional
4549 information such as debug info.
4551 Global Variable and Function Addresses
4552 --------------------------------------
4554 The addresses of :ref:`global variables <globalvars>` and
4555 :ref:`functions <functionstructure>` are always implicitly valid
4556 (link-time) constants. These constants are explicitly referenced when
4557 the :ref:`identifier for the global <identifiers>` is used and always have
4558 :ref:`pointer <t_pointer>` type. For example, the following is a legal LLVM
4561 .. code-block:: llvm
4565 @Z = global [2 x ptr] [ ptr @X, ptr @Y ]
4572 The string '``undef``' can be used anywhere a constant is expected, and
4573 indicates that the user of the value may receive an unspecified
4574 bit-pattern. Undefined values may be of any type (other than '``label``'
4575 or '``void``') and be used anywhere a constant is permitted.
4579 A '``poison``' value (described in the next section) should be used instead of
4580 '``undef``' whenever possible. Poison values are stronger than undef, and
4581 enable more optimizations. Just the existence of '``undef``' blocks certain
4582 optimizations (see the examples below).
4584 Undefined values are useful because they indicate to the compiler that
4585 the program is well defined no matter what value is used. This gives the
4586 compiler more freedom to optimize. Here are some examples of
4587 (potentially surprising) transformations that are valid (in pseudo IR):
4589 .. code-block:: llvm
4599 This is safe because all of the output bits are affected by the undef
4600 bits. Any output bit can have a zero or one depending on the input bits.
4602 .. code-block:: llvm
4610 %A = %X ;; By choosing undef as 0
4611 %B = %X ;; By choosing undef as -1
4616 These logical operations have bits that are not always affected by the
4617 input. For example, if ``%X`` has a zero bit, then the output of the
4618 '``and``' operation will always be a zero for that bit, no matter what
4619 the corresponding bit from the '``undef``' is. As such, it is unsafe to
4620 optimize or assume that the result of the '``and``' is '``undef``'.
4621 However, it is safe to assume that all bits of the '``undef``' could be
4622 0, and optimize the '``and``' to 0. Likewise, it is safe to assume that
4623 all the bits of the '``undef``' operand to the '``or``' could be set,
4624 allowing the '``or``' to be folded to -1.
4626 .. code-block:: llvm
4628 %A = select undef, %X, %Y
4629 %B = select undef, 42, %Y
4630 %C = select %X, %Y, undef
4634 %C = %Y (if %Y is provably not poison; unsafe otherwise)
4640 This set of examples shows that undefined '``select``' (and conditional
4641 branch) conditions can go *either way*, but they have to come from one
4642 of the two operands. In the ``%A`` example, if ``%X`` and ``%Y`` were
4643 both known to have a clear low bit, then ``%A`` would have to have a
4644 cleared low bit. However, in the ``%C`` example, the optimizer is
4645 allowed to assume that the '``undef``' operand could be the same as
4646 ``%Y`` if ``%Y`` is provably not '``poison``', allowing the whole '``select``'
4647 to be eliminated. This is because '``poison``' is stronger than '``undef``'.
4649 .. code-block:: llvm
4651 %A = xor undef, undef
4668 This example points out that two '``undef``' operands are not
4669 necessarily the same. This can be surprising to people (and also matches
4670 C semantics) where they assume that "``X^X``" is always zero, even if
4671 ``X`` is undefined. This isn't true for a number of reasons, but the
4672 short answer is that an '``undef``' "variable" can arbitrarily change
4673 its value over its "live range". This is true because the variable
4674 doesn't actually *have a live range*. Instead, the value is logically
4675 read from arbitrary registers that happen to be around when needed, so
4676 the value is not necessarily consistent over time. In fact, ``%A`` and
4677 ``%C`` need to have the same semantics or the core LLVM "replace all
4678 uses with" concept would not hold.
4680 To ensure all uses of a given register observe the same value (even if
4681 '``undef``'), the :ref:`freeze instruction <i_freeze>` can be used.
4683 .. code-block:: llvm
4691 These examples show the crucial difference between an *undefined value*
4692 and *undefined behavior*. An undefined value (like '``undef``') is
4693 allowed to have an arbitrary bit-pattern. This means that the ``%A``
4694 operation can be constant folded to '``0``', because the '``undef``'
4695 could be zero, and zero divided by any value is zero.
4696 However, in the second example, we can make a more aggressive
4697 assumption: because the ``undef`` is allowed to be an arbitrary value,
4698 we are allowed to assume that it could be zero. Since a divide by zero
4699 has *undefined behavior*, we are allowed to assume that the operation
4700 does not execute at all. This allows us to delete the divide and all
4701 code after it. Because the undefined operation "can't happen", the
4702 optimizer can assume that it occurs in dead code.
4704 .. code-block:: text
4706 a: store undef -> %X
4707 b: store %X -> undef
4709 a: <deleted> (if the stored value in %X is provably not poison)
4712 A store *of* an undefined value can be assumed to not have any effect;
4713 we can assume that the value is overwritten with bits that happen to
4714 match what was already there. This argument is only valid if the stored value
4715 is provably not ``poison``. However, a store *to* an undefined
4716 location could clobber arbitrary memory, therefore, it has undefined
4719 Branching on an undefined value is undefined behavior.
4720 This explains optimizations that depend on branch conditions to construct
4721 predicates, such as Correlated Value Propagation and Global Value Numbering.
4722 In case of switch instruction, the branch condition should be frozen, otherwise
4723 it is undefined behavior.
4725 .. code-block:: llvm
4728 br undef, BB1, BB2 ; UB
4730 %X = and i32 undef, 255
4731 switch %X, label %ret [ .. ] ; UB
4733 store undef, ptr %ptr
4734 %X = load ptr %ptr ; %X is undef
4735 switch i8 %X, label %ret [ .. ] ; UB
4738 %X = or i8 undef, 255 ; always 255
4739 switch i8 %X, label %ret [ .. ] ; Well-defined
4741 %X = freeze i1 undef
4742 br %X, BB1, BB2 ; Well-defined (non-deterministic jump)
4751 A poison value is a result of an erroneous operation.
4752 In order to facilitate speculative execution, many instructions do not
4753 invoke immediate undefined behavior when provided with illegal operands,
4754 and return a poison value instead.
4755 The string '``poison``' can be used anywhere a constant is expected, and
4756 operations such as :ref:`add <i_add>` with the ``nsw`` flag can produce
4759 Most instructions return '``poison``' when one of their arguments is
4760 '``poison``'. A notable exception is the :ref:`select instruction <i_select>`.
4761 Propagation of poison can be stopped with the
4762 :ref:`freeze instruction <i_freeze>`.
4764 It is correct to replace a poison value with an
4765 :ref:`undef value <undefvalues>` or any value of the type.
4767 This means that immediate undefined behavior occurs if a poison value is
4768 used as an instruction operand that has any values that trigger undefined
4769 behavior. Notably this includes (but is not limited to):
4771 - The pointer operand of a :ref:`load <i_load>`, :ref:`store <i_store>` or
4772 any other pointer dereferencing instruction (independent of address
4774 - The divisor operand of a ``udiv``, ``sdiv``, ``urem`` or ``srem``
4776 - The condition operand of a :ref:`br <i_br>` instruction.
4777 - The callee operand of a :ref:`call <i_call>` or :ref:`invoke <i_invoke>`
4779 - The parameter operand of a :ref:`call <i_call>` or :ref:`invoke <i_invoke>`
4780 instruction, when the function or invoking call site has a ``noundef``
4781 attribute in the corresponding position.
4782 - The operand of a :ref:`ret <i_ret>` instruction if the function or invoking
4783 call site has a `noundef` attribute in the return value position.
4785 Here are some examples:
4787 .. code-block:: llvm
4790 %poison = sub nuw i32 0, 1 ; Results in a poison value.
4791 %poison2 = sub i32 poison, 1 ; Also results in a poison value.
4792 %still_poison = and i32 %poison, 0 ; 0, but also poison.
4793 %poison_yet_again = getelementptr i32, ptr @h, i32 %still_poison
4794 store i32 0, ptr %poison_yet_again ; Undefined behavior due to
4797 store i32 %poison, ptr @g ; Poison value stored to memory.
4798 %poison3 = load i32, ptr @g ; Poison value loaded back from memory.
4800 %poison4 = load i16, ptr @g ; Returns a poison value.
4801 %poison5 = load i64, ptr @g ; Returns a poison value.
4803 %cmp = icmp slt i32 %poison, 0 ; Returns a poison value.
4804 br i1 %cmp, label %end, label %end ; undefined behavior
4808 .. _welldefinedvalues:
4813 Given a program execution, a value is *well defined* if the value does not
4814 have an undef bit and is not poison in the execution.
4815 An aggregate value or vector is well defined if its elements are well defined.
4816 The padding of an aggregate isn't considered, since it isn't visible
4817 without storing it into memory and loading it with a different type.
4819 A constant of a :ref:`single value <t_single_value>`, non-vector type is well
4820 defined if it is neither '``undef``' constant nor '``poison``' constant.
4821 The result of :ref:`freeze instruction <i_freeze>` is well defined regardless
4826 Addresses of Basic Blocks
4827 -------------------------
4829 ``blockaddress(@function, %block)``
4831 The '``blockaddress``' constant computes the address of the specified
4832 basic block in the specified function.
4834 It always has an ``ptr addrspace(P)`` type, where ``P`` is the address space
4835 of the function containing ``%block`` (usually ``addrspace(0)``).
4837 Taking the address of the entry block is illegal.
4839 This value only has defined behavior when used as an operand to the
4840 ':ref:`indirectbr <i_indirectbr>`' or for comparisons against null. Pointer
4841 equality tests between labels addresses results in undefined behavior ---
4842 though, again, comparison against null is ok, and no label is equal to the null
4843 pointer. This may be passed around as an opaque pointer sized value as long as
4844 the bits are not inspected. This allows ``ptrtoint`` and arithmetic to be
4845 performed on these values so long as the original value is reconstituted before
4846 the ``indirectbr`` instruction.
4848 Finally, some targets may provide defined semantics when using the value
4849 as the operand to an inline assembly, but that is target specific.
4851 .. _dso_local_equivalent:
4853 DSO Local Equivalent
4854 --------------------
4856 ``dso_local_equivalent @func``
4858 A '``dso_local_equivalent``' constant represents a function which is
4859 functionally equivalent to a given function, but is always defined in the
4860 current linkage unit. The resulting pointer has the same type as the underlying
4861 function. The resulting pointer is permitted, but not required, to be different
4862 from a pointer to the function, and it may have different values in different
4865 The target function may not have ``extern_weak`` linkage.
4867 ``dso_local_equivalent`` can be implemented as such:
4869 - If the function has local linkage, hidden visibility, or is
4870 ``dso_local``, ``dso_local_equivalent`` can be implemented as simply a pointer
4872 - ``dso_local_equivalent`` can be implemented with a stub that tail-calls the
4873 function. Many targets support relocations that resolve at link time to either
4874 a function or a stub for it, depending on if the function is defined within the
4875 linkage unit; LLVM will use this when available. (This is commonly called a
4876 "PLT stub".) On other targets, the stub may need to be emitted explicitly.
4878 This can be used wherever a ``dso_local`` instance of a function is needed without
4879 needing to explicitly make the original function ``dso_local``. An instance where
4880 this can be used is for static offset calculations between a function and some other
4881 ``dso_local`` symbol. This is especially useful for the Relative VTables C++ ABI,
4882 where dynamic relocations for function pointers in VTables can be replaced with
4883 static relocations for offsets between the VTable and virtual functions which
4884 may not be ``dso_local``.
4886 This is currently only supported for ELF binary formats.
4895 With `Control-Flow Integrity (CFI)
4896 <https://clang.llvm.org/docs/ControlFlowIntegrity.html>`_, a '``no_cfi``'
4897 constant represents a function reference that does not get replaced with a
4898 reference to the CFI jump table in the ``LowerTypeTests`` pass. These constants
4899 may be useful in low-level programs, such as operating system kernels, which
4900 need to refer to the actual function body.
4902 .. _ptrauth_constant:
4904 Pointer Authentication Constants
4905 --------------------------------
4907 ``ptrauth (ptr CST, i32 KEY[, i64 DISC[, ptr ADDRDISC]?]?)``
4909 A '``ptrauth``' constant represents a pointer with a cryptographic
4910 authentication signature embedded into some bits, as described in the
4911 `Pointer Authentication <PointerAuth.html>`__ document.
4913 A '``ptrauth``' constant is simply a constant equivalent to the
4914 ``llvm.ptrauth.sign`` intrinsic, potentially fed by a discriminator
4915 ``llvm.ptrauth.blend`` if needed.
4917 Its type is the same as the first argument. An integer constant discriminator
4918 and an address discriminator may be optionally specified. Otherwise, they have
4919 values ``i64 0`` and ``ptr null``.
4921 If the address discriminator is ``null`` then the expression is equivalent to
4923 .. code-block:: llvm
4925 %tmp = call i64 @llvm.ptrauth.sign(i64 ptrtoint (ptr CST to i64), i32 KEY, i64 DISC)
4926 %val = inttoptr i64 %tmp to ptr
4928 Otherwise, the expression is equivalent to:
4930 .. code-block:: llvm
4932 %tmp1 = call i64 @llvm.ptrauth.blend(i64 ptrtoint (ptr ADDRDISC to i64), i64 DISC)
4933 %tmp2 = call i64 @llvm.ptrauth.sign(i64 ptrtoint (ptr CST to i64), i32 KEY, i64 %tmp1)
4934 %val = inttoptr i64 %tmp2 to ptr
4938 Constant Expressions
4939 --------------------
4941 Constant expressions are used to allow expressions involving other
4942 constants to be used as constants. Constant expressions may be of any
4943 :ref:`first class <t_firstclass>` type and may involve any LLVM operation
4944 that does not have side effects (e.g. load and call are not supported).
4945 The following is the syntax for constant expressions:
4947 ``trunc (CST to TYPE)``
4948 Perform the :ref:`trunc operation <i_trunc>` on constants.
4949 ``ptrtoint (CST to TYPE)``
4950 Perform the :ref:`ptrtoint operation <i_ptrtoint>` on constants.
4951 ``inttoptr (CST to TYPE)``
4952 Perform the :ref:`inttoptr operation <i_inttoptr>` on constants.
4953 This one is *really* dangerous!
4954 ``bitcast (CST to TYPE)``
4955 Convert a constant, CST, to another TYPE.
4956 The constraints of the operands are the same as those for the
4957 :ref:`bitcast instruction <i_bitcast>`.
4958 ``addrspacecast (CST to TYPE)``
4959 Convert a constant pointer or constant vector of pointer, CST, to another
4960 TYPE in a different address space. The constraints of the operands are the
4961 same as those for the :ref:`addrspacecast instruction <i_addrspacecast>`.
4962 ``getelementptr (TY, CSTPTR, IDX0, IDX1, ...)``, ``getelementptr inbounds (TY, CSTPTR, IDX0, IDX1, ...)``
4963 Perform the :ref:`getelementptr operation <i_getelementptr>` on
4964 constants. As with the :ref:`getelementptr <i_getelementptr>`
4965 instruction, the index list may have one or more indexes, which are
4966 required to make sense for the type of "pointer to TY". These indexes
4967 may be implicitly sign-extended or truncated to match the index size
4968 of CSTPTR's address space.
4969 ``extractelement (VAL, IDX)``
4970 Perform the :ref:`extractelement operation <i_extractelement>` on
4972 ``insertelement (VAL, ELT, IDX)``
4973 Perform the :ref:`insertelement operation <i_insertelement>` on
4975 ``shufflevector (VEC1, VEC2, IDXMASK)``
4976 Perform the :ref:`shufflevector operation <i_shufflevector>` on
4979 Perform an addition on constants.
4981 Perform a subtraction on constants.
4983 Perform a multiplication on constants.
4985 Perform a left shift on constants.
4987 Perform a bitwise xor on constants.
4994 Inline Assembler Expressions
4995 ----------------------------
4997 LLVM supports inline assembler expressions (as opposed to :ref:`Module-Level
4998 Inline Assembly <moduleasm>`) through the use of a special value. This value
4999 represents the inline assembler as a template string (containing the
5000 instructions to emit), a list of operand constraints (stored as a string), a
5001 flag that indicates whether or not the inline asm expression has side effects,
5002 and a flag indicating whether the function containing the asm needs to align its
5003 stack conservatively.
5005 The template string supports argument substitution of the operands using "``$``"
5006 followed by a number, to indicate substitution of the given register/memory
5007 location, as specified by the constraint string. "``${NUM:MODIFIER}``" may also
5008 be used, where ``MODIFIER`` is a target-specific annotation for how to print the
5009 operand (See :ref:`inline-asm-modifiers`).
5011 A literal "``$``" may be included by using "``$$``" in the template. To include
5012 other special characters into the output, the usual "``\XX``" escapes may be
5013 used, just as in other strings. Note that after template substitution, the
5014 resulting assembly string is parsed by LLVM's integrated assembler unless it is
5015 disabled -- even when emitting a ``.s`` file -- and thus must contain assembly
5016 syntax known to LLVM.
5018 LLVM also supports a few more substitutions useful for writing inline assembly:
5020 - ``${:uid}``: Expands to a decimal integer unique to this inline assembly blob.
5021 This substitution is useful when declaring a local label. Many standard
5022 compiler optimizations, such as inlining, may duplicate an inline asm blob.
5023 Adding a blob-unique identifier ensures that the two labels will not conflict
5024 during assembly. This is used to implement `GCC's %= special format
5025 string <https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html>`_.
5026 - ``${:comment}``: Expands to the comment character of the current target's
5027 assembly dialect. This is usually ``#``, but many targets use other strings,
5028 such as ``;``, ``//``, or ``!``.
5029 - ``${:private}``: Expands to the assembler private label prefix. Labels with
5030 this prefix will not appear in the symbol table of the assembled object.
5031 Typically the prefix is ``L``, but targets may use other strings. ``.L`` is
5034 LLVM's support for inline asm is modeled closely on the requirements of Clang's
5035 GCC-compatible inline-asm support. Thus, the feature-set and the constraint and
5036 modifier codes listed here are similar or identical to those in GCC's inline asm
5037 support. However, to be clear, the syntax of the template and constraint strings
5038 described here is *not* the same as the syntax accepted by GCC and Clang, and,
5039 while most constraint letters are passed through as-is by Clang, some get
5040 translated to other codes when converting from the C source to the LLVM
5043 An example inline assembler expression is:
5045 .. code-block:: llvm
5047 i32 (i32) asm "bswap $0", "=r,r"
5049 Inline assembler expressions may **only** be used as the callee operand
5050 of a :ref:`call <i_call>` or an :ref:`invoke <i_invoke>` instruction.
5051 Thus, typically we have:
5053 .. code-block:: llvm
5055 %X = call i32 asm "bswap $0", "=r,r"(i32 %Y)
5057 Inline asms with side effects not visible in the constraint list must be
5058 marked as having side effects. This is done through the use of the
5059 '``sideeffect``' keyword, like so:
5061 .. code-block:: llvm
5063 call void asm sideeffect "eieio", ""()
5065 In some cases inline asms will contain code that will not work unless
5066 the stack is aligned in some way, such as calls or SSE instructions on
5067 x86, yet will not contain code that does that alignment within the asm.
5068 The compiler should make conservative assumptions about what the asm
5069 might contain and should generate its usual stack alignment code in the
5070 prologue if the '``alignstack``' keyword is present:
5072 .. code-block:: llvm
5074 call void asm alignstack "eieio", ""()
5076 Inline asms also support using non-standard assembly dialects. The
5077 assumed dialect is ATT. When the '``inteldialect``' keyword is present,
5078 the inline asm is using the Intel dialect. Currently, ATT and Intel are
5079 the only supported dialects. An example is:
5081 .. code-block:: llvm
5083 call void asm inteldialect "eieio", ""()
5085 In the case that the inline asm might unwind the stack,
5086 the '``unwind``' keyword must be used, so that the compiler emits
5087 unwinding information:
5089 .. code-block:: llvm
5091 call void asm unwind "call func", ""()
5093 If the inline asm unwinds the stack and isn't marked with
5094 the '``unwind``' keyword, the behavior is undefined.
5096 If multiple keywords appear, the '``sideeffect``' keyword must come
5097 first, the '``alignstack``' keyword second, the '``inteldialect``' keyword
5098 third and the '``unwind``' keyword last.
5100 Inline Asm Constraint String
5101 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
5103 The constraint list is a comma-separated string, each element containing one or
5104 more constraint codes.
5106 For each element in the constraint list an appropriate register or memory
5107 operand will be chosen, and it will be made available to assembly template
5108 string expansion as ``$0`` for the first constraint in the list, ``$1`` for the
5111 There are three different types of constraints, which are distinguished by a
5112 prefix symbol in front of the constraint code: Output, Input, and Clobber. The
5113 constraints must always be given in that order: outputs first, then inputs, then
5114 clobbers. They cannot be intermingled.
5116 There are also three different categories of constraint codes:
5118 - Register constraint. This is either a register class, or a fixed physical
5119 register. This kind of constraint will allocate a register, and if necessary,
5120 bitcast the argument or result to the appropriate type.
5121 - Memory constraint. This kind of constraint is for use with an instruction
5122 taking a memory operand. Different constraints allow for different addressing
5123 modes used by the target.
5124 - Immediate value constraint. This kind of constraint is for an integer or other
5125 immediate value which can be rendered directly into an instruction. The
5126 various target-specific constraints allow the selection of a value in the
5127 proper range for the instruction you wish to use it with.
5132 Output constraints are specified by an "``=``" prefix (e.g. "``=r``"). This
5133 indicates that the assembly will write to this operand, and the operand will
5134 then be made available as a return value of the ``asm`` expression. Output
5135 constraints do not consume an argument from the call instruction. (Except, see
5136 below about indirect outputs).
5138 Normally, it is expected that no output locations are written to by the assembly
5139 expression until *all* of the inputs have been read. As such, LLVM may assign
5140 the same register to an output and an input. If this is not safe (e.g. if the
5141 assembly contains two instructions, where the first writes to one output, and
5142 the second reads an input and writes to a second output), then the "``&``"
5143 modifier must be used (e.g. "``=&r``") to specify that the output is an
5144 "early-clobber" output. Marking an output as "early-clobber" ensures that LLVM
5145 will not use the same register for any inputs (other than an input tied to this
5151 Input constraints do not have a prefix -- just the constraint codes. Each input
5152 constraint will consume one argument from the call instruction. It is not
5153 permitted for the asm to write to any input register or memory location (unless
5154 that input is tied to an output). Note also that multiple inputs may all be
5155 assigned to the same register, if LLVM can determine that they necessarily all
5156 contain the same value.
5158 Instead of providing a Constraint Code, input constraints may also "tie"
5159 themselves to an output constraint, by providing an integer as the constraint
5160 string. Tied inputs still consume an argument from the call instruction, and
5161 take up a position in the asm template numbering as is usual -- they will simply
5162 be constrained to always use the same register as the output they've been tied
5163 to. For example, a constraint string of "``=r,0``" says to assign a register for
5164 output, and use that register as an input as well (it being the 0'th
5167 It is permitted to tie an input to an "early-clobber" output. In that case, no
5168 *other* input may share the same register as the input tied to the early-clobber
5169 (even when the other input has the same value).
5171 You may only tie an input to an output which has a register constraint, not a
5172 memory constraint. Only a single input may be tied to an output.
5174 There is also an "interesting" feature which deserves a bit of explanation: if a
5175 register class constraint allocates a register which is too small for the value
5176 type operand provided as input, the input value will be split into multiple
5177 registers, and all of them passed to the inline asm.
5179 However, this feature is often not as useful as you might think.
5181 Firstly, the registers are *not* guaranteed to be consecutive. So, on those
5182 architectures that have instructions which operate on multiple consecutive
5183 instructions, this is not an appropriate way to support them. (e.g. the 32-bit
5184 SparcV8 has a 64-bit load, which instruction takes a single 32-bit register. The
5185 hardware then loads into both the named register, and the next register. This
5186 feature of inline asm would not be useful to support that.)
5188 A few of the targets provide a template string modifier allowing explicit access
5189 to the second register of a two-register operand (e.g. MIPS ``L``, ``M``, and
5190 ``D``). On such an architecture, you can actually access the second allocated
5191 register (yet, still, not any subsequent ones). But, in that case, you're still
5192 probably better off simply splitting the value into two separate operands, for
5193 clarity. (e.g. see the description of the ``A`` constraint on X86, which,
5194 despite existing only for use with this feature, is not really a good idea to
5197 Indirect inputs and outputs
5198 """""""""""""""""""""""""""
5200 Indirect output or input constraints can be specified by the "``*``" modifier
5201 (which goes after the "``=``" in case of an output). This indicates that the asm
5202 will write to or read from the contents of an *address* provided as an input
5203 argument. (Note that in this way, indirect outputs act more like an *input* than
5204 an output: just like an input, they consume an argument of the call expression,
5205 rather than producing a return value. An indirect output constraint is an
5206 "output" only in that the asm is expected to write to the contents of the input
5207 memory location, instead of just read from it).
5209 This is most typically used for memory constraint, e.g. "``=*m``", to pass the
5210 address of a variable as a value.
5212 It is also possible to use an indirect *register* constraint, but only on output
5213 (e.g. "``=*r``"). This will cause LLVM to allocate a register for an output
5214 value normally, and then, separately emit a store to the address provided as
5215 input, after the provided inline asm. (It's not clear what value this
5216 functionality provides, compared to writing the store explicitly after the asm
5217 statement, and it can only produce worse code, since it bypasses many
5218 optimization passes. I would recommend not using it.)
5220 Call arguments for indirect constraints must have pointer type and must specify
5221 the :ref:`elementtype <attr_elementtype>` attribute to indicate the pointer
5227 A clobber constraint is indicated by a "``~``" prefix. A clobber does not
5228 consume an input operand, nor generate an output. Clobbers cannot use any of the
5229 general constraint code letters -- they may use only explicit register
5230 constraints, e.g. "``~{eax}``". The one exception is that a clobber string of
5231 "``~{memory}``" indicates that the assembly writes to arbitrary undeclared
5232 memory locations -- not only the memory pointed to by a declared indirect
5235 Note that clobbering named registers that are also present in output
5236 constraints is not legal.
5241 A label constraint is indicated by a "``!``" prefix and typically used in the
5242 form ``"!i"``. Instead of consuming call arguments, label constraints consume
5243 indirect destination labels of ``callbr`` instructions.
5245 Label constraints can only be used in conjunction with ``callbr`` and the
5246 number of label constraints must match the number of indirect destination
5247 labels in the ``callbr`` instruction.
5252 After a potential prefix comes constraint code, or codes.
5254 A Constraint Code is either a single letter (e.g. "``r``"), a "``^``" character
5255 followed by two letters (e.g. "``^wc``"), or "``{``" register-name "``}``"
5258 The one and two letter constraint codes are typically chosen to be the same as
5259 GCC's constraint codes.
5261 A single constraint may include one or more than constraint code in it, leaving
5262 it up to LLVM to choose which one to use. This is included mainly for
5263 compatibility with the translation of GCC inline asm coming from clang.
5265 There are two ways to specify alternatives, and either or both may be used in an
5266 inline asm constraint list:
5268 1) Append the codes to each other, making a constraint code set. E.g. "``im``"
5269 or "``{eax}m``". This means "choose any of the options in the set". The
5270 choice of constraint is made independently for each constraint in the
5273 2) Use "``|``" between constraint code sets, creating alternatives. Every
5274 constraint in the constraint list must have the same number of alternative
5275 sets. With this syntax, the same alternative in *all* of the items in the
5276 constraint list will be chosen together.
5278 Putting those together, you might have a two operand constraint string like
5279 ``"rm|r,ri|rm"``. This indicates that if operand 0 is ``r`` or ``m``, then
5280 operand 1 may be one of ``r`` or ``i``. If operand 0 is ``r``, then operand 1
5281 may be one of ``r`` or ``m``. But, operand 0 and 1 cannot both be of type m.
5283 However, the use of either of the alternatives features is *NOT* recommended, as
5284 LLVM is not able to make an intelligent choice about which one to use. (At the
5285 point it currently needs to choose, not enough information is available to do so
5286 in a smart way.) Thus, it simply tries to make a choice that's most likely to
5287 compile, not one that will be optimal performance. (e.g., given "``rm``", it'll
5288 always choose to use memory, not registers). And, if given multiple registers,
5289 or multiple register classes, it will simply choose the first one. (In fact, it
5290 doesn't currently even ensure explicitly specified physical registers are
5291 unique, so specifying multiple physical registers as alternatives, like
5292 ``{r11}{r12},{r11}{r12}``, will assign r11 to both operands, not at all what was
5295 Supported Constraint Code List
5296 """"""""""""""""""""""""""""""
5298 The constraint codes are, in general, expected to behave the same way they do in
5299 GCC. LLVM's support is often implemented on an 'as-needed' basis, to support C
5300 inline asm code which was supported by GCC. A mismatch in behavior between LLVM
5301 and GCC likely indicates a bug in LLVM.
5303 Some constraint codes are typically supported by all targets:
5305 - ``r``: A register in the target's general purpose register class.
5306 - ``m``: A memory address operand. It is target-specific what addressing modes
5307 are supported, typical examples are register, or register + register offset,
5308 or register + immediate offset (of some target-specific size).
5309 - ``p``: An address operand. Similar to ``m``, but used by "load address"
5310 type instructions without touching memory.
5311 - ``i``: An integer constant (of target-specific width). Allows either a simple
5312 immediate, or a relocatable value.
5313 - ``n``: An integer constant -- *not* including relocatable values.
5314 - ``s``: A symbol or label reference with a constant offset.
5315 - ``X``: Allows an operand of any kind, no constraint whatsoever. Typically
5316 useful to pass a label for an asm branch or call.
5318 .. FIXME: but that surely isn't actually okay to jump out of an asm
5319 block without telling llvm about the control transfer???)
5321 - ``{register-name}``: Requires exactly the named physical register.
5323 Other constraints are target-specific:
5327 - ``z``: An immediate integer 0. Outputs ``WZR`` or ``XZR``, as appropriate.
5328 - ``I``: An immediate integer valid for an ``ADD`` or ``SUB`` instruction,
5329 i.e. 0 to 4095 with optional shift by 12.
5330 - ``J``: An immediate integer that, when negated, is valid for an ``ADD`` or
5331 ``SUB`` instruction, i.e. -1 to -4095 with optional left shift by 12.
5332 - ``K``: An immediate integer that is valid for the 'bitmask immediate 32' of a
5333 logical instruction like ``AND``, ``EOR``, or ``ORR`` with a 32-bit register.
5334 - ``L``: An immediate integer that is valid for the 'bitmask immediate 64' of a
5335 logical instruction like ``AND``, ``EOR``, or ``ORR`` with a 64-bit register.
5336 - ``M``: An immediate integer for use with the ``MOV`` assembly alias on a
5337 32-bit register. This is a superset of ``K``: in addition to the bitmask
5338 immediate, also allows immediate integers which can be loaded with a single
5339 ``MOVZ`` or ``MOVL`` instruction.
5340 - ``N``: An immediate integer for use with the ``MOV`` assembly alias on a
5341 64-bit register. This is a superset of ``L``.
5342 - ``Q``: Memory address operand must be in a single register (no
5343 offsets). (However, LLVM currently does this for the ``m`` constraint as
5345 - ``r``: A 32 or 64-bit integer register (W* or X*).
5346 - ``S``: A symbol or label reference with a constant offset. The generic ``s``
5348 - ``Uci``: Like r, but restricted to registers 8 to 11 inclusive.
5349 - ``Ucj``: Like r, but restricted to registers 12 to 15 inclusive.
5350 - ``w``: A 32, 64, or 128-bit floating-point, SIMD or SVE vector register.
5351 - ``x``: Like w, but restricted to registers 0 to 15 inclusive.
5352 - ``y``: Like w, but restricted to SVE vector registers Z0 to Z7 inclusive.
5353 - ``Uph``: One of the upper eight SVE predicate registers (P8 to P15)
5354 - ``Upl``: One of the lower eight SVE predicate registers (P0 to P7)
5355 - ``Upa``: Any of the SVE predicate registers (P0 to P15)
5359 - ``r``: A 32 or 64-bit integer register.
5360 - ``[0-9]v``: The 32-bit VGPR register, number 0-9.
5361 - ``[0-9]s``: The 32-bit SGPR register, number 0-9.
5362 - ``[0-9]a``: The 32-bit AGPR register, number 0-9.
5363 - ``I``: An integer inline constant in the range from -16 to 64.
5364 - ``J``: A 16-bit signed integer constant.
5365 - ``A``: An integer or a floating-point inline constant.
5366 - ``B``: A 32-bit signed integer constant.
5367 - ``C``: A 32-bit unsigned integer constant or an integer inline constant in the range from -16 to 64.
5368 - ``DA``: A 64-bit constant that can be split into two "A" constants.
5369 - ``DB``: A 64-bit constant that can be split into two "B" constants.
5373 - ``Q``, ``Um``, ``Un``, ``Uq``, ``Us``, ``Ut``, ``Uv``, ``Uy``: Memory address
5374 operand. Treated the same as operand ``m``, at the moment.
5375 - ``Te``: An even general-purpose 32-bit integer register: ``r0,r2,...,r12,r14``
5376 - ``To``: An odd general-purpose 32-bit integer register: ``r1,r3,...,r11``
5378 ARM and ARM's Thumb2 mode:
5380 - ``j``: An immediate integer between 0 and 65535 (valid for ``MOVW``)
5381 - ``I``: An immediate integer valid for a data-processing instruction.
5382 - ``J``: An immediate integer between -4095 and 4095.
5383 - ``K``: An immediate integer whose bitwise inverse is valid for a
5384 data-processing instruction. (Can be used with template modifier "``B``" to
5385 print the inverted value).
5386 - ``L``: An immediate integer whose negation is valid for a data-processing
5387 instruction. (Can be used with template modifier "``n``" to print the negated
5389 - ``M``: A power of two or an integer between 0 and 32.
5390 - ``N``: Invalid immediate constraint.
5391 - ``O``: Invalid immediate constraint.
5392 - ``r``: A general-purpose 32-bit integer register (``r0-r15``).
5393 - ``l``: In Thumb2 mode, low 32-bit GPR registers (``r0-r7``). In ARM mode, same
5395 - ``h``: In Thumb2 mode, a high 32-bit GPR register (``r8-r15``). In ARM mode,
5397 - ``w``: A 32, 64, or 128-bit floating-point/SIMD register in the ranges
5398 ``s0-s31``, ``d0-d31``, or ``q0-q15``, respectively.
5399 - ``t``: A 32, 64, or 128-bit floating-point/SIMD register in the ranges
5400 ``s0-s31``, ``d0-d15``, or ``q0-q7``, respectively.
5401 - ``x``: A 32, 64, or 128-bit floating-point/SIMD register in the ranges
5402 ``s0-s15``, ``d0-d7``, or ``q0-q3``, respectively.
5406 - ``I``: An immediate integer between 0 and 255.
5407 - ``J``: An immediate integer between -255 and -1.
5408 - ``K``: An immediate integer between 0 and 255, with optional left-shift by
5410 - ``L``: An immediate integer between -7 and 7.
5411 - ``M``: An immediate integer which is a multiple of 4 between 0 and 1020.
5412 - ``N``: An immediate integer between 0 and 31.
5413 - ``O``: An immediate integer which is a multiple of 4 between -508 and 508.
5414 - ``r``: A low 32-bit GPR register (``r0-r7``).
5415 - ``l``: A low 32-bit GPR register (``r0-r7``).
5416 - ``h``: A high GPR register (``r0-r7``).
5417 - ``w``: A 32, 64, or 128-bit floating-point/SIMD register in the ranges
5418 ``s0-s31``, ``d0-d31``, or ``q0-q15``, respectively.
5419 - ``t``: A 32, 64, or 128-bit floating-point/SIMD register in the ranges
5420 ``s0-s31``, ``d0-d15``, or ``q0-q7``, respectively.
5421 - ``x``: A 32, 64, or 128-bit floating-point/SIMD register in the ranges
5422 ``s0-s15``, ``d0-d7``, or ``q0-q3``, respectively.
5426 - ``o``, ``v``: A memory address operand, treated the same as constraint ``m``,
5428 - ``r``: A 32 or 64-bit register.
5432 - ``f``: A floating-point register (if available).
5433 - ``k``: A memory operand whose address is formed by a base register and
5434 (optionally scaled) index register.
5435 - ``l``: A signed 16-bit constant.
5436 - ``m``: A memory operand whose address is formed by a base register and
5437 offset that is suitable for use in instructions with the same addressing
5438 mode as st.w and ld.w.
5439 - ``I``: A signed 12-bit constant (for arithmetic instructions).
5440 - ``J``: An immediate integer zero.
5441 - ``K``: An unsigned 12-bit constant (for logic instructions).
5442 - ``ZB``: An address that is held in a general-purpose register. The offset
5444 - ``ZC``: A memory operand whose address is formed by a base register and
5445 offset that is suitable for use in instructions with the same addressing
5446 mode as ll.w and sc.w.
5450 - ``r``: An 8 or 16-bit register.
5454 - ``I``: An immediate signed 16-bit integer.
5455 - ``J``: An immediate integer zero.
5456 - ``K``: An immediate unsigned 16-bit integer.
5457 - ``L``: An immediate 32-bit integer, where the lower 16 bits are 0.
5458 - ``N``: An immediate integer between -65535 and -1.
5459 - ``O``: An immediate signed 15-bit integer.
5460 - ``P``: An immediate integer between 1 and 65535.
5461 - ``m``: A memory address operand. In MIPS-SE mode, allows a base address
5462 register plus 16-bit immediate offset. In MIPS mode, just a base register.
5463 - ``R``: A memory address operand. In MIPS-SE mode, allows a base address
5464 register plus a 9-bit signed offset. In MIPS mode, the same as constraint
5466 - ``ZC``: A memory address operand, suitable for use in a ``pref``, ``ll``, or
5467 ``sc`` instruction on the given subtarget (details vary).
5468 - ``r``, ``d``, ``y``: A 32 or 64-bit GPR register.
5469 - ``f``: A 32 or 64-bit FPU register (``F0-F31``), or a 128-bit MSA register
5470 (``W0-W31``). In the case of MSA registers, it is recommended to use the ``w``
5471 argument modifier for compatibility with GCC.
5472 - ``c``: A 32-bit or 64-bit GPR register suitable for indirect jump (always
5474 - ``l``: The ``lo`` register, 32 or 64-bit.
5479 - ``b``: A 1-bit integer register.
5480 - ``c`` or ``h``: A 16-bit integer register.
5481 - ``r``: A 32-bit integer register.
5482 - ``l`` or ``N``: A 64-bit integer register.
5483 - ``q``: A 128-bit integer register.
5484 - ``f``: A 32-bit float register.
5485 - ``d``: A 64-bit float register.
5490 - ``I``: An immediate signed 16-bit integer.
5491 - ``J``: An immediate unsigned 16-bit integer, shifted left 16 bits.
5492 - ``K``: An immediate unsigned 16-bit integer.
5493 - ``L``: An immediate signed 16-bit integer, shifted left 16 bits.
5494 - ``M``: An immediate integer greater than 31.
5495 - ``N``: An immediate integer that is an exact power of 2.
5496 - ``O``: The immediate integer constant 0.
5497 - ``P``: An immediate integer constant whose negation is a signed 16-bit
5499 - ``es``, ``o``, ``Q``, ``Z``, ``Zy``: A memory address operand, currently
5500 treated the same as ``m``.
5501 - ``r``: A 32 or 64-bit integer register.
5502 - ``b``: A 32 or 64-bit integer register, excluding ``R0`` (that is:
5504 - ``f``: A 32 or 64-bit float register (``F0-F31``),
5505 - ``v``: For ``4 x f32`` or ``4 x f64`` types, a 128-bit altivec vector
5506 register (``V0-V31``).
5508 - ``y``: Condition register (``CR0-CR7``).
5509 - ``wc``: An individual CR bit in a CR register.
5510 - ``wa``, ``wd``, ``wf``: Any 128-bit VSX vector register, from the full VSX
5511 register set (overlapping both the floating-point and vector register files).
5512 - ``ws``: A 32 or 64-bit floating-point register, from the full VSX register
5517 - ``A``: An address operand (using a general-purpose register, without an
5519 - ``I``: A 12-bit signed integer immediate operand.
5520 - ``J``: A zero integer immediate operand.
5521 - ``K``: A 5-bit unsigned integer immediate operand.
5522 - ``f``: A 32- or 64-bit floating-point register (requires F or D extension).
5523 - ``r``: A 32- or 64-bit general-purpose register (depending on the platform
5525 - ``S``: Alias for ``s``.
5526 - ``vd``: A vector register, excluding ``v0`` (requires V extension).
5527 - ``vm``: The vector register ``v0`` (requires V extension).
5528 - ``vr``: A vector register (requires V extension).
5532 - ``I``: An immediate 13-bit signed integer.
5533 - ``r``: A 32-bit integer register.
5534 - ``f``: Any floating-point register on SparcV8, or a floating-point
5535 register in the "low" half of the registers on SparcV9.
5536 - ``e``: Any floating-point register. (Same as ``f`` on SparcV8.)
5540 - ``I``: An immediate unsigned 8-bit integer.
5541 - ``J``: An immediate unsigned 12-bit integer.
5542 - ``K``: An immediate signed 16-bit integer.
5543 - ``L``: An immediate signed 20-bit integer.
5544 - ``M``: An immediate integer 0x7fffffff.
5545 - ``Q``: A memory address operand with a base address and a 12-bit immediate
5546 unsigned displacement.
5547 - ``R``: A memory address operand with a base address, a 12-bit immediate
5548 unsigned displacement, and an index register.
5549 - ``S``: A memory address operand with a base address and a 20-bit immediate
5550 signed displacement.
5551 - ``T``: A memory address operand with a base address, a 20-bit immediate
5552 signed displacement, and an index register.
5553 - ``r`` or ``d``: A 32, 64, or 128-bit integer register.
5554 - ``a``: A 32, 64, or 128-bit integer address register (excludes R0, which in an
5555 address context evaluates as zero).
5556 - ``h``: A 32-bit value in the high part of a 64bit data register
5558 - ``f``: A 32, 64, or 128-bit floating-point register.
5562 - ``I``: An immediate integer between 0 and 31.
5563 - ``J``: An immediate integer between 0 and 64.
5564 - ``K``: An immediate signed 8-bit integer.
5565 - ``L``: An immediate integer, 0xff or 0xffff or (in 64-bit mode only)
5567 - ``M``: An immediate integer between 0 and 3.
5568 - ``N``: An immediate unsigned 8-bit integer.
5569 - ``O``: An immediate integer between 0 and 127.
5570 - ``e``: An immediate 32-bit signed integer.
5571 - ``Z``: An immediate 32-bit unsigned integer.
5572 - ``q``: An 8, 16, 32, or 64-bit register which can be accessed as an 8-bit
5573 ``l`` integer register. On X86-32, this is the ``a``, ``b``, ``c``, and ``d``
5574 registers, and on X86-64, it is all of the integer registers. When feature
5575 `egpr` and `inline-asm-use-gpr32` are both on, it will be extended to gpr32.
5576 - ``Q``: An 8, 16, 32, or 64-bit register which can be accessed as an 8-bit
5577 ``h`` integer register. This is the ``a``, ``b``, ``c``, and ``d`` registers.
5578 - ``r`` or ``l``: An 8, 16, 32, or 64-bit integer register. When feature
5579 `egpr` and `inline-asm-use-gpr32` are both on, it will be extended to gpr32.
5580 - ``R``: An 8, 16, 32, or 64-bit "legacy" integer register -- one which has
5581 existed since i386, and can be accessed without the REX prefix.
5582 - ``f``: A 32, 64, or 80-bit '387 FPU stack pseudo-register.
5583 - ``y``: A 64-bit MMX register, if MMX is enabled.
5584 - ``v``: If SSE is enabled: a 32 or 64-bit scalar operand, or 128-bit vector
5585 operand in a SSE register. If AVX is also enabled, can also be a 256-bit
5586 vector operand in an AVX register. If AVX-512 is also enabled, can also be a
5587 512-bit vector operand in an AVX512 register. Otherwise, an error.
5588 - ``Ws``: A symbolic reference with an optional constant addend or a label
5590 - ``x``: The same as ``v``, except that when AVX-512 is enabled, the ``x`` code
5591 only allocates into the first 16 AVX-512 registers, while the ``v`` code
5592 allocates into any of the 32 AVX-512 registers.
5593 - ``Y``: The same as ``x``, if *SSE2* is enabled, otherwise an error.
5594 - ``A``: Special case: allocates EAX first, then EDX, for a single operand (in
5595 32-bit mode, a 64-bit integer operand will get split into two registers). It
5596 is not recommended to use this constraint, as in 64-bit mode, the 64-bit
5597 operand will get allocated only to RAX -- if two 32-bit operands are needed,
5598 you're better off splitting it yourself, before passing it to the asm
5600 - ``jr``: An 8, 16, 32, or 64-bit integer gpr16. It won't be extended to gpr32
5601 when feature `egpr` or `inline-asm-use-gpr32` is on.
5602 - ``jR``: An 8, 16, 32, or 64-bit integer gpr32 when feature `egpr`` is on.
5603 Otherwise, same as ``r``.
5607 - ``r``: A 32-bit integer register.
5610 .. _inline-asm-modifiers:
5612 Asm template argument modifiers
5613 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
5615 In the asm template string, modifiers can be used on the operand reference, like
5618 The modifiers are, in general, expected to behave the same way they do in
5619 GCC. LLVM's support is often implemented on an 'as-needed' basis, to support C
5620 inline asm code which was supported by GCC. A mismatch in behavior between LLVM
5621 and GCC likely indicates a bug in LLVM.
5625 - ``c``: Print an immediate integer constant unadorned, without
5626 the target-specific immediate punctuation (e.g. no ``$`` prefix).
5627 - ``n``: Negate and print immediate integer constant unadorned, without the
5628 target-specific immediate punctuation (e.g. no ``$`` prefix).
5629 - ``l``: Print as an unadorned label, without the target-specific label
5630 punctuation (e.g. no ``$`` prefix).
5634 - ``w``: Print a GPR register with a ``w*`` name instead of ``x*`` name. E.g.,
5635 instead of ``x30``, print ``w30``.
5636 - ``x``: Print a GPR register with a ``x*`` name. (this is the default, anyhow).
5637 - ``b``, ``h``, ``s``, ``d``, ``q``: Print a floating-point/SIMD register with a
5638 ``b*``, ``h*``, ``s*``, ``d*``, or ``q*`` name, rather than the default of
5647 - ``a``: Print an operand as an address (with ``[`` and ``]`` surrounding a
5651 - ``y``: Print a VFP single-precision register as an indexed double (e.g. print
5652 as ``d4[1]`` instead of ``s9``)
5653 - ``B``: Bitwise invert and print an immediate integer constant without ``#``
5655 - ``L``: Print the low 16-bits of an immediate integer constant.
5656 - ``M``: Print as a register set suitable for ldm/stm. Also prints *all*
5657 register operands subsequent to the specified one (!), so use carefully.
5658 - ``Q``: Print the low-order register of a register-pair, or the low-order
5659 register of a two-register operand.
5660 - ``R``: Print the high-order register of a register-pair, or the high-order
5661 register of a two-register operand.
5662 - ``H``: Print the second register of a register-pair. (On a big-endian system,
5663 ``H`` is equivalent to ``Q``, and on little-endian system, ``H`` is equivalent
5666 .. FIXME: H doesn't currently support printing the second register
5667 of a two-register operand.
5669 - ``e``: Print the low doubleword register of a NEON quad register.
5670 - ``f``: Print the high doubleword register of a NEON quad register.
5671 - ``m``: Print the base register of a memory operand without the ``[`` and ``]``
5676 - ``L``: Print the second register of a two-register operand. Requires that it
5677 has been allocated consecutively to the first.
5679 .. FIXME: why is it restricted to consecutive ones? And there's
5680 nothing that ensures that happens, is there?
5682 - ``I``: Print the letter 'i' if the operand is an integer constant, otherwise
5683 nothing. Used to print 'addi' vs 'add' instructions.
5687 - ``z``: Print $zero register if operand is zero, otherwise print it normally.
5691 No additional modifiers.
5695 - ``X``: Print an immediate integer as hexadecimal
5696 - ``x``: Print the low 16 bits of an immediate integer as hexadecimal.
5697 - ``d``: Print an immediate integer as decimal.
5698 - ``m``: Subtract one and print an immediate integer as decimal.
5699 - ``z``: Print $0 if an immediate zero, otherwise print normally.
5700 - ``L``: Print the low-order register of a two-register operand, or prints the
5701 address of the low-order word of a double-word memory operand.
5703 .. FIXME: L seems to be missing memory operand support.
5705 - ``M``: Print the high-order register of a two-register operand, or prints the
5706 address of the high-order word of a double-word memory operand.
5708 .. FIXME: M seems to be missing memory operand support.
5710 - ``D``: Print the second register of a two-register operand, or prints the
5711 second word of a double-word memory operand. (On a big-endian system, ``D`` is
5712 equivalent to ``L``, and on little-endian system, ``D`` is equivalent to
5714 - ``w``: No effect. Provided for compatibility with GCC which requires this
5715 modifier in order to print MSA registers (``W0-W31``) with the ``f``
5724 - ``L``: Print the second register of a two-register operand. Requires that it
5725 has been allocated consecutively to the first.
5727 .. FIXME: why is it restricted to consecutive ones? And there's
5728 nothing that ensures that happens, is there?
5730 - ``I``: Print the letter 'i' if the operand is an integer constant, otherwise
5731 nothing. Used to print 'addi' vs 'add' instructions.
5732 - ``y``: For a memory operand, prints formatter for a two-register X-form
5733 instruction. (Currently always prints ``r0,OPERAND``).
5734 - ``U``: Prints 'u' if the memory operand is an update form, and nothing
5735 otherwise. (NOTE: LLVM does not support update form, so this will currently
5736 always print nothing)
5737 - ``X``: Prints 'x' if the memory operand is an indexed form. (NOTE: LLVM does
5738 not support indexed form, so this will currently always print nothing)
5742 - ``i``: Print the letter 'i' if the operand is not a register, otherwise print
5743 nothing. Used to print 'addi' vs 'add' instructions, etc.
5744 - ``z``: Print the register ``zero`` if an immediate zero, otherwise print
5749 - ``L``: Print the low-order register of a two-register operand.
5750 - ``H``: Print the high-order register of a two-register operand.
5755 SystemZ implements only ``n``, and does *not* support any of the other
5756 target-independent modifiers.
5760 - ``c``: Print an unadorned integer or symbol name. (The latter is
5761 target-specific behavior for this typically target-independent modifier).
5762 - ``A``: Print a register name with a '``*``' before it.
5763 - ``b``: Print an 8-bit register name (e.g. ``al``); do nothing on a memory
5765 - ``h``: Print the upper 8-bit register name (e.g. ``ah``); do nothing on a
5767 - ``w``: Print the 16-bit register name (e.g. ``ax``); do nothing on a memory
5769 - ``k``: Print the 32-bit register name (e.g. ``eax``); do nothing on a memory
5771 - ``q``: Print the 64-bit register name (e.g. ``rax``), if 64-bit registers are
5772 available, otherwise the 32-bit register name; do nothing on a memory operand.
5773 - ``n``: Negate and print an unadorned integer, or, for operands other than an
5774 immediate integer (e.g. a relocatable symbol expression), print a '-' before
5775 the operand. (The behavior for relocatable symbol expressions is a
5776 target-specific behavior for this typically target-independent modifier)
5777 - ``H``: Print a memory reference with additional offset +8.
5778 - ``p``: Print a raw symbol name (without syntax-specific prefixes).
5779 - ``P``: Print a memory reference used as the argument of a call instruction or
5780 used with explicit base reg and index reg as its offset. So it can not use
5781 additional regs to present the memory reference. (E.g. omit ``(rip)``, even
5782 though it's PC-relative.)
5786 No additional modifiers.
5792 The call instructions that wrap inline asm nodes may have a
5793 "``!srcloc``" MDNode attached to it that contains a list of constant
5794 integers. If present, the code generator will use the integer as the
5795 location cookie value when report errors through the ``LLVMContext``
5796 error reporting mechanisms. This allows a front-end to correlate backend
5797 errors that occur with inline asm back to the source code that produced
5800 .. code-block:: llvm
5802 call void asm sideeffect "something bad", ""(), !srcloc !42
5804 !42 = !{ i64 1234567 }
5806 It is up to the front-end to make sense of the magic numbers it places
5807 in the IR. If the MDNode contains multiple constants, the code generator
5808 will use the one that corresponds to the line of the asm that the error
5816 LLVM IR allows metadata to be attached to instructions and global objects in
5817 the program that can convey extra information about the code to the optimizers
5820 There are two metadata primitives: strings and nodes. There are
5821 also specialized nodes which have a distinguished name and a set of named
5826 One example application of metadata is source-level debug information,
5827 which is currently the only user of specialized nodes.
5829 Metadata does not have a type, and is not a value.
5831 A value of non-\ ``metadata`` type can be used in a metadata context using the
5832 syntax '``<type> <value>``'.
5834 All other metadata is identified in syntax as starting with an exclamation
5837 Metadata may be used in the following value contexts by using the ``metadata``
5840 - Arguments to certain intrinsic functions, as described in their specification.
5841 - Arguments to the ``catchpad``/``cleanuppad`` instructions.
5845 Metadata can be "wrapped" in a ``MetadataAsValue`` so it can be referenced
5846 in a value context: ``MetadataAsValue`` is-a ``Value``.
5848 A typed value can be "wrapped" in ``ValueAsMetadata`` so it can be
5849 referenced in a metadata context: ``ValueAsMetadata`` is-a ``Metadata``.
5851 There is no explicit syntax for a ``ValueAsMetadata``, and instead
5852 the fact that a type identifier cannot begin with an exclamation point
5853 is used to resolve ambiguity.
5855 A ``metadata`` type implies a ``MetadataAsValue``, and when followed with a
5856 '``<type> <value>``' pair it wraps the typed value in a ``ValueAsMetadata``.
5858 For example, the first argument
5859 to this call is a ``MetadataAsValue(ValueAsMetadata(Value))``:
5861 .. code-block:: llvm
5863 call void @llvm.foo(metadata i32 1)
5865 Whereas the first argument to this call is a ``MetadataAsValue(MDNode)``:
5867 .. code-block:: llvm
5869 call void @llvm.foo(metadata !0)
5871 The first element of this ``MDTuple`` is a ``MDNode``:
5873 .. code-block:: llvm
5877 And the first element of this ``MDTuple`` is a ``ValueAsMetadata(Value)``:
5879 .. code-block:: llvm
5883 .. _metadata-string:
5885 Metadata Strings (``MDString``)
5886 -------------------------------
5888 .. FIXME Either fix all references to "MDString" in the docs, or make that
5889 identifier a formal part of the document.
5891 A metadata string is a string surrounded by double quotes. It can
5892 contain any character by escaping non-printable characters with
5893 "``\xx``" where "``xx``" is the two digit hex code. For example:
5898 A metadata string is metadata, but is not a metadata node.
5902 Metadata Nodes (``MDNode``)
5903 ---------------------------
5905 .. FIXME Either fix all references to "MDNode" in the docs, or make that
5906 identifier a formal part of the document.
5908 Metadata tuples are represented with notation similar to structure
5909 constants: a comma separated list of elements, surrounded by braces and
5910 preceded by an exclamation point. Metadata nodes can have any values as
5911 their operand. For example:
5913 .. code-block:: llvm
5915 !{!"test\00", i32 10}
5917 Metadata nodes that aren't uniqued use the ``distinct`` keyword. For example:
5919 .. code-block:: text
5921 !0 = distinct !{!"test\00", i32 10}
5923 ``distinct`` nodes are useful when nodes shouldn't be merged based on their
5924 content. They can also occur when transformations cause uniquing collisions
5925 when metadata operands change.
5927 A :ref:`named metadata <namedmetadatastructure>` is a collection of
5928 metadata nodes, which can be looked up in the module symbol table. For
5931 .. code-block:: llvm
5935 Metadata can be used as function arguments. Here the ``llvm.dbg.value``
5936 intrinsic is using three metadata arguments:
5938 .. code-block:: llvm
5940 call void @llvm.dbg.value(metadata !24, metadata !25, metadata !26)
5943 .. FIXME Attachments cannot be ValueAsMetadata, but we don't have a
5944 particularly clear way to refer to ValueAsMetadata without getting into
5945 implementation details. Ideally the restriction would be explicit somewhere,
5948 Metadata can be attached to an instruction. Here metadata ``!21`` is attached
5949 to the ``add`` instruction using the ``!dbg`` identifier:
5951 .. code-block:: llvm
5953 %indvar.next = add i64 %indvar, 1, !dbg !21
5955 Instructions may not have multiple metadata attachments with the same
5958 Metadata can also be attached to a function or a global variable. Here metadata
5959 ``!22`` is attached to the ``f1`` and ``f2`` functions, and the globals ``g1``
5960 and ``g2`` using the ``!dbg`` identifier:
5962 .. code-block:: llvm
5964 declare !dbg !22 void @f1()
5965 define void @f2() !dbg !22 {
5969 @g1 = global i32 0, !dbg !22
5970 @g2 = external global i32, !dbg !22
5972 Unlike instructions, global objects (functions and global variables) may have
5973 multiple metadata attachments with the same identifier.
5975 A transformation is required to drop any metadata attachment that it
5976 does not know or know it can't preserve. Currently there is an
5977 exception for metadata attachment to globals for ``!func_sanitize``,
5978 ``!type``, ``!absolute_symbol`` and ``!associated`` which can't be
5979 unconditionally dropped unless the global is itself deleted.
5981 Metadata attached to a module using named metadata may not be dropped, with
5982 the exception of debug metadata (named metadata with the name ``!llvm.dbg.*``).
5984 More information about specific metadata nodes recognized by the
5985 optimizers and code generator is found below.
5987 .. _specialized-metadata:
5989 Specialized Metadata Nodes
5990 ^^^^^^^^^^^^^^^^^^^^^^^^^^
5992 Specialized metadata nodes are custom data structures in metadata (as opposed
5993 to generic tuples). Their fields are labelled, and can be specified in any
5996 These aren't inherently debug info centric, but currently all the specialized
5997 metadata nodes are related to debug info.
6004 ``DICompileUnit`` nodes represent a compile unit. The ``enums:``,
6005 ``retainedTypes:``, ``globals:``, ``imports:`` and ``macros:`` fields are tuples
6006 containing the debug info to be emitted along with the compile unit, regardless
6007 of code optimizations (some nodes are only emitted if there are references to
6008 them from instructions). The ``debugInfoForProfiling:`` field is a boolean
6009 indicating whether or not line-table discriminators are updated to provide
6010 more-accurate debug info for profiling results.
6012 .. code-block:: text
6014 !0 = !DICompileUnit(language: DW_LANG_C99, file: !1, producer: "clang",
6015 isOptimized: true, flags: "-O2", runtimeVersion: 2,
6016 splitDebugFilename: "abc.debug", emissionKind: FullDebug,
6017 enums: !2, retainedTypes: !3, globals: !4, imports: !5,
6018 macros: !6, dwoId: 0x0abcd)
6020 Compile unit descriptors provide the root scope for objects declared in a
6021 specific compilation unit. File descriptors are defined using this scope. These
6022 descriptors are collected by a named metadata node ``!llvm.dbg.cu``. They keep
6023 track of global variables, type information, and imported entities (declarations
6031 ``DIFile`` nodes represent files. The ``filename:`` can include slashes.
6033 .. code-block:: none
6035 !0 = !DIFile(filename: "path/to/file", directory: "/path/to/dir",
6036 checksumkind: CSK_MD5,
6037 checksum: "000102030405060708090a0b0c0d0e0f")
6039 Files are sometimes used in ``scope:`` fields, and are the only valid target
6040 for ``file:`` fields.
6042 The ``checksum:`` and ``checksumkind:`` fields are optional. If one of these
6043 fields is present, then the other is required to be present as well. Valid
6044 values for ``checksumkind:`` field are: {CSK_MD5, CSK_SHA1, CSK_SHA256}
6051 ``DIBasicType`` nodes represent primitive types, such as ``int``, ``bool`` and
6052 ``float``. ``tag:`` defaults to ``DW_TAG_base_type``.
6054 .. code-block:: text
6056 !0 = !DIBasicType(name: "unsigned char", size: 8, align: 8,
6057 encoding: DW_ATE_unsigned_char)
6058 !1 = !DIBasicType(tag: DW_TAG_unspecified_type, name: "decltype(nullptr)")
6060 The ``encoding:`` describes the details of the type. Usually it's one of the
6063 .. code-block:: text
6069 DW_ATE_signed_char = 6
6071 DW_ATE_unsigned_char = 8
6073 .. _DISubroutineType:
6078 ``DISubroutineType`` nodes represent subroutine types. Their ``types:`` field
6079 refers to a tuple; the first operand is the return type, while the rest are the
6080 types of the formal arguments in order. If the first operand is ``null``, that
6081 represents a function with no return value (such as ``void foo() {}`` in C++).
6083 .. code-block:: text
6085 !0 = !BasicType(name: "int", size: 32, align: 32, DW_ATE_signed)
6086 !1 = !BasicType(name: "char", size: 8, align: 8, DW_ATE_signed_char)
6087 !2 = !DISubroutineType(types: !{null, !0, !1}) ; void (int, char)
6094 ``DIDerivedType`` nodes represent types derived from other types, such as
6097 .. code-block:: text
6099 !0 = !DIBasicType(name: "unsigned char", size: 8, align: 8,
6100 encoding: DW_ATE_unsigned_char)
6101 !1 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !0, size: 32,
6104 The following ``tag:`` values are valid:
6106 .. code-block:: text
6109 DW_TAG_pointer_type = 15
6110 DW_TAG_reference_type = 16
6112 DW_TAG_inheritance = 28
6113 DW_TAG_ptr_to_member_type = 31
6114 DW_TAG_const_type = 38
6116 DW_TAG_volatile_type = 53
6117 DW_TAG_restrict_type = 55
6118 DW_TAG_atomic_type = 71
6119 DW_TAG_immutable_type = 75
6121 .. _DIDerivedTypeMember:
6123 ``DW_TAG_member`` is used to define a member of a :ref:`composite type
6124 <DICompositeType>`. The type of the member is the ``baseType:``. The
6125 ``offset:`` is the member's bit offset. If the composite type has an ODR
6126 ``identifier:`` and does not set ``flags: DIFwdDecl``, then the member is
6127 uniqued based only on its ``name:`` and ``scope:``.
6129 ``DW_TAG_inheritance`` and ``DW_TAG_friend`` are used in the ``elements:``
6130 field of :ref:`composite types <DICompositeType>` to describe parents and
6133 ``DW_TAG_typedef`` is used to provide a name for the ``baseType:``.
6135 ``DW_TAG_pointer_type``, ``DW_TAG_reference_type``, ``DW_TAG_const_type``,
6136 ``DW_TAG_volatile_type``, ``DW_TAG_restrict_type``, ``DW_TAG_atomic_type`` and
6137 ``DW_TAG_immutable_type`` are used to qualify the ``baseType:``.
6139 Note that the ``void *`` type is expressed as a type derived from NULL.
6141 .. _DICompositeType:
6146 ``DICompositeType`` nodes represent types composed of other types, like
6147 structures and unions. ``elements:`` points to a tuple of the composed types.
6149 If the source language supports ODR, the ``identifier:`` field gives the unique
6150 identifier used for type merging between modules. When specified,
6151 :ref:`subprogram declarations <DISubprogramDeclaration>` and :ref:`member
6152 derived types <DIDerivedTypeMember>` that reference the ODR-type in their
6153 ``scope:`` change uniquing rules.
6155 For a given ``identifier:``, there should only be a single composite type that
6156 does not have ``flags: DIFlagFwdDecl`` set. LLVM tools that link modules
6157 together will unique such definitions at parse time via the ``identifier:``
6158 field, even if the nodes are ``distinct``.
6160 .. code-block:: text
6162 !0 = !DIEnumerator(name: "SixKind", value: 7)
6163 !1 = !DIEnumerator(name: "SevenKind", value: 7)
6164 !2 = !DIEnumerator(name: "NegEightKind", value: -8)
6165 !3 = !DICompositeType(tag: DW_TAG_enumeration_type, name: "Enum", file: !12,
6166 line: 2, size: 32, align: 32, identifier: "_M4Enum",
6167 elements: !{!0, !1, !2})
6169 The following ``tag:`` values are valid:
6171 .. code-block:: text
6173 DW_TAG_array_type = 1
6174 DW_TAG_class_type = 2
6175 DW_TAG_enumeration_type = 4
6176 DW_TAG_structure_type = 19
6177 DW_TAG_union_type = 23
6179 For ``DW_TAG_array_type``, the ``elements:`` should be :ref:`subrange
6180 descriptors <DISubrange>`, each representing the range of subscripts at that
6181 level of indexing. The ``DIFlagVector`` flag to ``flags:`` indicates that an
6182 array type is a native packed vector. The optional ``dataLocation`` is a
6183 DIExpression that describes how to get from an object's address to the actual
6184 raw data, if they aren't equivalent. This is only supported for array types,
6185 particularly to describe Fortran arrays, which have an array descriptor in
6186 addition to the array data. Alternatively it can also be DIVariable which
6187 has the address of the actual raw data. The Fortran language supports pointer
6188 arrays which can be attached to actual arrays, this attachment between pointer
6189 and pointee is called association. The optional ``associated`` is a
6190 DIExpression that describes whether the pointer array is currently associated.
6191 The optional ``allocated`` is a DIExpression that describes whether the
6192 allocatable array is currently allocated. The optional ``rank`` is a
6193 DIExpression that describes the rank (number of dimensions) of fortran assumed
6194 rank array (rank is known at runtime).
6196 For ``DW_TAG_enumeration_type``, the ``elements:`` should be :ref:`enumerator
6197 descriptors <DIEnumerator>`, each representing the definition of an enumeration
6198 value for the set. All enumeration type descriptors are collected in the
6199 ``enums:`` field of the :ref:`compile unit <DICompileUnit>`.
6201 For ``DW_TAG_structure_type``, ``DW_TAG_class_type``, and
6202 ``DW_TAG_union_type``, the ``elements:`` should be :ref:`derived types
6203 <DIDerivedType>` with ``tag: DW_TAG_member``, ``tag: DW_TAG_inheritance``, or
6204 ``tag: DW_TAG_friend``; or :ref:`subprograms <DISubprogram>` with
6205 ``isDefinition: false``.
6212 ``DISubrange`` nodes are the elements for ``DW_TAG_array_type`` variants of
6213 :ref:`DICompositeType`.
6215 - ``count: -1`` indicates an empty array.
6216 - ``count: !10`` describes the count with a :ref:`DILocalVariable`.
6217 - ``count: !12`` describes the count with a :ref:`DIGlobalVariable`.
6219 .. code-block:: text
6221 !0 = !DISubrange(count: 5, lowerBound: 0) ; array counting from 0
6222 !1 = !DISubrange(count: 5, lowerBound: 1) ; array counting from 1
6223 !2 = !DISubrange(count: -1) ; empty array.
6225 ; Scopes used in rest of example
6226 !6 = !DIFile(filename: "vla.c", directory: "/path/to/file")
6227 !7 = distinct !DICompileUnit(language: DW_LANG_C99, file: !6)
6228 !8 = distinct !DISubprogram(name: "foo", scope: !7, file: !6, line: 5)
6230 ; Use of local variable as count value
6231 !9 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
6232 !10 = !DILocalVariable(name: "count", scope: !8, file: !6, line: 42, type: !9)
6233 !11 = !DISubrange(count: !10, lowerBound: 0)
6235 ; Use of global variable as count value
6236 !12 = !DIGlobalVariable(name: "count", scope: !8, file: !6, line: 22, type: !9)
6237 !13 = !DISubrange(count: !12, lowerBound: 0)
6244 ``DIEnumerator`` nodes are the elements for ``DW_TAG_enumeration_type``
6245 variants of :ref:`DICompositeType`.
6247 .. code-block:: text
6249 !0 = !DIEnumerator(name: "SixKind", value: 7)
6250 !1 = !DIEnumerator(name: "SevenKind", value: 7)
6251 !2 = !DIEnumerator(name: "NegEightKind", value: -8)
6253 DITemplateTypeParameter
6254 """""""""""""""""""""""
6256 ``DITemplateTypeParameter`` nodes represent type parameters to generic source
6257 language constructs. They are used (optionally) in :ref:`DICompositeType` and
6258 :ref:`DISubprogram` ``templateParams:`` fields.
6260 .. code-block:: text
6262 !0 = !DITemplateTypeParameter(name: "Ty", type: !1)
6264 DITemplateValueParameter
6265 """"""""""""""""""""""""
6267 ``DITemplateValueParameter`` nodes represent value parameters to generic source
6268 language constructs. ``tag:`` defaults to ``DW_TAG_template_value_parameter``,
6269 but if specified can also be set to ``DW_TAG_GNU_template_template_param`` or
6270 ``DW_TAG_GNU_template_param_pack``. They are used (optionally) in
6271 :ref:`DICompositeType` and :ref:`DISubprogram` ``templateParams:`` fields.
6273 .. code-block:: text
6275 !0 = !DITemplateValueParameter(name: "Ty", type: !1, value: i32 7)
6280 ``DINamespace`` nodes represent namespaces in the source language.
6282 .. code-block:: text
6284 !0 = !DINamespace(name: "myawesomeproject", scope: !1, file: !2, line: 7)
6286 .. _DIGlobalVariable:
6291 ``DIGlobalVariable`` nodes represent global variables in the source language.
6293 .. code-block:: text
6295 @foo = global i32, !dbg !0
6296 !0 = !DIGlobalVariableExpression(var: !1, expr: !DIExpression())
6297 !1 = !DIGlobalVariable(name: "foo", linkageName: "foo", scope: !2,
6298 file: !3, line: 7, type: !4, isLocal: true,
6299 isDefinition: false, declaration: !5)
6302 DIGlobalVariableExpression
6303 """"""""""""""""""""""""""
6305 ``DIGlobalVariableExpression`` nodes tie a :ref:`DIGlobalVariable` together
6306 with a :ref:`DIExpression`.
6308 .. code-block:: text
6310 @lower = global i32, !dbg !0
6311 @upper = global i32, !dbg !1
6312 !0 = !DIGlobalVariableExpression(
6314 expr: !DIExpression(DW_OP_LLVM_fragment, 0, 32)
6316 !1 = !DIGlobalVariableExpression(
6318 expr: !DIExpression(DW_OP_LLVM_fragment, 32, 32)
6320 !2 = !DIGlobalVariable(name: "split64", linkageName: "split64", scope: !3,
6321 file: !4, line: 8, type: !5, declaration: !6)
6323 All global variable expressions should be referenced by the `globals:` field of
6324 a :ref:`compile unit <DICompileUnit>`.
6331 ``DISubprogram`` nodes represent functions from the source language. A distinct
6332 ``DISubprogram`` may be attached to a function definition using ``!dbg``
6333 metadata. A unique ``DISubprogram`` may be attached to a function declaration
6334 used for call site debug info. The ``retainedNodes:`` field is a list of
6335 :ref:`variables <DILocalVariable>` and :ref:`labels <DILabel>` that must be
6336 retained, even if their IR counterparts are optimized out of the IR. The
6337 ``type:`` field must point at an :ref:`DISubroutineType`.
6339 .. _DISubprogramDeclaration:
6341 When ``spFlags: DISPFlagDefinition`` is not present, subprograms describe a
6342 declaration in the type tree as opposed to a definition of a function. In this
6343 case, the ``declaration`` field must be empty. If the scope is a composite type
6344 with an ODR ``identifier:`` and that does not set ``flags: DIFwdDecl``, then
6345 the subprogram declaration is uniqued based only on its ``linkageName:`` and
6348 .. code-block:: text
6350 define void @_Z3foov() !dbg !0 {
6354 !0 = distinct !DISubprogram(name: "foo", linkageName: "_Zfoov", scope: !1,
6355 file: !2, line: 7, type: !3,
6356 spFlags: DISPFlagDefinition | DISPFlagLocalToUnit,
6357 scopeLine: 8, containingType: !4,
6358 virtuality: DW_VIRTUALITY_pure_virtual,
6359 virtualIndex: 10, flags: DIFlagPrototyped,
6360 isOptimized: true, unit: !5, templateParams: !6,
6361 declaration: !7, retainedNodes: !8,
6369 ``DILexicalBlock`` nodes describe nested blocks within a :ref:`subprogram
6370 <DISubprogram>`. The line number and column numbers are used to distinguish
6371 two lexical blocks at same depth. They are valid targets for ``scope:``
6374 .. code-block:: text
6376 !0 = distinct !DILexicalBlock(scope: !1, file: !2, line: 7, column: 35)
6378 Usually lexical blocks are ``distinct`` to prevent node merging based on
6381 .. _DILexicalBlockFile:
6386 ``DILexicalBlockFile`` nodes are used to discriminate between sections of a
6387 :ref:`lexical block <DILexicalBlock>`. The ``file:`` field can be changed to
6388 indicate textual inclusion, or the ``discriminator:`` field can be used to
6389 discriminate between control flow within a single block in the source language.
6391 .. code-block:: text
6393 !0 = !DILexicalBlock(scope: !3, file: !4, line: 7, column: 35)
6394 !1 = !DILexicalBlockFile(scope: !0, file: !4, discriminator: 0)
6395 !2 = !DILexicalBlockFile(scope: !0, file: !4, discriminator: 1)
6402 ``DILocation`` nodes represent source debug locations. The ``scope:`` field is
6403 mandatory, and points at an :ref:`DILexicalBlockFile`, an
6404 :ref:`DILexicalBlock`, or an :ref:`DISubprogram`.
6406 .. code-block:: text
6408 !0 = !DILocation(line: 2900, column: 42, scope: !1, inlinedAt: !2)
6410 .. _DILocalVariable:
6415 ``DILocalVariable`` nodes represent local variables in the source language. If
6416 the ``arg:`` field is set to non-zero, then this variable is a subprogram
6417 parameter, and it will be included in the ``retainedNodes:`` field of its
6418 :ref:`DISubprogram`.
6420 .. code-block:: text
6422 !0 = !DILocalVariable(name: "this", arg: 1, scope: !3, file: !2, line: 7,
6423 type: !3, flags: DIFlagArtificial)
6424 !1 = !DILocalVariable(name: "x", arg: 2, scope: !4, file: !2, line: 7,
6426 !2 = !DILocalVariable(name: "y", scope: !5, file: !2, line: 7, type: !3)
6433 ``DIExpression`` nodes represent expressions that are inspired by the DWARF
6434 expression language. They are used in :ref:`debug records <debugrecords>`
6435 (such as ``#dbg_declare`` and ``#dbg_value``) to describe how the
6436 referenced LLVM variable relates to the source language variable. Debug
6437 expressions are interpreted left-to-right: start by pushing the value/address
6438 operand of the record onto a stack, then repeatedly push and evaluate
6439 opcodes from the DIExpression until the final variable description is produced.
6441 The current supported opcode vocabulary is limited:
6443 - ``DW_OP_deref`` dereferences the top of the expression stack.
6444 - ``DW_OP_plus`` pops the last two entries from the expression stack, adds
6445 them together and appends the result to the expression stack.
6446 - ``DW_OP_minus`` pops the last two entries from the expression stack, subtracts
6447 the last entry from the second last entry and appends the result to the
6449 - ``DW_OP_plus_uconst, 93`` adds ``93`` to the working expression.
6450 - ``DW_OP_LLVM_fragment, 16, 8`` specifies the offset and size (``16`` and ``8``
6451 here, respectively) of the variable fragment from the working expression. Note
6452 that contrary to DW_OP_bit_piece, the offset is describing the location
6453 within the described source variable.
6454 - ``DW_OP_LLVM_convert, 16, DW_ATE_signed`` specifies a bit size and encoding
6455 (``16`` and ``DW_ATE_signed`` here, respectively) to which the top of the
6456 expression stack is to be converted. Maps into a ``DW_OP_convert`` operation
6457 that references a base type constructed from the supplied values.
6458 - ``DW_OP_LLVM_extract_bits_sext, 16, 8,`` specifies the offset and size
6459 (``16`` and ``8`` here, respectively) of bits that are to be extracted and
6460 sign-extended from the value at the top of the expression stack. If the top of
6461 the expression stack is a memory location then these bits are extracted from
6462 the value pointed to by that memory location. Maps into a ``DW_OP_shl``
6463 followed by ``DW_OP_shra``.
6464 - ``DW_OP_LLVM_extract_bits_zext`` behaves similarly to
6465 ``DW_OP_LLVM_extract_bits_sext``, but zero-extends instead of sign-extending.
6466 Maps into a ``DW_OP_shl`` followed by ``DW_OP_shr``.
6467 - ``DW_OP_LLVM_tag_offset, tag_offset`` specifies that a memory tag should be
6468 optionally applied to the pointer. The memory tag is derived from the
6469 given tag offset in an implementation-defined manner.
6470 - ``DW_OP_swap`` swaps top two stack entries.
6471 - ``DW_OP_xderef`` provides extended dereference mechanism. The entry at the top
6472 of the stack is treated as an address. The second stack entry is treated as an
6473 address space identifier.
6474 - ``DW_OP_stack_value`` marks a constant value.
6475 - ``DW_OP_LLVM_entry_value, N`` refers to the value a register had upon
6476 function entry. When targeting DWARF, a ``DBG_VALUE(reg, ...,
6477 DIExpression(DW_OP_LLVM_entry_value, 1, ...)`` is lowered to
6478 ``DW_OP_entry_value [reg], ...``, which pushes the value ``reg`` had upon
6479 function entry onto the DWARF expression stack.
6481 The next ``(N - 1)`` operations will be part of the ``DW_OP_entry_value``
6482 block argument. For example, ``!DIExpression(DW_OP_LLVM_entry_value, 1,
6483 DW_OP_plus_uconst, 123, DW_OP_stack_value)`` specifies an expression where
6484 the entry value of ``reg`` is pushed onto the stack, and is added with 123.
6485 Due to framework limitations ``N`` must be 1, in other words,
6486 ``DW_OP_entry_value`` always refers to the value/address operand of the
6489 Because ``DW_OP_LLVM_entry_value`` is defined in terms of registers, it is
6490 usually used in MIR, but it is also allowed in LLVM IR when targeting a
6491 :ref:`swiftasync <swiftasync>` argument. The operation is introduced by:
6493 - ``LiveDebugValues`` pass, which applies it to function parameters that
6494 are unmodified throughout the function. Support is limited to simple
6495 register location descriptions, or as indirect locations (e.g.,
6496 parameters passed-by-value to a callee via a pointer to a temporary copy
6497 made in the caller).
6498 - ``AsmPrinter`` pass when a call site parameter value
6499 (``DW_AT_call_site_parameter_value``) is represented as entry value of
6501 - ``CoroSplit`` pass, which may move variables from allocas into a
6502 coroutine frame. If the coroutine frame is a
6503 :ref:`swiftasync <swiftasync>` argument, the variable is described with
6504 an ``DW_OP_LLVM_entry_value`` operation.
6506 - ``DW_OP_LLVM_arg, N`` is used in debug intrinsics that refer to more than one
6507 value, such as one that calculates the sum of two registers. This is always
6508 used in combination with an ordered list of values, such that
6509 ``DW_OP_LLVM_arg, N`` refers to the ``N``\ :sup:`th` element in that list. For
6510 example, ``!DIExpression(DW_OP_LLVM_arg, 0, DW_OP_LLVM_arg, 1, DW_OP_minus,
6511 DW_OP_stack_value)`` used with the list ``(%reg1, %reg2)`` would evaluate to
6512 ``%reg1 - reg2``. This list of values should be provided by the containing
6513 intrinsic/instruction.
6514 - ``DW_OP_breg`` (or ``DW_OP_bregx``) represents a content on the provided
6515 signed offset of the specified register. The opcode is only generated by the
6516 ``AsmPrinter`` pass to describe call site parameter value which requires an
6517 expression over two registers.
6518 - ``DW_OP_push_object_address`` pushes the address of the object which can then
6519 serve as a descriptor in subsequent calculation. This opcode can be used to
6520 calculate bounds of fortran allocatable array which has array descriptors.
6521 - ``DW_OP_over`` duplicates the entry currently second in the stack at the top
6522 of the stack. This opcode can be used to calculate bounds of fortran assumed
6523 rank array which has rank known at run time and current dimension number is
6524 implicitly first element of the stack.
6525 - ``DW_OP_LLVM_implicit_pointer`` It specifies the dereferenced value. It can
6526 be used to represent pointer variables which are optimized out but the value
6527 it points to is known. This operator is required as it is different than DWARF
6528 operator DW_OP_implicit_pointer in representation and specification (number
6529 and types of operands) and later can not be used as multiple level.
6531 .. code-block:: text
6535 #dbg_value(i32 4, !17, !DIExpression(DW_OP_LLVM_implicit_pointer), !20)
6536 !17 = !DILocalVariable(name: "ptr1", scope: !12, file: !3, line: 5,
6538 !18 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !19, size: 64)
6539 !19 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
6540 !20 = !DILocation(line: 10, scope: !12)
6544 #dbg_value(i32 4, !17,
6545 !DIExpression(DW_OP_LLVM_implicit_pointer, DW_OP_LLVM_implicit_pointer),
6547 !17 = !DILocalVariable(name: "ptr1", scope: !12, file: !3, line: 5,
6549 !18 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !19, size: 64)
6550 !19 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !20, size: 64)
6551 !20 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
6552 !21 = !DILocation(line: 10, scope: !12)
6554 DWARF specifies three kinds of simple location descriptions: Register, memory,
6555 and implicit location descriptions. Note that a location description is
6556 defined over certain ranges of a program, i.e the location of a variable may
6557 change over the course of the program. Register and memory location
6558 descriptions describe the *concrete location* of a source variable (in the
6559 sense that a debugger might modify its value), whereas *implicit locations*
6560 describe merely the actual *value* of a source variable which might not exist
6561 in registers or in memory (see ``DW_OP_stack_value``).
6563 A ``#dbg_declare`` record describes an indirect value (the address) of a
6564 source variable. The first operand of the record must be an address of some
6565 kind. A DIExpression operand to the record refines this address to produce a
6566 concrete location for the source variable.
6568 A ``#dbg_value`` record describes the direct value of a source variable.
6569 The first operand of the record may be a direct or indirect value. A
6570 DIExpression operand to the record refines the first operand to produce a
6571 direct value. For example, if the first operand is an indirect value, it may be
6572 necessary to insert ``DW_OP_deref`` into the DIExpression in order to produce a
6577 A DIExpression is interpreted in the same way regardless of which kind of
6578 debug record it's attached to.
6580 DIExpressions are always printed and parsed inline; they can never be
6581 referenced by an ID (e.g. ``!1``).
6583 .. code-block:: text
6585 !DIExpression(DW_OP_deref)
6586 !DIExpression(DW_OP_plus_uconst, 3)
6587 !DIExpression(DW_OP_constu, 3, DW_OP_plus)
6588 !DIExpression(DW_OP_bit_piece, 3, 7)
6589 !DIExpression(DW_OP_deref, DW_OP_constu, 3, DW_OP_plus, DW_OP_LLVM_fragment, 3, 7)
6590 !DIExpression(DW_OP_constu, 2, DW_OP_swap, DW_OP_xderef)
6591 !DIExpression(DW_OP_constu, 42, DW_OP_stack_value)
6596 ``DIAssignID`` nodes have no operands and are always distinct. They are used to
6597 link together (:ref:`#dbg_assign records <debugrecords>`) and instructions
6598 that store in IR. See `Debug Info Assignment Tracking
6599 <AssignmentTracking.html>`_ for more info.
6601 .. code-block:: llvm
6603 store i32 %a, ptr %a.addr, align 4, !DIAssignID !2
6604 #dbg_assign(%a, !1, !DIExpression(), !2, %a.addr, !DIExpression(), !3)
6606 !2 = distinct !DIAssignID()
6611 .. FIXME In the implementation this is not a "node", but as it can only appear
6612 inline in a function context that distinction isn't observable anyway. Even
6613 if it is not required, it would be nice to be more clear about what is a
6614 "node", and what that actually means. The names in the implementation could
6615 also be updated to mirror whatever we decide here.
6617 ``DIArgList`` nodes hold a list of constant or SSA value references. These are
6618 used in :ref:`debug records <debugrecords>` in combination with a
6619 ``DIExpression`` that uses the
6620 ``DW_OP_LLVM_arg`` operator. Because a DIArgList may refer to local values
6621 within a function, it must only be used as a function argument, must always be
6622 inlined, and cannot appear in named metadata.
6624 .. code-block:: text
6626 #dbg_value(!DIArgList(i32 %a, i32 %b),
6628 !DIExpression(DW_OP_LLVM_arg, 0, DW_OP_LLVM_arg, 1, DW_OP_plus),
6634 These flags encode various properties of DINodes.
6636 The `ExportSymbols` flag marks a class, struct or union whose members
6637 may be referenced as if they were defined in the containing class or
6638 union. This flag is used to decide whether the DW_AT_export_symbols can
6639 be used for the structure type.
6644 ``DIObjCProperty`` nodes represent Objective-C property nodes.
6646 .. code-block:: text
6648 !3 = !DIObjCProperty(name: "foo", file: !1, line: 7, setter: "setFoo",
6649 getter: "getFoo", attributes: 7, type: !2)
6654 ``DIImportedEntity`` nodes represent entities (such as modules) imported into a
6655 compile unit. The ``elements`` field is a list of renamed entities (such as
6656 variables and subprograms) in the imported entity (such as module).
6658 .. code-block:: text
6660 !2 = !DIImportedEntity(tag: DW_TAG_imported_module, name: "foo", scope: !0,
6661 entity: !1, line: 7, elements: !3)
6663 !4 = !DIImportedEntity(tag: DW_TAG_imported_declaration, name: "bar", scope: !0,
6664 entity: !5, line: 7)
6669 ``DIMacro`` nodes represent definition or undefinition of a macro identifiers.
6670 The ``name:`` field is the macro identifier, followed by macro parameters when
6671 defining a function-like macro, and the ``value`` field is the token-string
6672 used to expand the macro identifier.
6674 .. code-block:: text
6676 !2 = !DIMacro(macinfo: DW_MACINFO_define, line: 7, name: "foo(x)",
6678 !3 = !DIMacro(macinfo: DW_MACINFO_undef, line: 30, name: "foo")
6683 ``DIMacroFile`` nodes represent inclusion of source files.
6684 The ``nodes:`` field is a list of ``DIMacro`` and ``DIMacroFile`` nodes that
6685 appear in the included source file.
6687 .. code-block:: text
6689 !2 = !DIMacroFile(macinfo: DW_MACINFO_start_file, line: 7, file: !2,
6697 ``DILabel`` nodes represent labels within a :ref:`DISubprogram`. All fields of
6698 a ``DILabel`` are mandatory. The ``scope:`` field must be one of either a
6699 :ref:`DILexicalBlockFile`, a :ref:`DILexicalBlock`, or a :ref:`DISubprogram`.
6700 The ``name:`` field is the label identifier. The ``file:`` field is the
6701 :ref:`DIFile` the label is present in. The ``line:`` field is the source line
6702 within the file where the label is declared.
6704 .. code-block:: text
6706 !2 = !DILabel(scope: !0, name: "foo", file: !1, line: 7)
6711 ``DICommonBlock`` nodes represent Fortran common blocks. The ``scope:`` field
6712 is mandatory and points to a :ref:`DILexicalBlockFile`, a
6713 :ref:`DILexicalBlock`, or a :ref:`DISubprogram`. The ``declaration:``,
6714 ``name:``, ``file:``, and ``line:`` fields are optional.
6719 ``DIModule`` nodes represent a source language module, for example, a Clang
6720 module, or a Fortran module. The ``scope:`` field is mandatory and points to a
6721 :ref:`DILexicalBlockFile`, a :ref:`DILexicalBlock`, or a :ref:`DISubprogram`.
6722 The ``name:`` field is mandatory. The ``configMacros:``, ``includePath:``,
6723 ``apinotes:``, ``file:``, ``line:``, and ``isDecl:`` fields are optional.
6728 ``DIStringType`` nodes represent a Fortran ``CHARACTER(n)`` type, with a
6729 dynamic length and location encoded as an expression.
6730 The ``tag:`` field is optional and defaults to ``DW_TAG_string_type``. The ``name:``,
6731 ``stringLength:``, ``stringLengthExpression``, ``stringLocationExpression:``,
6732 ``size:``, ``align:``, and ``encoding:`` fields are optional.
6734 If not present, the ``size:`` and ``align:`` fields default to the value zero.
6736 The length in bits of the string is specified by the first of the following
6739 - ``stringLength:``, which points to a ``DIVariable`` whose value is the string
6741 - ``stringLengthExpression:``, which points to a ``DIExpression`` which
6742 computes the length in bits.
6743 - ``size``, which contains the literal length in bits.
6745 The ``stringLocationExpression:`` points to a ``DIExpression`` which describes
6746 the "data location" of the string object, if present.
6751 In LLVM IR, memory does not have types, so LLVM's own type system is not
6752 suitable for doing type based alias analysis (TBAA). Instead, metadata is
6753 added to the IR to describe a type system of a higher level language. This
6754 can be used to implement C/C++ strict type aliasing rules, but it can also
6755 be used to implement custom alias analysis behavior for other languages.
6757 This description of LLVM's TBAA system is broken into two parts:
6758 :ref:`Semantics<tbaa_node_semantics>` talks about high level issues, and
6759 :ref:`Representation<tbaa_node_representation>` talks about the metadata
6760 encoding of various entities.
6762 It is always possible to trace any TBAA node to a "root" TBAA node (details
6763 in the :ref:`Representation<tbaa_node_representation>` section). TBAA
6764 nodes with different roots have an unknown aliasing relationship, and LLVM
6765 conservatively infers ``MayAlias`` between them. The rules mentioned in
6766 this section only pertain to TBAA nodes living under the same root.
6768 .. _tbaa_node_semantics:
6773 The TBAA metadata system, referred to as "struct path TBAA" (not to be
6774 confused with ``tbaa.struct``), consists of the following high level
6775 concepts: *Type Descriptors*, further subdivided into scalar type
6776 descriptors and struct type descriptors; and *Access Tags*.
6778 **Type descriptors** describe the type system of the higher level language
6779 being compiled. **Scalar type descriptors** describe types that do not
6780 contain other types. Each scalar type has a parent type, which must also
6781 be a scalar type or the TBAA root. Via this parent relation, scalar types
6782 within a TBAA root form a tree. **Struct type descriptors** denote types
6783 that contain a sequence of other type descriptors, at known offsets. These
6784 contained type descriptors can either be struct type descriptors themselves
6785 or scalar type descriptors.
6787 **Access tags** are metadata nodes attached to load and store instructions.
6788 Access tags use type descriptors to describe the *location* being accessed
6789 in terms of the type system of the higher level language. Access tags are
6790 tuples consisting of a base type, an access type and an offset. The base
6791 type is a scalar type descriptor or a struct type descriptor, the access
6792 type is a scalar type descriptor, and the offset is a constant integer.
6794 The access tag ``(BaseTy, AccessTy, Offset)`` can describe one of two
6797 * If ``BaseTy`` is a struct type, the tag describes a memory access (load
6798 or store) of a value of type ``AccessTy`` contained in the struct type
6799 ``BaseTy`` at offset ``Offset``.
6801 * If ``BaseTy`` is a scalar type, ``Offset`` must be 0 and ``BaseTy`` and
6802 ``AccessTy`` must be the same; and the access tag describes a scalar
6803 access with scalar type ``AccessTy``.
6805 We first define an ``ImmediateParent`` relation on ``(BaseTy, Offset)``
6808 * If ``BaseTy`` is a scalar type then ``ImmediateParent(BaseTy, 0)`` is
6809 ``(ParentTy, 0)`` where ``ParentTy`` is the parent of the scalar type as
6810 described in the TBAA metadata. ``ImmediateParent(BaseTy, Offset)`` is
6811 undefined if ``Offset`` is non-zero.
6813 * If ``BaseTy`` is a struct type then ``ImmediateParent(BaseTy, Offset)``
6814 is ``(NewTy, NewOffset)`` where ``NewTy`` is the type contained in
6815 ``BaseTy`` at offset ``Offset`` and ``NewOffset`` is ``Offset`` adjusted
6816 to be relative within that inner type.
6818 A memory access with an access tag ``(BaseTy1, AccessTy1, Offset1)``
6819 aliases a memory access with an access tag ``(BaseTy2, AccessTy2,
6820 Offset2)`` if either ``(BaseTy1, Offset1)`` is reachable from ``(Base2,
6821 Offset2)`` via the ``Parent`` relation or vice versa. If memory accesses
6822 alias even though they are noalias according to ``!tbaa`` metadata, the
6823 behavior is undefined.
6825 As a concrete example, the type descriptor graph for the following program
6831 float f; // offset 4
6835 float f; // offset 0
6836 double d; // offset 4
6837 struct Inner inner_a; // offset 12
6840 void f(struct Outer* outer, struct Inner* inner, float* f, int* i, char* c) {
6841 outer->f = 0; // tag0: (OuterStructTy, FloatScalarTy, 0)
6842 outer->inner_a.i = 0; // tag1: (OuterStructTy, IntScalarTy, 12)
6843 outer->inner_a.f = 0.0; // tag2: (OuterStructTy, FloatScalarTy, 16)
6844 *f = 0.0; // tag3: (FloatScalarTy, FloatScalarTy, 0)
6847 is (note that in C and C++, ``char`` can be used to access any arbitrary
6850 .. code-block:: text
6853 CharScalarTy = ("char", Root, 0)
6854 FloatScalarTy = ("float", CharScalarTy, 0)
6855 DoubleScalarTy = ("double", CharScalarTy, 0)
6856 IntScalarTy = ("int", CharScalarTy, 0)
6857 InnerStructTy = {"Inner" (IntScalarTy, 0), (FloatScalarTy, 4)}
6858 OuterStructTy = {"Outer", (FloatScalarTy, 0), (DoubleScalarTy, 4),
6859 (InnerStructTy, 12)}
6862 with (e.g.) ``ImmediateParent(OuterStructTy, 12)`` = ``(InnerStructTy,
6863 0)``, ``ImmediateParent(InnerStructTy, 0)`` = ``(IntScalarTy, 0)``, and
6864 ``ImmediateParent(IntScalarTy, 0)`` = ``(CharScalarTy, 0)``.
6866 .. _tbaa_node_representation:
6871 The root node of a TBAA type hierarchy is an ``MDNode`` with 0 operands or
6872 with exactly one ``MDString`` operand.
6874 Scalar type descriptors are represented as an ``MDNode`` s with two
6875 operands. The first operand is an ``MDString`` denoting the name of the
6876 struct type. LLVM does not assign meaning to the value of this operand, it
6877 only cares about it being an ``MDString``. The second operand is an
6878 ``MDNode`` which points to the parent for said scalar type descriptor,
6879 which is either another scalar type descriptor or the TBAA root. Scalar
6880 type descriptors can have an optional third argument, but that must be the
6881 constant integer zero.
6883 Struct type descriptors are represented as ``MDNode`` s with an odd number
6884 of operands greater than 1. The first operand is an ``MDString`` denoting
6885 the name of the struct type. Like in scalar type descriptors the actual
6886 value of this name operand is irrelevant to LLVM. After the name operand,
6887 the struct type descriptors have a sequence of alternating ``MDNode`` and
6888 ``ConstantInt`` operands. With N starting from 1, the 2N - 1 th operand,
6889 an ``MDNode``, denotes a contained field, and the 2N th operand, a
6890 ``ConstantInt``, is the offset of the said contained field. The offsets
6891 must be in non-decreasing order.
6893 Access tags are represented as ``MDNode`` s with either 3 or 4 operands.
6894 The first operand is an ``MDNode`` pointing to the node representing the
6895 base type. The second operand is an ``MDNode`` pointing to the node
6896 representing the access type. The third operand is a ``ConstantInt`` that
6897 states the offset of the access. If a fourth field is present, it must be
6898 a ``ConstantInt`` valued at 0 or 1. If it is 1 then the access tag states
6899 that the location being accessed is "constant" (meaning
6900 ``pointsToConstantMemory`` should return true; see `other useful
6901 AliasAnalysis methods <AliasAnalysis.html#OtherItfs>`_). The TBAA root of
6902 the access type and the base type of an access tag must be the same, and
6903 that is the TBAA root of the access tag.
6905 '``tbaa.struct``' Metadata
6906 ^^^^^^^^^^^^^^^^^^^^^^^^^^
6908 The :ref:`llvm.memcpy <int_memcpy>` is often used to implement
6909 aggregate assignment operations in C and similar languages, however it
6910 is defined to copy a contiguous region of memory, which is more than
6911 strictly necessary for aggregate types which contain holes due to
6912 padding. Also, it doesn't contain any TBAA information about the fields
6915 ``!tbaa.struct`` metadata can describe which memory subregions in a
6916 memcpy are padding and what the TBAA tags of the struct are.
6918 The current metadata format is very simple. ``!tbaa.struct`` metadata
6919 nodes are a list of operands which are in conceptual groups of three.
6920 For each group of three, the first operand gives the byte offset of a
6921 field in bytes, the second gives its size in bytes, and the third gives
6924 .. code-block:: llvm
6926 !4 = !{ i64 0, i64 4, !1, i64 8, i64 4, !2 }
6928 This describes a struct with two fields. The first is at offset 0 bytes
6929 with size 4 bytes, and has tbaa tag !1. The second is at offset 8 bytes
6930 and has size 4 bytes and has tbaa tag !2.
6932 Note that the fields need not be contiguous. In this example, there is a
6933 4 byte gap between the two fields. This gap represents padding which
6934 does not carry useful data and need not be preserved.
6936 '``noalias``' and '``alias.scope``' Metadata
6937 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6939 ``noalias`` and ``alias.scope`` metadata provide the ability to specify generic
6940 noalias memory-access sets. This means that some collection of memory access
6941 instructions (loads, stores, memory-accessing calls, etc.) that carry
6942 ``noalias`` metadata can specifically be specified not to alias with some other
6943 collection of memory access instructions that carry ``alias.scope`` metadata. If
6944 accesses from different collections alias, the behavior is undefined. Each type
6945 of metadata specifies a list of scopes where each scope has an id and a domain.
6947 When evaluating an aliasing query, if for some domain, the set
6948 of scopes with that domain in one instruction's ``alias.scope`` list is a
6949 subset of (or equal to) the set of scopes for that domain in another
6950 instruction's ``noalias`` list, then the two memory accesses are assumed not to
6953 Because scopes in one domain don't affect scopes in other domains, separate
6954 domains can be used to compose multiple independent noalias sets. This is
6955 used for example during inlining. As the noalias function parameters are
6956 turned into noalias scope metadata, a new domain is used every time the
6957 function is inlined.
6959 The metadata identifying each domain is itself a list containing one or two
6960 entries. The first entry is the name of the domain. Note that if the name is a
6961 string then it can be combined across functions and translation units. A
6962 self-reference can be used to create globally unique domain names. A
6963 descriptive string may optionally be provided as a second list entry.
6965 The metadata identifying each scope is also itself a list containing two or
6966 three entries. The first entry is the name of the scope. Note that if the name
6967 is a string then it can be combined across functions and translation units. A
6968 self-reference can be used to create globally unique scope names. A metadata
6969 reference to the scope's domain is the second entry. A descriptive string may
6970 optionally be provided as a third list entry.
6974 .. code-block:: llvm
6976 ; Two scope domains:
6980 ; Some scopes in these domains:
6986 !5 = !{!4} ; A list containing only scope !4
6990 ; These two instructions don't alias:
6991 %0 = load float, ptr %c, align 4, !alias.scope !5
6992 store float %0, ptr %arrayidx.i, align 4, !noalias !5
6994 ; These two instructions also don't alias (for domain !1, the set of scopes
6995 ; in the !alias.scope equals that in the !noalias list):
6996 %2 = load float, ptr %c, align 4, !alias.scope !5
6997 store float %2, ptr %arrayidx.i2, align 4, !noalias !6
6999 ; These two instructions may alias (for domain !0, the set of scopes in
7000 ; the !noalias list is not a superset of, or equal to, the scopes in the
7001 ; !alias.scope list):
7002 %2 = load float, ptr %c, align 4, !alias.scope !6
7003 store float %0, ptr %arrayidx.i, align 4, !noalias !7
7005 .. _fpmath-metadata:
7007 '``fpmath``' Metadata
7008 ^^^^^^^^^^^^^^^^^^^^^
7010 ``fpmath`` metadata may be attached to any instruction of floating-point
7011 type. It can be used to express the maximum acceptable error in the
7012 result of that instruction, in ULPs, thus potentially allowing the
7013 compiler to use a more efficient but less accurate method of computing
7014 it. ULP is defined as follows:
7016 If ``x`` is a real number that lies between two finite consecutive
7017 floating-point numbers ``a`` and ``b``, without being equal to one
7018 of them, then ``ulp(x) = |b - a|``, otherwise ``ulp(x)`` is the
7019 distance between the two non-equal finite floating-point numbers
7020 nearest ``x``. Moreover, ``ulp(NaN)`` is ``NaN``.
7022 The metadata node shall consist of a single positive float type number
7023 representing the maximum relative error, for example:
7025 .. code-block:: llvm
7027 !0 = !{ float 2.5 } ; maximum acceptable inaccuracy is 2.5 ULPs
7031 '``range``' Metadata
7032 ^^^^^^^^^^^^^^^^^^^^
7034 ``range`` metadata may be attached only to ``load``, ``call`` and ``invoke`` of
7035 integer or vector of integer types. It expresses the possible ranges the loaded
7036 value or the value returned by the called function at this call site is in. If
7037 the loaded or returned value is not in the specified range, a poison value is
7038 returned instead. The ranges are represented with a flattened list of integers.
7039 The loaded value or the value returned is known to be in the union of the ranges
7040 defined by each consecutive pair. Each pair has the following properties:
7042 - The type must match the scalar type of the instruction.
7043 - The pair ``a,b`` represents the range ``[a,b)``.
7044 - Both ``a`` and ``b`` are constants.
7045 - The range is allowed to wrap.
7046 - The range should not represent the full or empty set. That is,
7049 In addition, the pairs must be in signed order of the lower bound and
7050 they must be non-contiguous.
7052 For vector-typed instructions, the range is applied element-wise.
7056 .. code-block:: llvm
7058 %a = load i8, ptr %x, align 1, !range !0 ; Can only be 0 or 1
7059 %b = load i8, ptr %y, align 1, !range !1 ; Can only be 255 (-1), 0 or 1
7060 %c = call i8 @foo(), !range !2 ; Can only be 0, 1, 3, 4 or 5
7061 %d = invoke i8 @bar() to label %cont
7062 unwind label %lpad, !range !3 ; Can only be -2, -1, 3, 4 or 5
7063 %e = load <2 x i8>, ptr %x, !range 0 ; Can only be <0 or 1, 0 or 1>
7065 !0 = !{ i8 0, i8 2 }
7066 !1 = !{ i8 255, i8 2 }
7067 !2 = !{ i8 0, i8 2, i8 3, i8 6 }
7068 !3 = !{ i8 -2, i8 0, i8 3, i8 6 }
7070 '``absolute_symbol``' Metadata
7071 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7073 ``absolute_symbol`` metadata may be attached to a global variable
7074 declaration. It marks the declaration as a reference to an absolute symbol,
7075 which causes the backend to use absolute relocations for the symbol even
7076 in position independent code, and expresses the possible ranges that the
7077 global variable's *address* (not its value) is in, in the same format as
7078 ``range`` metadata, with the extension that the pair ``all-ones,all-ones``
7079 may be used to represent the full set.
7081 Example (assuming 64-bit pointers):
7083 .. code-block:: llvm
7085 @a = external global i8, !absolute_symbol !0 ; Absolute symbol in range [0,256)
7086 @b = external global i8, !absolute_symbol !1 ; Absolute symbol in range [0,2^64)
7089 !0 = !{ i64 0, i64 256 }
7090 !1 = !{ i64 -1, i64 -1 }
7092 '``callees``' Metadata
7093 ^^^^^^^^^^^^^^^^^^^^^^
7095 ``callees`` metadata may be attached to indirect call sites. If ``callees``
7096 metadata is attached to a call site, and any callee is not among the set of
7097 functions provided by the metadata, the behavior is undefined. The intent of
7098 this metadata is to facilitate optimizations such as indirect-call promotion.
7099 For example, in the code below, the call instruction may only target the
7100 ``add`` or ``sub`` functions:
7102 .. code-block:: llvm
7104 %result = call i64 %binop(i64 %x, i64 %y), !callees !0
7107 !0 = !{ptr @add, ptr @sub}
7109 '``callback``' Metadata
7110 ^^^^^^^^^^^^^^^^^^^^^^^
7112 ``callback`` metadata may be attached to a function declaration, or definition.
7113 (Call sites are excluded only due to the lack of a use case.) For ease of
7114 exposition, we'll refer to the function annotated w/ metadata as a broker
7115 function. The metadata describes how the arguments of a call to the broker are
7116 in turn passed to the callback function specified by the metadata. Thus, the
7117 ``callback`` metadata provides a partial description of a call site inside the
7118 broker function with regards to the arguments of a call to the broker. The only
7119 semantic restriction on the broker function itself is that it is not allowed to
7120 inspect or modify arguments referenced in the ``callback`` metadata as
7121 pass-through to the callback function.
7123 The broker is not required to actually invoke the callback function at runtime.
7124 However, the assumptions about not inspecting or modifying arguments that would
7125 be passed to the specified callback function still hold, even if the callback
7126 function is not dynamically invoked. The broker is allowed to invoke the
7127 callback function more than once per invocation of the broker. The broker is
7128 also allowed to invoke (directly or indirectly) the function passed as a
7129 callback through another use. Finally, the broker is also allowed to relay the
7130 callback callee invocation to a different thread.
7132 The metadata is structured as follows: At the outer level, ``callback``
7133 metadata is a list of ``callback`` encodings. Each encoding starts with a
7134 constant ``i64`` which describes the argument position of the callback function
7135 in the call to the broker. The following elements, except the last, describe
7136 what arguments are passed to the callback function. Each element is again an
7137 ``i64`` constant identifying the argument of the broker that is passed through,
7138 or ``i64 -1`` to indicate an unknown or inspected argument. The order in which
7139 they are listed has to be the same in which they are passed to the callback
7140 callee. The last element of the encoding is a boolean which specifies how
7141 variadic arguments of the broker are handled. If it is true, all variadic
7142 arguments of the broker are passed through to the callback function *after* the
7143 arguments encoded explicitly before.
7145 In the code below, the ``pthread_create`` function is marked as a broker
7146 through the ``!callback !1`` metadata. In the example, there is only one
7147 callback encoding, namely ``!2``, associated with the broker. This encoding
7148 identifies the callback function as the second argument of the broker (``i64
7149 2``) and the sole argument of the callback function as the third one of the
7150 broker function (``i64 3``).
7152 .. FIXME why does the llvm-sphinx-docs builder give a highlighting
7153 error if the below is set to highlight as 'llvm', despite that we
7154 have misc.highlighting_failure set?
7156 .. code-block:: text
7158 declare !callback !1 dso_local i32 @pthread_create(ptr, ptr, ptr, ptr)
7161 !2 = !{i64 2, i64 3, i1 false}
7164 Another example is shown below. The callback callee is the second argument of
7165 the ``__kmpc_fork_call`` function (``i64 2``). The callee is given two unknown
7166 values (each identified by a ``i64 -1``) and afterwards all
7167 variadic arguments that are passed to the ``__kmpc_fork_call`` call (due to the
7170 .. FIXME why does the llvm-sphinx-docs builder give a highlighting
7171 error if the below is set to highlight as 'llvm', despite that we
7172 have misc.highlighting_failure set?
7174 .. code-block:: text
7176 declare !callback !0 dso_local void @__kmpc_fork_call(ptr, i32, ptr, ...)
7179 !1 = !{i64 2, i64 -1, i64 -1, i1 true}
7182 '``exclude``' Metadata
7183 ^^^^^^^^^^^^^^^^^^^^^^
7185 ``exclude`` metadata may be attached to a global variable to signify that its
7186 section should not be included in the final executable or shared library. This
7187 option is only valid for global variables with an explicit section targeting ELF
7188 or COFF. This is done using the ``SHF_EXCLUDE`` flag on ELF targets and the
7189 ``IMAGE_SCN_LNK_REMOVE`` and ``IMAGE_SCN_MEM_DISCARDABLE`` flags for COFF
7190 targets. Additionally, this metadata is only used as a flag, so the associated
7191 node must be empty. The explicit section should not conflict with any other
7192 sections that the user does not want removed after linking.
7194 .. code-block:: text
7196 @object = private constant [1 x i8] c"\00", section ".foo" !exclude !0
7201 '``unpredictable``' Metadata
7202 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7204 ``unpredictable`` metadata may be attached to any branch or switch
7205 instruction. It can be used to express the unpredictability of control
7206 flow. Similar to the llvm.expect intrinsic, it may be used to alter
7207 optimizations related to compare and branch instructions. The metadata
7208 is treated as a boolean value; if it exists, it signals that the branch
7209 or switch that it is attached to is completely unpredictable.
7211 .. _md_dereferenceable:
7213 '``dereferenceable``' Metadata
7214 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7216 The existence of the ``!dereferenceable`` metadata on the instruction
7217 tells the optimizer that the value loaded is known to be dereferenceable,
7218 otherwise the behavior is undefined.
7219 The number of bytes known to be dereferenceable is specified by the integer
7220 value in the metadata node. This is analogous to the ''dereferenceable''
7221 attribute on parameters and return values.
7223 .. _md_dereferenceable_or_null:
7225 '``dereferenceable_or_null``' Metadata
7226 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7228 The existence of the ``!dereferenceable_or_null`` metadata on the
7229 instruction tells the optimizer that the value loaded is known to be either
7230 dereferenceable or null, otherwise the behavior is undefined.
7231 The number of bytes known to be dereferenceable is specified by the integer
7232 value in the metadata node. This is analogous to the ''dereferenceable_or_null''
7233 attribute on parameters and return values.
7240 It is sometimes useful to attach information to loop constructs. Currently,
7241 loop metadata is implemented as metadata attached to the branch instruction
7242 in the loop latch block. The loop metadata node is a list of
7243 other metadata nodes, each representing a property of the loop. Usually,
7244 the first item of the property node is a string. For example, the
7245 ``llvm.loop.unroll.count`` suggests an unroll factor to the loop
7248 .. code-block:: llvm
7250 br i1 %exitcond, label %._crit_edge, label %.lr.ph, !llvm.loop !0
7253 !1 = !{!"llvm.loop.unroll.enable"}
7254 !2 = !{!"llvm.loop.unroll.count", i32 4}
7256 For legacy reasons, the first item of a loop metadata node must be a
7257 reference to itself. Before the advent of the 'distinct' keyword, this
7258 forced the preservation of otherwise identical metadata nodes. Since
7259 the loop-metadata node can be attached to multiple nodes, the 'distinct'
7260 keyword has become unnecessary.
7262 Prior to the property nodes, one or two ``DILocation`` (debug location)
7263 nodes can be present in the list. The first, if present, identifies the
7264 source-code location where the loop begins. The second, if present,
7265 identifies the source-code location where the loop ends.
7267 Loop metadata nodes cannot be used as unique identifiers. They are
7268 neither persistent for the same loop through transformations nor
7269 necessarily unique to just one loop.
7271 '``llvm.loop.disable_nonforced``'
7272 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7274 This metadata disables all optional loop transformations unless
7275 explicitly instructed using other transformation metadata such as
7276 ``llvm.loop.unroll.enable``. That is, no heuristic will try to determine
7277 whether a transformation is profitable. The purpose is to avoid that the
7278 loop is transformed to a different loop before an explicitly requested
7279 (forced) transformation is applied. For instance, loop fusion can make
7280 other transformations impossible. Mandatory loop canonicalizations such
7281 as loop rotation are still applied.
7283 It is recommended to use this metadata in addition to any llvm.loop.*
7284 transformation directive. Also, any loop should have at most one
7285 directive applied to it (and a sequence of transformations built using
7286 followup-attributes). Otherwise, which transformation will be applied
7287 depends on implementation details such as the pass pipeline order.
7289 See :ref:`transformation-metadata` for details.
7291 '``llvm.loop.vectorize``' and '``llvm.loop.interleave``'
7292 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7294 Metadata prefixed with ``llvm.loop.vectorize`` or ``llvm.loop.interleave`` are
7295 used to control per-loop vectorization and interleaving parameters such as
7296 vectorization width and interleave count. These metadata should be used in
7297 conjunction with ``llvm.loop`` loop identification metadata. The
7298 ``llvm.loop.vectorize`` and ``llvm.loop.interleave`` metadata are only
7299 optimization hints and the optimizer will only interleave and vectorize loops if
7300 it believes it is safe to do so. The ``llvm.loop.parallel_accesses`` metadata
7301 which contains information about loop-carried memory dependencies can be helpful
7302 in determining the safety of these transformations.
7304 '``llvm.loop.interleave.count``' Metadata
7305 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7307 This metadata suggests an interleave count to the loop interleaver.
7308 The first operand is the string ``llvm.loop.interleave.count`` and the
7309 second operand is an integer specifying the interleave count. For
7312 .. code-block:: llvm
7314 !0 = !{!"llvm.loop.interleave.count", i32 4}
7316 Note that setting ``llvm.loop.interleave.count`` to 1 disables interleaving
7317 multiple iterations of the loop. If ``llvm.loop.interleave.count`` is set to 0
7318 then the interleave count will be determined automatically.
7320 '``llvm.loop.vectorize.enable``' Metadata
7321 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7323 This metadata selectively enables or disables vectorization for the loop. The
7324 first operand is the string ``llvm.loop.vectorize.enable`` and the second operand
7325 is a bit. If the bit operand value is 1 vectorization is enabled. A value of
7326 0 disables vectorization:
7328 .. code-block:: llvm
7330 !0 = !{!"llvm.loop.vectorize.enable", i1 0}
7331 !1 = !{!"llvm.loop.vectorize.enable", i1 1}
7333 '``llvm.loop.vectorize.predicate.enable``' Metadata
7334 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7336 This metadata selectively enables or disables creating predicated instructions
7337 for the loop, which can enable folding of the scalar epilogue loop into the
7338 main loop. The first operand is the string
7339 ``llvm.loop.vectorize.predicate.enable`` and the second operand is a bit. If
7340 the bit operand value is 1 vectorization is enabled. A value of 0 disables
7343 .. code-block:: llvm
7345 !0 = !{!"llvm.loop.vectorize.predicate.enable", i1 0}
7346 !1 = !{!"llvm.loop.vectorize.predicate.enable", i1 1}
7348 '``llvm.loop.vectorize.scalable.enable``' Metadata
7349 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7351 This metadata selectively enables or disables scalable vectorization for the
7352 loop, and only has any effect if vectorization for the loop is already enabled.
7353 The first operand is the string ``llvm.loop.vectorize.scalable.enable``
7354 and the second operand is a bit. If the bit operand value is 1 scalable
7355 vectorization is enabled, whereas a value of 0 reverts to the default fixed
7356 width vectorization:
7358 .. code-block:: llvm
7360 !0 = !{!"llvm.loop.vectorize.scalable.enable", i1 0}
7361 !1 = !{!"llvm.loop.vectorize.scalable.enable", i1 1}
7363 '``llvm.loop.vectorize.width``' Metadata
7364 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7366 This metadata sets the target width of the vectorizer. The first
7367 operand is the string ``llvm.loop.vectorize.width`` and the second
7368 operand is an integer specifying the width. For example:
7370 .. code-block:: llvm
7372 !0 = !{!"llvm.loop.vectorize.width", i32 4}
7374 Note that setting ``llvm.loop.vectorize.width`` to 1 disables
7375 vectorization of the loop. If ``llvm.loop.vectorize.width`` is set to
7376 0 or if the loop does not have this metadata the width will be
7377 determined automatically.
7379 '``llvm.loop.vectorize.followup_vectorized``' Metadata
7380 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7382 This metadata defines which loop attributes the vectorized loop will
7383 have. See :ref:`transformation-metadata` for details.
7385 '``llvm.loop.vectorize.followup_epilogue``' Metadata
7386 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7388 This metadata defines which loop attributes the epilogue will have. The
7389 epilogue is not vectorized and is executed when either the vectorized
7390 loop is not known to preserve semantics (because e.g., it processes two
7391 arrays that are found to alias by a runtime check) or for the last
7392 iterations that do not fill a complete set of vector lanes. See
7393 :ref:`Transformation Metadata <transformation-metadata>` for details.
7395 '``llvm.loop.vectorize.followup_all``' Metadata
7396 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7398 Attributes in the metadata will be added to both the vectorized and
7400 See :ref:`Transformation Metadata <transformation-metadata>` for details.
7402 '``llvm.loop.unroll``'
7403 ^^^^^^^^^^^^^^^^^^^^^^
7405 Metadata prefixed with ``llvm.loop.unroll`` are loop unrolling
7406 optimization hints such as the unroll factor. ``llvm.loop.unroll``
7407 metadata should be used in conjunction with ``llvm.loop`` loop
7408 identification metadata. The ``llvm.loop.unroll`` metadata are only
7409 optimization hints and the unrolling will only be performed if the
7410 optimizer believes it is safe to do so.
7412 '``llvm.loop.unroll.count``' Metadata
7413 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7415 This metadata suggests an unroll factor to the loop unroller. The
7416 first operand is the string ``llvm.loop.unroll.count`` and the second
7417 operand is a positive integer specifying the unroll factor. For
7420 .. code-block:: llvm
7422 !0 = !{!"llvm.loop.unroll.count", i32 4}
7424 If the trip count of the loop is less than the unroll count the loop
7425 will be partially unrolled.
7427 '``llvm.loop.unroll.disable``' Metadata
7428 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7430 This metadata disables loop unrolling. The metadata has a single operand
7431 which is the string ``llvm.loop.unroll.disable``. For example:
7433 .. code-block:: llvm
7435 !0 = !{!"llvm.loop.unroll.disable"}
7437 '``llvm.loop.unroll.runtime.disable``' Metadata
7438 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7440 This metadata disables runtime loop unrolling. The metadata has a single
7441 operand which is the string ``llvm.loop.unroll.runtime.disable``. For example:
7443 .. code-block:: llvm
7445 !0 = !{!"llvm.loop.unroll.runtime.disable"}
7447 '``llvm.loop.unroll.enable``' Metadata
7448 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7450 This metadata suggests that the loop should be fully unrolled if the trip count
7451 is known at compile time and partially unrolled if the trip count is not known
7452 at compile time. The metadata has a single operand which is the string
7453 ``llvm.loop.unroll.enable``. For example:
7455 .. code-block:: llvm
7457 !0 = !{!"llvm.loop.unroll.enable"}
7459 '``llvm.loop.unroll.full``' Metadata
7460 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7462 This metadata suggests that the loop should be unrolled fully. The
7463 metadata has a single operand which is the string ``llvm.loop.unroll.full``.
7466 .. code-block:: llvm
7468 !0 = !{!"llvm.loop.unroll.full"}
7470 '``llvm.loop.unroll.followup``' Metadata
7471 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7473 This metadata defines which loop attributes the unrolled loop will have.
7474 See :ref:`Transformation Metadata <transformation-metadata>` for details.
7476 '``llvm.loop.unroll.followup_remainder``' Metadata
7477 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7479 This metadata defines which loop attributes the remainder loop after
7480 partial/runtime unrolling will have. See
7481 :ref:`Transformation Metadata <transformation-metadata>` for details.
7483 '``llvm.loop.unroll_and_jam``'
7484 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7486 This metadata is treated very similarly to the ``llvm.loop.unroll`` metadata
7487 above, but affect the unroll and jam pass. In addition any loop with
7488 ``llvm.loop.unroll`` metadata but no ``llvm.loop.unroll_and_jam`` metadata will
7489 disable unroll and jam (so ``llvm.loop.unroll`` metadata will be left to the
7490 unroller, plus ``llvm.loop.unroll.disable`` metadata will disable unroll and jam
7493 The metadata for unroll and jam otherwise is the same as for ``unroll``.
7494 ``llvm.loop.unroll_and_jam.enable``, ``llvm.loop.unroll_and_jam.disable`` and
7495 ``llvm.loop.unroll_and_jam.count`` do the same as for unroll.
7496 ``llvm.loop.unroll_and_jam.full`` is not supported. Again these are only hints
7497 and the normal safety checks will still be performed.
7499 '``llvm.loop.unroll_and_jam.count``' Metadata
7500 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7502 This metadata suggests an unroll and jam factor to use, similarly to
7503 ``llvm.loop.unroll.count``. The first operand is the string
7504 ``llvm.loop.unroll_and_jam.count`` and the second operand is a positive integer
7505 specifying the unroll factor. For example:
7507 .. code-block:: llvm
7509 !0 = !{!"llvm.loop.unroll_and_jam.count", i32 4}
7511 If the trip count of the loop is less than the unroll count the loop
7512 will be partially unroll and jammed.
7514 '``llvm.loop.unroll_and_jam.disable``' Metadata
7515 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7517 This metadata disables loop unroll and jamming. The metadata has a single
7518 operand which is the string ``llvm.loop.unroll_and_jam.disable``. For example:
7520 .. code-block:: llvm
7522 !0 = !{!"llvm.loop.unroll_and_jam.disable"}
7524 '``llvm.loop.unroll_and_jam.enable``' Metadata
7525 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7527 This metadata suggests that the loop should be fully unroll and jammed if the
7528 trip count is known at compile time and partially unrolled if the trip count is
7529 not known at compile time. The metadata has a single operand which is the
7530 string ``llvm.loop.unroll_and_jam.enable``. For example:
7532 .. code-block:: llvm
7534 !0 = !{!"llvm.loop.unroll_and_jam.enable"}
7536 '``llvm.loop.unroll_and_jam.followup_outer``' Metadata
7537 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7539 This metadata defines which loop attributes the outer unrolled loop will
7540 have. See :ref:`Transformation Metadata <transformation-metadata>` for
7543 '``llvm.loop.unroll_and_jam.followup_inner``' Metadata
7544 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7546 This metadata defines which loop attributes the inner jammed loop will
7547 have. See :ref:`Transformation Metadata <transformation-metadata>` for
7550 '``llvm.loop.unroll_and_jam.followup_remainder_outer``' Metadata
7551 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7553 This metadata defines which attributes the epilogue of the outer loop
7554 will have. This loop is usually unrolled, meaning there is no such
7555 loop. This attribute will be ignored in this case. See
7556 :ref:`Transformation Metadata <transformation-metadata>` for details.
7558 '``llvm.loop.unroll_and_jam.followup_remainder_inner``' Metadata
7559 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7561 This metadata defines which attributes the inner loop of the epilogue
7562 will have. The outer epilogue will usually be unrolled, meaning there
7563 can be multiple inner remainder loops. See
7564 :ref:`Transformation Metadata <transformation-metadata>` for details.
7566 '``llvm.loop.unroll_and_jam.followup_all``' Metadata
7567 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7569 Attributes specified in the metadata is added to all
7570 ``llvm.loop.unroll_and_jam.*`` loops. See
7571 :ref:`Transformation Metadata <transformation-metadata>` for details.
7573 '``llvm.loop.licm_versioning.disable``' Metadata
7574 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7576 This metadata indicates that the loop should not be versioned for the purpose
7577 of enabling loop-invariant code motion (LICM). The metadata has a single operand
7578 which is the string ``llvm.loop.licm_versioning.disable``. For example:
7580 .. code-block:: llvm
7582 !0 = !{!"llvm.loop.licm_versioning.disable"}
7584 '``llvm.loop.distribute.enable``' Metadata
7585 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7587 Loop distribution allows splitting a loop into multiple loops. Currently,
7588 this is only performed if the entire loop cannot be vectorized due to unsafe
7589 memory dependencies. The transformation will attempt to isolate the unsafe
7590 dependencies into their own loop.
7592 This metadata can be used to selectively enable or disable distribution of the
7593 loop. The first operand is the string ``llvm.loop.distribute.enable`` and the
7594 second operand is a bit. If the bit operand value is 1 distribution is
7595 enabled. A value of 0 disables distribution:
7597 .. code-block:: llvm
7599 !0 = !{!"llvm.loop.distribute.enable", i1 0}
7600 !1 = !{!"llvm.loop.distribute.enable", i1 1}
7602 This metadata should be used in conjunction with ``llvm.loop`` loop
7603 identification metadata.
7605 '``llvm.loop.distribute.followup_coincident``' Metadata
7606 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7608 This metadata defines which attributes extracted loops with no cyclic
7609 dependencies will have (i.e. can be vectorized). See
7610 :ref:`Transformation Metadata <transformation-metadata>` for details.
7612 '``llvm.loop.distribute.followup_sequential``' Metadata
7613 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7615 This metadata defines which attributes the isolated loops with unsafe
7616 memory dependencies will have. See
7617 :ref:`Transformation Metadata <transformation-metadata>` for details.
7619 '``llvm.loop.distribute.followup_fallback``' Metadata
7620 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7622 If loop versioning is necessary, this metadata defined the attributes
7623 the non-distributed fallback version will have. See
7624 :ref:`Transformation Metadata <transformation-metadata>` for details.
7626 '``llvm.loop.distribute.followup_all``' Metadata
7627 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7629 The attributes in this metadata is added to all followup loops of the
7630 loop distribution pass. See
7631 :ref:`Transformation Metadata <transformation-metadata>` for details.
7633 '``llvm.licm.disable``' Metadata
7634 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7636 This metadata indicates that loop-invariant code motion (LICM) should not be
7637 performed on this loop. The metadata has a single operand which is the string
7638 ``llvm.licm.disable``. For example:
7640 .. code-block:: llvm
7642 !0 = !{!"llvm.licm.disable"}
7644 Note that although it operates per loop it isn't given the llvm.loop prefix
7645 as it is not affected by the ``llvm.loop.disable_nonforced`` metadata.
7647 '``llvm.access.group``' Metadata
7648 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7650 ``llvm.access.group`` metadata can be attached to any instruction that
7651 potentially accesses memory. It can point to a single distinct metadata
7652 node, which we call access group. This node represents all memory access
7653 instructions referring to it via ``llvm.access.group``. When an
7654 instruction belongs to multiple access groups, it can also point to a
7655 list of accesses groups, illustrated by the following example.
7657 .. code-block:: llvm
7659 %val = load i32, ptr %arrayidx, !llvm.access.group !0
7665 It is illegal for the list node to be empty since it might be confused
7666 with an access group.
7668 The access group metadata node must be 'distinct' to avoid collapsing
7669 multiple access groups by content. An access group metadata node must
7670 always be empty which can be used to distinguish an access group
7671 metadata node from a list of access groups. Being empty avoids the
7672 situation that the content must be updated which, because metadata is
7673 immutable by design, would required finding and updating all references
7674 to the access group node.
7676 The access group can be used to refer to a memory access instruction
7677 without pointing to it directly (which is not possible in global
7678 metadata). Currently, the only metadata making use of it is
7679 ``llvm.loop.parallel_accesses``.
7681 '``llvm.loop.parallel_accesses``' Metadata
7682 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7684 The ``llvm.loop.parallel_accesses`` metadata refers to one or more
7685 access group metadata nodes (see ``llvm.access.group``). It denotes that
7686 no loop-carried memory dependence exist between it and other instructions
7687 in the loop with this metadata.
7689 Let ``m1`` and ``m2`` be two instructions that both have the
7690 ``llvm.access.group`` metadata to the access group ``g1``, respectively
7691 ``g2`` (which might be identical). If a loop contains both access groups
7692 in its ``llvm.loop.parallel_accesses`` metadata, then the compiler can
7693 assume that there is no dependency between ``m1`` and ``m2`` carried by
7694 this loop. Instructions that belong to multiple access groups are
7695 considered having this property if at least one of the access groups
7696 matches the ``llvm.loop.parallel_accesses`` list.
7698 If all memory-accessing instructions in a loop have
7699 ``llvm.access.group`` metadata that each refer to one of the access
7700 groups of a loop's ``llvm.loop.parallel_accesses`` metadata, then the
7701 loop has no loop carried memory dependencies and is considered to be a
7702 parallel loop. If there is a loop-carried dependency, the behavior is
7705 Note that if not all memory access instructions belong to an access
7706 group referred to by ``llvm.loop.parallel_accesses``, then the loop must
7707 not be considered trivially parallel. Additional
7708 memory dependence analysis is required to make that determination. As a fail
7709 safe mechanism, this causes loops that were originally parallel to be considered
7710 sequential (if optimization passes that are unaware of the parallel semantics
7711 insert new memory instructions into the loop body).
7713 Example of a loop that is considered parallel due to its correct use of
7714 both ``llvm.access.group`` and ``llvm.loop.parallel_accesses``
7717 .. code-block:: llvm
7721 %val0 = load i32, ptr %arrayidx, !llvm.access.group !1
7723 store i32 %val0, ptr %arrayidx1, !llvm.access.group !1
7725 br i1 %exitcond, label %for.end, label %for.body, !llvm.loop !0
7729 !0 = distinct !{!0, !{!"llvm.loop.parallel_accesses", !1}}
7732 It is also possible to have nested parallel loops:
7734 .. code-block:: llvm
7738 %val1 = load i32, ptr %arrayidx3, !llvm.access.group !4
7740 br label %inner.for.body
7744 %val0 = load i32, ptr %arrayidx1, !llvm.access.group !3
7746 store i32 %val0, ptr %arrayidx2, !llvm.access.group !3
7748 br i1 %exitcond, label %inner.for.end, label %inner.for.body, !llvm.loop !1
7752 store i32 %val1, ptr %arrayidx4, !llvm.access.group !4
7754 br i1 %exitcond, label %outer.for.end, label %outer.for.body, !llvm.loop !2
7756 outer.for.end: ; preds = %for.body
7758 !1 = distinct !{!1, !{!"llvm.loop.parallel_accesses", !3}} ; metadata for the inner loop
7759 !2 = distinct !{!2, !{!"llvm.loop.parallel_accesses", !3, !4}} ; metadata for the outer loop
7760 !3 = distinct !{} ; access group for instructions in the inner loop (which are implicitly contained in outer loop as well)
7761 !4 = distinct !{} ; access group for instructions in the outer, but not the inner loop
7763 .. _langref_llvm_loop_mustprogress:
7765 '``llvm.loop.mustprogress``' Metadata
7766 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7768 The ``llvm.loop.mustprogress`` metadata indicates that this loop is required to
7769 terminate, unwind, or interact with the environment in an observable way e.g.
7770 via a volatile memory access, I/O, or other synchronization. If such a loop is
7771 not found to interact with the environment in an observable way, the loop may
7772 be removed. This corresponds to the ``mustprogress`` function attribute.
7774 '``irr_loop``' Metadata
7775 ^^^^^^^^^^^^^^^^^^^^^^^
7777 ``irr_loop`` metadata may be attached to the terminator instruction of a basic
7778 block that's an irreducible loop header (note that an irreducible loop has more
7779 than once header basic blocks.) If ``irr_loop`` metadata is attached to the
7780 terminator instruction of a basic block that is not really an irreducible loop
7781 header, the behavior is undefined. The intent of this metadata is to improve the
7782 accuracy of the block frequency propagation. For example, in the code below, the
7783 block ``header0`` may have a loop header weight (relative to the other headers of
7784 the irreducible loop) of 100:
7786 .. code-block:: llvm
7790 br i1 %cmp, label %t1, label %t2, !irr_loop !0
7793 !0 = !{"loop_header_weight", i64 100}
7795 Irreducible loop header weights are typically based on profile data.
7797 .. _md_invariant.group:
7799 '``invariant.group``' Metadata
7800 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7802 The experimental ``invariant.group`` metadata may be attached to
7803 ``load``/``store`` instructions referencing a single metadata with no entries.
7804 The existence of the ``invariant.group`` metadata on the instruction tells
7805 the optimizer that every ``load`` and ``store`` to the same pointer operand
7806 can be assumed to load or store the same
7807 value (but see the ``llvm.launder.invariant.group`` intrinsic which affects
7808 when two pointers are considered the same). Pointers returned by bitcast or
7809 getelementptr with only zero indices are considered the same.
7813 .. code-block:: llvm
7815 @unknownPtr = external global i8
7818 store i8 42, ptr %ptr, !invariant.group !0
7819 call void @foo(ptr %ptr)
7821 %a = load i8, ptr %ptr, !invariant.group !0 ; Can assume that value under %ptr didn't change
7822 call void @foo(ptr %ptr)
7824 %newPtr = call ptr @getPointer(ptr %ptr)
7825 %c = load i8, ptr %newPtr, !invariant.group !0 ; Can't assume anything, because we only have information about %ptr
7827 %unknownValue = load i8, ptr @unknownPtr
7828 store i8 %unknownValue, ptr %ptr, !invariant.group !0 ; Can assume that %unknownValue == 42
7830 call void @foo(ptr %ptr)
7831 %newPtr2 = call ptr @llvm.launder.invariant.group.p0(ptr %ptr)
7832 %d = load i8, ptr %newPtr2, !invariant.group !0 ; Can't step through launder.invariant.group to get value of %ptr
7835 declare void @foo(ptr)
7836 declare ptr @getPointer(ptr)
7837 declare ptr @llvm.launder.invariant.group.p0(ptr)
7841 The invariant.group metadata must be dropped when replacing one pointer by
7842 another based on aliasing information. This is because invariant.group is tied
7843 to the SSA value of the pointer operand.
7845 .. code-block:: llvm
7847 %v = load i8, ptr %x, !invariant.group !0
7848 ; if %x mustalias %y then we can replace the above instruction with
7849 %v = load i8, ptr %y
7851 Note that this is an experimental feature, which means that its semantics might
7852 change in the future.
7857 See :doc:`TypeMetadata`.
7859 '``associated``' Metadata
7860 ^^^^^^^^^^^^^^^^^^^^^^^^^
7862 The ``associated`` metadata may be attached to a global variable definition with
7863 a single argument that references a global object (optionally through an alias).
7865 This metadata lowers to the ELF section flag ``SHF_LINK_ORDER`` which prevents
7866 discarding of the global variable in linker GC unless the referenced object is
7867 also discarded. The linker support for this feature is spotty. For best
7868 compatibility, globals carrying this metadata should:
7870 - Be in ``@llvm.compiler.used``.
7871 - If the referenced global variable is in a comdat, be in the same comdat.
7873 ``!associated`` can not express many-to-one relationship. A global variable with
7874 the metadata should generally not be referenced by a function: the function may
7875 be inlined into other functions, leading to more references to the metadata.
7876 Ideally we would want to keep metadata alive as long as any inline location is
7877 alive, but this many-to-one relationship is not representable. Moreover, if the
7878 metadata is retained while the function is discarded, the linker will report an
7879 error of a relocation referencing a discarded section.
7881 The metadata is often used with an explicit section consisting of valid C
7882 identifiers so that the runtime can find the metadata section with
7883 linker-defined encapsulation symbols ``__start_<section_name>`` and
7884 ``__stop_<section_name>``.
7886 It does not have any effect on non-ELF targets.
7890 .. code-block:: text
7893 @a = global i32 1, comdat $a
7894 @b = internal global i32 2, comdat $a, section "abc", !associated !0
7901 The ``prof`` metadata is used to record profile data in the IR.
7902 The first operand of the metadata node indicates the profile metadata
7903 type. There are currently 3 types:
7904 :ref:`branch_weights<prof_node_branch_weights>`,
7905 :ref:`function_entry_count<prof_node_function_entry_count>`, and
7906 :ref:`VP<prof_node_VP>`.
7908 .. _prof_node_branch_weights:
7913 Branch weight metadata attached to a branch, select, switch or call instruction
7914 represents the likeliness of the associated branch being taken.
7915 For more information, see :doc:`BranchWeightMetadata`.
7917 .. _prof_node_function_entry_count:
7919 function_entry_count
7920 """"""""""""""""""""
7922 Function entry count metadata can be attached to function definitions
7923 to record the number of times the function is called. Used with BFI
7924 information, it is also used to derive the basic block profile count.
7925 For more information, see :doc:`BranchWeightMetadata`.
7932 VP (value profile) metadata can be attached to instructions that have
7933 value profile information. Currently this is indirect calls (where it
7934 records the hottest callees) and calls to memory intrinsics such as memcpy,
7935 memmove, and memset (where it records the hottest byte lengths).
7937 Each VP metadata node contains "VP" string, then a uint32_t value for the value
7938 profiling kind, a uint64_t value for the total number of times the instruction
7939 is executed, followed by uint64_t value and execution count pairs.
7940 The value profiling kind is 0 for indirect call targets and 1 for memory
7941 operations. For indirect call targets, each profile value is a hash
7942 of the callee function name, and for memory operations each value is the
7945 Note that the value counts do not need to add up to the total count
7946 listed in the third operand (in practice only the top hottest values
7947 are tracked and reported).
7949 Indirect call example:
7951 .. code-block:: llvm
7953 call void %f(), !prof !1
7954 !1 = !{!"VP", i32 0, i64 1600, i64 7651369219802541373, i64 1030, i64 -4377547752858689819, i64 410}
7956 Note that the VP type is 0 (the second operand), which indicates this is
7957 an indirect call value profile data. The third operand indicates that the
7958 indirect call executed 1600 times. The 4th and 6th operands give the
7959 hashes of the 2 hottest target functions' names (this is the same hash used
7960 to represent function names in the profile database), and the 5th and 7th
7961 operands give the execution count that each of the respective prior target
7962 functions was called.
7966 '``annotation``' Metadata
7967 ^^^^^^^^^^^^^^^^^^^^^^^^^
7969 The ``annotation`` metadata can be used to attach a tuple of annotation strings
7970 or a tuple of a tuple of annotation strings to any instruction. This metadata does
7971 not impact the semantics of the program and may only be used to provide additional
7972 insight about the program and transformations to users.
7976 .. code-block:: text
7978 %a.addr = alloca ptr, align 8, !annotation !0
7979 !0 = !{!"auto-init"}
7981 Embedding tuple of strings example:
7983 .. code-block:: text
7985 %a.ptr = getelementptr ptr, ptr %base, i64 0. !annotation !0
7987 !1 = !{!"gep offset", !"0"}
7989 '``func_sanitize``' Metadata
7990 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7992 The ``func_sanitize`` metadata is used to attach two values for the function
7993 sanitizer instrumentation. The first value is the ubsan function signature.
7994 The second value is the address of the proxy variable which stores the address
7995 of the RTTI descriptor. If :ref:`prologue <prologuedata>` and '``func_sanitize``'
7996 are used at the same time, :ref:`prologue <prologuedata>` is emitted before
7997 '``func_sanitize``' in the output.
8001 .. code-block:: text
8003 @__llvm_rtti_proxy = private unnamed_addr constant ptr @_ZTIFvvE
8004 define void @_Z3funv() !func_sanitize !0 {
8007 !0 = !{i32 846595819, ptr @__llvm_rtti_proxy}
8011 '``kcfi_type``' Metadata
8012 ^^^^^^^^^^^^^^^^^^^^^^^^
8014 The ``kcfi_type`` metadata can be used to attach a type identifier to
8015 functions that can be called indirectly. The type data is emitted before the
8016 function entry in the assembly. Indirect calls with the :ref:`kcfi operand
8017 bundle<ob_kcfi>` will emit a check that compares the type identifier to the
8022 .. code-block:: text
8024 define dso_local i32 @f() !kcfi_type !0 {
8027 !0 = !{i32 12345678}
8029 Clang emits ``kcfi_type`` metadata nodes for address-taken functions with
8030 ``-fsanitize=kcfi``.
8034 '``memprof``' Metadata
8035 ^^^^^^^^^^^^^^^^^^^^^^^^
8037 The ``memprof`` metadata is used to record memory profile data on heap
8038 allocation calls. Multiple context-sensitive profiles can be represented
8039 with a single ``memprof`` metadata attachment.
8043 .. code-block:: text
8045 %call = call ptr @_Znam(i64 10), !memprof !0, !callsite !5
8048 !2 = !{i64 4854880825882961848, i64 1905834578520680781}
8049 !3 = !{!4, !"notcold"}
8050 !4 = !{i64 4854880825882961848, i64 -6528110295079665978}
8051 !5 = !{i64 4854880825882961848}
8053 Each operand in the ``memprof`` metadata attachment describes the profiled
8054 behavior of memory allocated by the associated allocation for a given context.
8055 In the above example, there were 2 profiled contexts, one allocating memory
8056 that was typically cold and one allocating memory that was typically not cold.
8058 The format of the metadata describing a context specific profile (e.g.
8059 ``!1`` and ``!3`` above) requires a first operand that is a metadata node
8060 describing the context, followed by a list of string metadata tags describing
8061 the profile behavior (e.g. ``cold`` and ``notcold``) above. The metadata nodes
8062 describing the context (e.g. ``!2`` and ``!4`` above) are unique ids
8063 corresponding to callsites, which can be matched to associated IR calls via
8064 :ref:`callsite metadata<md_callsite>`. In practice these ids are formed via
8065 a hash of the callsite's debug info, and the associated call may be in a
8066 different module. The contexts are listed in order from leaf-most call (the
8067 allocation itself) to the outermost callsite context required for uniquely
8068 identifying the described profile behavior (note this may not be the top of
8069 the profiled call stack).
8073 '``callsite``' Metadata
8074 ^^^^^^^^^^^^^^^^^^^^^^^^
8076 The ``callsite`` metadata is used to identify callsites involved in memory
8077 profile contexts described in :ref:`memprof metadata<md_memprof>`.
8079 It is attached both to the profile allocation calls (see the example in
8080 :ref:`memprof metadata<md_memprof>`), as well as to other callsites
8081 in profiled contexts described in heap allocation ``memprof`` metadata.
8085 .. code-block:: text
8087 %call = call ptr @_Z1Bb(void), !callsite !0
8088 !0 = !{i64 -6528110295079665978, i64 5462047985461644151}
8090 Each operand in the ``callsite`` metadata attachment is a unique id
8091 corresponding to a callsite (possibly inlined). In practice these ids are
8092 formed via a hash of the callsite's debug info. If the call was not inlined
8093 into any callers it will contain a single operand (id). If it was inlined
8094 it will contain a list of ids, including the ids of the callsites in the
8095 full inline sequence, in order from the leaf-most call's id to the outermost
8099 '``noalias.addrspace``' Metadata
8100 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
8102 The ``noalias.addrspace`` metadata is used to identify memory
8103 operations which cannot access objects allocated in a range of address
8104 spaces. It is attached to memory instructions, including
8105 :ref:`atomicrmw <i_atomicrmw>`, :ref:`cmpxchg <i_cmpxchg>`, and
8106 :ref:`call <i_call>` instructions.
8108 This follows the same form as :ref:`range metadata <range-metadata>`,
8109 except the field entries must be of type `i32`. The interpretation is
8110 the same numeric address spaces as applied to IR values.
8114 .. code-block:: llvm
8116 ; %ptr cannot point to an object allocated in addrspace(5)
8117 %rmw.valid = atomicrmw and ptr %ptr, i64 %value seq_cst, !noalias.addrspace !0
8119 ; Undefined behavior. The underlying object is allocated in one of the listed
8121 %alloca = alloca i64, addrspace(5)
8122 %alloca.cast = addrspacecast ptr addrspace(5) %alloca to ptr
8123 %rmw.ub = atomicrmw and ptr %alloca.cast, i64 %value seq_cst, !noalias.addrspace !0
8125 !0 = !{i32 5, i32 6} ; Exclude addrspace(5) only
8128 This is intended for use on targets with a notion of generic address
8129 spaces, which at runtime resolve to different physical memory
8130 spaces. The interpretation of the address space values is target
8131 specific. The behavior is undefined if the runtime memory address does
8132 resolve to an object defined in one of the indicated address spaces.
8135 Module Flags Metadata
8136 =====================
8138 Information about the module as a whole is difficult to convey to LLVM's
8139 subsystems. The LLVM IR isn't sufficient to transmit this information.
8140 The ``llvm.module.flags`` named metadata exists in order to facilitate
8141 this. These flags are in the form of key / value pairs --- much like a
8142 dictionary --- making it easy for any subsystem who cares about a flag to
8145 The ``llvm.module.flags`` metadata contains a list of metadata triplets.
8146 Each triplet has the following form:
8148 - The first element is a *behavior* flag, which specifies the behavior
8149 when two (or more) modules are merged together, and it encounters two
8150 (or more) metadata with the same ID. The supported behaviors are
8152 - The second element is a metadata string that is a unique ID for the
8153 metadata. Each module may only have one flag entry for each unique ID (not
8154 including entries with the **Require** behavior).
8155 - The third element is the value of the flag.
8157 When two (or more) modules are merged together, the resulting
8158 ``llvm.module.flags`` metadata is the union of the modules' flags. That is, for
8159 each unique metadata ID string, there will be exactly one entry in the merged
8160 modules ``llvm.module.flags`` metadata table, and the value for that entry will
8161 be determined by the merge behavior flag, as described below. The only exception
8162 is that entries with the *Require* behavior are always preserved.
8164 The following behaviors are supported:
8175 Emits an error if two values disagree, otherwise the resulting value
8176 is that of the operands.
8180 Emits a warning if two values disagree. The result value will be the
8181 operand for the flag from the first module being linked, unless the
8182 other module uses **Min** or **Max**, in which case the result will
8183 be **Min** (with the min value) or **Max** (with the max value),
8188 Adds a requirement that another module flag be present and have a
8189 specified value after linking is performed. The value must be a
8190 metadata pair, where the first element of the pair is the ID of the
8191 module flag to be restricted, and the second element of the pair is
8192 the value the module flag should be restricted to. This behavior can
8193 be used to restrict the allowable results (via triggering of an
8194 error) of linking IDs with the **Override** behavior.
8198 Uses the specified value, regardless of the behavior or value of the
8199 other module. If both modules specify **Override**, but the values
8200 differ, an error will be emitted.
8204 Appends the two values, which are required to be metadata nodes.
8208 Appends the two values, which are required to be metadata
8209 nodes. However, duplicate entries in the second list are dropped
8210 during the append operation.
8214 Takes the max of the two values, which are required to be integers.
8218 Takes the min of the two values, which are required to be non-negative integers.
8219 An absent module flag is treated as having the value 0.
8221 It is an error for a particular unique flag ID to have multiple behaviors,
8222 except in the case of **Require** (which adds restrictions on another metadata
8223 value) or **Override**.
8225 An example of module flags:
8227 .. code-block:: llvm
8229 !0 = !{ i32 1, !"foo", i32 1 }
8230 !1 = !{ i32 4, !"bar", i32 37 }
8231 !2 = !{ i32 2, !"qux", i32 42 }
8232 !3 = !{ i32 3, !"qux",
8237 !llvm.module.flags = !{ !0, !1, !2, !3 }
8239 - Metadata ``!0`` has the ID ``!"foo"`` and the value '1'. The behavior
8240 if two or more ``!"foo"`` flags are seen is to emit an error if their
8241 values are not equal.
8243 - Metadata ``!1`` has the ID ``!"bar"`` and the value '37'. The
8244 behavior if two or more ``!"bar"`` flags are seen is to use the value
8247 - Metadata ``!2`` has the ID ``!"qux"`` and the value '42'. The
8248 behavior if two or more ``!"qux"`` flags are seen is to emit a
8249 warning if their values are not equal.
8251 - Metadata ``!3`` has the ID ``!"qux"`` and the value:
8257 The behavior is to emit an error if the ``llvm.module.flags`` does not
8258 contain a flag with the ID ``!"foo"`` that has the value '1' after linking is
8261 Synthesized Functions Module Flags Metadata
8262 -------------------------------------------
8264 These metadata specify the default attributes synthesized functions should have.
8265 These metadata are currently respected by a few instrumentation passes, such as
8268 These metadata correspond to a few function attributes with significant code
8269 generation behaviors. Function attributes with just optimization purposes
8270 should not be listed because the performance impact of these synthesized
8273 - "frame-pointer": **Max**. The value can be 0, 1, or 2. A synthesized function
8274 will get the "frame-pointer" function attribute, with value being "none",
8275 "non-leaf", or "all", respectively.
8276 - "function_return_thunk_extern": The synthesized function will get the
8277 ``fn_return_thunk_extern`` function attribute.
8278 - "uwtable": **Max**. The value can be 0, 1, or 2. If the value is 1, a synthesized
8279 function will get the ``uwtable(sync)`` function attribute, if the value is 2,
8280 a synthesized function will get the ``uwtable(async)`` function attribute.
8282 Objective-C Garbage Collection Module Flags Metadata
8283 ----------------------------------------------------
8285 On the Mach-O platform, Objective-C stores metadata about garbage
8286 collection in a special section called "image info". The metadata
8287 consists of a version number and a bitmask specifying what types of
8288 garbage collection are supported (if any) by the file. If two or more
8289 modules are linked together their garbage collection metadata needs to
8290 be merged rather than appended together.
8292 The Objective-C garbage collection module flags metadata consists of the
8293 following key-value pairs:
8302 * - ``Objective-C Version``
8303 - **[Required]** --- The Objective-C ABI version. Valid values are 1 and 2.
8305 * - ``Objective-C Image Info Version``
8306 - **[Required]** --- The version of the image info section. Currently
8309 * - ``Objective-C Image Info Section``
8310 - **[Required]** --- The section to place the metadata. Valid values are
8311 ``"__OBJC, __image_info, regular"`` for Objective-C ABI version 1, and
8312 ``"__DATA,__objc_imageinfo, regular, no_dead_strip"`` for
8313 Objective-C ABI version 2.
8315 * - ``Objective-C Garbage Collection``
8316 - **[Required]** --- Specifies whether garbage collection is supported or
8317 not. Valid values are 0, for no garbage collection, and 2, for garbage
8318 collection supported.
8320 * - ``Objective-C GC Only``
8321 - **[Optional]** --- Specifies that only garbage collection is supported.
8322 If present, its value must be 6. This flag requires that the
8323 ``Objective-C Garbage Collection`` flag have the value 2.
8325 Some important flag interactions:
8327 - If a module with ``Objective-C Garbage Collection`` set to 0 is
8328 merged with a module with ``Objective-C Garbage Collection`` set to
8329 2, then the resulting module has the
8330 ``Objective-C Garbage Collection`` flag set to 0.
8331 - A module with ``Objective-C Garbage Collection`` set to 0 cannot be
8332 merged with a module with ``Objective-C GC Only`` set to 6.
8334 C type width Module Flags Metadata
8335 ----------------------------------
8337 The ARM backend emits a section into each generated object file describing the
8338 options that it was compiled with (in a compiler-independent way) to prevent
8339 linking incompatible objects, and to allow automatic library selection. Some
8340 of these options are not visible at the IR level, namely wchar_t width and enum
8343 To pass this information to the backend, these options are encoded in module
8344 flags metadata, using the following key-value pairs:
8354 - * 0 --- sizeof(wchar_t) == 4
8355 * 1 --- sizeof(wchar_t) == 2
8358 - * 0 --- Enums are at least as large as an ``int``.
8359 * 1 --- Enums are stored in the smallest integer type which can
8360 represent all of its values.
8362 For example, the following metadata section specifies that the module was
8363 compiled with a ``wchar_t`` width of 4 bytes, and the underlying type of an
8364 enum is the smallest type which can represent all of its values::
8366 !llvm.module.flags = !{!0, !1}
8367 !0 = !{i32 1, !"short_wchar", i32 1}
8368 !1 = !{i32 1, !"short_enum", i32 0}
8370 Stack Alignment Metadata
8371 ------------------------
8373 Changes the default stack alignment from the target ABI's implicit default
8374 stack alignment. Takes an i32 value in bytes. It is considered an error to link
8375 two modules together with different values for this metadata.
8379 !llvm.module.flags = !{!0}
8380 !0 = !{i32 1, !"override-stack-alignment", i32 8}
8382 This will change the stack alignment to 8B.
8384 Embedded Objects Names Metadata
8385 ===============================
8387 Offloading compilations need to embed device code into the host section table to
8388 create a fat binary. This metadata node references each global that will be
8389 embedded in the module. The primary use for this is to make referencing these
8390 globals more efficient in the IR. The metadata references nodes containing
8391 pointers to the global to be embedded followed by the section name it will be
8394 !llvm.embedded.objects = !{!0}
8395 !0 = !{ptr @object, !".section"}
8397 Automatic Linker Flags Named Metadata
8398 =====================================
8400 Some targets support embedding of flags to the linker inside individual object
8401 files. Typically this is used in conjunction with language extensions which
8402 allow source files to contain linker command line options, and have these
8403 automatically be transmitted to the linker via object files.
8405 These flags are encoded in the IR using named metadata with the name
8406 ``!llvm.linker.options``. Each operand is expected to be a metadata node
8407 which should be a list of other metadata nodes, each of which should be a
8408 list of metadata strings defining linker options.
8410 For example, the following metadata section specifies two separate sets of
8411 linker options, presumably to link against ``libz`` and the ``Cocoa``
8415 !1 = !{ !"-framework", !"Cocoa" }
8416 !llvm.linker.options = !{ !0, !1 }
8418 The metadata encoding as lists of lists of options, as opposed to a collapsed
8419 list of options, is chosen so that the IR encoding can use multiple option
8420 strings to specify e.g., a single library, while still having that specifier be
8421 preserved as an atomic element that can be recognized by a target specific
8422 assembly writer or object file emitter.
8424 Each individual option is required to be either a valid option for the target's
8425 linker, or an option that is reserved by the target specific assembly writer or
8426 object file emitter. No other aspect of these options is defined by the IR.
8428 Dependent Libs Named Metadata
8429 =============================
8431 Some targets support embedding of strings into object files to indicate
8432 a set of libraries to add to the link. Typically this is used in conjunction
8433 with language extensions which allow source files to explicitly declare the
8434 libraries they depend on, and have these automatically be transmitted to the
8435 linker via object files.
8437 The list is encoded in the IR using named metadata with the name
8438 ``!llvm.dependent-libraries``. Each operand is expected to be a metadata node
8439 which should contain a single string operand.
8441 For example, the following metadata section contains two library specifiers::
8443 !0 = !{!"a library specifier"}
8444 !1 = !{!"another library specifier"}
8445 !llvm.dependent-libraries = !{ !0, !1 }
8447 Each library specifier will be handled independently by the consuming linker.
8448 The effect of the library specifiers are defined by the consuming linker.
8455 Compiling with `ThinLTO <https://clang.llvm.org/docs/ThinLTO.html>`_
8456 causes the building of a compact summary of the module that is emitted into
8457 the bitcode. The summary is emitted into the LLVM assembly and identified
8458 in syntax by a caret ('``^``').
8460 The summary is parsed into a bitcode output, along with the Module
8461 IR, via the "``llvm-as``" tool. Tools that parse the Module IR for the purposes
8462 of optimization (e.g. "``clang -x ir``" and "``opt``"), will ignore the
8463 summary entries (just as they currently ignore summary entries in a bitcode
8466 Eventually, the summary will be parsed into a ModuleSummaryIndex object under
8467 the same conditions where summary index is currently built from bitcode.
8468 Specifically, tools that test the Thin Link portion of a ThinLTO compile
8469 (i.e. llvm-lto and llvm-lto2), or when parsing a combined index
8470 for a distributed ThinLTO backend via clang's "``-fthinlto-index=<>``" flag
8471 (this part is not yet implemented, use llvm-as to create a bitcode object
8472 before feeding into thin link tools for now).
8474 There are currently 3 types of summary entries in the LLVM assembly:
8475 :ref:`module paths<module_path_summary>`,
8476 :ref:`global values<gv_summary>`, and
8477 :ref:`type identifiers<typeid_summary>`.
8479 .. _module_path_summary:
8481 Module Path Summary Entry
8482 -------------------------
8484 Each module path summary entry lists a module containing global values included
8485 in the summary. For a single IR module there will be one such entry, but
8486 in a combined summary index produced during the thin link, there will be
8487 one module path entry per linked module with summary.
8491 .. code-block:: text
8493 ^0 = module: (path: "/path/to/file.o", hash: (2468601609, 1329373163, 1565878005, 638838075, 3148790418))
8495 The ``path`` field is a string path to the bitcode file, and the ``hash``
8496 field is the 160-bit SHA-1 hash of the IR bitcode contents, used for
8497 incremental builds and caching.
8501 Global Value Summary Entry
8502 --------------------------
8504 Each global value summary entry corresponds to a global value defined or
8505 referenced by a summarized module.
8509 .. code-block:: text
8511 ^4 = gv: (name: "f"[, summaries: (Summary)[, (Summary)]*]?) ; guid = 14740650423002898831
8513 For declarations, there will not be a summary list. For definitions, a
8514 global value will contain a list of summaries, one per module containing
8515 a definition. There can be multiple entries in a combined summary index
8516 for symbols with weak linkage.
8518 Each ``Summary`` format will depend on whether the global value is a
8519 :ref:`function<function_summary>`, :ref:`variable<variable_summary>`, or
8520 :ref:`alias<alias_summary>`.
8522 .. _function_summary:
8527 If the global value is a function, the ``Summary`` entry will look like:
8529 .. code-block:: text
8531 function: (module: ^0, flags: (linkage: external, notEligibleToImport: 0, live: 0, dsoLocal: 0), insts: 2[, FuncFlags]?[, Calls]?[, TypeIdInfo]?[, Params]?[, Refs]?
8533 The ``module`` field includes the summary entry id for the module containing
8534 this definition, and the ``flags`` field contains information such as
8535 the linkage type, a flag indicating whether it is legal to import the
8536 definition, whether it is globally live and whether the linker resolved it
8537 to a local definition (the latter two are populated during the thin link).
8538 The ``insts`` field contains the number of IR instructions in the function.
8539 Finally, there are several optional fields: :ref:`FuncFlags<funcflags_summary>`,
8540 :ref:`Calls<calls_summary>`, :ref:`TypeIdInfo<typeidinfo_summary>`,
8541 :ref:`Params<params_summary>`, :ref:`Refs<refs_summary>`.
8543 .. _variable_summary:
8545 Global Variable Summary
8546 ^^^^^^^^^^^^^^^^^^^^^^^
8548 If the global value is a variable, the ``Summary`` entry will look like:
8550 .. code-block:: text
8552 variable: (module: ^0, flags: (linkage: external, notEligibleToImport: 0, live: 0, dsoLocal: 0)[, Refs]?
8554 The variable entry contains a subset of the fields in a
8555 :ref:`function summary <function_summary>`, see the descriptions there.
8562 If the global value is an alias, the ``Summary`` entry will look like:
8564 .. code-block:: text
8566 alias: (module: ^0, flags: (linkage: external, notEligibleToImport: 0, live: 0, dsoLocal: 0), aliasee: ^2)
8568 The ``module`` and ``flags`` fields are as described for a
8569 :ref:`function summary <function_summary>`. The ``aliasee`` field
8570 contains a reference to the global value summary entry of the aliasee.
8572 .. _funcflags_summary:
8577 The optional ``FuncFlags`` field looks like:
8579 .. code-block:: text
8581 funcFlags: (readNone: 0, readOnly: 0, noRecurse: 0, returnDoesNotAlias: 0, noInline: 0, alwaysInline: 0, noUnwind: 1, mayThrow: 0, hasUnknownCall: 0)
8583 If unspecified, flags are assumed to hold the conservative ``false`` value of
8591 The optional ``Calls`` field looks like:
8593 .. code-block:: text
8595 calls: ((Callee)[, (Callee)]*)
8597 where each ``Callee`` looks like:
8599 .. code-block:: text
8601 callee: ^1[, hotness: None]?[, relbf: 0]?
8603 The ``callee`` refers to the summary entry id of the callee. At most one
8604 of ``hotness`` (which can take the values ``Unknown``, ``Cold``, ``None``,
8605 ``Hot``, and ``Critical``), and ``relbf`` (which holds the integer
8606 branch frequency relative to the entry frequency, scaled down by 2^8)
8607 may be specified. The defaults are ``Unknown`` and ``0``, respectively.
8614 The optional ``Params`` is used by ``StackSafety`` and looks like:
8616 .. code-block:: text
8618 Params: ((Param)[, (Param)]*)
8620 where each ``Param`` describes pointer parameter access inside of the
8621 function and looks like:
8623 .. code-block:: text
8625 param: 4, offset: [0, 5][, calls: ((Callee)[, (Callee)]*)]?
8627 where the first ``param`` is the number of the parameter it describes,
8628 ``offset`` is the inclusive range of offsets from the pointer parameter to bytes
8629 which can be accessed by the function. This range does not include accesses by
8630 function calls from ``calls`` list.
8632 where each ``Callee`` describes how parameter is forwarded into other
8633 functions and looks like:
8635 .. code-block:: text
8637 callee: ^3, param: 5, offset: [-3, 3]
8639 The ``callee`` refers to the summary entry id of the callee, ``param`` is
8640 the number of the callee parameter which points into the callers parameter
8641 with offset known to be inside of the ``offset`` range. ``calls`` will be
8642 consumed and removed by thin link stage to update ``Param::offset`` so it
8643 covers all accesses possible by ``calls``.
8645 Pointer parameter without corresponding ``Param`` is considered unsafe and we
8646 assume that access with any offset is possible.
8650 If we have the following function:
8652 .. code-block:: text
8654 define i64 @foo(ptr %0, ptr %1, ptr %2, i8 %3) {
8655 store ptr %1, ptr @x
8656 %5 = getelementptr inbounds i8, ptr %2, i64 5
8657 %6 = load i8, ptr %5
8658 %7 = getelementptr inbounds i8, ptr %2, i8 %3
8659 tail call void @bar(i8 %3, ptr %7)
8660 %8 = load i64, ptr %0
8664 We can expect the record like this:
8666 .. code-block:: text
8668 params: ((param: 0, offset: [0, 7]),(param: 2, offset: [5, 5], calls: ((callee: ^3, param: 1, offset: [-128, 127]))))
8670 The function may access just 8 bytes of the parameter %0 . ``calls`` is empty,
8671 so the parameter is either not used for function calls or ``offset`` already
8672 covers all accesses from nested function calls.
8673 Parameter %1 escapes, so access is unknown.
8674 The function itself can access just a single byte of the parameter %2. Additional
8675 access is possible inside of the ``@bar`` or ``^3``. The function adds signed
8676 offset to the pointer and passes the result as the argument %1 into ``^3``.
8677 This record itself does not tell us how ``^3`` will access the parameter.
8678 Parameter %3 is not a pointer.
8685 The optional ``Refs`` field looks like:
8687 .. code-block:: text
8689 refs: ((Ref)[, (Ref)]*)
8691 where each ``Ref`` contains a reference to the summary id of the referenced
8692 value (e.g. ``^1``).
8694 .. _typeidinfo_summary:
8699 The optional ``TypeIdInfo`` field, used for
8700 `Control Flow Integrity <https://clang.llvm.org/docs/ControlFlowIntegrity.html>`_,
8703 .. code-block:: text
8705 typeIdInfo: [(TypeTests)]?[, (TypeTestAssumeVCalls)]?[, (TypeCheckedLoadVCalls)]?[, (TypeTestAssumeConstVCalls)]?[, (TypeCheckedLoadConstVCalls)]?
8707 These optional fields have the following forms:
8712 .. code-block:: text
8714 typeTests: (TypeIdRef[, TypeIdRef]*)
8716 Where each ``TypeIdRef`` refers to a :ref:`type id<typeid_summary>`
8717 by summary id or ``GUID``.
8719 TypeTestAssumeVCalls
8720 """"""""""""""""""""
8722 .. code-block:: text
8724 typeTestAssumeVCalls: (VFuncId[, VFuncId]*)
8726 Where each VFuncId has the format:
8728 .. code-block:: text
8730 vFuncId: (TypeIdRef, offset: 16)
8732 Where each ``TypeIdRef`` refers to a :ref:`type id<typeid_summary>`
8733 by summary id or ``GUID`` preceded by a ``guid:`` tag.
8735 TypeCheckedLoadVCalls
8736 """""""""""""""""""""
8738 .. code-block:: text
8740 typeCheckedLoadVCalls: (VFuncId[, VFuncId]*)
8742 Where each VFuncId has the format described for ``TypeTestAssumeVCalls``.
8744 TypeTestAssumeConstVCalls
8745 """""""""""""""""""""""""
8747 .. code-block:: text
8749 typeTestAssumeConstVCalls: (ConstVCall[, ConstVCall]*)
8751 Where each ConstVCall has the format:
8753 .. code-block:: text
8755 (VFuncId, args: (Arg[, Arg]*))
8757 and where each VFuncId has the format described for ``TypeTestAssumeVCalls``,
8758 and each Arg is an integer argument number.
8760 TypeCheckedLoadConstVCalls
8761 """"""""""""""""""""""""""
8763 .. code-block:: text
8765 typeCheckedLoadConstVCalls: (ConstVCall[, ConstVCall]*)
8767 Where each ConstVCall has the format described for
8768 ``TypeTestAssumeConstVCalls``.
8772 Type ID Summary Entry
8773 ---------------------
8775 Each type id summary entry corresponds to a type identifier resolution
8776 which is generated during the LTO link portion of the compile when building
8777 with `Control Flow Integrity <https://clang.llvm.org/docs/ControlFlowIntegrity.html>`_,
8778 so these are only present in a combined summary index.
8782 .. code-block:: text
8784 ^4 = typeid: (name: "_ZTS1A", summary: (typeTestRes: (kind: allOnes, sizeM1BitWidth: 7[, alignLog2: 0]?[, sizeM1: 0]?[, bitMask: 0]?[, inlineBits: 0]?)[, WpdResolutions]?)) ; guid = 7004155349499253778
8786 The ``typeTestRes`` gives the type test resolution ``kind`` (which may
8787 be ``unsat``, ``byteArray``, ``inline``, ``single``, or ``allOnes``), and
8788 the ``size-1`` bit width. It is followed by optional flags, which default to 0,
8789 and an optional WpdResolutions (whole program devirtualization resolution)
8790 field that looks like:
8792 .. code-block:: text
8794 wpdResolutions: ((offset: 0, WpdRes)[, (offset: 1, WpdRes)]*
8796 where each entry is a mapping from the given byte offset to the whole-program
8797 devirtualization resolution WpdRes, that has one of the following formats:
8799 .. code-block:: text
8801 wpdRes: (kind: branchFunnel)
8802 wpdRes: (kind: singleImpl, singleImplName: "_ZN1A1nEi")
8803 wpdRes: (kind: indir)
8805 Additionally, each wpdRes has an optional ``resByArg`` field, which
8806 describes the resolutions for calls with all constant integer arguments:
8808 .. code-block:: text
8810 resByArg: (ResByArg[, ResByArg]*)
8814 .. code-block:: text
8816 args: (Arg[, Arg]*), byArg: (kind: UniformRetVal[, info: 0][, byte: 0][, bit: 0])
8818 Where the ``kind`` can be ``Indir``, ``UniformRetVal``, ``UniqueRetVal``
8819 or ``VirtualConstProp``. The ``info`` field is only used if the kind
8820 is ``UniformRetVal`` (indicates the uniform return value), or
8821 ``UniqueRetVal`` (holds the return value associated with the unique vtable
8822 (0 or 1)). The ``byte`` and ``bit`` fields are only used if the target does
8823 not support the use of absolute symbols to store constants.
8825 .. _intrinsicglobalvariables:
8827 Intrinsic Global Variables
8828 ==========================
8830 LLVM has a number of "magic" global variables that contain data that
8831 affect code generation or other IR semantics. These are documented here.
8832 All globals of this sort should have a section specified as
8833 "``llvm.metadata``". This section and all globals that start with
8834 "``llvm.``" are reserved for use by LLVM.
8838 The '``llvm.used``' Global Variable
8839 -----------------------------------
8841 The ``@llvm.used`` global is an array which has
8842 :ref:`appending linkage <linkage_appending>`. This array contains a list of
8843 pointers to named global variables, functions and aliases which may optionally
8844 have a pointer cast formed of bitcast or getelementptr. For example, a legal
8847 .. code-block:: llvm
8852 @llvm.used = appending global [2 x ptr] [
8855 ], section "llvm.metadata"
8857 If a symbol appears in the ``@llvm.used`` list, then the compiler, assembler,
8858 and linker are required to treat the symbol as if there is a reference to the
8859 symbol that it cannot see (which is why they have to be named). For example, if
8860 a variable has internal linkage and no references other than that from the
8861 ``@llvm.used`` list, it cannot be deleted. This is commonly used to represent
8862 references from inline asms and other things the compiler cannot "see", and
8863 corresponds to "``attribute((used))``" in GNU C.
8865 On some targets, the code generator must emit a directive to the
8866 assembler or object file to prevent the assembler and linker from
8867 removing the symbol.
8869 .. _gv_llvmcompilerused:
8871 The '``llvm.compiler.used``' Global Variable
8872 --------------------------------------------
8874 The ``@llvm.compiler.used`` directive is the same as the ``@llvm.used``
8875 directive, except that it only prevents the compiler from touching the
8876 symbol. On targets that support it, this allows an intelligent linker to
8877 optimize references to the symbol without being impeded as it would be
8880 This is a rare construct that should only be used in rare circumstances,
8881 and should not be exposed to source languages.
8883 .. _gv_llvmglobalctors:
8885 The '``llvm.global_ctors``' Global Variable
8886 -------------------------------------------
8888 .. code-block:: llvm
8890 %0 = type { i32, ptr, ptr }
8891 @llvm.global_ctors = appending global [1 x %0] [%0 { i32 65535, ptr @ctor, ptr @data }]
8893 The ``@llvm.global_ctors`` array contains a list of constructor
8894 functions, priorities, and an associated global or function.
8895 The functions referenced by this array will be called in ascending order
8896 of priority (i.e. lowest first) when the module is loaded. The order of
8897 functions with the same priority is not defined.
8899 If the third field is non-null, and points to a global variable
8900 or function, the initializer function will only run if the associated
8901 data from the current module is not discarded.
8902 On ELF the referenced global variable or function must be in a comdat.
8904 .. _llvmglobaldtors:
8906 The '``llvm.global_dtors``' Global Variable
8907 -------------------------------------------
8909 .. code-block:: llvm
8911 %0 = type { i32, ptr, ptr }
8912 @llvm.global_dtors = appending global [1 x %0] [%0 { i32 65535, ptr @dtor, ptr @data }]
8914 The ``@llvm.global_dtors`` array contains a list of destructor
8915 functions, priorities, and an associated global or function.
8916 The functions referenced by this array will be called in descending
8917 order of priority (i.e. highest first) when the module is unloaded. The
8918 order of functions with the same priority is not defined.
8920 If the third field is non-null, and points to a global variable
8921 or function, the destructor function will only run if the associated
8922 data from the current module is not discarded.
8923 On ELF the referenced global variable or function must be in a comdat.
8925 Instruction Reference
8926 =====================
8928 The LLVM instruction set consists of several different classifications
8929 of instructions: :ref:`terminator instructions <terminators>`, :ref:`binary
8930 instructions <binaryops>`, :ref:`bitwise binary
8931 instructions <bitwiseops>`, :ref:`memory instructions <memoryops>`, and
8932 :ref:`other instructions <otherops>`. There are also :ref:`debug records
8933 <debugrecords>`, which are not instructions themselves but are printed
8934 interleaved with instructions to describe changes in the state of the program's
8935 debug information at each position in the program's execution.
8939 Terminator Instructions
8940 -----------------------
8942 As mentioned :ref:`previously <functionstructure>`, every basic block in a
8943 program ends with a "Terminator" instruction, which indicates which
8944 block should be executed after the current block is finished. These
8945 terminator instructions typically yield a '``void``' value: they produce
8946 control flow, not values (the one exception being the
8947 ':ref:`invoke <i_invoke>`' instruction).
8949 The terminator instructions are: ':ref:`ret <i_ret>`',
8950 ':ref:`br <i_br>`', ':ref:`switch <i_switch>`',
8951 ':ref:`indirectbr <i_indirectbr>`', ':ref:`invoke <i_invoke>`',
8952 ':ref:`callbr <i_callbr>`'
8953 ':ref:`resume <i_resume>`', ':ref:`catchswitch <i_catchswitch>`',
8954 ':ref:`catchret <i_catchret>`',
8955 ':ref:`cleanupret <i_cleanupret>`',
8956 and ':ref:`unreachable <i_unreachable>`'.
8960 '``ret``' Instruction
8961 ^^^^^^^^^^^^^^^^^^^^^
8968 ret <type> <value> ; Return a value from a non-void function
8969 ret void ; Return from void function
8974 The '``ret``' instruction is used to return control flow (and optionally
8975 a value) from a function back to the caller.
8977 There are two forms of the '``ret``' instruction: one that returns a
8978 value and then causes control flow, and one that just causes control
8984 The '``ret``' instruction optionally accepts a single argument, the
8985 return value. The type of the return value must be a ':ref:`first
8986 class <t_firstclass>`' type.
8988 A function is not :ref:`well formed <wellformed>` if it has a non-void
8989 return type and contains a '``ret``' instruction with no return value or
8990 a return value with a type that does not match its type, or if it has a
8991 void return type and contains a '``ret``' instruction with a return
8997 When the '``ret``' instruction is executed, control flow returns back to
8998 the calling function's context. If the caller is a
8999 ":ref:`call <i_call>`" instruction, execution continues at the
9000 instruction after the call. If the caller was an
9001 ":ref:`invoke <i_invoke>`" instruction, execution continues at the
9002 beginning of the "normal" destination block. If the instruction returns
9003 a value, that value shall set the call or invoke instruction's return
9009 .. code-block:: llvm
9011 ret i32 5 ; Return an integer value of 5
9012 ret void ; Return from a void function
9013 ret { i32, i8 } { i32 4, i8 2 } ; Return a struct of values 4 and 2
9017 '``br``' Instruction
9018 ^^^^^^^^^^^^^^^^^^^^
9025 br i1 <cond>, label <iftrue>, label <iffalse>
9026 br label <dest> ; Unconditional branch
9031 The '``br``' instruction is used to cause control flow to transfer to a
9032 different basic block in the current function. There are two forms of
9033 this instruction, corresponding to a conditional branch and an
9034 unconditional branch.
9039 The conditional branch form of the '``br``' instruction takes a single
9040 '``i1``' value and two '``label``' values. The unconditional form of the
9041 '``br``' instruction takes a single '``label``' value as a target.
9046 Upon execution of a conditional '``br``' instruction, the '``i1``'
9047 argument is evaluated. If the value is ``true``, control flows to the
9048 '``iftrue``' ``label`` argument. If "cond" is ``false``, control flows
9049 to the '``iffalse``' ``label`` argument.
9050 If '``cond``' is ``poison`` or ``undef``, this instruction has undefined
9056 .. code-block:: llvm
9059 %cond = icmp eq i32 %a, %b
9060 br i1 %cond, label %IfEqual, label %IfUnequal
9068 '``switch``' Instruction
9069 ^^^^^^^^^^^^^^^^^^^^^^^^
9076 switch <intty> <value>, label <defaultdest> [ <intty> <val>, label <dest> ... ]
9081 The '``switch``' instruction is used to transfer control flow to one of
9082 several different places. It is a generalization of the '``br``'
9083 instruction, allowing a branch to occur to one of many possible
9089 The '``switch``' instruction uses three parameters: an integer
9090 comparison value '``value``', a default '``label``' destination, and an
9091 array of pairs of comparison value constants and '``label``'s. The table
9092 is not allowed to contain duplicate constant entries.
9097 The ``switch`` instruction specifies a table of values and destinations.
9098 When the '``switch``' instruction is executed, this table is searched
9099 for the given value. If the value is found, control flow is transferred
9100 to the corresponding destination; otherwise, control flow is transferred
9101 to the default destination.
9102 If '``value``' is ``poison`` or ``undef``, this instruction has undefined
9108 Depending on properties of the target machine and the particular
9109 ``switch`` instruction, this instruction may be code generated in
9110 different ways. For example, it could be generated as a series of
9111 chained conditional branches or with a lookup table.
9116 .. code-block:: llvm
9118 ; Emulate a conditional br instruction
9119 %Val = zext i1 %value to i32
9120 switch i32 %Val, label %truedest [ i32 0, label %falsedest ]
9122 ; Emulate an unconditional br instruction
9123 switch i32 0, label %dest [ ]
9125 ; Implement a jump table:
9126 switch i32 %val, label %otherwise [ i32 0, label %onzero
9128 i32 2, label %ontwo ]
9132 '``indirectbr``' Instruction
9133 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
9140 indirectbr ptr <address>, [ label <dest1>, label <dest2>, ... ]
9145 The '``indirectbr``' instruction implements an indirect branch to a
9146 label within the current function, whose address is specified by
9147 "``address``". Address must be derived from a
9148 :ref:`blockaddress <blockaddress>` constant.
9153 The '``address``' argument is the address of the label to jump to. The
9154 rest of the arguments indicate the full set of possible destinations
9155 that the address may point to. Blocks are allowed to occur multiple
9156 times in the destination list, though this isn't particularly useful.
9158 This destination list is required so that dataflow analysis has an
9159 accurate understanding of the CFG.
9164 Control transfers to the block specified in the address argument. All
9165 possible destination blocks must be listed in the label list, otherwise
9166 this instruction has undefined behavior. This implies that jumps to
9167 labels defined in other functions have undefined behavior as well.
9168 If '``address``' is ``poison`` or ``undef``, this instruction has undefined
9174 This is typically implemented with a jump through a register.
9179 .. code-block:: llvm
9181 indirectbr ptr %Addr, [ label %bb1, label %bb2, label %bb3 ]
9185 '``invoke``' Instruction
9186 ^^^^^^^^^^^^^^^^^^^^^^^^
9193 <result> = invoke [cconv] [ret attrs] [addrspace(<num>)] <ty>|<fnty> <fnptrval>(<function args>) [fn attrs]
9194 [operand bundles] to label <normal label> unwind label <exception label>
9199 The '``invoke``' instruction causes control to transfer to a specified
9200 function, with the possibility of control flow transfer to either the
9201 '``normal``' label or the '``exception``' label. If the callee function
9202 returns with the "``ret``" instruction, control flow will return to the
9203 "normal" label. If the callee (or any indirect callees) returns via the
9204 ":ref:`resume <i_resume>`" instruction or other exception handling
9205 mechanism, control is interrupted and continued at the dynamically
9206 nearest "exception" label.
9208 The '``exception``' label is a `landing
9209 pad <ExceptionHandling.html#overview>`_ for the exception. As such,
9210 '``exception``' label is required to have the
9211 ":ref:`landingpad <i_landingpad>`" instruction, which contains the
9212 information about the behavior of the program after unwinding happens,
9213 as its first non-PHI instruction. The restrictions on the
9214 "``landingpad``" instruction's tightly couples it to the "``invoke``"
9215 instruction, so that the important information contained within the
9216 "``landingpad``" instruction can't be lost through normal code motion.
9221 This instruction requires several arguments:
9223 #. The optional "cconv" marker indicates which :ref:`calling
9224 convention <callingconv>` the call should use. If none is
9225 specified, the call defaults to using C calling conventions.
9226 #. The optional :ref:`Parameter Attributes <paramattrs>` list for return
9227 values. Only '``zeroext``', '``signext``', '``noext``', and '``inreg``'
9228 attributes are valid here.
9229 #. The optional addrspace attribute can be used to indicate the address space
9230 of the called function. If it is not specified, the program address space
9231 from the :ref:`datalayout string<langref_datalayout>` will be used.
9232 #. '``ty``': the type of the call instruction itself which is also the
9233 type of the return value. Functions that return no value are marked
9235 #. '``fnty``': shall be the signature of the function being invoked. The
9236 argument types must match the types implied by this signature. This
9237 type can be omitted if the function is not varargs.
9238 #. '``fnptrval``': An LLVM value containing a pointer to a function to
9239 be invoked. In most cases, this is a direct function invocation, but
9240 indirect ``invoke``'s are just as possible, calling an arbitrary pointer
9242 #. '``function args``': argument list whose types match the function
9243 signature argument types and parameter attributes. All arguments must
9244 be of :ref:`first class <t_firstclass>` type. If the function signature
9245 indicates the function accepts a variable number of arguments, the
9246 extra arguments can be specified.
9247 #. '``normal label``': the label reached when the called function
9248 executes a '``ret``' instruction.
9249 #. '``exception label``': the label reached when a callee returns via
9250 the :ref:`resume <i_resume>` instruction or other exception handling
9252 #. The optional :ref:`function attributes <fnattrs>` list.
9253 #. The optional :ref:`operand bundles <opbundles>` list.
9258 This instruction is designed to operate as a standard '``call``'
9259 instruction in most regards. The primary difference is that it
9260 establishes an association with a label, which is used by the runtime
9261 library to unwind the stack.
9263 This instruction is used in languages with destructors to ensure that
9264 proper cleanup is performed in the case of either a ``longjmp`` or a
9265 thrown exception. Additionally, this is important for implementation of
9266 '``catch``' clauses in high-level languages that support them.
9268 For the purposes of the SSA form, the definition of the value returned
9269 by the '``invoke``' instruction is deemed to occur on the edge from the
9270 current block to the "normal" label. If the callee unwinds then no
9271 return value is available.
9276 .. code-block:: llvm
9278 %retval = invoke i32 @Test(i32 15) to label %Continue
9279 unwind label %TestCleanup ; i32:retval set
9280 %retval = invoke coldcc i32 %Testfnptr(i32 15) to label %Continue
9281 unwind label %TestCleanup ; i32:retval set
9285 '``callbr``' Instruction
9286 ^^^^^^^^^^^^^^^^^^^^^^^^
9293 <result> = callbr [cconv] [ret attrs] [addrspace(<num>)] <ty>|<fnty> <fnptrval>(<function args>) [fn attrs]
9294 [operand bundles] to label <fallthrough label> [indirect labels]
9299 The '``callbr``' instruction causes control to transfer to a specified
9300 function, with the possibility of control flow transfer to either the
9301 '``fallthrough``' label or one of the '``indirect``' labels.
9303 This instruction should only be used to implement the "goto" feature of gcc
9304 style inline assembly. Any other usage is an error in the IR verifier.
9306 Note that in order to support outputs along indirect edges, LLVM may need to
9307 split critical edges, which may require synthesizing a replacement block for
9308 the ``indirect labels``. Therefore, the address of a label as seen by another
9309 ``callbr`` instruction, or for a :ref:`blockaddress <blockaddress>` constant,
9310 may not be equal to the address provided for the same block to this
9311 instruction's ``indirect labels`` operand. The assembly code may only transfer
9312 control to addresses provided via this instruction's ``indirect labels``.
9317 This instruction requires several arguments:
9319 #. The optional "cconv" marker indicates which :ref:`calling
9320 convention <callingconv>` the call should use. If none is
9321 specified, the call defaults to using C calling conventions.
9322 #. The optional :ref:`Parameter Attributes <paramattrs>` list for return
9323 values. Only '``zeroext``', '``signext``', '``noext``', and '``inreg``'
9324 attributes are valid here.
9325 #. The optional addrspace attribute can be used to indicate the address space
9326 of the called function. If it is not specified, the program address space
9327 from the :ref:`datalayout string<langref_datalayout>` will be used.
9328 #. '``ty``': the type of the call instruction itself which is also the
9329 type of the return value. Functions that return no value are marked
9331 #. '``fnty``': shall be the signature of the function being called. The
9332 argument types must match the types implied by this signature. This
9333 type can be omitted if the function is not varargs.
9334 #. '``fnptrval``': An LLVM value containing a pointer to a function to
9335 be called. In most cases, this is a direct function call, but
9336 other ``callbr``'s are just as possible, calling an arbitrary pointer
9338 #. '``function args``': argument list whose types match the function
9339 signature argument types and parameter attributes. All arguments must
9340 be of :ref:`first class <t_firstclass>` type. If the function signature
9341 indicates the function accepts a variable number of arguments, the
9342 extra arguments can be specified.
9343 #. '``fallthrough label``': the label reached when the inline assembly's
9344 execution exits the bottom.
9345 #. '``indirect labels``': the labels reached when a callee transfers control
9346 to a location other than the '``fallthrough label``'. Label constraints
9347 refer to these destinations.
9348 #. The optional :ref:`function attributes <fnattrs>` list.
9349 #. The optional :ref:`operand bundles <opbundles>` list.
9354 This instruction is designed to operate as a standard '``call``'
9355 instruction in most regards. The primary difference is that it
9356 establishes an association with additional labels to define where control
9357 flow goes after the call.
9359 The output values of a '``callbr``' instruction are available both in the
9360 the '``fallthrough``' block, and any '``indirect``' blocks(s).
9362 The only use of this today is to implement the "goto" feature of gcc inline
9363 assembly where additional labels can be provided as locations for the inline
9364 assembly to jump to.
9369 .. code-block:: llvm
9371 ; "asm goto" without output constraints.
9372 callbr void asm "", "r,!i"(i32 %x)
9373 to label %fallthrough [label %indirect]
9375 ; "asm goto" with output constraints.
9376 <result> = callbr i32 asm "", "=r,r,!i"(i32 %x)
9377 to label %fallthrough [label %indirect]
9381 '``resume``' Instruction
9382 ^^^^^^^^^^^^^^^^^^^^^^^^
9389 resume <type> <value>
9394 The '``resume``' instruction is a terminator instruction that has no
9400 The '``resume``' instruction requires one argument, which must have the
9401 same type as the result of any '``landingpad``' instruction in the same
9407 The '``resume``' instruction resumes propagation of an existing
9408 (in-flight) exception whose unwinding was interrupted with a
9409 :ref:`landingpad <i_landingpad>` instruction.
9414 .. code-block:: llvm
9416 resume { ptr, i32 } %exn
9420 '``catchswitch``' Instruction
9421 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
9428 <resultval> = catchswitch within <parent> [ label <handler1>, label <handler2>, ... ] unwind to caller
9429 <resultval> = catchswitch within <parent> [ label <handler1>, label <handler2>, ... ] unwind label <default>
9434 The '``catchswitch``' instruction is used by `LLVM's exception handling system
9435 <ExceptionHandling.html#overview>`_ to describe the set of possible catch handlers
9436 that may be executed by the :ref:`EH personality routine <personalityfn>`.
9441 The ``parent`` argument is the token of the funclet that contains the
9442 ``catchswitch`` instruction. If the ``catchswitch`` is not inside a funclet,
9443 this operand may be the token ``none``.
9445 The ``default`` argument is the label of another basic block beginning with
9446 either a ``cleanuppad`` or ``catchswitch`` instruction. This unwind destination
9447 must be a legal target with respect to the ``parent`` links, as described in
9448 the `exception handling documentation\ <ExceptionHandling.html#wineh-constraints>`_.
9450 The ``handlers`` are a nonempty list of successor blocks that each begin with a
9451 :ref:`catchpad <i_catchpad>` instruction.
9456 Executing this instruction transfers control to one of the successors in
9457 ``handlers``, if appropriate, or continues to unwind via the unwind label if
9460 The ``catchswitch`` is both a terminator and a "pad" instruction, meaning that
9461 it must be both the first non-phi instruction and last instruction in the basic
9462 block. Therefore, it must be the only non-phi instruction in the block.
9467 .. code-block:: text
9470 %cs1 = catchswitch within none [label %handler0, label %handler1] unwind to caller
9472 %cs2 = catchswitch within %parenthandler [label %handler0] unwind label %cleanup
9476 '``catchret``' Instruction
9477 ^^^^^^^^^^^^^^^^^^^^^^^^^^
9484 catchret from <token> to label <normal>
9489 The '``catchret``' instruction is a terminator instruction that has a
9496 The first argument to a '``catchret``' indicates which ``catchpad`` it
9497 exits. It must be a :ref:`catchpad <i_catchpad>`.
9498 The second argument to a '``catchret``' specifies where control will
9504 The '``catchret``' instruction ends an existing (in-flight) exception whose
9505 unwinding was interrupted with a :ref:`catchpad <i_catchpad>` instruction. The
9506 :ref:`personality function <personalityfn>` gets a chance to execute arbitrary
9507 code to, for example, destroy the active exception. Control then transfers to
9510 The ``token`` argument must be a token produced by a ``catchpad`` instruction.
9511 If the specified ``catchpad`` is not the most-recently-entered not-yet-exited
9512 funclet pad (as described in the `EH documentation\ <ExceptionHandling.html#wineh-constraints>`_),
9513 the ``catchret``'s behavior is undefined.
9518 .. code-block:: text
9520 catchret from %catch to label %continue
9524 '``cleanupret``' Instruction
9525 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
9532 cleanupret from <value> unwind label <continue>
9533 cleanupret from <value> unwind to caller
9538 The '``cleanupret``' instruction is a terminator instruction that has
9539 an optional successor.
9545 The '``cleanupret``' instruction requires one argument, which indicates
9546 which ``cleanuppad`` it exits, and must be a :ref:`cleanuppad <i_cleanuppad>`.
9547 If the specified ``cleanuppad`` is not the most-recently-entered not-yet-exited
9548 funclet pad (as described in the `EH documentation\ <ExceptionHandling.html#wineh-constraints>`_),
9549 the ``cleanupret``'s behavior is undefined.
9551 The '``cleanupret``' instruction also has an optional successor, ``continue``,
9552 which must be the label of another basic block beginning with either a
9553 ``cleanuppad`` or ``catchswitch`` instruction. This unwind destination must
9554 be a legal target with respect to the ``parent`` links, as described in the
9555 `exception handling documentation\ <ExceptionHandling.html#wineh-constraints>`_.
9560 The '``cleanupret``' instruction indicates to the
9561 :ref:`personality function <personalityfn>` that one
9562 :ref:`cleanuppad <i_cleanuppad>` it transferred control to has ended.
9563 It transfers control to ``continue`` or unwinds out of the function.
9568 .. code-block:: text
9570 cleanupret from %cleanup unwind to caller
9571 cleanupret from %cleanup unwind label %continue
9575 '``unreachable``' Instruction
9576 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
9588 The '``unreachable``' instruction has no defined semantics. This
9589 instruction is used to inform the optimizer that a particular portion of
9590 the code is not reachable. This can be used to indicate that the code
9591 after a no-return function cannot be reached, and other facts.
9596 The '``unreachable``' instruction has no defined semantics.
9603 Unary operators require a single operand, execute an operation on
9604 it, and produce a single value. The operand might represent multiple
9605 data, as is the case with the :ref:`vector <t_vector>` data type. The
9606 result value has the same type as its operand.
9610 '``fneg``' Instruction
9611 ^^^^^^^^^^^^^^^^^^^^^^
9618 <result> = fneg [fast-math flags]* <ty> <op1> ; yields ty:result
9623 The '``fneg``' instruction returns the negation of its operand.
9628 The argument to the '``fneg``' instruction must be a
9629 :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of
9630 floating-point values.
9635 The value produced is a copy of the operand with its sign bit flipped.
9636 The value is otherwise completely identical; in particular, if the input is a
9637 NaN, then the quiet/signaling bit and payload are perfectly preserved.
9639 This instruction can also take any number of :ref:`fast-math
9640 flags <fastmath>`, which are optimization hints to enable otherwise
9641 unsafe floating-point optimizations:
9646 .. code-block:: text
9648 <result> = fneg float %val ; yields float:result = -%var
9655 Binary operators are used to do most of the computation in a program.
9656 They require two operands of the same type, execute an operation on
9657 them, and produce a single value. The operands might represent multiple
9658 data, as is the case with the :ref:`vector <t_vector>` data type. The
9659 result value has the same type as its operands.
9661 There are several different binary operators:
9665 '``add``' Instruction
9666 ^^^^^^^^^^^^^^^^^^^^^
9673 <result> = add <ty> <op1>, <op2> ; yields ty:result
9674 <result> = add nuw <ty> <op1>, <op2> ; yields ty:result
9675 <result> = add nsw <ty> <op1>, <op2> ; yields ty:result
9676 <result> = add nuw nsw <ty> <op1>, <op2> ; yields ty:result
9681 The '``add``' instruction returns the sum of its two operands.
9686 The two arguments to the '``add``' instruction must be
9687 :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
9688 arguments must have identical types.
9693 The value produced is the integer sum of the two operands.
9695 If the sum has unsigned overflow, the result returned is the
9696 mathematical result modulo 2\ :sup:`n`\ , where n is the bit width of
9699 Because LLVM integers use a two's complement representation, this
9700 instruction is appropriate for both signed and unsigned integers.
9702 ``nuw`` and ``nsw`` stand for "No Unsigned Wrap" and "No Signed Wrap",
9703 respectively. If the ``nuw`` and/or ``nsw`` keywords are present, the
9704 result value of the ``add`` is a :ref:`poison value <poisonvalues>` if
9705 unsigned and/or signed overflow, respectively, occurs.
9710 .. code-block:: text
9712 <result> = add i32 4, %var ; yields i32:result = 4 + %var
9716 '``fadd``' Instruction
9717 ^^^^^^^^^^^^^^^^^^^^^^
9724 <result> = fadd [fast-math flags]* <ty> <op1>, <op2> ; yields ty:result
9729 The '``fadd``' instruction returns the sum of its two operands.
9734 The two arguments to the '``fadd``' instruction must be
9735 :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of
9736 floating-point values. Both arguments must have identical types.
9741 The value produced is the floating-point sum of the two operands.
9742 This instruction is assumed to execute in the default :ref:`floating-point
9743 environment <floatenv>`.
9744 This instruction can also take any number of :ref:`fast-math
9745 flags <fastmath>`, which are optimization hints to enable otherwise
9746 unsafe floating-point optimizations:
9751 .. code-block:: text
9753 <result> = fadd float 4.0, %var ; yields float:result = 4.0 + %var
9757 '``sub``' Instruction
9758 ^^^^^^^^^^^^^^^^^^^^^
9765 <result> = sub <ty> <op1>, <op2> ; yields ty:result
9766 <result> = sub nuw <ty> <op1>, <op2> ; yields ty:result
9767 <result> = sub nsw <ty> <op1>, <op2> ; yields ty:result
9768 <result> = sub nuw nsw <ty> <op1>, <op2> ; yields ty:result
9773 The '``sub``' instruction returns the difference of its two operands.
9775 Note that the '``sub``' instruction is used to represent the '``neg``'
9776 instruction present in most other intermediate representations.
9781 The two arguments to the '``sub``' instruction must be
9782 :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
9783 arguments must have identical types.
9788 The value produced is the integer difference of the two operands.
9790 If the difference has unsigned overflow, the result returned is the
9791 mathematical result modulo 2\ :sup:`n`\ , where n is the bit width of
9794 Because LLVM integers use a two's complement representation, this
9795 instruction is appropriate for both signed and unsigned integers.
9797 ``nuw`` and ``nsw`` stand for "No Unsigned Wrap" and "No Signed Wrap",
9798 respectively. If the ``nuw`` and/or ``nsw`` keywords are present, the
9799 result value of the ``sub`` is a :ref:`poison value <poisonvalues>` if
9800 unsigned and/or signed overflow, respectively, occurs.
9805 .. code-block:: text
9807 <result> = sub i32 4, %var ; yields i32:result = 4 - %var
9808 <result> = sub i32 0, %val ; yields i32:result = -%var
9812 '``fsub``' Instruction
9813 ^^^^^^^^^^^^^^^^^^^^^^
9820 <result> = fsub [fast-math flags]* <ty> <op1>, <op2> ; yields ty:result
9825 The '``fsub``' instruction returns the difference of its two operands.
9830 The two arguments to the '``fsub``' instruction must be
9831 :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of
9832 floating-point values. Both arguments must have identical types.
9837 The value produced is the floating-point difference of the two operands.
9838 This instruction is assumed to execute in the default :ref:`floating-point
9839 environment <floatenv>`.
9840 This instruction can also take any number of :ref:`fast-math
9841 flags <fastmath>`, which are optimization hints to enable otherwise
9842 unsafe floating-point optimizations:
9847 .. code-block:: text
9849 <result> = fsub float 4.0, %var ; yields float:result = 4.0 - %var
9850 <result> = fsub float -0.0, %val ; yields float:result = -%var
9854 '``mul``' Instruction
9855 ^^^^^^^^^^^^^^^^^^^^^
9862 <result> = mul <ty> <op1>, <op2> ; yields ty:result
9863 <result> = mul nuw <ty> <op1>, <op2> ; yields ty:result
9864 <result> = mul nsw <ty> <op1>, <op2> ; yields ty:result
9865 <result> = mul nuw nsw <ty> <op1>, <op2> ; yields ty:result
9870 The '``mul``' instruction returns the product of its two operands.
9875 The two arguments to the '``mul``' instruction must be
9876 :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
9877 arguments must have identical types.
9882 The value produced is the integer product of the two operands.
9884 If the result of the multiplication has unsigned overflow, the result
9885 returned is the mathematical result modulo 2\ :sup:`n`\ , where n is the
9886 bit width of the result.
9888 Because LLVM integers use a two's complement representation, and the
9889 result is the same width as the operands, this instruction returns the
9890 correct result for both signed and unsigned integers. If a full product
9891 (e.g. ``i32`` * ``i32`` -> ``i64``) is needed, the operands should be
9892 sign-extended or zero-extended as appropriate to the width of the full
9895 ``nuw`` and ``nsw`` stand for "No Unsigned Wrap" and "No Signed Wrap",
9896 respectively. If the ``nuw`` and/or ``nsw`` keywords are present, the
9897 result value of the ``mul`` is a :ref:`poison value <poisonvalues>` if
9898 unsigned and/or signed overflow, respectively, occurs.
9903 .. code-block:: text
9905 <result> = mul i32 4, %var ; yields i32:result = 4 * %var
9909 '``fmul``' Instruction
9910 ^^^^^^^^^^^^^^^^^^^^^^
9917 <result> = fmul [fast-math flags]* <ty> <op1>, <op2> ; yields ty:result
9922 The '``fmul``' instruction returns the product of its two operands.
9927 The two arguments to the '``fmul``' instruction must be
9928 :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of
9929 floating-point values. Both arguments must have identical types.
9934 The value produced is the floating-point product of the two operands.
9935 This instruction is assumed to execute in the default :ref:`floating-point
9936 environment <floatenv>`.
9937 This instruction can also take any number of :ref:`fast-math
9938 flags <fastmath>`, which are optimization hints to enable otherwise
9939 unsafe floating-point optimizations:
9944 .. code-block:: text
9946 <result> = fmul float 4.0, %var ; yields float:result = 4.0 * %var
9950 '``udiv``' Instruction
9951 ^^^^^^^^^^^^^^^^^^^^^^
9958 <result> = udiv <ty> <op1>, <op2> ; yields ty:result
9959 <result> = udiv exact <ty> <op1>, <op2> ; yields ty:result
9964 The '``udiv``' instruction returns the quotient of its two operands.
9969 The two arguments to the '``udiv``' instruction must be
9970 :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
9971 arguments must have identical types.
9976 The value produced is the unsigned integer quotient of the two operands.
9978 Note that unsigned integer division and signed integer division are
9979 distinct operations; for signed integer division, use '``sdiv``'.
9981 Division by zero is undefined behavior. For vectors, if any element
9982 of the divisor is zero, the operation has undefined behavior.
9985 If the ``exact`` keyword is present, the result value of the ``udiv`` is
9986 a :ref:`poison value <poisonvalues>` if %op1 is not a multiple of %op2 (as
9987 such, "((a udiv exact b) mul b) == a").
9992 .. code-block:: text
9994 <result> = udiv i32 4, %var ; yields i32:result = 4 / %var
9998 '``sdiv``' Instruction
9999 ^^^^^^^^^^^^^^^^^^^^^^
10006 <result> = sdiv <ty> <op1>, <op2> ; yields ty:result
10007 <result> = sdiv exact <ty> <op1>, <op2> ; yields ty:result
10012 The '``sdiv``' instruction returns the quotient of its two operands.
10017 The two arguments to the '``sdiv``' instruction must be
10018 :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
10019 arguments must have identical types.
10024 The value produced is the signed integer quotient of the two operands
10025 rounded towards zero.
10027 Note that signed integer division and unsigned integer division are
10028 distinct operations; for unsigned integer division, use '``udiv``'.
10030 Division by zero is undefined behavior. For vectors, if any element
10031 of the divisor is zero, the operation has undefined behavior.
10032 Overflow also leads to undefined behavior; this is a rare case, but can
10033 occur, for example, by doing a 32-bit division of -2147483648 by -1.
10035 If the ``exact`` keyword is present, the result value of the ``sdiv`` is
10036 a :ref:`poison value <poisonvalues>` if the result would be rounded.
10041 .. code-block:: text
10043 <result> = sdiv i32 4, %var ; yields i32:result = 4 / %var
10047 '``fdiv``' Instruction
10048 ^^^^^^^^^^^^^^^^^^^^^^
10055 <result> = fdiv [fast-math flags]* <ty> <op1>, <op2> ; yields ty:result
10060 The '``fdiv``' instruction returns the quotient of its two operands.
10065 The two arguments to the '``fdiv``' instruction must be
10066 :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of
10067 floating-point values. Both arguments must have identical types.
10072 The value produced is the floating-point quotient of the two operands.
10073 This instruction is assumed to execute in the default :ref:`floating-point
10074 environment <floatenv>`.
10075 This instruction can also take any number of :ref:`fast-math
10076 flags <fastmath>`, which are optimization hints to enable otherwise
10077 unsafe floating-point optimizations:
10082 .. code-block:: text
10084 <result> = fdiv float 4.0, %var ; yields float:result = 4.0 / %var
10088 '``urem``' Instruction
10089 ^^^^^^^^^^^^^^^^^^^^^^
10096 <result> = urem <ty> <op1>, <op2> ; yields ty:result
10101 The '``urem``' instruction returns the remainder from the unsigned
10102 division of its two arguments.
10107 The two arguments to the '``urem``' instruction must be
10108 :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
10109 arguments must have identical types.
10114 This instruction returns the unsigned integer *remainder* of a division.
10115 This instruction always performs an unsigned division to get the
10118 Note that unsigned integer remainder and signed integer remainder are
10119 distinct operations; for signed integer remainder, use '``srem``'.
10121 Taking the remainder of a division by zero is undefined behavior.
10122 For vectors, if any element of the divisor is zero, the operation has
10123 undefined behavior.
10128 .. code-block:: text
10130 <result> = urem i32 4, %var ; yields i32:result = 4 % %var
10134 '``srem``' Instruction
10135 ^^^^^^^^^^^^^^^^^^^^^^
10142 <result> = srem <ty> <op1>, <op2> ; yields ty:result
10147 The '``srem``' instruction returns the remainder from the signed
10148 division of its two operands. This instruction can also take
10149 :ref:`vector <t_vector>` versions of the values in which case the elements
10155 The two arguments to the '``srem``' instruction must be
10156 :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
10157 arguments must have identical types.
10162 This instruction returns the *remainder* of a division (where the result
10163 is either zero or has the same sign as the dividend, ``op1``), not the
10164 *modulo* operator (where the result is either zero or has the same sign
10165 as the divisor, ``op2``) of a value. For more information about the
10166 difference, see `The Math
10167 Forum <http://mathforum.org/dr.math/problems/anne.4.28.99.html>`_. For a
10168 table of how this is implemented in various languages, please see
10170 operation <http://en.wikipedia.org/wiki/Modulo_operation>`_.
10172 Note that signed integer remainder and unsigned integer remainder are
10173 distinct operations; for unsigned integer remainder, use '``urem``'.
10175 Taking the remainder of a division by zero is undefined behavior.
10176 For vectors, if any element of the divisor is zero, the operation has
10177 undefined behavior.
10178 Overflow also leads to undefined behavior; this is a rare case, but can
10179 occur, for example, by taking the remainder of a 32-bit division of
10180 -2147483648 by -1. (The remainder doesn't actually overflow, but this
10181 rule lets srem be implemented using instructions that return both the
10182 result of the division and the remainder.)
10187 .. code-block:: text
10189 <result> = srem i32 4, %var ; yields i32:result = 4 % %var
10193 '``frem``' Instruction
10194 ^^^^^^^^^^^^^^^^^^^^^^
10201 <result> = frem [fast-math flags]* <ty> <op1>, <op2> ; yields ty:result
10206 The '``frem``' instruction returns the remainder from the division of
10211 The instruction is implemented as a call to libm's '``fmod``'
10212 for some targets, and using the instruction may thus require linking libm.
10218 The two arguments to the '``frem``' instruction must be
10219 :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of
10220 floating-point values. Both arguments must have identical types.
10225 The value produced is the floating-point remainder of the two operands.
10226 This is the same output as a libm '``fmod``' function, but without any
10227 possibility of setting ``errno``. The remainder has the same sign as the
10229 This instruction is assumed to execute in the default :ref:`floating-point
10230 environment <floatenv>`.
10231 This instruction can also take any number of :ref:`fast-math
10232 flags <fastmath>`, which are optimization hints to enable otherwise
10233 unsafe floating-point optimizations:
10238 .. code-block:: text
10240 <result> = frem float 4.0, %var ; yields float:result = 4.0 % %var
10244 Bitwise Binary Operations
10245 -------------------------
10247 Bitwise binary operators are used to do various forms of bit-twiddling
10248 in a program. They are generally very efficient instructions and can
10249 commonly be strength reduced from other instructions. They require two
10250 operands of the same type, execute an operation on them, and produce a
10251 single value. The resulting value is the same type as its operands.
10255 '``shl``' Instruction
10256 ^^^^^^^^^^^^^^^^^^^^^
10263 <result> = shl <ty> <op1>, <op2> ; yields ty:result
10264 <result> = shl nuw <ty> <op1>, <op2> ; yields ty:result
10265 <result> = shl nsw <ty> <op1>, <op2> ; yields ty:result
10266 <result> = shl nuw nsw <ty> <op1>, <op2> ; yields ty:result
10271 The '``shl``' instruction returns the first operand shifted to the left
10272 a specified number of bits.
10277 Both arguments to the '``shl``' instruction must be the same
10278 :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer type.
10279 '``op2``' is treated as an unsigned value.
10284 The value produced is ``op1`` \* 2\ :sup:`op2` mod 2\ :sup:`n`,
10285 where ``n`` is the width of the result. If ``op2`` is (statically or
10286 dynamically) equal to or larger than the number of bits in
10287 ``op1``, this instruction returns a :ref:`poison value <poisonvalues>`.
10288 If the arguments are vectors, each vector element of ``op1`` is shifted
10289 by the corresponding shift amount in ``op2``.
10291 If the ``nuw`` keyword is present, then the shift produces a poison
10292 value if it shifts out any non-zero bits.
10293 If the ``nsw`` keyword is present, then the shift produces a poison
10294 value if it shifts out any bits that disagree with the resultant sign bit.
10299 .. code-block:: text
10301 <result> = shl i32 4, %var ; yields i32: 4 << %var
10302 <result> = shl i32 4, 2 ; yields i32: 16
10303 <result> = shl i32 1, 10 ; yields i32: 1024
10304 <result> = shl i32 1, 32 ; undefined
10305 <result> = shl <2 x i32> < i32 1, i32 1>, < i32 1, i32 2> ; yields: result=<2 x i32> < i32 2, i32 4>
10310 '``lshr``' Instruction
10311 ^^^^^^^^^^^^^^^^^^^^^^
10318 <result> = lshr <ty> <op1>, <op2> ; yields ty:result
10319 <result> = lshr exact <ty> <op1>, <op2> ; yields ty:result
10324 The '``lshr``' instruction (logical shift right) returns the first
10325 operand shifted to the right a specified number of bits with zero fill.
10330 Both arguments to the '``lshr``' instruction must be the same
10331 :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer type.
10332 '``op2``' is treated as an unsigned value.
10337 This instruction always performs a logical shift right operation. The
10338 most significant bits of the result will be filled with zero bits after
10339 the shift. If ``op2`` is (statically or dynamically) equal to or larger
10340 than the number of bits in ``op1``, this instruction returns a :ref:`poison
10341 value <poisonvalues>`. If the arguments are vectors, each vector element
10342 of ``op1`` is shifted by the corresponding shift amount in ``op2``.
10344 If the ``exact`` keyword is present, the result value of the ``lshr`` is
10345 a poison value if any of the bits shifted out are non-zero.
10350 .. code-block:: text
10352 <result> = lshr i32 4, 1 ; yields i32:result = 2
10353 <result> = lshr i32 4, 2 ; yields i32:result = 1
10354 <result> = lshr i8 4, 3 ; yields i8:result = 0
10355 <result> = lshr i8 -2, 1 ; yields i8:result = 0x7F
10356 <result> = lshr i32 1, 32 ; undefined
10357 <result> = lshr <2 x i32> < i32 -2, i32 4>, < i32 1, i32 2> ; yields: result=<2 x i32> < i32 0x7FFFFFFF, i32 1>
10361 '``ashr``' Instruction
10362 ^^^^^^^^^^^^^^^^^^^^^^
10369 <result> = ashr <ty> <op1>, <op2> ; yields ty:result
10370 <result> = ashr exact <ty> <op1>, <op2> ; yields ty:result
10375 The '``ashr``' instruction (arithmetic shift right) returns the first
10376 operand shifted to the right a specified number of bits with sign
10382 Both arguments to the '``ashr``' instruction must be the same
10383 :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer type.
10384 '``op2``' is treated as an unsigned value.
10389 This instruction always performs an arithmetic shift right operation,
10390 The most significant bits of the result will be filled with the sign bit
10391 of ``op1``. If ``op2`` is (statically or dynamically) equal to or larger
10392 than the number of bits in ``op1``, this instruction returns a :ref:`poison
10393 value <poisonvalues>`. If the arguments are vectors, each vector element
10394 of ``op1`` is shifted by the corresponding shift amount in ``op2``.
10396 If the ``exact`` keyword is present, the result value of the ``ashr`` is
10397 a poison value if any of the bits shifted out are non-zero.
10402 .. code-block:: text
10404 <result> = ashr i32 4, 1 ; yields i32:result = 2
10405 <result> = ashr i32 4, 2 ; yields i32:result = 1
10406 <result> = ashr i8 4, 3 ; yields i8:result = 0
10407 <result> = ashr i8 -2, 1 ; yields i8:result = -1
10408 <result> = ashr i32 1, 32 ; undefined
10409 <result> = ashr <2 x i32> < i32 -2, i32 4>, < i32 1, i32 3> ; yields: result=<2 x i32> < i32 -1, i32 0>
10413 '``and``' Instruction
10414 ^^^^^^^^^^^^^^^^^^^^^
10421 <result> = and <ty> <op1>, <op2> ; yields ty:result
10426 The '``and``' instruction returns the bitwise logical and of its two
10432 The two arguments to the '``and``' instruction must be
10433 :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
10434 arguments must have identical types.
10439 The truth table used for the '``and``' instruction is:
10441 +-----+-----+-----+
10442 | In0 | In1 | Out |
10443 +-----+-----+-----+
10445 +-----+-----+-----+
10447 +-----+-----+-----+
10449 +-----+-----+-----+
10451 +-----+-----+-----+
10456 .. code-block:: text
10458 <result> = and i32 4, %var ; yields i32:result = 4 & %var
10459 <result> = and i32 15, 40 ; yields i32:result = 8
10460 <result> = and i32 4, 8 ; yields i32:result = 0
10464 '``or``' Instruction
10465 ^^^^^^^^^^^^^^^^^^^^
10472 <result> = or <ty> <op1>, <op2> ; yields ty:result
10473 <result> = or disjoint <ty> <op1>, <op2> ; yields ty:result
10478 The '``or``' instruction returns the bitwise logical inclusive or of its
10484 The two arguments to the '``or``' instruction must be
10485 :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
10486 arguments must have identical types.
10491 The truth table used for the '``or``' instruction is:
10493 +-----+-----+-----+
10494 | In0 | In1 | Out |
10495 +-----+-----+-----+
10497 +-----+-----+-----+
10499 +-----+-----+-----+
10501 +-----+-----+-----+
10503 +-----+-----+-----+
10505 ``disjoint`` means that for each bit, that bit is zero in at least one of the
10506 inputs. This allows the Or to be treated as an Add since no carry can occur from
10507 any bit. If the disjoint keyword is present, the result value of the ``or`` is a
10508 :ref:`poison value <poisonvalues>` if both inputs have a one in the same bit
10509 position. For vectors, only the element containing the bit is poison.
10516 <result> = or i32 4, %var ; yields i32:result = 4 | %var
10517 <result> = or i32 15, 40 ; yields i32:result = 47
10518 <result> = or i32 4, 8 ; yields i32:result = 12
10522 '``xor``' Instruction
10523 ^^^^^^^^^^^^^^^^^^^^^
10530 <result> = xor <ty> <op1>, <op2> ; yields ty:result
10535 The '``xor``' instruction returns the bitwise logical exclusive or of
10536 its two operands. The ``xor`` is used to implement the "one's
10537 complement" operation, which is the "~" operator in C.
10542 The two arguments to the '``xor``' instruction must be
10543 :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
10544 arguments must have identical types.
10549 The truth table used for the '``xor``' instruction is:
10551 +-----+-----+-----+
10552 | In0 | In1 | Out |
10553 +-----+-----+-----+
10555 +-----+-----+-----+
10557 +-----+-----+-----+
10559 +-----+-----+-----+
10561 +-----+-----+-----+
10566 .. code-block:: text
10568 <result> = xor i32 4, %var ; yields i32:result = 4 ^ %var
10569 <result> = xor i32 15, 40 ; yields i32:result = 39
10570 <result> = xor i32 4, 8 ; yields i32:result = 12
10571 <result> = xor i32 %V, -1 ; yields i32:result = ~%V
10576 LLVM supports several instructions to represent vector operations in a
10577 target-independent manner. These instructions cover the element-access
10578 and vector-specific operations needed to process vectors effectively.
10579 While LLVM does directly support these vector operations, many
10580 sophisticated algorithms will want to use target-specific intrinsics to
10581 take full advantage of a specific target.
10583 .. _i_extractelement:
10585 '``extractelement``' Instruction
10586 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10593 <result> = extractelement <n x <ty>> <val>, <ty2> <idx> ; yields <ty>
10594 <result> = extractelement <vscale x n x <ty>> <val>, <ty2> <idx> ; yields <ty>
10599 The '``extractelement``' instruction extracts a single scalar element
10600 from a vector at a specified index.
10605 The first operand of an '``extractelement``' instruction is a value of
10606 :ref:`vector <t_vector>` type. The second operand is an index indicating
10607 the position from which to extract the element. The index may be a
10608 variable of any integer type, and will be treated as an unsigned integer.
10613 The result is a scalar of the same type as the element type of ``val``.
10614 Its value is the value at position ``idx`` of ``val``. If ``idx``
10615 exceeds the length of ``val`` for a fixed-length vector, the result is a
10616 :ref:`poison value <poisonvalues>`. For a scalable vector, if the value
10617 of ``idx`` exceeds the runtime length of the vector, the result is a
10618 :ref:`poison value <poisonvalues>`.
10623 .. code-block:: text
10625 <result> = extractelement <4 x i32> %vec, i32 0 ; yields i32
10627 .. _i_insertelement:
10629 '``insertelement``' Instruction
10630 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10637 <result> = insertelement <n x <ty>> <val>, <ty> <elt>, <ty2> <idx> ; yields <n x <ty>>
10638 <result> = insertelement <vscale x n x <ty>> <val>, <ty> <elt>, <ty2> <idx> ; yields <vscale x n x <ty>>
10643 The '``insertelement``' instruction inserts a scalar element into a
10644 vector at a specified index.
10649 The first operand of an '``insertelement``' instruction is a value of
10650 :ref:`vector <t_vector>` type. The second operand is a scalar value whose
10651 type must equal the element type of the first operand. The third operand
10652 is an index indicating the position at which to insert the value. The
10653 index may be a variable of any integer type, and will be treated as an
10659 The result is a vector of the same type as ``val``. Its element values
10660 are those of ``val`` except at position ``idx``, where it gets the value
10661 ``elt``. If ``idx`` exceeds the length of ``val`` for a fixed-length vector,
10662 the result is a :ref:`poison value <poisonvalues>`. For a scalable vector,
10663 if the value of ``idx`` exceeds the runtime length of the vector, the result
10664 is a :ref:`poison value <poisonvalues>`.
10669 .. code-block:: text
10671 <result> = insertelement <4 x i32> %vec, i32 1, i32 0 ; yields <4 x i32>
10673 .. _i_shufflevector:
10675 '``shufflevector``' Instruction
10676 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10683 <result> = shufflevector <n x <ty>> <v1>, <n x <ty>> <v2>, <m x i32> <mask> ; yields <m x <ty>>
10684 <result> = shufflevector <vscale x n x <ty>> <v1>, <vscale x n x <ty>> v2, <vscale x m x i32> <mask> ; yields <vscale x m x <ty>>
10689 The '``shufflevector``' instruction constructs a permutation of elements
10690 from two input vectors, returning a vector with the same element type as
10691 the input and length that is the same as the shuffle mask.
10696 The first two operands of a '``shufflevector``' instruction are vectors
10697 with the same type. The third argument is a shuffle mask vector constant
10698 whose element type is ``i32``. The mask vector elements must be constant
10699 integers or ``poison`` values. The result of the instruction is a vector
10700 whose length is the same as the shuffle mask and whose element type is the
10701 same as the element type of the first two operands.
10706 The elements of the two input vectors are numbered from left to right
10707 across both of the vectors. For each element of the result vector, the
10708 shuffle mask selects an element from one of the input vectors to copy
10709 to the result. Non-negative elements in the mask represent an index
10710 into the concatenated pair of input vectors.
10712 A ``poison`` element in the mask vector specifies that the resulting element
10714 For backwards-compatibility reasons, LLVM temporarily also accepts ``undef``
10715 mask elements, which will be interpreted the same way as ``poison`` elements.
10716 If the shuffle mask selects an ``undef`` element from one of the input
10717 vectors, the resulting element is ``undef``.
10719 For scalable vectors, the only valid mask values at present are
10720 ``zeroinitializer``, ``undef`` and ``poison``, since we cannot write all indices as
10721 literals for a vector with a length unknown at compile time.
10726 .. code-block:: text
10728 <result> = shufflevector <4 x i32> %v1, <4 x i32> %v2,
10729 <4 x i32> <i32 0, i32 4, i32 1, i32 5> ; yields <4 x i32>
10730 <result> = shufflevector <4 x i32> %v1, <4 x i32> poison,
10731 <4 x i32> <i32 0, i32 1, i32 2, i32 3> ; yields <4 x i32> - Identity shuffle.
10732 <result> = shufflevector <8 x i32> %v1, <8 x i32> poison,
10733 <4 x i32> <i32 0, i32 1, i32 2, i32 3> ; yields <4 x i32>
10734 <result> = shufflevector <4 x i32> %v1, <4 x i32> %v2,
10735 <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7 > ; yields <8 x i32>
10737 Aggregate Operations
10738 --------------------
10740 LLVM supports several instructions for working with
10741 :ref:`aggregate <t_aggregate>` values.
10743 .. _i_extractvalue:
10745 '``extractvalue``' Instruction
10746 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10753 <result> = extractvalue <aggregate type> <val>, <idx>{, <idx>}*
10758 The '``extractvalue``' instruction extracts the value of a member field
10759 from an :ref:`aggregate <t_aggregate>` value.
10764 The first operand of an '``extractvalue``' instruction is a value of
10765 :ref:`struct <t_struct>` or :ref:`array <t_array>` type. The other operands are
10766 constant indices to specify which value to extract in a similar manner
10767 as indices in a '``getelementptr``' instruction.
10769 The major differences to ``getelementptr`` indexing are:
10771 - Since the value being indexed is not a pointer, the first index is
10772 omitted and assumed to be zero.
10773 - At least one index must be specified.
10774 - Not only struct indices but also array indices must be in bounds.
10779 The result is the value at the position in the aggregate specified by
10780 the index operands.
10785 .. code-block:: text
10787 <result> = extractvalue {i32, float} %agg, 0 ; yields i32
10791 '``insertvalue``' Instruction
10792 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10799 <result> = insertvalue <aggregate type> <val>, <ty> <elt>, <idx>{, <idx>}* ; yields <aggregate type>
10804 The '``insertvalue``' instruction inserts a value into a member field in
10805 an :ref:`aggregate <t_aggregate>` value.
10810 The first operand of an '``insertvalue``' instruction is a value of
10811 :ref:`struct <t_struct>` or :ref:`array <t_array>` type. The second operand is
10812 a first-class value to insert. The following operands are constant
10813 indices indicating the position at which to insert the value in a
10814 similar manner as indices in a '``extractvalue``' instruction. The value
10815 to insert must have the same type as the value identified by the
10821 The result is an aggregate of the same type as ``val``. Its value is
10822 that of ``val`` except that the value at the position specified by the
10823 indices is that of ``elt``.
10828 .. code-block:: llvm
10830 %agg1 = insertvalue {i32, float} undef, i32 1, 0 ; yields {i32 1, float undef}
10831 %agg2 = insertvalue {i32, float} %agg1, float %val, 1 ; yields {i32 1, float %val}
10832 %agg3 = insertvalue {i32, {float}} undef, float %val, 1, 0 ; yields {i32 undef, {float %val}}
10836 Memory Access and Addressing Operations
10837 ---------------------------------------
10839 A key design point of an SSA-based representation is how it represents
10840 memory. In LLVM, no memory locations are in SSA form, which makes things
10841 very simple. This section describes how to read, write, and allocate
10846 '``alloca``' Instruction
10847 ^^^^^^^^^^^^^^^^^^^^^^^^
10854 <result> = alloca [inalloca] <type> [, <ty> <NumElements>] [, align <alignment>] [, addrspace(<num>)] ; yields type addrspace(num)*:result
10859 The '``alloca``' instruction allocates memory on the stack frame of the
10860 currently executing function, to be automatically released when this
10861 function returns to its caller. If the address space is not explicitly
10862 specified, the object is allocated in the alloca address space from the
10863 :ref:`datalayout string<langref_datalayout>`.
10868 The '``alloca``' instruction allocates ``sizeof(<type>)*NumElements``
10869 bytes of memory on the runtime stack, returning a pointer of the
10870 appropriate type to the program. If "NumElements" is specified, it is
10871 the number of elements allocated, otherwise "NumElements" is defaulted
10874 If a constant alignment is specified, the value result of the
10875 allocation is guaranteed to be aligned to at least that boundary. The
10876 alignment may not be greater than ``1 << 32``.
10878 The alignment is only optional when parsing textual IR; for in-memory IR,
10879 it is always present. If not specified, the target can choose to align the
10880 allocation on any convenient boundary compatible with the type.
10882 '``type``' may be any sized type.
10884 Structs containing scalable vectors cannot be used in allocas unless all
10885 fields are the same scalable vector type (e.g. ``{<vscale x 2 x i32>,
10886 <vscale x 2 x i32>}`` contains the same type while ``{<vscale x 2 x i32>,
10887 <vscale x 2 x i64>}`` doesn't).
10892 Memory is allocated; a pointer is returned. The allocated memory is
10893 uninitialized, and loading from uninitialized memory produces an undefined
10894 value. The operation itself is undefined if there is insufficient stack
10895 space for the allocation.'``alloca``'d memory is automatically released
10896 when the function returns. The '``alloca``' instruction is commonly used
10897 to represent automatic variables that must have an address available. When
10898 the function returns (either with the ``ret`` or ``resume`` instructions),
10899 the memory is reclaimed. Allocating zero bytes is legal, but the returned
10900 pointer may not be unique. The order in which memory is allocated (ie.,
10901 which way the stack grows) is not specified.
10903 Note that '``alloca``' outside of the alloca address space from the
10904 :ref:`datalayout string<langref_datalayout>` is meaningful only if the
10905 target has assigned it a semantics.
10907 If the returned pointer is used by :ref:`llvm.lifetime.start <int_lifestart>`,
10908 the returned object is initially dead.
10909 See :ref:`llvm.lifetime.start <int_lifestart>` and
10910 :ref:`llvm.lifetime.end <int_lifeend>` for the precise semantics of
10911 lifetime-manipulating intrinsics.
10916 .. code-block:: llvm
10918 %ptr = alloca i32 ; yields ptr
10919 %ptr = alloca i32, i32 4 ; yields ptr
10920 %ptr = alloca i32, i32 4, align 1024 ; yields ptr
10921 %ptr = alloca i32, align 1024 ; yields ptr
10925 '``load``' Instruction
10926 ^^^^^^^^^^^^^^^^^^^^^^
10933 <result> = load [volatile] <ty>, ptr <pointer>[, align <alignment>][, !nontemporal !<nontemp_node>][, !invariant.load !<empty_node>][, !invariant.group !<empty_node>][, !nonnull !<empty_node>][, !dereferenceable !<deref_bytes_node>][, !dereferenceable_or_null !<deref_bytes_node>][, !align !<align_node>][, !noundef !<empty_node>]
10934 <result> = load atomic [volatile] <ty>, ptr <pointer> [syncscope("<target-scope>")] <ordering>, align <alignment> [, !invariant.group !<empty_node>]
10935 !<nontemp_node> = !{ i32 1 }
10936 !<empty_node> = !{}
10937 !<deref_bytes_node> = !{ i64 <dereferenceable_bytes> }
10938 !<align_node> = !{ i64 <value_alignment> }
10943 The '``load``' instruction is used to read from memory.
10948 The argument to the ``load`` instruction specifies the memory address from which
10949 to load. The type specified must be a :ref:`first class <t_firstclass>` type of
10950 known size (i.e. not containing an :ref:`opaque structural type <t_opaque>`). If
10951 the ``load`` is marked as ``volatile``, then the optimizer is not allowed to
10952 modify the number or order of execution of this ``load`` with other
10953 :ref:`volatile operations <volatile>`.
10955 If the ``load`` is marked as ``atomic``, it takes an extra :ref:`ordering
10956 <ordering>` and optional ``syncscope("<target-scope>")`` argument. The
10957 ``release`` and ``acq_rel`` orderings are not valid on ``load`` instructions.
10958 Atomic loads produce :ref:`defined <memmodel>` results when they may see
10959 multiple atomic stores. The type of the pointee must be an integer, pointer, or
10960 floating-point type whose bit width is a power of two greater than or equal to
10961 eight and less than or equal to a target-specific size limit. ``align`` must be
10962 explicitly specified on atomic loads. Note: if the alignment is not greater or
10963 equal to the size of the `<value>` type, the atomic operation is likely to
10964 require a lock and have poor performance. ``!nontemporal`` does not have any
10965 defined semantics for atomic loads.
10967 The optional constant ``align`` argument specifies the alignment of the
10968 operation (that is, the alignment of the memory address). It is the
10969 responsibility of the code emitter to ensure that the alignment information is
10970 correct. Overestimating the alignment results in undefined behavior.
10971 Underestimating the alignment may produce less efficient code. An alignment of
10972 1 is always safe. The maximum possible alignment is ``1 << 32``. An alignment
10973 value higher than the size of the loaded type implies memory up to the
10974 alignment value bytes can be safely loaded without trapping in the default
10975 address space. Access of the high bytes can interfere with debugging tools, so
10976 should not be accessed if the function has the ``sanitize_thread`` or
10977 ``sanitize_address`` attributes.
10979 The alignment is only optional when parsing textual IR; for in-memory IR, it is
10980 always present. An omitted ``align`` argument means that the operation has the
10981 ABI alignment for the target.
10983 The optional ``!nontemporal`` metadata must reference a single
10984 metadata name ``<nontemp_node>`` corresponding to a metadata node with one
10985 ``i32`` entry of value 1. The existence of the ``!nontemporal``
10986 metadata on the instruction tells the optimizer and code generator
10987 that this load is not expected to be reused in the cache. The code
10988 generator may select special instructions to save cache bandwidth, such
10989 as the ``MOVNT`` instruction on x86.
10991 The optional ``!invariant.load`` metadata must reference a single
10992 metadata name ``<empty_node>`` corresponding to a metadata node with no
10993 entries. If a load instruction tagged with the ``!invariant.load``
10994 metadata is executed, the memory location referenced by the load has
10995 to contain the same value at all points in the program where the
10996 memory location is dereferenceable; otherwise, the behavior is
10999 The optional ``!invariant.group`` metadata must reference a single metadata name
11000 ``<empty_node>`` corresponding to a metadata node with no entries.
11001 See ``invariant.group`` metadata :ref:`invariant.group <md_invariant.group>`.
11003 The optional ``!nonnull`` metadata must reference a single
11004 metadata name ``<empty_node>`` corresponding to a metadata node with no
11005 entries. The existence of the ``!nonnull`` metadata on the
11006 instruction tells the optimizer that the value loaded is known to
11007 never be null. If the value is null at runtime, a poison value is returned
11008 instead. This is analogous to the ``nonnull`` attribute on parameters and
11009 return values. This metadata can only be applied to loads of a pointer type.
11011 The optional ``!dereferenceable`` metadata must reference a single metadata
11012 name ``<deref_bytes_node>`` corresponding to a metadata node with one ``i64``
11014 See ``dereferenceable`` metadata :ref:`dereferenceable <md_dereferenceable>`.
11016 The optional ``!dereferenceable_or_null`` metadata must reference a single
11017 metadata name ``<deref_bytes_node>`` corresponding to a metadata node with one
11019 See ``dereferenceable_or_null`` metadata :ref:`dereferenceable_or_null
11020 <md_dereferenceable_or_null>`.
11022 The optional ``!align`` metadata must reference a single metadata name
11023 ``<align_node>`` corresponding to a metadata node with one ``i64`` entry.
11024 The existence of the ``!align`` metadata on the instruction tells the
11025 optimizer that the value loaded is known to be aligned to a boundary specified
11026 by the integer value in the metadata node. The alignment must be a power of 2.
11027 This is analogous to the ''align'' attribute on parameters and return values.
11028 This metadata can only be applied to loads of a pointer type. If the returned
11029 value is not appropriately aligned at runtime, a poison value is returned
11032 The optional ``!noundef`` metadata must reference a single metadata name
11033 ``<empty_node>`` corresponding to a node with no entries. The existence of
11034 ``!noundef`` metadata on the instruction tells the optimizer that the value
11035 loaded is known to be :ref:`well defined <welldefinedvalues>`.
11036 If the value isn't well defined, the behavior is undefined. If the ``!noundef``
11037 metadata is combined with poison-generating metadata like ``!nonnull``,
11038 violation of that metadata constraint will also result in undefined behavior.
11043 The location of memory pointed to is loaded. If the value being loaded
11044 is of scalar type then the number of bytes read does not exceed the
11045 minimum number of bytes needed to hold all bits of the type. For
11046 example, loading an ``i24`` reads at most three bytes. When loading a
11047 value of a type like ``i20`` with a size that is not an integral number
11048 of bytes, the result is undefined if the value was not originally
11049 written using a store of the same type.
11050 If the value being loaded is of aggregate type, the bytes that correspond to
11051 padding may be accessed but are ignored, because it is impossible to observe
11052 padding from the loaded aggregate value.
11053 If ``<pointer>`` is not a well-defined value, the behavior is undefined.
11058 .. code-block:: llvm
11060 %ptr = alloca i32 ; yields ptr
11061 store i32 3, ptr %ptr ; yields void
11062 %val = load i32, ptr %ptr ; yields i32:val = i32 3
11066 '``store``' Instruction
11067 ^^^^^^^^^^^^^^^^^^^^^^^
11074 store [volatile] <ty> <value>, ptr <pointer>[, align <alignment>][, !nontemporal !<nontemp_node>][, !invariant.group !<empty_node>] ; yields void
11075 store atomic [volatile] <ty> <value>, ptr <pointer> [syncscope("<target-scope>")] <ordering>, align <alignment> [, !invariant.group !<empty_node>] ; yields void
11076 !<nontemp_node> = !{ i32 1 }
11077 !<empty_node> = !{}
11082 The '``store``' instruction is used to write to memory.
11087 There are two arguments to the ``store`` instruction: a value to store and an
11088 address at which to store it. The type of the ``<pointer>`` operand must be a
11089 pointer to the :ref:`first class <t_firstclass>` type of the ``<value>``
11090 operand. If the ``store`` is marked as ``volatile``, then the optimizer is not
11091 allowed to modify the number or order of execution of this ``store`` with other
11092 :ref:`volatile operations <volatile>`. Only values of :ref:`first class
11093 <t_firstclass>` types of known size (i.e. not containing an :ref:`opaque
11094 structural type <t_opaque>`) can be stored.
11096 If the ``store`` is marked as ``atomic``, it takes an extra :ref:`ordering
11097 <ordering>` and optional ``syncscope("<target-scope>")`` argument. The
11098 ``acquire`` and ``acq_rel`` orderings aren't valid on ``store`` instructions.
11099 Atomic loads produce :ref:`defined <memmodel>` results when they may see
11100 multiple atomic stores. The type of the pointee must be an integer, pointer, or
11101 floating-point type whose bit width is a power of two greater than or equal to
11102 eight and less than or equal to a target-specific size limit. ``align`` must be
11103 explicitly specified on atomic stores. Note: if the alignment is not greater or
11104 equal to the size of the `<value>` type, the atomic operation is likely to
11105 require a lock and have poor performance. ``!nontemporal`` does not have any
11106 defined semantics for atomic stores.
11108 The optional constant ``align`` argument specifies the alignment of the
11109 operation (that is, the alignment of the memory address). It is the
11110 responsibility of the code emitter to ensure that the alignment information is
11111 correct. Overestimating the alignment results in undefined behavior.
11112 Underestimating the alignment may produce less efficient code. An alignment of
11113 1 is always safe. The maximum possible alignment is ``1 << 32``. An alignment
11114 value higher than the size of the loaded type implies memory up to the
11115 alignment value bytes can be safely loaded without trapping in the default
11116 address space. Access of the high bytes can interfere with debugging tools, so
11117 should not be accessed if the function has the ``sanitize_thread`` or
11118 ``sanitize_address`` attributes.
11120 The alignment is only optional when parsing textual IR; for in-memory IR, it is
11121 always present. An omitted ``align`` argument means that the operation has the
11122 ABI alignment for the target.
11124 The optional ``!nontemporal`` metadata must reference a single metadata
11125 name ``<nontemp_node>`` corresponding to a metadata node with one ``i32`` entry
11126 of value 1. The existence of the ``!nontemporal`` metadata on the instruction
11127 tells the optimizer and code generator that this load is not expected to
11128 be reused in the cache. The code generator may select special
11129 instructions to save cache bandwidth, such as the ``MOVNT`` instruction on
11132 The optional ``!invariant.group`` metadata must reference a
11133 single metadata name ``<empty_node>``. See ``invariant.group`` metadata.
11138 The contents of memory are updated to contain ``<value>`` at the
11139 location specified by the ``<pointer>`` operand. If ``<value>`` is
11140 of scalar type then the number of bytes written does not exceed the
11141 minimum number of bytes needed to hold all bits of the type. For
11142 example, storing an ``i24`` writes at most three bytes. When writing a
11143 value of a type like ``i20`` with a size that is not an integral number
11144 of bytes, it is unspecified what happens to the extra bits that do not
11145 belong to the type, but they will typically be overwritten.
11146 If ``<value>`` is of aggregate type, padding is filled with
11147 :ref:`undef <undefvalues>`.
11148 If ``<pointer>`` is not a well-defined value, the behavior is undefined.
11153 .. code-block:: llvm
11155 %ptr = alloca i32 ; yields ptr
11156 store i32 3, ptr %ptr ; yields void
11157 %val = load i32, ptr %ptr ; yields i32:val = i32 3
11161 '``fence``' Instruction
11162 ^^^^^^^^^^^^^^^^^^^^^^^
11169 fence [syncscope("<target-scope>")] <ordering> ; yields void
11174 The '``fence``' instruction is used to introduce happens-before edges
11175 between operations.
11180 '``fence``' instructions take an :ref:`ordering <ordering>` argument which
11181 defines what *synchronizes-with* edges they add. They can only be given
11182 ``acquire``, ``release``, ``acq_rel``, and ``seq_cst`` orderings.
11187 A fence A which has (at least) ``release`` ordering semantics
11188 *synchronizes with* a fence B with (at least) ``acquire`` ordering
11189 semantics if and only if there exist atomic operations X and Y, both
11190 operating on some atomic object M, such that A is sequenced before X, X
11191 modifies M (either directly or through some side effect of a sequence
11192 headed by X), Y is sequenced before B, and Y observes M. This provides a
11193 *happens-before* dependency between A and B. Rather than an explicit
11194 ``fence``, one (but not both) of the atomic operations X or Y might
11195 provide a ``release`` or ``acquire`` (resp.) ordering constraint and
11196 still *synchronize-with* the explicit ``fence`` and establish the
11197 *happens-before* edge.
11199 A ``fence`` which has ``seq_cst`` ordering, in addition to having both
11200 ``acquire`` and ``release`` semantics specified above, participates in
11201 the global program order of other ``seq_cst`` operations and/or
11202 fences. Furthermore, the global ordering created by a ``seq_cst``
11203 fence must be compatible with the individual total orders of
11204 ``monotonic`` (or stronger) memory accesses occurring before and after
11205 such a fence. The exact semantics of this interaction are somewhat
11206 complicated, see the C++ standard's `[atomics.order]
11207 <https://wg21.link/atomics.order>`_ section for more details.
11209 A ``fence`` instruction can also take an optional
11210 ":ref:`syncscope <syncscope>`" argument.
11215 .. code-block:: text
11217 fence acquire ; yields void
11218 fence syncscope("singlethread") seq_cst ; yields void
11219 fence syncscope("agent") seq_cst ; yields void
11223 '``cmpxchg``' Instruction
11224 ^^^^^^^^^^^^^^^^^^^^^^^^^
11231 cmpxchg [weak] [volatile] ptr <pointer>, <ty> <cmp>, <ty> <new> [syncscope("<target-scope>")] <success ordering> <failure ordering>[, align <alignment>] ; yields { ty, i1 }
11236 The '``cmpxchg``' instruction is used to atomically modify memory. It
11237 loads a value in memory and compares it to a given value. If they are
11238 equal, it tries to store a new value into the memory.
11243 There are three arguments to the '``cmpxchg``' instruction: an address
11244 to operate on, a value to compare to the value currently be at that
11245 address, and a new value to place at that address if the compared values
11246 are equal. The type of '<cmp>' must be an integer or pointer type whose
11247 bit width is a power of two greater than or equal to eight and less
11248 than or equal to a target-specific size limit. '<cmp>' and '<new>' must
11249 have the same type, and the type of '<pointer>' must be a pointer to
11250 that type. If the ``cmpxchg`` is marked as ``volatile``, then the
11251 optimizer is not allowed to modify the number or order of execution of
11252 this ``cmpxchg`` with other :ref:`volatile operations <volatile>`.
11254 The success and failure :ref:`ordering <ordering>` arguments specify how this
11255 ``cmpxchg`` synchronizes with other atomic operations. Both ordering parameters
11256 must be at least ``monotonic``, the failure ordering cannot be either
11257 ``release`` or ``acq_rel``.
11259 A ``cmpxchg`` instruction can also take an optional
11260 ":ref:`syncscope <syncscope>`" argument.
11262 Note: if the alignment is not greater or equal to the size of the `<value>`
11263 type, the atomic operation is likely to require a lock and have poor
11266 The alignment is only optional when parsing textual IR; for in-memory IR, it is
11267 always present. If unspecified, the alignment is assumed to be equal to the
11268 size of the '<value>' type. Note that this default alignment assumption is
11269 different from the alignment used for the load/store instructions when align
11272 The pointer passed into cmpxchg must have alignment greater than or
11273 equal to the size in memory of the operand.
11278 The contents of memory at the location specified by the '``<pointer>``' operand
11279 is read and compared to '``<cmp>``'; if the values are equal, '``<new>``' is
11280 written to the location. The original value at the location is returned,
11281 together with a flag indicating success (true) or failure (false).
11283 If the cmpxchg operation is marked as ``weak`` then a spurious failure is
11284 permitted: the operation may not write ``<new>`` even if the comparison
11287 If the cmpxchg operation is strong (the default), the i1 value is 1 if and only
11288 if the value loaded equals ``cmp``.
11290 A successful ``cmpxchg`` is a read-modify-write instruction for the purpose of
11291 identifying release sequences. A failed ``cmpxchg`` is equivalent to an atomic
11292 load with an ordering parameter determined the second ordering parameter.
11297 .. code-block:: llvm
11300 %orig = load atomic i32, ptr %ptr unordered, align 4 ; yields i32
11304 %cmp = phi i32 [ %orig, %entry ], [%value_loaded, %loop]
11305 %squared = mul i32 %cmp, %cmp
11306 %val_success = cmpxchg ptr %ptr, i32 %cmp, i32 %squared acq_rel monotonic ; yields { i32, i1 }
11307 %value_loaded = extractvalue { i32, i1 } %val_success, 0
11308 %success = extractvalue { i32, i1 } %val_success, 1
11309 br i1 %success, label %done, label %loop
11316 '``atomicrmw``' Instruction
11317 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
11324 atomicrmw [volatile] <operation> ptr <pointer>, <ty> <value> [syncscope("<target-scope>")] <ordering>[, align <alignment>] ; yields ty
11329 The '``atomicrmw``' instruction is used to atomically modify memory.
11334 There are three arguments to the '``atomicrmw``' instruction: an
11335 operation to apply, an address whose value to modify, an argument to the
11336 operation. The operation must be one of the following keywords:
11358 For most of these operations, the type of '<value>' must be an integer
11359 type whose bit width is a power of two greater than or equal to eight
11360 and less than or equal to a target-specific size limit. For xchg, this
11361 may also be a floating point or a pointer type with the same size constraints
11362 as integers. For fadd/fsub/fmax/fmin, this must be a floating-point
11363 or fixed vector of floating-point type. The type of the '``<pointer>``'
11364 operand must be a pointer to that type. If the ``atomicrmw`` is marked
11365 as ``volatile``, then the optimizer is not allowed to modify the
11366 number or order of execution of this ``atomicrmw`` with other
11367 :ref:`volatile operations <volatile>`.
11369 Note: if the alignment is not greater or equal to the size of the `<value>`
11370 type, the atomic operation is likely to require a lock and have poor
11373 The alignment is only optional when parsing textual IR; for in-memory IR, it is
11374 always present. If unspecified, the alignment is assumed to be equal to the
11375 size of the '<value>' type. Note that this default alignment assumption is
11376 different from the alignment used for the load/store instructions when align
11379 A ``atomicrmw`` instruction can also take an optional
11380 ":ref:`syncscope <syncscope>`" argument.
11385 The contents of memory at the location specified by the '``<pointer>``'
11386 operand are atomically read, modified, and written back. The original
11387 value at the location is returned. The modification is specified by the
11388 operation argument:
11390 - xchg: ``*ptr = val``
11391 - add: ``*ptr = *ptr + val``
11392 - sub: ``*ptr = *ptr - val``
11393 - and: ``*ptr = *ptr & val``
11394 - nand: ``*ptr = ~(*ptr & val)``
11395 - or: ``*ptr = *ptr | val``
11396 - xor: ``*ptr = *ptr ^ val``
11397 - max: ``*ptr = *ptr > val ? *ptr : val`` (using a signed comparison)
11398 - min: ``*ptr = *ptr < val ? *ptr : val`` (using a signed comparison)
11399 - umax: ``*ptr = *ptr > val ? *ptr : val`` (using an unsigned comparison)
11400 - umin: ``*ptr = *ptr < val ? *ptr : val`` (using an unsigned comparison)
11401 - fadd: ``*ptr = *ptr + val`` (using floating point arithmetic)
11402 - fsub: ``*ptr = *ptr - val`` (using floating point arithmetic)
11403 - fmax: ``*ptr = maxnum(*ptr, val)`` (match the `llvm.maxnum.*`` intrinsic)
11404 - fmin: ``*ptr = minnum(*ptr, val)`` (match the `llvm.minnum.*`` intrinsic)
11405 - uinc_wrap: ``*ptr = (*ptr u>= val) ? 0 : (*ptr + 1)`` (increment value with wraparound to zero when incremented above input value)
11406 - udec_wrap: ``*ptr = ((*ptr == 0) || (*ptr u> val)) ? val : (*ptr - 1)`` (decrement with wraparound to input value when decremented below zero).
11407 - usub_cond: ``*ptr = (*ptr u>= val) ? *ptr - val : *ptr`` (subtract only if no unsigned overflow).
11408 - usub_sat: ``*ptr = (*ptr u>= val) ? *ptr - val : 0`` (subtract with unsigned clamping to zero).
11414 .. code-block:: llvm
11416 %old = atomicrmw add ptr %ptr, i32 1 acquire ; yields i32
11418 .. _i_getelementptr:
11420 '``getelementptr``' Instruction
11421 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11428 <result> = getelementptr <ty>, ptr <ptrval>{, <ty> <idx>}*
11429 <result> = getelementptr inbounds <ty>, ptr <ptrval>{, <ty> <idx>}*
11430 <result> = getelementptr nusw <ty>, ptr <ptrval>{, <ty> <idx>}*
11431 <result> = getelementptr nuw <ty>, ptr <ptrval>{, <ty> <idx>}*
11432 <result> = getelementptr inrange(S,E) <ty>, ptr <ptrval>{, <ty> <idx>}*
11433 <result> = getelementptr <ty>, <N x ptr> <ptrval>, <vector index type> <idx>
11438 The '``getelementptr``' instruction is used to get the address of a
11439 subelement of an :ref:`aggregate <t_aggregate>` data structure. It performs
11440 address calculation only and does not access memory. The instruction can also
11441 be used to calculate a vector of such addresses.
11446 The first argument is always a type used as the basis for the calculations.
11447 The second argument is always a pointer or a vector of pointers, and is the
11448 base address to start from. The remaining arguments are indices
11449 that indicate which of the elements of the aggregate object are indexed.
11450 The interpretation of each index is dependent on the type being indexed
11451 into. The first index always indexes the pointer value given as the
11452 second argument, the second index indexes a value of the type pointed to
11453 (not necessarily the value directly pointed to, since the first index
11454 can be non-zero), etc. The first type indexed into must be a pointer
11455 value, subsequent types can be arrays, vectors, and structs. Note that
11456 subsequent types being indexed into can never be pointers, since that
11457 would require loading the pointer before continuing calculation.
11459 The type of each index argument depends on the type it is indexing into.
11460 When indexing into a (optionally packed) structure, only ``i32`` integer
11461 **constants** are allowed (when using a vector of indices they must all
11462 be the **same** ``i32`` integer constant). When indexing into an array,
11463 pointer or vector, integers of any width are allowed, and they are not
11464 required to be constant. These integers are treated as signed values
11467 For example, let's consider a C code fragment and how it gets compiled
11483 int *foo(struct ST *s) {
11484 return &s[1].Z.B[5][13];
11487 The LLVM code generated by Clang is approximately:
11489 .. code-block:: llvm
11491 %struct.RT = type { i8, [10 x [20 x i32]], i8 }
11492 %struct.ST = type { i32, double, %struct.RT }
11494 define ptr @foo(ptr %s) {
11496 %arrayidx = getelementptr inbounds %struct.ST, ptr %s, i64 1, i32 2, i32 1, i64 5, i64 13
11503 In the example above, the first index is indexing into the
11504 '``%struct.ST*``' type, which is a pointer, yielding a '``%struct.ST``'
11505 = '``{ i32, double, %struct.RT }``' type, a structure. The second index
11506 indexes into the third element of the structure, yielding a
11507 '``%struct.RT``' = '``{ i8 , [10 x [20 x i32]], i8 }``' type, another
11508 structure. The third index indexes into the second element of the
11509 structure, yielding a '``[10 x [20 x i32]]``' type, an array. The two
11510 dimensions of the array are subscripted into, yielding an '``i32``'
11511 type. The '``getelementptr``' instruction returns a pointer to this
11514 Note that it is perfectly legal to index partially through a structure,
11515 returning a pointer to an inner element. Because of this, the LLVM code
11516 for the given testcase is equivalent to:
11518 .. code-block:: llvm
11520 define ptr @foo(ptr %s) {
11521 %t1 = getelementptr %struct.ST, ptr %s, i32 1
11522 %t2 = getelementptr %struct.ST, ptr %t1, i32 0, i32 2
11523 %t3 = getelementptr %struct.RT, ptr %t2, i32 0, i32 1
11524 %t4 = getelementptr [10 x [20 x i32]], ptr %t3, i32 0, i32 5
11525 %t5 = getelementptr [20 x i32], ptr %t4, i32 0, i32 13
11529 The indices are first converted to offsets in the pointer's index type. If the
11530 currently indexed type is a struct type, the struct offset corresponding to the
11531 index is sign-extended or truncated to the pointer index type. Otherwise, the
11532 index itself is sign-extended or truncated, and then multiplied by the type
11533 allocation size (that is, the size rounded up to the ABI alignment) of the
11534 currently indexed type.
11536 The offsets are then added to the low bits of the base address up to the index
11537 type width, with silently-wrapping two's complement arithmetic. If the pointer
11538 size is larger than the index size, this means that the bits outside the index
11539 type width will not be affected.
11541 The result value of the ``getelementptr`` may be outside the object pointed
11542 to by the base pointer. The result value may not necessarily be used to access
11543 memory though, even if it happens to point into allocated storage. See the
11544 :ref:`Pointer Aliasing Rules <pointeraliasing>` section for more
11547 The ``getelementptr`` instruction may have a number of attributes that impose
11548 additional rules. If any of the rules are violated, the result value is a
11549 :ref:`poison value <poisonvalues>`. In cases where the base is a vector of
11550 pointers, the attributes apply to each computation element-wise.
11552 For ``nusw`` (no unsigned signed wrap):
11554 * If the type of an index is larger than the pointer index type, the
11555 truncation to the pointer index type preserves the signed value
11557 * The multiplication of an index by the type size does not wrap the pointer
11558 index type in a signed sense (``mul nsw``).
11559 * The successive addition of each offset (without adding the base address)
11560 does not wrap the pointer index type in a signed sense (``add nsw``).
11561 * The successive addition of the current address, truncated to the pointer
11562 index type and interpreted as an unsigned number, and each offset,
11563 interpreted as a signed number, does not wrap the pointer index type.
11565 For ``nuw`` (no unsigned wrap):
11567 * If the type of an index is larger than the pointer index type, the
11568 truncation to the pointer index type preserves the unsigned value
11570 * The multiplication of an index by the type size does not wrap the pointer
11571 index type in an unsigned sense (``mul nuw``).
11572 * The successive addition of each offset (without adding the base address)
11573 does not wrap the pointer index type in an unsigned sense (``add nuw``).
11574 * The successive addition of the current address, truncated to the pointer
11575 index type and interpreted as an unsigned number, and each offset, also
11576 interpreted as an unsigned number, does not wrap the pointer index type
11579 For ``inbounds`` all rules of the ``nusw`` attribute apply. Additionally,
11580 if the ``getelementptr`` has any non-zero indices, the following rules apply:
11582 * The base pointer has an *in bounds* address of the allocated object that it
11583 is :ref:`based <pointeraliasing>` on. This means that it points into that
11584 allocated object, or to its end. Note that the object does not have to be
11585 live anymore; being in-bounds of a deallocated object is sufficient.
11586 * During the successive addition of offsets to the address, the resulting
11587 pointer must remain *in bounds* of the allocated object at each step.
11589 Note that ``getelementptr`` with all-zero indices is always considered to be
11590 ``inbounds``, even if the base pointer does not point to an allocated object.
11591 As a corollary, the only pointer in bounds of the null pointer in the default
11592 address space is the null pointer itself.
11594 These rules are based on the assumption that no allocated object may cross
11595 the unsigned address space boundary, and no allocated object may be larger
11596 than half the pointer index type space.
11598 If ``inbounds`` is present on a ``getelementptr`` instruction, the ``nusw``
11599 attribute will be automatically set as well. For this reason, the ``nusw``
11600 will also not be printed in textual IR if ``inbounds`` is already present.
11602 If the ``inrange(Start, End)`` attribute is present, loading from or
11603 storing to any pointer derived from the ``getelementptr`` has undefined
11604 behavior if the load or store would access memory outside the half-open range
11605 ``[Start, End)`` from the ``getelementptr`` expression result. The result of
11606 a pointer comparison or ``ptrtoint`` (including ``ptrtoint``-like operations
11607 involving memory) involving a pointer derived from a ``getelementptr`` with
11608 the ``inrange`` keyword is undefined, with the exception of comparisons
11609 in the case where both operands are in the closed range ``[Start, End]``.
11610 Note that the ``inrange`` keyword is currently only allowed
11611 in constant ``getelementptr`` expressions.
11613 The getelementptr instruction is often confusing. For some more insight
11614 into how it works, see :doc:`the getelementptr FAQ <GetElementPtr>`.
11619 .. code-block:: llvm
11621 %aptr = getelementptr {i32, [12 x i8]}, ptr %saptr, i64 0, i32 1
11622 %vptr = getelementptr {i32, <2 x i8>}, ptr %svptr, i64 0, i32 1, i32 1
11623 %eptr = getelementptr [12 x i8], ptr %aptr, i64 0, i32 1
11624 %iptr = getelementptr [10 x i32], ptr @arr, i16 0, i16 0
11626 Vector of pointers:
11627 """""""""""""""""""
11629 The ``getelementptr`` returns a vector of pointers, instead of a single address,
11630 when one or more of its arguments is a vector. In such cases, all vector
11631 arguments should have the same number of elements, and every scalar argument
11632 will be effectively broadcast into a vector during address calculation.
11634 .. code-block:: llvm
11636 ; All arguments are vectors:
11637 ; A[i] = ptrs[i] + offsets[i]*sizeof(i8)
11638 %A = getelementptr i8, <4 x i8*> %ptrs, <4 x i64> %offsets
11640 ; Add the same scalar offset to each pointer of a vector:
11641 ; A[i] = ptrs[i] + offset*sizeof(i8)
11642 %A = getelementptr i8, <4 x ptr> %ptrs, i64 %offset
11644 ; Add distinct offsets to the same pointer:
11645 ; A[i] = ptr + offsets[i]*sizeof(i8)
11646 %A = getelementptr i8, ptr %ptr, <4 x i64> %offsets
11648 ; In all cases described above the type of the result is <4 x ptr>
11650 The two following instructions are equivalent:
11652 .. code-block:: llvm
11654 getelementptr %struct.ST, <4 x ptr> %s, <4 x i64> %ind1,
11655 <4 x i32> <i32 2, i32 2, i32 2, i32 2>,
11656 <4 x i32> <i32 1, i32 1, i32 1, i32 1>,
11658 <4 x i64> <i64 13, i64 13, i64 13, i64 13>
11660 getelementptr %struct.ST, <4 x ptr> %s, <4 x i64> %ind1,
11661 i32 2, i32 1, <4 x i32> %ind4, i64 13
11663 Let's look at the C code, where the vector version of ``getelementptr``
11668 // Let's assume that we vectorize the following loop:
11669 double *A, *B; int *C;
11670 for (int i = 0; i < size; ++i) {
11674 .. code-block:: llvm
11676 ; get pointers for 8 elements from array B
11677 %ptrs = getelementptr double, ptr %B, <8 x i32> %C
11678 ; load 8 elements from array B into A
11679 %A = call <8 x double> @llvm.masked.gather.v8f64.v8p0f64(<8 x ptr> %ptrs,
11680 i32 8, <8 x i1> %mask, <8 x double> %passthru)
11682 Conversion Operations
11683 ---------------------
11685 The instructions in this category are the conversion instructions
11686 (casting) which all take a single operand and a type. They perform
11687 various bit conversions on the operand.
11691 '``trunc .. to``' Instruction
11692 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11699 <result> = trunc <ty> <value> to <ty2> ; yields ty2
11700 <result> = trunc nsw <ty> <value> to <ty2> ; yields ty2
11701 <result> = trunc nuw <ty> <value> to <ty2> ; yields ty2
11702 <result> = trunc nuw nsw <ty> <value> to <ty2> ; yields ty2
11707 The '``trunc``' instruction truncates its operand to the type ``ty2``.
11712 The '``trunc``' instruction takes a value to trunc, and a type to trunc
11713 it to. Both types must be of :ref:`integer <t_integer>` types, or vectors
11714 of the same number of integers. The bit size of the ``value`` must be
11715 larger than the bit size of the destination type, ``ty2``. Equal sized
11716 types are not allowed.
11721 The '``trunc``' instruction truncates the high order bits in ``value``
11722 and converts the remaining bits to ``ty2``. Since the source size must
11723 be larger than the destination size, ``trunc`` cannot be a *no-op cast*.
11724 It will always truncate bits.
11726 If the ``nuw`` keyword is present, and any of the truncated bits are non-zero,
11727 the result is a :ref:`poison value <poisonvalues>`. If the ``nsw`` keyword
11728 is present, and any of the truncated bits are not the same as the top bit
11729 of the truncation result, the result is a :ref:`poison value <poisonvalues>`.
11734 .. code-block:: llvm
11736 %X = trunc i32 257 to i8 ; yields i8:1
11737 %Y = trunc i32 123 to i1 ; yields i1:true
11738 %Z = trunc i32 122 to i1 ; yields i1:false
11739 %W = trunc <2 x i16> <i16 8, i16 7> to <2 x i8> ; yields <i8 8, i8 7>
11743 '``zext .. to``' Instruction
11744 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11751 <result> = zext <ty> <value> to <ty2> ; yields ty2
11756 The '``zext``' instruction zero extends its operand to type ``ty2``.
11758 The ``nneg`` (non-negative) flag, if present, specifies that the operand is
11759 non-negative. This property may be used by optimization passes to later
11760 convert the ``zext`` into a ``sext``.
11765 The '``zext``' instruction takes a value to cast, and a type to cast it
11766 to. Both types must be of :ref:`integer <t_integer>` types, or vectors of
11767 the same number of integers. The bit size of the ``value`` must be
11768 smaller than the bit size of the destination type, ``ty2``.
11773 The ``zext`` fills the high order bits of the ``value`` with zero bits
11774 until it reaches the size of the destination type, ``ty2``.
11776 When zero extending from i1, the result will always be either 0 or 1.
11778 If the ``nneg`` flag is set, and the ``zext`` argument is negative, the result
11784 .. code-block:: llvm
11786 %X = zext i32 257 to i64 ; yields i64:257
11787 %Y = zext i1 true to i32 ; yields i32:1
11788 %Z = zext <2 x i16> <i16 8, i16 7> to <2 x i32> ; yields <i32 8, i32 7>
11790 %a = zext nneg i8 127 to i16 ; yields i16 127
11791 %b = zext nneg i8 -1 to i16 ; yields i16 poison
11795 '``sext .. to``' Instruction
11796 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11803 <result> = sext <ty> <value> to <ty2> ; yields ty2
11808 The '``sext``' sign extends ``value`` to the type ``ty2``.
11813 The '``sext``' instruction takes a value to cast, and a type to cast it
11814 to. Both types must be of :ref:`integer <t_integer>` types, or vectors of
11815 the same number of integers. The bit size of the ``value`` must be
11816 smaller than the bit size of the destination type, ``ty2``.
11821 The '``sext``' instruction performs a sign extension by copying the sign
11822 bit (highest order bit) of the ``value`` until it reaches the bit size
11823 of the type ``ty2``.
11825 When sign extending from i1, the extension always results in -1 or 0.
11830 .. code-block:: llvm
11832 %X = sext i8 -1 to i16 ; yields i16 :65535
11833 %Y = sext i1 true to i32 ; yields i32:-1
11834 %Z = sext <2 x i16> <i16 8, i16 7> to <2 x i32> ; yields <i32 8, i32 7>
11836 '``fptrunc .. to``' Instruction
11837 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11844 <result> = fptrunc <ty> <value> to <ty2> ; yields ty2
11849 The '``fptrunc``' instruction truncates ``value`` to type ``ty2``.
11854 The '``fptrunc``' instruction takes a :ref:`floating-point <t_floating>`
11855 value to cast and a :ref:`floating-point <t_floating>` type to cast it to.
11856 The size of ``value`` must be larger than the size of ``ty2``. This
11857 implies that ``fptrunc`` cannot be used to make a *no-op cast*.
11862 The '``fptrunc``' instruction casts a ``value`` from a larger
11863 :ref:`floating-point <t_floating>` type to a smaller :ref:`floating-point
11864 <t_floating>` type.
11865 This instruction is assumed to execute in the default :ref:`floating-point
11866 environment <floatenv>`.
11868 NaN values follow the usual :ref:`NaN behaviors <floatnan>`, except that _if_ a
11869 NaN payload is propagated from the input ("Quieting NaN propagation" or
11870 "Unchanged NaN propagation" cases), then the low order bits of the NaN payload
11871 which cannot fit in the resulting type are discarded. Note that if discarding
11872 the low order bits leads to an all-0 payload, this cannot be represented as a
11873 signaling NaN (it would represent an infinity instead), so in that case
11874 "Unchanged NaN propagation" is not possible.
11879 .. code-block:: llvm
11881 %X = fptrunc double 16777217.0 to float ; yields float:16777216.0
11882 %Y = fptrunc double 1.0E+300 to half ; yields half:+infinity
11884 '``fpext .. to``' Instruction
11885 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11892 <result> = fpext <ty> <value> to <ty2> ; yields ty2
11897 The '``fpext``' extends a floating-point ``value`` to a larger floating-point
11903 The '``fpext``' instruction takes a :ref:`floating-point <t_floating>`
11904 ``value`` to cast, and a :ref:`floating-point <t_floating>` type to cast it
11905 to. The source type must be smaller than the destination type.
11910 The '``fpext``' instruction extends the ``value`` from a smaller
11911 :ref:`floating-point <t_floating>` type to a larger :ref:`floating-point
11912 <t_floating>` type. The ``fpext`` cannot be used to make a
11913 *no-op cast* because it always changes bits. Use ``bitcast`` to make a
11914 *no-op cast* for a floating-point cast.
11916 NaN values follow the usual :ref:`NaN behaviors <floatnan>`, except that _if_ a
11917 NaN payload is propagated from the input ("Quieting NaN propagation" or
11918 "Unchanged NaN propagation" cases), then it is copied to the high order bits of
11919 the resulting payload, and the remaining low order bits are zero.
11924 .. code-block:: llvm
11926 %X = fpext float 3.125 to double ; yields double:3.125000e+00
11927 %Y = fpext double %X to fp128 ; yields fp128:0xL00000000000000004000900000000000
11929 '``fptoui .. to``' Instruction
11930 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11937 <result> = fptoui <ty> <value> to <ty2> ; yields ty2
11942 The '``fptoui``' converts a floating-point ``value`` to its unsigned
11943 integer equivalent of type ``ty2``.
11948 The '``fptoui``' instruction takes a value to cast, which must be a
11949 scalar or vector :ref:`floating-point <t_floating>` value, and a type to
11950 cast it to ``ty2``, which must be an :ref:`integer <t_integer>` type. If
11951 ``ty`` is a vector floating-point type, ``ty2`` must be a vector integer
11952 type with the same number of elements as ``ty``
11957 The '``fptoui``' instruction converts its :ref:`floating-point
11958 <t_floating>` operand into the nearest (rounding towards zero)
11959 unsigned integer value. If the value cannot fit in ``ty2``, the result
11960 is a :ref:`poison value <poisonvalues>`.
11965 .. code-block:: llvm
11967 %X = fptoui double 123.0 to i32 ; yields i32:123
11968 %Y = fptoui float 1.0E+300 to i1 ; yields undefined:1
11969 %Z = fptoui float 1.04E+17 to i8 ; yields undefined:1
11971 '``fptosi .. to``' Instruction
11972 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11979 <result> = fptosi <ty> <value> to <ty2> ; yields ty2
11984 The '``fptosi``' instruction converts :ref:`floating-point <t_floating>`
11985 ``value`` to type ``ty2``.
11990 The '``fptosi``' instruction takes a value to cast, which must be a
11991 scalar or vector :ref:`floating-point <t_floating>` value, and a type to
11992 cast it to ``ty2``, which must be an :ref:`integer <t_integer>` type. If
11993 ``ty`` is a vector floating-point type, ``ty2`` must be a vector integer
11994 type with the same number of elements as ``ty``
11999 The '``fptosi``' instruction converts its :ref:`floating-point
12000 <t_floating>` operand into the nearest (rounding towards zero)
12001 signed integer value. If the value cannot fit in ``ty2``, the result
12002 is a :ref:`poison value <poisonvalues>`.
12007 .. code-block:: llvm
12009 %X = fptosi double -123.0 to i32 ; yields i32:-123
12010 %Y = fptosi float 1.0E-247 to i1 ; yields undefined:1
12011 %Z = fptosi float 1.04E+17 to i8 ; yields undefined:1
12013 '``uitofp .. to``' Instruction
12014 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12021 <result> = uitofp <ty> <value> to <ty2> ; yields ty2
12026 The '``uitofp``' instruction regards ``value`` as an unsigned integer
12027 and converts that value to the ``ty2`` type.
12029 The ``nneg`` (non-negative) flag, if present, specifies that the
12030 operand is non-negative. This property may be used by optimization
12031 passes to later convert the ``uitofp`` into a ``sitofp``.
12036 The '``uitofp``' instruction takes a value to cast, which must be a
12037 scalar or vector :ref:`integer <t_integer>` value, and a type to cast it to
12038 ``ty2``, which must be an :ref:`floating-point <t_floating>` type. If
12039 ``ty`` is a vector integer type, ``ty2`` must be a vector floating-point
12040 type with the same number of elements as ``ty``
12045 The '``uitofp``' instruction interprets its operand as an unsigned
12046 integer quantity and converts it to the corresponding floating-point
12047 value. If the value cannot be exactly represented, it is rounded using
12048 the default rounding mode.
12050 If the ``nneg`` flag is set, and the ``uitofp`` argument is negative,
12051 the result is a poison value.
12057 .. code-block:: llvm
12059 %X = uitofp i32 257 to float ; yields float:257.0
12060 %Y = uitofp i8 -1 to double ; yields double:255.0
12062 %a = uitofp nneg i32 256 to i32 ; yields float:256.0
12063 %b = uitofp nneg i32 -256 to i32 ; yields i32 poison
12065 '``sitofp .. to``' Instruction
12066 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12073 <result> = sitofp <ty> <value> to <ty2> ; yields ty2
12078 The '``sitofp``' instruction regards ``value`` as a signed integer and
12079 converts that value to the ``ty2`` type.
12084 The '``sitofp``' instruction takes a value to cast, which must be a
12085 scalar or vector :ref:`integer <t_integer>` value, and a type to cast it to
12086 ``ty2``, which must be an :ref:`floating-point <t_floating>` type. If
12087 ``ty`` is a vector integer type, ``ty2`` must be a vector floating-point
12088 type with the same number of elements as ``ty``
12093 The '``sitofp``' instruction interprets its operand as a signed integer
12094 quantity and converts it to the corresponding floating-point value. If the
12095 value cannot be exactly represented, it is rounded using the default rounding
12101 .. code-block:: llvm
12103 %X = sitofp i32 257 to float ; yields float:257.0
12104 %Y = sitofp i8 -1 to double ; yields double:-1.0
12108 '``ptrtoint .. to``' Instruction
12109 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12116 <result> = ptrtoint <ty> <value> to <ty2> ; yields ty2
12121 The '``ptrtoint``' instruction converts the pointer or a vector of
12122 pointers ``value`` to the integer (or vector of integers) type ``ty2``.
12127 The '``ptrtoint``' instruction takes a ``value`` to cast, which must be
12128 a value of type :ref:`pointer <t_pointer>` or a vector of pointers, and a
12129 type to cast it to ``ty2``, which must be an :ref:`integer <t_integer>` or
12130 a vector of integers type.
12135 The '``ptrtoint``' instruction converts ``value`` to integer type
12136 ``ty2`` by interpreting the pointer value as an integer and either
12137 truncating or zero extending that value to the size of the integer type.
12138 If ``value`` is smaller than ``ty2`` then a zero extension is done. If
12139 ``value`` is larger than ``ty2`` then a truncation is done. If they are
12140 the same size, then nothing is done (*no-op cast*) other than a type
12146 .. code-block:: llvm
12148 %X = ptrtoint ptr %P to i8 ; yields truncation on 32-bit architecture
12149 %Y = ptrtoint ptr %P to i64 ; yields zero extension on 32-bit architecture
12150 %Z = ptrtoint <4 x ptr> %P to <4 x i64>; yields vector zero extension for a vector of addresses on 32-bit architecture
12154 '``inttoptr .. to``' Instruction
12155 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12162 <result> = inttoptr <ty> <value> to <ty2>[, !dereferenceable !<deref_bytes_node>][, !dereferenceable_or_null !<deref_bytes_node>] ; yields ty2
12167 The '``inttoptr``' instruction converts an integer ``value`` to a
12168 pointer type, ``ty2``.
12173 The '``inttoptr``' instruction takes an :ref:`integer <t_integer>` value to
12174 cast, and a type to cast it to, which must be a :ref:`pointer <t_pointer>`
12177 The optional ``!dereferenceable`` metadata must reference a single metadata
12178 name ``<deref_bytes_node>`` corresponding to a metadata node with one ``i64``
12180 See ``dereferenceable`` metadata.
12182 The optional ``!dereferenceable_or_null`` metadata must reference a single
12183 metadata name ``<deref_bytes_node>`` corresponding to a metadata node with one
12185 See ``dereferenceable_or_null`` metadata.
12190 The '``inttoptr``' instruction converts ``value`` to type ``ty2`` by
12191 applying either a zero extension or a truncation depending on the size
12192 of the integer ``value``. If ``value`` is larger than the size of a
12193 pointer then a truncation is done. If ``value`` is smaller than the size
12194 of a pointer then a zero extension is done. If they are the same size,
12195 nothing is done (*no-op cast*).
12200 .. code-block:: llvm
12202 %X = inttoptr i32 255 to ptr ; yields zero extension on 64-bit architecture
12203 %Y = inttoptr i32 255 to ptr ; yields no-op on 32-bit architecture
12204 %Z = inttoptr i64 0 to ptr ; yields truncation on 32-bit architecture
12205 %Z = inttoptr <4 x i32> %G to <4 x ptr>; yields truncation of vector G to four pointers
12209 '``bitcast .. to``' Instruction
12210 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12217 <result> = bitcast <ty> <value> to <ty2> ; yields ty2
12222 The '``bitcast``' instruction converts ``value`` to type ``ty2`` without
12228 The '``bitcast``' instruction takes a value to cast, which must be a
12229 non-aggregate first class value, and a type to cast it to, which must
12230 also be a non-aggregate :ref:`first class <t_firstclass>` type. The
12231 bit sizes of ``value`` and the destination type, ``ty2``, must be
12232 identical. If the source type is a pointer, the destination type must
12233 also be a pointer of the same size. This instruction supports bitwise
12234 conversion of vectors to integers and to vectors of other types (as
12235 long as they have the same size).
12240 The '``bitcast``' instruction converts ``value`` to type ``ty2``. It
12241 is always a *no-op cast* because no bits change with this
12242 conversion. The conversion is done as if the ``value`` had been stored
12243 to memory and read back as type ``ty2``. Pointer (or vector of
12244 pointers) types may only be converted to other pointer (or vector of
12245 pointers) types with the same address space through this instruction.
12246 To convert pointers to other types, use the :ref:`inttoptr <i_inttoptr>`
12247 or :ref:`ptrtoint <i_ptrtoint>` instructions first.
12249 There is a caveat for bitcasts involving vector types in relation to
12250 endianness. For example ``bitcast <2 x i8> <value> to i16`` puts element zero
12251 of the vector in the least significant bits of the i16 for little-endian while
12252 element zero ends up in the most significant bits for big-endian.
12257 .. code-block:: text
12259 %X = bitcast i8 255 to i8 ; yields i8 :-1
12260 %Y = bitcast i32* %x to i16* ; yields i16*:%x
12261 %Z = bitcast <2 x i32> %V to i64; ; yields i64: %V (depends on endianness)
12262 %Z = bitcast <2 x i32*> %V to <2 x i64*> ; yields <2 x i64*>
12264 .. _i_addrspacecast:
12266 '``addrspacecast .. to``' Instruction
12267 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12274 <result> = addrspacecast <pty> <ptrval> to <pty2> ; yields pty2
12279 The '``addrspacecast``' instruction converts ``ptrval`` from ``pty`` in
12280 address space ``n`` to type ``pty2`` in address space ``m``.
12285 The '``addrspacecast``' instruction takes a pointer or vector of pointer value
12286 to cast and a pointer type to cast it to, which must have a different
12292 The '``addrspacecast``' instruction converts the pointer value
12293 ``ptrval`` to type ``pty2``. It can be a *no-op cast* or a complex
12294 value modification, depending on the target and the address space
12295 pair. Pointer conversions within the same address space must be
12296 performed with the ``bitcast`` instruction. Note that if the address
12297 space conversion produces a dereferenceable result then both result
12298 and operand refer to the same memory location. The conversion must
12299 have no side effects, and must not capture the value of the pointer.
12301 If the source is :ref:`poison <poisonvalues>`, the result is
12302 :ref:`poison <poisonvalues>`.
12304 If the source is not :ref:`poison <poisonvalues>`, and both source and
12305 destination are :ref:`integral pointers <nointptrtype>`, and the
12306 result pointer is dereferenceable, the cast is assumed to be
12307 reversible (i.e. casting the result back to the original address space
12308 should yield the original bit pattern).
12313 .. code-block:: llvm
12315 %X = addrspacecast ptr %x to ptr addrspace(1)
12316 %Y = addrspacecast ptr addrspace(1) %y to ptr addrspace(2)
12317 %Z = addrspacecast <4 x ptr> %z to <4 x ptr addrspace(3)>
12324 The instructions in this category are the "miscellaneous" instructions,
12325 which defy better classification.
12329 '``icmp``' Instruction
12330 ^^^^^^^^^^^^^^^^^^^^^^
12337 <result> = icmp <cond> <ty> <op1>, <op2> ; yields i1 or <N x i1>:result
12338 <result> = icmp samesign <cond> <ty> <op1>, <op2> ; yields i1 or <N x i1>:result
12343 The '``icmp``' instruction returns a boolean value or a vector of
12344 boolean values based on comparison of its two integer, integer vector,
12345 pointer, or pointer vector operands.
12350 The '``icmp``' instruction takes three operands. The first operand is
12351 the condition code indicating the kind of comparison to perform. It is
12352 not a value, just a keyword. The possible condition codes are:
12357 #. ``ne``: not equal
12358 #. ``ugt``: unsigned greater than
12359 #. ``uge``: unsigned greater or equal
12360 #. ``ult``: unsigned less than
12361 #. ``ule``: unsigned less or equal
12362 #. ``sgt``: signed greater than
12363 #. ``sge``: signed greater or equal
12364 #. ``slt``: signed less than
12365 #. ``sle``: signed less or equal
12367 The remaining two arguments must be :ref:`integer <t_integer>` or
12368 :ref:`pointer <t_pointer>` or integer :ref:`vector <t_vector>` typed. They
12369 must also be identical types.
12374 The '``icmp``' compares ``op1`` and ``op2`` according to the condition
12375 code given as ``cond``. The comparison performed always yields either an
12376 :ref:`i1 <t_integer>` or vector of ``i1`` result, as follows:
12378 .. _icmp_md_cc_sem:
12380 #. ``eq``: yields ``true`` if the operands are equal, ``false``
12381 otherwise. No sign interpretation is necessary or performed.
12382 #. ``ne``: yields ``true`` if the operands are unequal, ``false``
12383 otherwise. No sign interpretation is necessary or performed.
12384 #. ``ugt``: interprets the operands as unsigned values and yields
12385 ``true`` if ``op1`` is greater than ``op2``.
12386 #. ``uge``: interprets the operands as unsigned values and yields
12387 ``true`` if ``op1`` is greater than or equal to ``op2``.
12388 #. ``ult``: interprets the operands as unsigned values and yields
12389 ``true`` if ``op1`` is less than ``op2``.
12390 #. ``ule``: interprets the operands as unsigned values and yields
12391 ``true`` if ``op1`` is less than or equal to ``op2``.
12392 #. ``sgt``: interprets the operands as signed values and yields ``true``
12393 if ``op1`` is greater than ``op2``.
12394 #. ``sge``: interprets the operands as signed values and yields ``true``
12395 if ``op1`` is greater than or equal to ``op2``.
12396 #. ``slt``: interprets the operands as signed values and yields ``true``
12397 if ``op1`` is less than ``op2``.
12398 #. ``sle``: interprets the operands as signed values and yields ``true``
12399 if ``op1`` is less than or equal to ``op2``.
12401 If the operands are :ref:`pointer <t_pointer>` typed, the pointer values
12402 are compared as if they were integers.
12404 If the operands are integer vectors, then they are compared element by
12405 element. The result is an ``i1`` vector with the same number of elements
12406 as the values being compared. Otherwise, the result is an ``i1``.
12408 If the ``samesign`` keyword is present and the operands are not of the
12409 same sign then the result is a :ref:`poison value <poisonvalues>`.
12414 .. code-block:: text
12416 <result> = icmp eq i32 4, 5 ; yields: result=false
12417 <result> = icmp ne ptr %X, %X ; yields: result=false
12418 <result> = icmp ult i16 4, 5 ; yields: result=true
12419 <result> = icmp sgt i16 4, 5 ; yields: result=false
12420 <result> = icmp ule i16 -4, 5 ; yields: result=false
12421 <result> = icmp sge i16 4, 5 ; yields: result=false
12425 '``fcmp``' Instruction
12426 ^^^^^^^^^^^^^^^^^^^^^^
12433 <result> = fcmp [fast-math flags]* <cond> <ty> <op1>, <op2> ; yields i1 or <N x i1>:result
12438 The '``fcmp``' instruction returns a boolean value or vector of boolean
12439 values based on comparison of its operands.
12441 If the operands are floating-point scalars, then the result type is a
12442 boolean (:ref:`i1 <t_integer>`).
12444 If the operands are floating-point vectors, then the result type is a
12445 vector of boolean with the same number of elements as the operands being
12451 The '``fcmp``' instruction takes three operands. The first operand is
12452 the condition code indicating the kind of comparison to perform. It is
12453 not a value, just a keyword. The possible condition codes are:
12455 #. ``false``: no comparison, always returns false
12456 #. ``oeq``: ordered and equal
12457 #. ``ogt``: ordered and greater than
12458 #. ``oge``: ordered and greater than or equal
12459 #. ``olt``: ordered and less than
12460 #. ``ole``: ordered and less than or equal
12461 #. ``one``: ordered and not equal
12462 #. ``ord``: ordered (no nans)
12463 #. ``ueq``: unordered or equal
12464 #. ``ugt``: unordered or greater than
12465 #. ``uge``: unordered or greater than or equal
12466 #. ``ult``: unordered or less than
12467 #. ``ule``: unordered or less than or equal
12468 #. ``une``: unordered or not equal
12469 #. ``uno``: unordered (either nans)
12470 #. ``true``: no comparison, always returns true
12472 *Ordered* means that neither operand is a QNAN while *unordered* means
12473 that either operand may be a QNAN.
12475 Each of ``val1`` and ``val2`` arguments must be either a :ref:`floating-point
12476 <t_floating>` type or a :ref:`vector <t_vector>` of floating-point type.
12477 They must have identical types.
12482 The '``fcmp``' instruction compares ``op1`` and ``op2`` according to the
12483 condition code given as ``cond``. If the operands are vectors, then the
12484 vectors are compared element by element. Each comparison performed
12485 always yields an :ref:`i1 <t_integer>` result, as follows:
12487 #. ``false``: always yields ``false``, regardless of operands.
12488 #. ``oeq``: yields ``true`` if both operands are not a QNAN and ``op1``
12489 is equal to ``op2``.
12490 #. ``ogt``: yields ``true`` if both operands are not a QNAN and ``op1``
12491 is greater than ``op2``.
12492 #. ``oge``: yields ``true`` if both operands are not a QNAN and ``op1``
12493 is greater than or equal to ``op2``.
12494 #. ``olt``: yields ``true`` if both operands are not a QNAN and ``op1``
12495 is less than ``op2``.
12496 #. ``ole``: yields ``true`` if both operands are not a QNAN and ``op1``
12497 is less than or equal to ``op2``.
12498 #. ``one``: yields ``true`` if both operands are not a QNAN and ``op1``
12499 is not equal to ``op2``.
12500 #. ``ord``: yields ``true`` if both operands are not a QNAN.
12501 #. ``ueq``: yields ``true`` if either operand is a QNAN or ``op1`` is
12503 #. ``ugt``: yields ``true`` if either operand is a QNAN or ``op1`` is
12504 greater than ``op2``.
12505 #. ``uge``: yields ``true`` if either operand is a QNAN or ``op1`` is
12506 greater than or equal to ``op2``.
12507 #. ``ult``: yields ``true`` if either operand is a QNAN or ``op1`` is
12509 #. ``ule``: yields ``true`` if either operand is a QNAN or ``op1`` is
12510 less than or equal to ``op2``.
12511 #. ``une``: yields ``true`` if either operand is a QNAN or ``op1`` is
12512 not equal to ``op2``.
12513 #. ``uno``: yields ``true`` if either operand is a QNAN.
12514 #. ``true``: always yields ``true``, regardless of operands.
12516 The ``fcmp`` instruction can also optionally take any number of
12517 :ref:`fast-math flags <fastmath>`, which are optimization hints to enable
12518 otherwise unsafe floating-point optimizations.
12520 Any set of fast-math flags are legal on an ``fcmp`` instruction, but the
12521 only flags that have any effect on its semantics are those that allow
12522 assumptions to be made about the values of input arguments; namely
12523 ``nnan``, ``ninf``, and ``reassoc``. See :ref:`fastmath` for more information.
12528 .. code-block:: text
12530 <result> = fcmp oeq float 4.0, 5.0 ; yields: result=false
12531 <result> = fcmp one float 4.0, 5.0 ; yields: result=true
12532 <result> = fcmp olt float 4.0, 5.0 ; yields: result=true
12533 <result> = fcmp ueq double 1.0, 2.0 ; yields: result=false
12537 '``phi``' Instruction
12538 ^^^^^^^^^^^^^^^^^^^^^
12545 <result> = phi [fast-math-flags] <ty> [ <val0>, <label0>], ...
12550 The '``phi``' instruction is used to implement the φ node in the SSA
12551 graph representing the function.
12556 The type of the incoming values is specified with the first type field.
12557 After this, the '``phi``' instruction takes a list of pairs as
12558 arguments, with one pair for each predecessor basic block of the current
12559 block. Only values of :ref:`first class <t_firstclass>` type may be used as
12560 the value arguments to the PHI node. Only labels may be used as the
12563 There must be no non-phi instructions between the start of a basic block
12564 and the PHI instructions: i.e. PHI instructions must be first in a basic
12567 For the purposes of the SSA form, the use of each incoming value is
12568 deemed to occur on the edge from the corresponding predecessor block to
12569 the current block (but after any definition of an '``invoke``'
12570 instruction's return value on the same edge).
12572 The optional ``fast-math-flags`` marker indicates that the phi has one
12573 or more :ref:`fast-math-flags <fastmath>`. These are optimization hints
12574 to enable otherwise unsafe floating-point optimizations. Fast-math-flags
12575 are only valid for phis that return :ref:`supported floating-point types
12576 <fastmath_return_types>`.
12581 At runtime, the '``phi``' instruction logically takes on the value
12582 specified by the pair corresponding to the predecessor basic block that
12583 executed just prior to the current block.
12588 .. code-block:: llvm
12590 Loop: ; Infinite loop that counts from 0 on up...
12591 %indvar = phi i32 [ 0, %LoopHeader ], [ %nextindvar, %Loop ]
12592 %nextindvar = add i32 %indvar, 1
12597 '``select``' Instruction
12598 ^^^^^^^^^^^^^^^^^^^^^^^^
12605 <result> = select [fast-math flags] selty <cond>, <ty> <val1>, <ty> <val2> ; yields ty
12607 selty is either i1 or {<N x i1>}
12612 The '``select``' instruction is used to choose one value based on a
12613 condition, without IR-level branching.
12618 The '``select``' instruction requires an 'i1' value or a vector of 'i1'
12619 values indicating the condition, and two values of the same :ref:`first
12620 class <t_firstclass>` type.
12622 #. The optional ``fast-math flags`` marker indicates that the select has one or more
12623 :ref:`fast-math flags <fastmath>`. These are optimization hints to enable
12624 otherwise unsafe floating-point optimizations. Fast-math flags are only valid
12625 for selects that return :ref:`supported floating-point types
12626 <fastmath_return_types>`.
12631 If the condition is an i1 and it evaluates to 1, the instruction returns
12632 the first value argument; otherwise, it returns the second value
12635 If the condition is a vector of i1, then the value arguments must be
12636 vectors of the same size, and the selection is done element by element.
12638 If the condition is an i1 and the value arguments are vectors of the
12639 same size, then an entire vector is selected.
12644 .. code-block:: llvm
12646 %X = select i1 true, i8 17, i8 42 ; yields i8:17
12651 '``freeze``' Instruction
12652 ^^^^^^^^^^^^^^^^^^^^^^^^
12659 <result> = freeze ty <val> ; yields ty:result
12664 The '``freeze``' instruction is used to stop propagation of
12665 :ref:`undef <undefvalues>` and :ref:`poison <poisonvalues>` values.
12670 The '``freeze``' instruction takes a single argument.
12675 If the argument is ``undef`` or ``poison``, '``freeze``' returns an
12676 arbitrary, but fixed, value of type '``ty``'.
12677 Otherwise, this instruction is a no-op and returns the input argument.
12678 All uses of a value returned by the same '``freeze``' instruction are
12679 guaranteed to always observe the same value, while different '``freeze``'
12680 instructions may yield different values.
12682 While ``undef`` and ``poison`` pointers can be frozen, the result is a
12683 non-dereferenceable pointer. See the
12684 :ref:`Pointer Aliasing Rules <pointeraliasing>` section for more information.
12685 If an aggregate value or vector is frozen, the operand is frozen element-wise.
12686 The padding of an aggregate isn't considered, since it isn't visible
12687 without storing it into memory and loading it with a different type.
12693 .. code-block:: text
12697 %y = add i32 %w, %w ; undef
12698 %z = add i32 %x, %x ; even number because all uses of %x observe
12700 %x2 = freeze i32 %w
12701 %cmp = icmp eq i32 %x, %x2 ; can be true or false
12703 ; example with vectors
12704 %v = <2 x i32> <i32 undef, i32 poison>
12705 %a = extractelement <2 x i32> %v, i32 0 ; undef
12706 %b = extractelement <2 x i32> %v, i32 1 ; poison
12707 %add = add i32 %a, %a ; undef
12709 %v.fr = freeze <2 x i32> %v ; element-wise freeze
12710 %d = extractelement <2 x i32> %v.fr, i32 0 ; not undef
12711 %add.f = add i32 %d, %d ; even number
12713 ; branching on frozen value
12714 %poison = add nsw i1 %k, undef ; poison
12715 %c = freeze i1 %poison
12716 br i1 %c, label %foo, label %bar ; non-deterministic branch to %foo or %bar
12721 '``call``' Instruction
12722 ^^^^^^^^^^^^^^^^^^^^^^
12729 <result> = [tail | musttail | notail ] call [fast-math flags] [cconv] [ret attrs] [addrspace(<num>)]
12730 <ty>|<fnty> <fnptrval>(<function args>) [fn attrs] [ operand bundles ]
12735 The '``call``' instruction represents a simple function call.
12740 This instruction requires several arguments:
12742 #. The optional ``tail`` and ``musttail`` markers indicate that the optimizers
12743 should perform tail call optimization. The ``tail`` marker is a hint that
12744 `can be ignored <CodeGenerator.html#tail-call-optimization>`_. The
12745 ``musttail`` marker means that the call must be tail call optimized in order
12746 for the program to be correct. This is true even in the presence of
12747 attributes like "disable-tail-calls". The ``musttail`` marker provides these
12750 #. The call will not cause unbounded stack growth if it is part of a
12751 recursive cycle in the call graph.
12752 #. Arguments with the :ref:`inalloca <attr_inalloca>` or
12753 :ref:`preallocated <attr_preallocated>` attribute are forwarded in place.
12754 #. If the musttail call appears in a function with the ``"thunk"`` attribute
12755 and the caller and callee both have varargs, then any unprototyped
12756 arguments in register or memory are forwarded to the callee. Similarly,
12757 the return value of the callee is returned to the caller's caller, even
12758 if a void return type is in use.
12760 Both markers imply that the callee does not access allocas, va_args, or
12761 byval arguments from the caller. As an exception to that, an alloca or byval
12762 argument may be passed to the callee as a byval argument, which can be
12763 dereferenced inside the callee. For example:
12765 .. code-block:: llvm
12767 declare void @take_byval(ptr byval(i64))
12768 declare void @take_ptr(ptr)
12770 ; Invalid (assuming @take_ptr dereferences the pointer), because %local
12771 ; may be de-allocated before the call to @take_ptr.
12772 define void @invalid_alloca() {
12774 %local = alloca i64
12775 tail call void @take_ptr(ptr %local)
12779 ; Valid, the byval attribute causes the memory allocated by %local to be
12780 ; copied into @take_byval's stack frame.
12781 define void @byval_alloca() {
12783 %local = alloca i64
12784 tail call void @take_byval(ptr byval(i64) %local)
12788 ; Invalid, because @use_global_va_list uses the variadic arguments from
12789 ; @invalid_va_list.
12790 %struct.va_list = type { ptr }
12791 @va_list = external global %struct.va_list
12792 define void @use_global_va_list() {
12794 %arg = va_arg ptr @va_list, i64
12797 define void @invalid_va_list(i32 %a, ...) {
12799 call void @llvm.va_start.p0(ptr @va_list)
12800 tail call void @use_global_va_list()
12804 ; Valid, byval argument forwarded to tail call as another byval argument.
12805 define void @forward_byval(ptr byval(i64) %x) {
12807 tail call void @take_byval(ptr byval(i64) %x)
12811 ; Invalid (assuming @take_ptr dereferences the pointer), byval argument
12812 ; passed to tail callee as non-byval ptr.
12813 define void @invalid_byval(ptr byval(i64) %x) {
12815 tail call void @take_ptr(ptr %x)
12820 Calls marked ``musttail`` must obey the following additional rules:
12822 - The call must immediately precede a :ref:`ret <i_ret>` instruction,
12823 or a pointer bitcast followed by a ret instruction.
12824 - The ret instruction must return the (possibly bitcasted) value
12825 produced by the call, undef, or void.
12826 - The calling conventions of the caller and callee must match.
12827 - The callee must be varargs iff the caller is varargs. Bitcasting a
12828 non-varargs function to the appropriate varargs type is legal so
12829 long as the non-varargs prefixes obey the other rules.
12830 - The return type must not undergo automatic conversion to an `sret` pointer.
12832 In addition, if the calling convention is not `swifttailcc` or `tailcc`:
12834 - All ABI-impacting function attributes, such as sret, byval, inreg,
12835 returned, and inalloca, must match.
12836 - The caller and callee prototypes must match. Pointer types of parameters
12837 or return types may differ in pointee type, but not in address space.
12839 On the other hand, if the calling convention is `swifttailcc` or `tailcc`:
12841 - Only these ABI-impacting attributes attributes are allowed: sret, byval,
12842 swiftself, and swiftasync.
12843 - Prototypes are not required to match.
12845 Tail call optimization for calls marked ``tail`` is guaranteed to occur if
12846 the following conditions are met:
12848 - Caller and callee both have the calling convention ``fastcc`` or ``tailcc``.
12849 - The call is in tail position (ret immediately follows call and ret
12850 uses value of call or is void).
12851 - Option ``-tailcallopt`` is enabled,
12852 ``llvm::GuaranteedTailCallOpt`` is ``true``, or the calling convention
12854 - `Platform-specific constraints are
12855 met. <CodeGenerator.html#tail-call-optimization>`_
12857 #. The optional ``notail`` marker indicates that the optimizers should not add
12858 ``tail`` or ``musttail`` markers to the call. It is used to prevent tail
12859 call optimization from being performed on the call.
12861 #. The optional ``fast-math flags`` marker indicates that the call has one or more
12862 :ref:`fast-math flags <fastmath>`, which are optimization hints to enable
12863 otherwise unsafe floating-point optimizations. Fast-math flags are only valid
12864 for calls that return :ref:`supported floating-point types <fastmath_return_types>`.
12866 #. The optional "cconv" marker indicates which :ref:`calling
12867 convention <callingconv>` the call should use. If none is
12868 specified, the call defaults to using C calling conventions. The
12869 calling convention of the call must match the calling convention of
12870 the target function, or else the behavior is undefined.
12871 #. The optional :ref:`Parameter Attributes <paramattrs>` list for return
12872 values. Only '``zeroext``', '``signext``', '``noext``', and '``inreg``'
12873 attributes are valid here.
12874 #. The optional addrspace attribute can be used to indicate the address space
12875 of the called function. If it is not specified, the program address space
12876 from the :ref:`datalayout string<langref_datalayout>` will be used.
12877 #. '``ty``': the type of the call instruction itself which is also the
12878 type of the return value. Functions that return no value are marked
12880 #. '``fnty``': shall be the signature of the function being called. The
12881 argument types must match the types implied by this signature. This
12882 type can be omitted if the function is not varargs.
12883 #. '``fnptrval``': An LLVM value containing a pointer to a function to
12884 be called. In most cases, this is a direct function call, but
12885 indirect ``call``'s are just as possible, calling an arbitrary pointer
12887 #. '``function args``': argument list whose types match the function
12888 signature argument types and parameter attributes. All arguments must
12889 be of :ref:`first class <t_firstclass>` type. If the function signature
12890 indicates the function accepts a variable number of arguments, the
12891 extra arguments can be specified.
12892 #. The optional :ref:`function attributes <fnattrs>` list.
12893 #. The optional :ref:`operand bundles <opbundles>` list.
12898 The '``call``' instruction is used to cause control flow to transfer to
12899 a specified function, with its incoming arguments bound to the specified
12900 values. Upon a '``ret``' instruction in the called function, control
12901 flow continues with the instruction after the function call, and the
12902 return value of the function is bound to the result argument.
12907 .. code-block:: llvm
12909 %retval = call i32 @test(i32 %argc)
12910 call i32 (ptr, ...) @printf(ptr %msg, i32 12, i8 42) ; yields i32
12911 %X = tail call i32 @foo() ; yields i32
12912 %Y = tail call fastcc i32 @foo() ; yields i32
12913 call void %foo(i8 signext 97)
12915 %struct.A = type { i32, i8 }
12916 %r = call %struct.A @foo() ; yields { i32, i8 }
12917 %gr = extractvalue %struct.A %r, 0 ; yields i32
12918 %gr1 = extractvalue %struct.A %r, 1 ; yields i8
12919 %Z = call void @foo() noreturn ; indicates that %foo never returns normally
12920 %ZZ = call zeroext i32 @bar() ; Return value is %zero extended
12922 llvm treats calls to some functions with names and arguments that match
12923 the standard C99 library as being the C99 library functions, and may
12924 perform optimizations or generate code for them under that assumption.
12925 This is something we'd like to change in the future to provide better
12926 support for freestanding environments and non-C-based languages.
12930 '``va_arg``' Instruction
12931 ^^^^^^^^^^^^^^^^^^^^^^^^
12938 <resultval> = va_arg <va_list*> <arglist>, <argty>
12943 The '``va_arg``' instruction is used to access arguments passed through
12944 the "variable argument" area of a function call. It is used to implement
12945 the ``va_arg`` macro in C.
12950 This instruction takes a ``va_list*`` value and the type of the
12951 argument. It returns a value of the specified argument type and
12952 increments the ``va_list`` to point to the next argument. The actual
12953 type of ``va_list`` is target specific.
12958 The '``va_arg``' instruction loads an argument of the specified type
12959 from the specified ``va_list`` and causes the ``va_list`` to point to
12960 the next argument. For more information, see the variable argument
12961 handling :ref:`Intrinsic Functions <int_varargs>`.
12963 It is legal for this instruction to be called in a function which does
12964 not take a variable number of arguments, for example, the ``vfprintf``
12967 ``va_arg`` is an LLVM instruction instead of an :ref:`intrinsic
12968 function <intrinsics>` because it takes a type as an argument.
12973 See the :ref:`variable argument processing <int_varargs>` section.
12975 Note that the code generator does not yet fully support va\_arg on many
12976 targets. Also, it does not currently support va\_arg with aggregate
12977 types on any target.
12981 '``landingpad``' Instruction
12982 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12989 <resultval> = landingpad <resultty> <clause>+
12990 <resultval> = landingpad <resultty> cleanup <clause>*
12992 <clause> := catch <type> <value>
12993 <clause> := filter <array constant type> <array constant>
12998 The '``landingpad``' instruction is used by `LLVM's exception handling
12999 system <ExceptionHandling.html#overview>`_ to specify that a basic block
13000 is a landing pad --- one where the exception lands, and corresponds to the
13001 code found in the ``catch`` portion of a ``try``/``catch`` sequence. It
13002 defines values supplied by the :ref:`personality function <personalityfn>` upon
13003 re-entry to the function. The ``resultval`` has the type ``resultty``.
13009 ``cleanup`` flag indicates that the landing pad block is a cleanup.
13011 A ``clause`` begins with the clause type --- ``catch`` or ``filter`` --- and
13012 contains the global variable representing the "type" that may be caught
13013 or filtered respectively. Unlike the ``catch`` clause, the ``filter``
13014 clause takes an array constant as its argument. Use
13015 "``[0 x ptr] undef``" for a filter which cannot throw. The
13016 '``landingpad``' instruction must contain *at least* one ``clause`` or
13017 the ``cleanup`` flag.
13022 The '``landingpad``' instruction defines the values which are set by the
13023 :ref:`personality function <personalityfn>` upon re-entry to the function, and
13024 therefore the "result type" of the ``landingpad`` instruction. As with
13025 calling conventions, how the personality function results are
13026 represented in LLVM IR is target specific.
13028 The clauses are applied in order from top to bottom. If two
13029 ``landingpad`` instructions are merged together through inlining, the
13030 clauses from the calling function are appended to the list of clauses.
13031 When the call stack is being unwound due to an exception being thrown,
13032 the exception is compared against each ``clause`` in turn. If it doesn't
13033 match any of the clauses, and the ``cleanup`` flag is not set, then
13034 unwinding continues further up the call stack.
13036 The ``landingpad`` instruction has several restrictions:
13038 - A landing pad block is a basic block which is the unwind destination
13039 of an '``invoke``' instruction.
13040 - A landing pad block must have a '``landingpad``' instruction as its
13041 first non-PHI instruction.
13042 - There can be only one '``landingpad``' instruction within the landing
13044 - A basic block that is not a landing pad block may not include a
13045 '``landingpad``' instruction.
13050 .. code-block:: llvm
13052 ;; A landing pad which can catch an integer.
13053 %res = landingpad { ptr, i32 }
13055 ;; A landing pad that is a cleanup.
13056 %res = landingpad { ptr, i32 }
13058 ;; A landing pad which can catch an integer and can only throw a double.
13059 %res = landingpad { ptr, i32 }
13061 filter [1 x ptr] [ptr @_ZTId]
13065 '``catchpad``' Instruction
13066 ^^^^^^^^^^^^^^^^^^^^^^^^^^
13073 <resultval> = catchpad within <catchswitch> [<args>*]
13078 The '``catchpad``' instruction is used by `LLVM's exception handling
13079 system <ExceptionHandling.html#overview>`_ to specify that a basic block
13080 begins a catch handler --- one where a personality routine attempts to transfer
13081 control to catch an exception.
13086 The ``catchswitch`` operand must always be a token produced by a
13087 :ref:`catchswitch <i_catchswitch>` instruction in a predecessor block. This
13088 ensures that each ``catchpad`` has exactly one predecessor block, and it always
13089 terminates in a ``catchswitch``.
13091 The ``args`` correspond to whatever information the personality routine
13092 requires to know if this is an appropriate handler for the exception. Control
13093 will transfer to the ``catchpad`` if this is the first appropriate handler for
13096 The ``resultval`` has the type :ref:`token <t_token>` and is used to match the
13097 ``catchpad`` to corresponding :ref:`catchrets <i_catchret>` and other nested EH
13103 When the call stack is being unwound due to an exception being thrown, the
13104 exception is compared against the ``args``. If it doesn't match, control will
13105 not reach the ``catchpad`` instruction. The representation of ``args`` is
13106 entirely target and personality function-specific.
13108 Like the :ref:`landingpad <i_landingpad>` instruction, the ``catchpad``
13109 instruction must be the first non-phi of its parent basic block.
13111 The meaning of the tokens produced and consumed by ``catchpad`` and other "pad"
13112 instructions is described in the
13113 `Windows exception handling documentation\ <ExceptionHandling.html#wineh>`_.
13115 When a ``catchpad`` has been "entered" but not yet "exited" (as
13116 described in the `EH documentation\ <ExceptionHandling.html#wineh-constraints>`_),
13117 it is undefined behavior to execute a :ref:`call <i_call>` or :ref:`invoke <i_invoke>`
13118 that does not carry an appropriate :ref:`"funclet" bundle <ob_funclet>`.
13123 .. code-block:: text
13126 %cs = catchswitch within none [label %handler0] unwind to caller
13127 ;; A catch block which can catch an integer.
13129 %tok = catchpad within %cs [ptr @_ZTIi]
13133 '``cleanuppad``' Instruction
13134 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13141 <resultval> = cleanuppad within <parent> [<args>*]
13146 The '``cleanuppad``' instruction is used by `LLVM's exception handling
13147 system <ExceptionHandling.html#overview>`_ to specify that a basic block
13148 is a cleanup block --- one where a personality routine attempts to
13149 transfer control to run cleanup actions.
13150 The ``args`` correspond to whatever additional
13151 information the :ref:`personality function <personalityfn>` requires to
13152 execute the cleanup.
13153 The ``resultval`` has the type :ref:`token <t_token>` and is used to
13154 match the ``cleanuppad`` to corresponding :ref:`cleanuprets <i_cleanupret>`.
13155 The ``parent`` argument is the token of the funclet that contains the
13156 ``cleanuppad`` instruction. If the ``cleanuppad`` is not inside a funclet,
13157 this operand may be the token ``none``.
13162 The instruction takes a list of arbitrary values which are interpreted
13163 by the :ref:`personality function <personalityfn>`.
13168 When the call stack is being unwound due to an exception being thrown,
13169 the :ref:`personality function <personalityfn>` transfers control to the
13170 ``cleanuppad`` with the aid of the personality-specific arguments.
13171 As with calling conventions, how the personality function results are
13172 represented in LLVM IR is target specific.
13174 The ``cleanuppad`` instruction has several restrictions:
13176 - A cleanup block is a basic block which is the unwind destination of
13177 an exceptional instruction.
13178 - A cleanup block must have a '``cleanuppad``' instruction as its
13179 first non-PHI instruction.
13180 - There can be only one '``cleanuppad``' instruction within the
13182 - A basic block that is not a cleanup block may not include a
13183 '``cleanuppad``' instruction.
13185 When a ``cleanuppad`` has been "entered" but not yet "exited" (as
13186 described in the `EH documentation\ <ExceptionHandling.html#wineh-constraints>`_),
13187 it is undefined behavior to execute a :ref:`call <i_call>` or :ref:`invoke <i_invoke>`
13188 that does not carry an appropriate :ref:`"funclet" bundle <ob_funclet>`.
13193 .. code-block:: text
13195 %tok = cleanuppad within %cs []
13200 -----------------------
13202 Debug records appear interleaved with instructions, but are not instructions;
13203 they are used only to define debug information, and have no effect on generated
13204 code. They are distinguished from instructions by the use of a leading `#` and
13205 an extra level of indentation. As an example:
13207 .. code-block:: llvm
13209 %inst1 = op1 %a, %b
13210 #dbg_value(%inst1, !10, !DIExpression(), !11)
13211 %inst2 = op2 %inst1, %c
13213 These debug records replace the prior :ref:`debug intrinsics<dbg_intrinsics>`.
13214 Debug records will be disabled if ``--write-experimental-debuginfo=false`` is
13215 passed to LLVM; it is an error for both records and intrinsics to appear in the
13216 same module. More information about debug records can be found in the `LLVM
13217 Source Level Debugging <SourceLevelDebugging.html#format-common-intrinsics>`_
13222 Intrinsic Functions
13223 ===================
13225 LLVM supports the notion of an "intrinsic function". These functions
13226 have well known names and semantics and are required to follow certain
13227 restrictions. Overall, these intrinsics represent an extension mechanism
13228 for the LLVM language that does not require changing all of the
13229 transformations in LLVM when adding to the language (or the bitcode
13230 reader/writer, the parser, etc...).
13232 Intrinsic function names must all start with an "``llvm.``" prefix. This
13233 prefix is reserved in LLVM for intrinsic names; thus, function names may
13234 not begin with this prefix. Intrinsic functions must always be external
13235 functions: you cannot define the body of intrinsic functions. Intrinsic
13236 functions may only be used in call or invoke instructions: it is illegal
13237 to take the address of an intrinsic function. Additionally, because
13238 intrinsic functions are part of the LLVM language, it is required if any
13239 are added that they be documented here.
13241 Some intrinsic functions can be overloaded, i.e., the intrinsic
13242 represents a family of functions that perform the same operation but on
13243 different data types. Because LLVM can represent over 8 million
13244 different integer types, overloading is used commonly to allow an
13245 intrinsic function to operate on any integer type. One or more of the
13246 argument types or the result type can be overloaded to accept any
13247 integer type. Argument types may also be defined as exactly matching a
13248 previous argument's type or the result type. This allows an intrinsic
13249 function which accepts multiple arguments, but needs all of them to be
13250 of the same type, to only be overloaded with respect to a single
13251 argument or the result.
13253 Overloaded intrinsics will have the names of its overloaded argument
13254 types encoded into its function name, each preceded by a period. Only
13255 those types which are overloaded result in a name suffix. Arguments
13256 whose type is matched against another type do not. For example, the
13257 ``llvm.ctpop`` function can take an integer of any width and returns an
13258 integer of exactly the same integer width. This leads to a family of
13259 functions such as ``i8 @llvm.ctpop.i8(i8 %val)`` and
13260 ``i29 @llvm.ctpop.i29(i29 %val)``. Only one type, the return type, is
13261 overloaded, and only one type suffix is required. Because the argument's
13262 type is matched against the return type, it does not require its own
13265 :ref:`Unnamed types <t_opaque>` are encoded as ``s_s``. Overloaded intrinsics
13266 that depend on an unnamed type in one of its overloaded argument types get an
13267 additional ``.<number>`` suffix. This allows differentiating intrinsics with
13268 different unnamed types as arguments. (For example:
13269 ``llvm.ssa.copy.p0s_s.2(%42*)``) The number is tracked in the LLVM module and
13270 it ensures unique names in the module. While linking together two modules, it is
13271 still possible to get a name clash. In that case one of the names will be
13272 changed by getting a new number.
13274 For target developers who are defining intrinsics for back-end code
13275 generation, any intrinsic overloads based solely the distinction between
13276 integer or floating point types should not be relied upon for correct
13277 code generation. In such cases, the recommended approach for target
13278 maintainers when defining intrinsics is to create separate integer and
13279 FP intrinsics rather than rely on overloading. For example, if different
13280 codegen is required for ``llvm.target.foo(<4 x i32>)`` and
13281 ``llvm.target.foo(<4 x float>)`` then these should be split into
13282 different intrinsics.
13284 To learn how to add an intrinsic function, please see the `Extending
13285 LLVM Guide <ExtendingLLVM.html>`_.
13289 Variable Argument Handling Intrinsics
13290 -------------------------------------
13292 Variable argument support is defined in LLVM with the
13293 :ref:`va_arg <i_va_arg>` instruction and these three intrinsic
13294 functions. These functions are related to the similarly named macros
13295 defined in the ``<stdarg.h>`` header file.
13297 All of these functions take as arguments pointers to a target-specific
13298 value type "``va_list``". The LLVM assembly language reference manual
13299 does not define what this type is, so all transformations should be
13300 prepared to handle these functions regardless of the type used. The intrinsics
13301 are overloaded, and can be used for pointers to different address spaces.
13303 This example shows how the :ref:`va_arg <i_va_arg>` instruction and the
13304 variable argument handling intrinsic functions are used.
13306 .. code-block:: llvm
13308 ; This struct is different for every platform. For most platforms,
13309 ; it is merely a ptr.
13310 %struct.va_list = type { ptr }
13312 ; For Unix x86_64 platforms, va_list is the following struct:
13313 ; %struct.va_list = type { i32, i32, ptr, ptr }
13315 define i32 @test(i32 %X, ...) {
13316 ; Initialize variable argument processing
13317 %ap = alloca %struct.va_list
13318 call void @llvm.va_start.p0(ptr %ap)
13320 ; Read a single integer argument
13321 %tmp = va_arg ptr %ap, i32
13323 ; Demonstrate usage of llvm.va_copy and llvm.va_end
13325 call void @llvm.va_copy.p0(ptr %aq, ptr %ap)
13326 call void @llvm.va_end.p0(ptr %aq)
13328 ; Stop processing of arguments.
13329 call void @llvm.va_end.p0(ptr %ap)
13333 declare void @llvm.va_start.p0(ptr)
13334 declare void @llvm.va_copy.p0(ptr, ptr)
13335 declare void @llvm.va_end.p0(ptr)
13339 '``llvm.va_start``' Intrinsic
13340 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13347 declare void @llvm.va_start.p0(ptr <arglist>)
13348 declare void @llvm.va_start.p5(ptr addrspace(5) <arglist>)
13353 The '``llvm.va_start``' intrinsic initializes ``<arglist>`` for
13354 subsequent use by ``va_arg``.
13359 The argument is a pointer to a ``va_list`` element to initialize.
13364 The '``llvm.va_start``' intrinsic works just like the ``va_start`` macro
13365 available in C. In a target-dependent way, it initializes the
13366 ``va_list`` element to which the argument points, so that the next call
13367 to ``va_arg`` will produce the first variable argument passed to the
13368 function. Unlike the C ``va_start`` macro, this intrinsic does not need
13369 to know the last argument of the function as the compiler can figure
13372 '``llvm.va_end``' Intrinsic
13373 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
13380 declare void @llvm.va_end.p0(ptr <arglist>)
13381 declare void @llvm.va_end.p5(ptr addrspace(5) <arglist>)
13386 The '``llvm.va_end``' intrinsic destroys ``<arglist>``, which has been
13387 initialized previously with ``llvm.va_start`` or ``llvm.va_copy``.
13392 The argument is a pointer to a ``va_list`` to destroy.
13397 The '``llvm.va_end``' intrinsic works just like the ``va_end`` macro
13398 available in C. In a target-dependent way, it destroys the ``va_list``
13399 element to which the argument points. Calls to
13400 :ref:`llvm.va_start <int_va_start>` and
13401 :ref:`llvm.va_copy <int_va_copy>` must be matched exactly with calls to
13406 '``llvm.va_copy``' Intrinsic
13407 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13414 declare void @llvm.va_copy.p0(ptr <destarglist>, ptr <srcarglist>)
13415 declare void @llvm.va_copy.p5(ptr addrspace(5) <destarglist>, ptr addrspace(5) <srcarglist>)
13420 The '``llvm.va_copy``' intrinsic copies the current argument position
13421 from the source argument list to the destination argument list.
13426 The first argument is a pointer to a ``va_list`` element to initialize.
13427 The second argument is a pointer to a ``va_list`` element to copy from.
13428 The address spaces of the two arguments must match.
13433 The '``llvm.va_copy``' intrinsic works just like the ``va_copy`` macro
13434 available in C. In a target-dependent way, it copies the source
13435 ``va_list`` element into the destination ``va_list`` element. This
13436 intrinsic is necessary because the `` llvm.va_start`` intrinsic may be
13437 arbitrarily complex and require, for example, memory allocation.
13439 Accurate Garbage Collection Intrinsics
13440 --------------------------------------
13442 LLVM's support for `Accurate Garbage Collection <GarbageCollection.html>`_
13443 (GC) requires the frontend to generate code containing appropriate intrinsic
13444 calls and select an appropriate GC strategy which knows how to lower these
13445 intrinsics in a manner which is appropriate for the target collector.
13447 These intrinsics allow identification of :ref:`GC roots on the
13448 stack <int_gcroot>`, as well as garbage collector implementations that
13449 require :ref:`read <int_gcread>` and :ref:`write <int_gcwrite>` barriers.
13450 Frontends for type-safe garbage collected languages should generate
13451 these intrinsics to make use of the LLVM garbage collectors. For more
13452 details, see `Garbage Collection with LLVM <GarbageCollection.html>`_.
13454 LLVM provides an second experimental set of intrinsics for describing garbage
13455 collection safepoints in compiled code. These intrinsics are an alternative
13456 to the ``llvm.gcroot`` intrinsics, but are compatible with the ones for
13457 :ref:`read <int_gcread>` and :ref:`write <int_gcwrite>` barriers. The
13458 differences in approach are covered in the `Garbage Collection with LLVM
13459 <GarbageCollection.html>`_ documentation. The intrinsics themselves are
13460 described in :doc:`Statepoints`.
13464 '``llvm.gcroot``' Intrinsic
13465 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
13472 declare void @llvm.gcroot(ptr %ptrloc, ptr %metadata)
13477 The '``llvm.gcroot``' intrinsic declares the existence of a GC root to
13478 the code generator, and allows some metadata to be associated with it.
13483 The first argument specifies the address of a stack object that contains
13484 the root pointer. The second pointer (which must be either a constant or
13485 a global value address) contains the meta-data to be associated with the
13491 At runtime, a call to this intrinsic stores a null pointer into the
13492 "ptrloc" location. At compile-time, the code generator generates
13493 information to allow the runtime to find the pointer at GC safe points.
13494 The '``llvm.gcroot``' intrinsic may only be used in a function which
13495 :ref:`specifies a GC algorithm <gc>`.
13499 '``llvm.gcread``' Intrinsic
13500 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
13507 declare ptr @llvm.gcread(ptr %ObjPtr, ptr %Ptr)
13512 The '``llvm.gcread``' intrinsic identifies reads of references from heap
13513 locations, allowing garbage collector implementations that require read
13519 The second argument is the address to read from, which should be an
13520 address allocated from the garbage collector. The first object is a
13521 pointer to the start of the referenced object, if needed by the language
13522 runtime (otherwise null).
13527 The '``llvm.gcread``' intrinsic has the same semantics as a load
13528 instruction, but may be replaced with substantially more complex code by
13529 the garbage collector runtime, as needed. The '``llvm.gcread``'
13530 intrinsic may only be used in a function which :ref:`specifies a GC
13535 '``llvm.gcwrite``' Intrinsic
13536 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13543 declare void @llvm.gcwrite(ptr %P1, ptr %Obj, ptr %P2)
13548 The '``llvm.gcwrite``' intrinsic identifies writes of references to heap
13549 locations, allowing garbage collector implementations that require write
13550 barriers (such as generational or reference counting collectors).
13555 The first argument is the reference to store, the second is the start of
13556 the object to store it to, and the third is the address of the field of
13557 Obj to store to. If the runtime does not require a pointer to the
13558 object, Obj may be null.
13563 The '``llvm.gcwrite``' intrinsic has the same semantics as a store
13564 instruction, but may be replaced with substantially more complex code by
13565 the garbage collector runtime, as needed. The '``llvm.gcwrite``'
13566 intrinsic may only be used in a function which :ref:`specifies a GC
13572 '``llvm.experimental.gc.statepoint``' Intrinsic
13573 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13581 @llvm.experimental.gc.statepoint(i64 <id>, i32 <num patch bytes>,
13582 ptr elementtype(func_type) <target>,
13583 i64 <#call args>, i64 <flags>,
13584 ... (call parameters),
13590 The statepoint intrinsic represents a call which is parse-able by the
13596 The 'id' operand is a constant integer that is reported as the ID
13597 field in the generated stackmap. LLVM does not interpret this
13598 parameter in any way and its meaning is up to the statepoint user to
13599 decide. Note that LLVM is free to duplicate code containing
13600 statepoint calls, and this may transform IR that had a unique 'id' per
13601 lexical call to statepoint to IR that does not.
13603 If 'num patch bytes' is non-zero then the call instruction
13604 corresponding to the statepoint is not emitted and LLVM emits 'num
13605 patch bytes' bytes of nops in its place. LLVM will emit code to
13606 prepare the function arguments and retrieve the function return value
13607 in accordance to the calling convention; the former before the nop
13608 sequence and the latter after the nop sequence. It is expected that
13609 the user will patch over the 'num patch bytes' bytes of nops with a
13610 calling sequence specific to their runtime before executing the
13611 generated machine code. There are no guarantees with respect to the
13612 alignment of the nop sequence. Unlike :doc:`StackMaps` statepoints do
13613 not have a concept of shadow bytes. Note that semantically the
13614 statepoint still represents a call or invoke to 'target', and the nop
13615 sequence after patching is expected to represent an operation
13616 equivalent to a call or invoke to 'target'.
13618 The 'target' operand is the function actually being called. The operand
13619 must have an :ref:`elementtype <attr_elementtype>` attribute specifying
13620 the function type of the target. The target can be specified as either
13621 a symbolic LLVM function, or as an arbitrary Value of pointer type. Note
13622 that the function type must match the signature of the callee and the
13623 types of the 'call parameters' arguments.
13625 The '#call args' operand is the number of arguments to the actual
13626 call. It must exactly match the number of arguments passed in the
13627 'call parameters' variable length section.
13629 The 'flags' operand is used to specify extra information about the
13630 statepoint. This is currently only used to mark certain statepoints
13631 as GC transitions. This operand is a 64-bit integer with the following
13632 layout, where bit 0 is the least significant bit:
13634 +-------+---------------------------------------------------+
13636 +=======+===================================================+
13637 | 0 | Set if the statepoint is a GC transition, cleared |
13639 +-------+---------------------------------------------------+
13640 | 1-63 | Reserved for future use; must be cleared. |
13641 +-------+---------------------------------------------------+
13643 The 'call parameters' arguments are simply the arguments which need to
13644 be passed to the call target. They will be lowered according to the
13645 specified calling convention and otherwise handled like a normal call
13646 instruction. The number of arguments must exactly match what is
13647 specified in '# call args'. The types must match the signature of
13650 The 'call parameter' attributes must be followed by two 'i64 0' constants.
13651 These were originally the length prefixes for 'gc transition parameter' and
13652 'deopt parameter' arguments, but the role of these parameter sets have been
13653 entirely replaced with the corresponding operand bundles. In a future
13654 revision, these now redundant arguments will be removed.
13659 A statepoint is assumed to read and write all memory. As a result,
13660 memory operations can not be reordered past a statepoint. It is
13661 illegal to mark a statepoint as being either 'readonly' or 'readnone'.
13663 Note that legal IR can not perform any memory operation on a 'gc
13664 pointer' argument of the statepoint in a location statically reachable
13665 from the statepoint. Instead, the explicitly relocated value (from a
13666 ``gc.relocate``) must be used.
13668 '``llvm.experimental.gc.result``' Intrinsic
13669 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13677 @llvm.experimental.gc.result(token %statepoint_token)
13682 ``gc.result`` extracts the result of the original call instruction
13683 which was replaced by the ``gc.statepoint``. The ``gc.result``
13684 intrinsic is actually a family of three intrinsics due to an
13685 implementation limitation. Other than the type of the return value,
13686 the semantics are the same.
13691 The first and only argument is the ``gc.statepoint`` which starts
13692 the safepoint sequence of which this ``gc.result`` is a part.
13693 Despite the typing of this as a generic token, *only* the value defined
13694 by a ``gc.statepoint`` is legal here.
13699 The ``gc.result`` represents the return value of the call target of
13700 the ``statepoint``. The type of the ``gc.result`` must exactly match
13701 the type of the target. If the call target returns void, there will
13702 be no ``gc.result``.
13704 A ``gc.result`` is modeled as a 'readnone' pure function. It has no
13705 side effects since it is just a projection of the return value of the
13706 previous call represented by the ``gc.statepoint``.
13708 '``llvm.experimental.gc.relocate``' Intrinsic
13709 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13716 declare <pointer type>
13717 @llvm.experimental.gc.relocate(token %statepoint_token,
13719 i32 %pointer_offset)
13724 A ``gc.relocate`` returns the potentially relocated value of a pointer
13730 The first argument is the ``gc.statepoint`` which starts the
13731 safepoint sequence of which this ``gc.relocation`` is a part.
13732 Despite the typing of this as a generic token, *only* the value defined
13733 by a ``gc.statepoint`` is legal here.
13735 The second and third arguments are both indices into operands of the
13736 corresponding statepoint's :ref:`gc-live <ob_gc_live>` operand bundle.
13738 The second argument is an index which specifies the allocation for the pointer
13739 being relocated. The associated value must be within the object with which the
13740 pointer being relocated is associated. The optimizer is free to change *which*
13741 interior derived pointer is reported, provided that it does not replace an
13742 actual base pointer with another interior derived pointer. Collectors are
13743 allowed to rely on the base pointer operand remaining an actual base pointer if
13746 The third argument is an index which specify the (potentially) derived pointer
13747 being relocated. It is legal for this index to be the same as the second
13748 argument if-and-only-if a base pointer is being relocated.
13753 The return value of ``gc.relocate`` is the potentially relocated value
13754 of the pointer specified by its arguments. It is unspecified how the
13755 value of the returned pointer relates to the argument to the
13756 ``gc.statepoint`` other than that a) it points to the same source
13757 language object with the same offset, and b) the 'based-on'
13758 relationship of the newly relocated pointers is a projection of the
13759 unrelocated pointers. In particular, the integer value of the pointer
13760 returned is unspecified.
13762 A ``gc.relocate`` is modeled as a ``readnone`` pure function. It has no
13763 side effects since it is just a way to extract information about work
13764 done during the actual call modeled by the ``gc.statepoint``.
13766 .. _gc.get.pointer.base:
13768 '``llvm.experimental.gc.get.pointer.base``' Intrinsic
13769 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13776 declare <pointer type>
13777 @llvm.experimental.gc.get.pointer.base(
13778 <pointer type> readnone nocapture %derived_ptr)
13779 nounwind willreturn memory(none)
13784 ``gc.get.pointer.base`` for a derived pointer returns its base pointer.
13789 The only argument is a pointer which is based on some object with
13790 an unknown offset from the base of said object.
13795 This intrinsic is used in the abstract machine model for GC to represent
13796 the base pointer for an arbitrary derived pointer.
13798 This intrinsic is inlined by the :ref:`RewriteStatepointsForGC` pass by
13799 replacing all uses of this callsite with the offset of a derived pointer from
13800 its base pointer value. The replacement is done as part of the lowering to the
13801 explicit statepoint model.
13803 The return pointer type must be the same as the type of the parameter.
13806 '``llvm.experimental.gc.get.pointer.offset``' Intrinsic
13807 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13815 @llvm.experimental.gc.get.pointer.offset(
13816 <pointer type> readnone nocapture %derived_ptr)
13817 nounwind willreturn memory(none)
13822 ``gc.get.pointer.offset`` for a derived pointer returns the offset from its
13828 The only argument is a pointer which is based on some object with
13829 an unknown offset from the base of said object.
13834 This intrinsic is used in the abstract machine model for GC to represent
13835 the offset of an arbitrary derived pointer from its base pointer.
13837 This intrinsic is inlined by the :ref:`RewriteStatepointsForGC` pass by
13838 replacing all uses of this callsite with the offset of a derived pointer from
13839 its base pointer value. The replacement is done as part of the lowering to the
13840 explicit statepoint model.
13842 Basically this call calculates difference between the derived pointer and its
13843 base pointer (see :ref:`gc.get.pointer.base`) both ptrtoint casted. But
13844 this cast done outside the :ref:`RewriteStatepointsForGC` pass could result
13845 in the pointers lost for further lowering from the abstract model to the
13846 explicit physical one.
13848 Code Generator Intrinsics
13849 -------------------------
13851 These intrinsics are provided by LLVM to expose special features that
13852 may only be implemented with code generator support.
13854 '``llvm.returnaddress``' Intrinsic
13855 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13862 declare ptr @llvm.returnaddress(i32 <level>)
13867 The '``llvm.returnaddress``' intrinsic attempts to compute a
13868 target-specific value indicating the return address of the current
13869 function or one of its callers.
13874 The argument to this intrinsic indicates which function to return the
13875 address for. Zero indicates the calling function, one indicates its
13876 caller, etc. The argument is **required** to be a constant integer
13882 The '``llvm.returnaddress``' intrinsic either returns a pointer
13883 indicating the return address of the specified call frame, or zero if it
13884 cannot be identified. The value returned by this intrinsic is likely to
13885 be incorrect or 0 for arguments other than zero, so it should only be
13886 used for debugging purposes.
13888 Note that calling this intrinsic does not prevent function inlining or
13889 other aggressive transformations, so the value returned may not be that
13890 of the obvious source-language caller.
13892 '``llvm.addressofreturnaddress``' Intrinsic
13893 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13900 declare ptr @llvm.addressofreturnaddress()
13905 The '``llvm.addressofreturnaddress``' intrinsic returns a target-specific
13906 pointer to the place in the stack frame where the return address of the
13907 current function is stored.
13912 Note that calling this intrinsic does not prevent function inlining or
13913 other aggressive transformations, so the value returned may not be that
13914 of the obvious source-language caller.
13916 This intrinsic is only implemented for x86 and aarch64.
13918 '``llvm.sponentry``' Intrinsic
13919 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13926 declare ptr @llvm.sponentry()
13931 The '``llvm.sponentry``' intrinsic returns the stack pointer value at
13932 the entry of the current function calling this intrinsic.
13937 Note this intrinsic is only verified on AArch64 and ARM.
13939 '``llvm.frameaddress``' Intrinsic
13940 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13947 declare ptr @llvm.frameaddress(i32 <level>)
13952 The '``llvm.frameaddress``' intrinsic attempts to return the
13953 target-specific frame pointer value for the specified stack frame.
13958 The argument to this intrinsic indicates which function to return the
13959 frame pointer for. Zero indicates the calling function, one indicates
13960 its caller, etc. The argument is **required** to be a constant integer
13966 The '``llvm.frameaddress``' intrinsic either returns a pointer
13967 indicating the frame address of the specified call frame, or zero if it
13968 cannot be identified. The value returned by this intrinsic is likely to
13969 be incorrect or 0 for arguments other than zero, so it should only be
13970 used for debugging purposes.
13972 Note that calling this intrinsic does not prevent function inlining or
13973 other aggressive transformations, so the value returned may not be that
13974 of the obvious source-language caller.
13976 '``llvm.swift.async.context.addr``' Intrinsic
13977 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13984 declare ptr @llvm.swift.async.context.addr()
13989 The '``llvm.swift.async.context.addr``' intrinsic returns a pointer to
13990 the part of the extended frame record containing the asynchronous
13991 context of a Swift execution.
13996 If the caller has a ``swiftasync`` parameter, that argument will initially
13997 be stored at the returned address. If not, it will be initialized to null.
13999 '``llvm.localescape``' and '``llvm.localrecover``' Intrinsics
14000 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14007 declare void @llvm.localescape(...)
14008 declare ptr @llvm.localrecover(ptr %func, ptr %fp, i32 %idx)
14013 The '``llvm.localescape``' intrinsic escapes offsets of a collection of static
14014 allocas, and the '``llvm.localrecover``' intrinsic applies those offsets to a
14015 live frame pointer to recover the address of the allocation. The offset is
14016 computed during frame layout of the caller of ``llvm.localescape``.
14021 All arguments to '``llvm.localescape``' must be pointers to static allocas or
14022 casts of static allocas. Each function can only call '``llvm.localescape``'
14023 once, and it can only do so from the entry block.
14025 The ``func`` argument to '``llvm.localrecover``' must be a constant
14026 bitcasted pointer to a function defined in the current module. The code
14027 generator cannot determine the frame allocation offset of functions defined in
14030 The ``fp`` argument to '``llvm.localrecover``' must be a frame pointer of a
14031 call frame that is currently live. The return value of '``llvm.localaddress``'
14032 is one way to produce such a value, but various runtimes also expose a suitable
14033 pointer in platform-specific ways.
14035 The ``idx`` argument to '``llvm.localrecover``' indicates which alloca passed to
14036 '``llvm.localescape``' to recover. It is zero-indexed.
14041 These intrinsics allow a group of functions to share access to a set of local
14042 stack allocations of a one parent function. The parent function may call the
14043 '``llvm.localescape``' intrinsic once from the function entry block, and the
14044 child functions can use '``llvm.localrecover``' to access the escaped allocas.
14045 The '``llvm.localescape``' intrinsic blocks inlining, as inlining changes where
14046 the escaped allocas are allocated, which would break attempts to use
14047 '``llvm.localrecover``'.
14049 '``llvm.seh.try.begin``' and '``llvm.seh.try.end``' Intrinsics
14050 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14057 declare void @llvm.seh.try.begin()
14058 declare void @llvm.seh.try.end()
14063 The '``llvm.seh.try.begin``' and '``llvm.seh.try.end``' intrinsics mark
14064 the boundary of a _try region for Windows SEH Asynchronous Exception Handling.
14069 When a C-function is compiled with Windows SEH Asynchronous Exception option,
14070 -feh_asynch (aka MSVC -EHa), these two intrinsics are injected to mark _try
14071 boundary and to prevent potential exceptions from being moved across boundary.
14072 Any set of operations can then be confined to the region by reading their leaf
14073 inputs via volatile loads and writing their root outputs via volatile stores.
14075 '``llvm.seh.scope.begin``' and '``llvm.seh.scope.end``' Intrinsics
14076 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14083 declare void @llvm.seh.scope.begin()
14084 declare void @llvm.seh.scope.end()
14089 The '``llvm.seh.scope.begin``' and '``llvm.seh.scope.end``' intrinsics mark
14090 the boundary of a CPP object lifetime for Windows SEH Asynchronous Exception
14091 Handling (MSVC option -EHa).
14096 LLVM's ordinary exception-handling representation associates EH cleanups and
14097 handlers only with ``invoke``s, which normally correspond only to call sites. To
14098 support arbitrary faulting instructions, it must be possible to recover the current
14099 EH scope for any instruction. Turning every operation in LLVM that could fault
14100 into an ``invoke`` of a new, potentially-throwing intrinsic would require adding a
14101 large number of intrinsics, impede optimization of those operations, and make
14102 compilation slower by introducing many extra basic blocks. These intrinsics can
14103 be used instead to mark the region protected by a cleanup, such as for a local
14104 C++ object with a non-trivial destructor. ``llvm.seh.scope.begin`` is used to mark
14105 the start of the region; it is always called with ``invoke``, with the unwind block
14106 being the desired unwind destination for any potentially-throwing instructions
14107 within the region. `llvm.seh.scope.end` is used to mark when the scope ends
14108 and the EH cleanup is no longer required (e.g. because the destructor is being
14111 .. _int_read_register:
14112 .. _int_read_volatile_register:
14113 .. _int_write_register:
14115 '``llvm.read_register``', '``llvm.read_volatile_register``', and '``llvm.write_register``' Intrinsics
14116 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14123 declare i32 @llvm.read_register.i32(metadata)
14124 declare i64 @llvm.read_register.i64(metadata)
14125 declare i32 @llvm.read_volatile_register.i32(metadata)
14126 declare i64 @llvm.read_volatile_register.i64(metadata)
14127 declare void @llvm.write_register.i32(metadata, i32 @value)
14128 declare void @llvm.write_register.i64(metadata, i64 @value)
14134 The '``llvm.read_register``', '``llvm.read_volatile_register``', and
14135 '``llvm.write_register``' intrinsics provide access to the named register.
14136 The register must be valid on the architecture being compiled to. The type
14137 needs to be compatible with the register being read.
14142 The '``llvm.read_register``' and '``llvm.read_volatile_register``' intrinsics
14143 return the current value of the register, where possible. The
14144 '``llvm.write_register``' intrinsic sets the current value of the register,
14147 A call to '``llvm.read_volatile_register``' is assumed to have side-effects
14148 and possibly return a different value each time (e.g. for a timer register).
14150 This is useful to implement named register global variables that need
14151 to always be mapped to a specific register, as is common practice on
14152 bare-metal programs including OS kernels.
14154 The compiler doesn't check for register availability or use of the used
14155 register in surrounding code, including inline assembly. Because of that,
14156 allocatable registers are not supported.
14158 Warning: So far it only works with the stack pointer on selected
14159 architectures (ARM, AArch64, PowerPC and x86_64). Significant amount of
14160 work is needed to support other registers and even more so, allocatable
14165 '``llvm.stacksave``' Intrinsic
14166 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14173 declare ptr @llvm.stacksave.p0()
14174 declare ptr addrspace(5) @llvm.stacksave.p5()
14179 The '``llvm.stacksave``' intrinsic is used to remember the current state
14180 of the function stack, for use with
14181 :ref:`llvm.stackrestore <int_stackrestore>`. This is useful for
14182 implementing language features like scoped automatic variable sized
14188 This intrinsic returns an opaque pointer value that can be passed to
14189 :ref:`llvm.stackrestore <int_stackrestore>`. When an
14190 ``llvm.stackrestore`` intrinsic is executed with a value saved from
14191 ``llvm.stacksave``, it effectively restores the state of the stack to
14192 the state it was in when the ``llvm.stacksave`` intrinsic executed. In
14193 practice, this pops any :ref:`alloca <i_alloca>` blocks from the stack
14194 that were allocated after the ``llvm.stacksave`` was executed. The
14195 address space should typically be the
14196 :ref:`alloca address space <alloca_addrspace>`.
14198 .. _int_stackrestore:
14200 '``llvm.stackrestore``' Intrinsic
14201 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14208 declare void @llvm.stackrestore.p0(ptr %ptr)
14209 declare void @llvm.stackrestore.p5(ptr addrspace(5) %ptr)
14214 The '``llvm.stackrestore``' intrinsic is used to restore the state of
14215 the function stack to the state it was in when the corresponding
14216 :ref:`llvm.stacksave <int_stacksave>` intrinsic executed. This is
14217 useful for implementing language features like scoped automatic
14218 variable sized arrays in C99. The address space should typically be
14219 the :ref:`alloca address space <alloca_addrspace>`.
14224 See the description for :ref:`llvm.stacksave <int_stacksave>`.
14226 .. _int_get_dynamic_area_offset:
14228 '``llvm.get.dynamic.area.offset``' Intrinsic
14229 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14236 declare i32 @llvm.get.dynamic.area.offset.i32()
14237 declare i64 @llvm.get.dynamic.area.offset.i64()
14242 The '``llvm.get.dynamic.area.offset.*``' intrinsic family is used to
14243 get the offset from native stack pointer to the address of the most
14244 recent dynamic alloca on the caller's stack. These intrinsics are
14245 intended for use in combination with
14246 :ref:`llvm.stacksave <int_stacksave>` to get a
14247 pointer to the most recent dynamic alloca. This is useful, for example,
14248 for AddressSanitizer's stack unpoisoning routines.
14253 These intrinsics return a non-negative integer value that can be used to
14254 get the address of the most recent dynamic alloca, allocated by :ref:`alloca <i_alloca>`
14255 on the caller's stack. In particular, for targets where stack grows downwards,
14256 adding this offset to the native stack pointer would get the address of the most
14257 recent dynamic alloca. For targets where stack grows upwards, the situation is a bit more
14258 complicated, because subtracting this value from stack pointer would get the address
14259 one past the end of the most recent dynamic alloca.
14261 Although for most targets `llvm.get.dynamic.area.offset <int_get_dynamic_area_offset>`
14262 returns just a zero, for others, such as PowerPC and PowerPC64, it returns a
14263 compile-time-known constant value.
14265 The return value type of :ref:`llvm.get.dynamic.area.offset <int_get_dynamic_area_offset>`
14266 must match the target's default address space's (address space 0) pointer type.
14268 '``llvm.prefetch``' Intrinsic
14269 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14276 declare void @llvm.prefetch(ptr <address>, i32 <rw>, i32 <locality>, i32 <cache type>)
14281 The '``llvm.prefetch``' intrinsic is a hint to the code generator to
14282 insert a prefetch instruction if supported; otherwise, it is a noop.
14283 Prefetches have no effect on the behavior of the program but can change
14284 its performance characteristics.
14289 ``address`` is the address to be prefetched, ``rw`` is the specifier
14290 determining if the fetch should be for a read (0) or write (1), and
14291 ``locality`` is a temporal locality specifier ranging from (0) - no
14292 locality, to (3) - extremely local keep in cache. The ``cache type``
14293 specifies whether the prefetch is performed on the data (1) or
14294 instruction (0) cache. The ``rw``, ``locality`` and ``cache type``
14295 arguments must be constant integers.
14300 This intrinsic does not modify the behavior of the program. In
14301 particular, prefetches cannot trap and do not produce a value. On
14302 targets that support this intrinsic, the prefetch can provide hints to
14303 the processor cache for better performance.
14305 '``llvm.pcmarker``' Intrinsic
14306 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14313 declare void @llvm.pcmarker(i32 <id>)
14318 The '``llvm.pcmarker``' intrinsic is a method to export a Program
14319 Counter (PC) in a region of code to simulators and other tools. The
14320 method is target specific, but it is expected that the marker will use
14321 exported symbols to transmit the PC of the marker. The marker makes no
14322 guarantees that it will remain with any specific instruction after
14323 optimizations. It is possible that the presence of a marker will inhibit
14324 optimizations. The intended use is to be inserted after optimizations to
14325 allow correlations of simulation runs.
14330 ``id`` is a numerical id identifying the marker.
14335 This intrinsic does not modify the behavior of the program. Backends
14336 that do not support this intrinsic may ignore it.
14338 '``llvm.readcyclecounter``' Intrinsic
14339 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14346 declare i64 @llvm.readcyclecounter()
14351 The '``llvm.readcyclecounter``' intrinsic provides access to the cycle
14352 counter register (or similar low latency, high accuracy clocks) on those
14353 targets that support it. On X86, it should map to RDTSC. On Alpha, it
14354 should map to RPCC. As the backing counters overflow quickly (on the
14355 order of 9 seconds on alpha), this should only be used for small
14361 When directly supported, reading the cycle counter should not modify any
14362 memory. Implementations are allowed to either return an application
14363 specific value or a system wide value. On backends without support, this
14364 is lowered to a constant 0.
14366 Note that runtime support may be conditional on the privilege-level code is
14367 running at and the host platform.
14369 '``llvm.clear_cache``' Intrinsic
14370 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14377 declare void @llvm.clear_cache(ptr, ptr)
14382 The '``llvm.clear_cache``' intrinsic ensures visibility of modifications
14383 in the specified range to the execution unit of the processor. On
14384 targets with non-unified instruction and data cache, the implementation
14385 flushes the instruction cache.
14390 On platforms with coherent instruction and data caches (e.g. x86), this
14391 intrinsic is a nop. On platforms with non-coherent instruction and data
14392 cache (e.g. ARM, MIPS), the intrinsic is lowered either to appropriate
14393 instructions or a system call, if cache flushing requires special
14396 The default behavior is to emit a call to ``__clear_cache`` from the run
14399 This intrinsic does *not* empty the instruction pipeline. Modifications
14400 of the current function are outside the scope of the intrinsic.
14402 '``llvm.instrprof.increment``' Intrinsic
14403 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14410 declare void @llvm.instrprof.increment(ptr <name>, i64 <hash>,
14411 i32 <num-counters>, i32 <index>)
14416 The '``llvm.instrprof.increment``' intrinsic can be emitted by a
14417 frontend for use with instrumentation based profiling. These will be
14418 lowered by the ``-instrprof`` pass to generate execution counts of a
14419 program at runtime.
14424 The first argument is a pointer to a global variable containing the
14425 name of the entity being instrumented. This should generally be the
14426 (mangled) function name for a set of counters.
14428 The second argument is a hash value that can be used by the consumer
14429 of the profile data to detect changes to the instrumented source, and
14430 the third is the number of counters associated with ``name``. It is an
14431 error if ``hash`` or ``num-counters`` differ between two instances of
14432 ``instrprof.increment`` that refer to the same name.
14434 The last argument refers to which of the counters for ``name`` should
14435 be incremented. It should be a value between 0 and ``num-counters``.
14440 This intrinsic represents an increment of a profiling counter. It will
14441 cause the ``-instrprof`` pass to generate the appropriate data
14442 structures and the code to increment the appropriate value, in a
14443 format that can be written out by a compiler runtime and consumed via
14444 the ``llvm-profdata`` tool.
14446 .. FIXME: write complete doc on contextual instrumentation and link from here
14447 .. and from llvm.instrprof.callsite.
14449 The intrinsic is lowered differently for contextual profiling by the
14450 ``-ctx-instr-lower`` pass. Here:
14452 * the entry basic block increment counter is lowered as a call to compiler-rt,
14453 to either ``__llvm_ctx_profile_start_context`` or
14454 ``__llvm_ctx_profile_get_context``. Either returns a pointer to a context object
14455 which contains a buffer into which counter increments can happen. Note that the
14456 pointer value returned by compiler-rt may have its LSB set - counter increments
14457 happen offset from the address with the LSB cleared.
14459 * all the other lowerings of ``llvm.instrprof.increment[.step]`` happen within
14462 * the context is assumed to be a local value to the function, and no concurrency
14463 concerns need to be handled by LLVM.
14465 '``llvm.instrprof.increment.step``' Intrinsic
14466 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14473 declare void @llvm.instrprof.increment.step(ptr <name>, i64 <hash>,
14474 i32 <num-counters>,
14475 i32 <index>, i64 <step>)
14480 The '``llvm.instrprof.increment.step``' intrinsic is an extension to
14481 the '``llvm.instrprof.increment``' intrinsic with an additional fifth
14482 argument to specify the step of the increment.
14486 The first four arguments are the same as '``llvm.instrprof.increment``'
14489 The last argument specifies the value of the increment of the counter variable.
14493 See description of '``llvm.instrprof.increment``' intrinsic.
14495 '``llvm.instrprof.callsite``' Intrinsic
14496 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14503 declare void @llvm.instrprof.callsite(ptr <name>, i64 <hash>,
14504 i32 <num-counters>,
14505 i32 <index>, ptr <callsite>)
14510 The '``llvm.instrprof.callsite``' intrinsic should be emitted before a callsite
14511 that's not to a "fake" callee (like another intrinsic or asm). It is used by
14512 contextual profiling and has side-effects. Its lowering happens in IR, and
14513 target-specific backends should never encounter it.
14517 The first 4 arguments are similar to ``llvm.instrprof.increment``. The indexing
14518 is specific to callsites, meaning callsites are indexed from 0, independent from
14519 the indexes used by the other intrinsics (such as
14520 ``llvm.instrprof.increment[.step]``).
14522 The last argument is the called value of the callsite this intrinsic precedes.
14527 This is lowered by contextual profiling. In contextual profiling, functions get,
14528 from compiler-rt, a pointer to a context object. The context object consists of
14529 a buffer LLVM can use to perform counter increments (i.e. the lowering of
14530 ``llvm.instrprof.increment[.step]``. The address range following the counter
14531 buffer, ``<num-counters>`` x ``sizeof(ptr)`` - sized, is expected to contain
14532 pointers to contexts of functions called from this function ("subcontexts").
14533 LLVM does not dereference into that memory region, just calculates GEPs.
14535 The lowering of ``llvm.instrprof.callsite`` consists of:
14537 * write to ``__llvm_ctx_profile_expected_callee`` the ``<callsite>`` value;
14539 * write to ``__llvm_ctx_profile_callsite`` the address into this function's
14540 context of the ``<index>`` position into the subcontexts region.
14543 ``__llvm_ctx_profile_{expected_callee|callsite}`` are initialized by compiler-rt
14544 and are TLS. They are both vectors of pointers of size 2. The index into each is
14545 determined when the current function obtains the pointer to its context from
14546 compiler-rt. The pointer's LSB gives the index.
14549 '``llvm.instrprof.timestamp``' Intrinsic
14550 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14557 declare void @llvm.instrprof.timestamp(i8* <name>, i64 <hash>,
14558 i32 <num-counters>, i32 <index>)
14563 The '``llvm.instrprof.timestamp``' intrinsic is used to implement temporal
14568 The arguments are the same as '``llvm.instrprof.increment``'. The ``index`` is
14569 expected to always be zero.
14573 Similar to the '``llvm.instrprof.increment``' intrinsic, but it stores a
14574 timestamp representing when this function was executed for the first time.
14576 '``llvm.instrprof.cover``' Intrinsic
14577 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14584 declare void @llvm.instrprof.cover(ptr <name>, i64 <hash>,
14585 i32 <num-counters>, i32 <index>)
14590 The '``llvm.instrprof.cover``' intrinsic is used to implement coverage
14595 The arguments are the same as the first four arguments of
14596 '``llvm.instrprof.increment``'.
14600 Similar to the '``llvm.instrprof.increment``' intrinsic, but it stores zero to
14601 the profiling variable to signify that the function has been covered. We store
14602 zero because this is more efficient on some targets.
14604 '``llvm.instrprof.value.profile``' Intrinsic
14605 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14612 declare void @llvm.instrprof.value.profile(ptr <name>, i64 <hash>,
14613 i64 <value>, i32 <value_kind>,
14619 The '``llvm.instrprof.value.profile``' intrinsic can be emitted by a
14620 frontend for use with instrumentation based profiling. This will be
14621 lowered by the ``-instrprof`` pass to find out the target values,
14622 instrumented expressions take in a program at runtime.
14627 The first argument is a pointer to a global variable containing the
14628 name of the entity being instrumented. ``name`` should generally be the
14629 (mangled) function name for a set of counters.
14631 The second argument is a hash value that can be used by the consumer
14632 of the profile data to detect changes to the instrumented source. It
14633 is an error if ``hash`` differs between two instances of
14634 ``llvm.instrprof.*`` that refer to the same name.
14636 The third argument is the value of the expression being profiled. The profiled
14637 expression's value should be representable as an unsigned 64-bit value. The
14638 fourth argument represents the kind of value profiling that is being done. The
14639 supported value profiling kinds are enumerated through the
14640 ``InstrProfValueKind`` type declared in the
14641 ``<include/llvm/ProfileData/InstrProf.h>`` header file. The last argument is the
14642 index of the instrumented expression within ``name``. It should be >= 0.
14647 This intrinsic represents the point where a call to a runtime routine
14648 should be inserted for value profiling of target expressions. ``-instrprof``
14649 pass will generate the appropriate data structures and replace the
14650 ``llvm.instrprof.value.profile`` intrinsic with the call to the profile
14651 runtime library with proper arguments.
14653 '``llvm.instrprof.mcdc.parameters``' Intrinsic
14654 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14661 declare void @llvm.instrprof.mcdc.parameters(ptr <name>, i64 <hash>,
14667 The '``llvm.instrprof.mcdc.parameters``' intrinsic is used to initiate MC/DC
14668 code coverage instrumentation for a function.
14673 The first argument is a pointer to a global variable containing the
14674 name of the entity being instrumented. This should generally be the
14675 (mangled) function name for a set of counters.
14677 The second argument is a hash value that can be used by the consumer
14678 of the profile data to detect changes to the instrumented source.
14680 The third argument is the number of bitmap bits required by the function to
14681 record the number of test vectors executed for each boolean expression.
14686 This intrinsic represents basic MC/DC parameters initiating one or more MC/DC
14687 instrumentation sequences in a function. It will cause the ``-instrprof`` pass
14688 to generate the appropriate data structures and the code to instrument MC/DC
14689 test vectors in a format that can be written out by a compiler runtime and
14690 consumed via the ``llvm-profdata`` tool.
14692 '``llvm.instrprof.mcdc.tvbitmap.update``' Intrinsic
14693 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14700 declare void @llvm.instrprof.mcdc.tvbitmap.update(ptr <name>, i64 <hash>,
14701 i32 <bitmap-index>,
14702 ptr <mcdc-temp-addr>)
14707 The '``llvm.instrprof.mcdc.tvbitmap.update``' intrinsic is used to track MC/DC
14708 test vector execution after each boolean expression has been fully executed.
14709 The overall value of the condition bitmap, after it has been successively
14710 updated with the true or false evaluation of each condition, uniquely identifies
14711 an executed MC/DC test vector and is used as a bit index into the global test
14717 The first argument is a pointer to a global variable containing the
14718 name of the entity being instrumented. This should generally be the
14719 (mangled) function name for a set of counters.
14721 The second argument is a hash value that can be used by the consumer
14722 of the profile data to detect changes to the instrumented source.
14724 The third argument is the bit index into the global test vector bitmap
14725 corresponding to the function.
14727 The fourth argument is the address of the condition bitmap, which contains a
14728 value representing an executed MC/DC test vector. It is loaded and used as the
14729 bit index of the test vector bitmap.
14734 This intrinsic represents the final operation of an MC/DC instrumentation
14735 sequence and will cause the ``-instrprof`` pass to generate the code to
14736 instrument an update of a function's global test vector bitmap to indicate that
14737 a test vector has been executed. The global test vector bitmap can be consumed
14738 by the ``llvm-profdata`` and ``llvm-cov`` tools.
14740 '``llvm.thread.pointer``' Intrinsic
14741 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14748 declare ptr @llvm.thread.pointer()
14753 The '``llvm.thread.pointer``' intrinsic returns the value of the thread
14759 The '``llvm.thread.pointer``' intrinsic returns a pointer to the TLS area
14760 for the current thread. The exact semantics of this value are target
14761 specific: it may point to the start of TLS area, to the end, or somewhere
14762 in the middle. Depending on the target, this intrinsic may read a register,
14763 call a helper function, read from an alternate memory space, or perform
14764 other operations necessary to locate the TLS area. Not all targets support
14767 '``llvm.call.preallocated.setup``' Intrinsic
14768 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14775 declare token @llvm.call.preallocated.setup(i32 %num_args)
14780 The '``llvm.call.preallocated.setup``' intrinsic returns a token which can
14781 be used with a call's ``"preallocated"`` operand bundle to indicate that
14782 certain arguments are allocated and initialized before the call.
14787 The '``llvm.call.preallocated.setup``' intrinsic returns a token which is
14788 associated with at most one call. The token can be passed to
14789 '``@llvm.call.preallocated.arg``' to get a pointer to get that
14790 corresponding argument. The token must be the parameter to a
14791 ``"preallocated"`` operand bundle for the corresponding call.
14793 Nested calls to '``llvm.call.preallocated.setup``' are allowed, but must
14794 be properly nested. e.g.
14796 :: code-block:: llvm
14798 %t1 = call token @llvm.call.preallocated.setup(i32 0)
14799 %t2 = call token @llvm.call.preallocated.setup(i32 0)
14800 call void foo() ["preallocated"(token %t2)]
14801 call void foo() ["preallocated"(token %t1)]
14803 is allowed, but not
14805 :: code-block:: llvm
14807 %t1 = call token @llvm.call.preallocated.setup(i32 0)
14808 %t2 = call token @llvm.call.preallocated.setup(i32 0)
14809 call void foo() ["preallocated"(token %t1)]
14810 call void foo() ["preallocated"(token %t2)]
14812 .. _int_call_preallocated_arg:
14814 '``llvm.call.preallocated.arg``' Intrinsic
14815 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14822 declare ptr @llvm.call.preallocated.arg(token %setup_token, i32 %arg_index)
14827 The '``llvm.call.preallocated.arg``' intrinsic returns a pointer to the
14828 corresponding preallocated argument for the preallocated call.
14833 The '``llvm.call.preallocated.arg``' intrinsic returns a pointer to the
14834 ``%arg_index``th argument with the ``preallocated`` attribute for
14835 the call associated with the ``%setup_token``, which must be from
14836 '``llvm.call.preallocated.setup``'.
14838 A call to '``llvm.call.preallocated.arg``' must have a call site
14839 ``preallocated`` attribute. The type of the ``preallocated`` attribute must
14840 match the type used by the ``preallocated`` attribute of the corresponding
14841 argument at the preallocated call. The type is used in the case that an
14842 ``llvm.call.preallocated.setup`` does not have a corresponding call (e.g. due
14843 to DCE), where otherwise we cannot know how large the arguments are.
14845 It is undefined behavior if this is called with a token from an
14846 '``llvm.call.preallocated.setup``' if another
14847 '``llvm.call.preallocated.setup``' has already been called or if the
14848 preallocated call corresponding to the '``llvm.call.preallocated.setup``'
14849 has already been called.
14851 .. _int_call_preallocated_teardown:
14853 '``llvm.call.preallocated.teardown``' Intrinsic
14854 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14861 declare ptr @llvm.call.preallocated.teardown(token %setup_token)
14866 The '``llvm.call.preallocated.teardown``' intrinsic cleans up the stack
14867 created by a '``llvm.call.preallocated.setup``'.
14872 The token argument must be a '``llvm.call.preallocated.setup``'.
14874 The '``llvm.call.preallocated.teardown``' intrinsic cleans up the stack
14875 allocated by the corresponding '``llvm.call.preallocated.setup``'. Exactly
14876 one of this or the preallocated call must be called to prevent stack leaks.
14877 It is undefined behavior to call both a '``llvm.call.preallocated.teardown``'
14878 and the preallocated call for a given '``llvm.call.preallocated.setup``'.
14880 For example, if the stack is allocated for a preallocated call by a
14881 '``llvm.call.preallocated.setup``', then an initializer function called on an
14882 allocated argument throws an exception, there should be a
14883 '``llvm.call.preallocated.teardown``' in the exception handler to prevent
14886 Following the nesting rules in '``llvm.call.preallocated.setup``', nested
14887 calls to '``llvm.call.preallocated.setup``' and
14888 '``llvm.call.preallocated.teardown``' are allowed but must be properly
14894 .. code-block:: llvm
14896 %cs = call token @llvm.call.preallocated.setup(i32 1)
14897 %x = call ptr @llvm.call.preallocated.arg(token %cs, i32 0) preallocated(i32)
14898 invoke void @constructor(ptr %x) to label %conta unwind label %contb
14900 call void @foo1(ptr preallocated(i32) %x) ["preallocated"(token %cs)]
14903 %s = catchswitch within none [label %catch] unwind to caller
14905 %p = catchpad within %s []
14906 call void @llvm.call.preallocated.teardown(token %cs)
14909 Standard C/C++ Library Intrinsics
14910 ---------------------------------
14912 LLVM provides intrinsics for a few important standard C/C++ library
14913 functions. These intrinsics allow source-language front-ends to pass
14914 information about the alignment of the pointer arguments to the code
14915 generator, providing opportunity for more efficient code generation.
14919 '``llvm.abs.*``' Intrinsic
14920 ^^^^^^^^^^^^^^^^^^^^^^^^^^
14925 This is an overloaded intrinsic. You can use ``llvm.abs`` on any
14926 integer bit width or any vector of integer elements.
14930 declare i32 @llvm.abs.i32(i32 <src>, i1 <is_int_min_poison>)
14931 declare <4 x i32> @llvm.abs.v4i32(<4 x i32> <src>, i1 <is_int_min_poison>)
14936 The '``llvm.abs``' family of intrinsic functions returns the absolute value
14942 The first argument is the value for which the absolute value is to be returned.
14943 This argument may be of any integer type or a vector with integer element type.
14944 The return type must match the first argument type.
14946 The second argument must be a constant and is a flag to indicate whether the
14947 result value of the '``llvm.abs``' intrinsic is a
14948 :ref:`poison value <poisonvalues>` if the first argument is statically or
14949 dynamically an ``INT_MIN`` value.
14954 The '``llvm.abs``' intrinsic returns the magnitude (always positive) of the
14955 first argument or each element of a vector argument.". If the first argument is
14956 ``INT_MIN``, then the result is also ``INT_MIN`` if ``is_int_min_poison == 0``
14957 and ``poison`` otherwise.
14962 '``llvm.smax.*``' Intrinsic
14963 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
14968 This is an overloaded intrinsic. You can use ``@llvm.smax`` on any
14969 integer bit width or any vector of integer elements.
14973 declare i32 @llvm.smax.i32(i32 %a, i32 %b)
14974 declare <4 x i32> @llvm.smax.v4i32(<4 x i32> %a, <4 x i32> %b)
14979 Return the larger of ``%a`` and ``%b`` comparing the values as signed integers.
14980 Vector intrinsics operate on a per-element basis. The larger element of ``%a``
14981 and ``%b`` at a given index is returned for that index.
14986 The arguments (``%a`` and ``%b``) may be of any integer type or a vector with
14987 integer element type. The argument types must match each other, and the return
14988 type must match the argument type.
14993 '``llvm.smin.*``' Intrinsic
14994 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
14999 This is an overloaded intrinsic. You can use ``@llvm.smin`` on any
15000 integer bit width or any vector of integer elements.
15004 declare i32 @llvm.smin.i32(i32 %a, i32 %b)
15005 declare <4 x i32> @llvm.smin.v4i32(<4 x i32> %a, <4 x i32> %b)
15010 Return the smaller of ``%a`` and ``%b`` comparing the values as signed integers.
15011 Vector intrinsics operate on a per-element basis. The smaller element of ``%a``
15012 and ``%b`` at a given index is returned for that index.
15017 The arguments (``%a`` and ``%b``) may be of any integer type or a vector with
15018 integer element type. The argument types must match each other, and the return
15019 type must match the argument type.
15024 '``llvm.umax.*``' Intrinsic
15025 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
15030 This is an overloaded intrinsic. You can use ``@llvm.umax`` on any
15031 integer bit width or any vector of integer elements.
15035 declare i32 @llvm.umax.i32(i32 %a, i32 %b)
15036 declare <4 x i32> @llvm.umax.v4i32(<4 x i32> %a, <4 x i32> %b)
15041 Return the larger of ``%a`` and ``%b`` comparing the values as unsigned
15042 integers. Vector intrinsics operate on a per-element basis. The larger element
15043 of ``%a`` and ``%b`` at a given index is returned for that index.
15048 The arguments (``%a`` and ``%b``) may be of any integer type or a vector with
15049 integer element type. The argument types must match each other, and the return
15050 type must match the argument type.
15055 '``llvm.umin.*``' Intrinsic
15056 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
15061 This is an overloaded intrinsic. You can use ``@llvm.umin`` on any
15062 integer bit width or any vector of integer elements.
15066 declare i32 @llvm.umin.i32(i32 %a, i32 %b)
15067 declare <4 x i32> @llvm.umin.v4i32(<4 x i32> %a, <4 x i32> %b)
15072 Return the smaller of ``%a`` and ``%b`` comparing the values as unsigned
15073 integers. Vector intrinsics operate on a per-element basis. The smaller element
15074 of ``%a`` and ``%b`` at a given index is returned for that index.
15079 The arguments (``%a`` and ``%b``) may be of any integer type or a vector with
15080 integer element type. The argument types must match each other, and the return
15081 type must match the argument type.
15085 '``llvm.scmp.*``' Intrinsic
15086 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
15091 This is an overloaded intrinsic. You can use ``@llvm.scmp`` on any
15092 integer bit width or any vector of integer elements.
15096 declare i2 @llvm.scmp.i2.i32(i32 %a, i32 %b)
15097 declare <4 x i32> @llvm.scmp.v4i32.v4i32(<4 x i32> %a, <4 x i32> %b)
15102 Return ``-1`` if ``%a`` is signed less than ``%b``, ``0`` if they are equal, and
15103 ``1`` if ``%a`` is signed greater than ``%b``. Vector intrinsics operate on a per-element basis.
15108 The arguments (``%a`` and ``%b``) may be of any integer type or a vector with
15109 integer element type. The argument types must match each other, and the return
15110 type must be at least as wide as ``i2``, to hold the three possible return values.
15114 '``llvm.ucmp.*``' Intrinsic
15115 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
15120 This is an overloaded intrinsic. You can use ``@llvm.ucmp`` on any
15121 integer bit width or any vector of integer elements.
15125 declare i2 @llvm.ucmp.i2.i32(i32 %a, i32 %b)
15126 declare <4 x i32> @llvm.ucmp.v4i32.v4i32(<4 x i32> %a, <4 x i32> %b)
15131 Return ``-1`` if ``%a`` is unsigned less than ``%b``, ``0`` if they are equal, and
15132 ``1`` if ``%a`` is unsigned greater than ``%b``. Vector intrinsics operate on a per-element basis.
15137 The arguments (``%a`` and ``%b``) may be of any integer type or a vector with
15138 integer element type. The argument types must match each other, and the return
15139 type must be at least as wide as ``i2``, to hold the three possible return values.
15143 '``llvm.memcpy``' Intrinsic
15144 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
15149 This is an overloaded intrinsic. You can use ``llvm.memcpy`` on any
15150 integer bit width and for different address spaces. Not all targets
15151 support all bit widths however.
15155 declare void @llvm.memcpy.p0.p0.i32(ptr <dest>, ptr <src>,
15156 i32 <len>, i1 <isvolatile>)
15157 declare void @llvm.memcpy.p0.p0.i64(ptr <dest>, ptr <src>,
15158 i64 <len>, i1 <isvolatile>)
15163 The '``llvm.memcpy.*``' intrinsics copy a block of memory from the
15164 source location to the destination location.
15166 Note that, unlike the standard libc function, the ``llvm.memcpy.*``
15167 intrinsics do not return a value, takes extra isvolatile
15168 arguments and the pointers can be in specified address spaces.
15173 The first argument is a pointer to the destination, the second is a
15174 pointer to the source. The third argument is an integer argument
15175 specifying the number of bytes to copy, and the fourth is a
15176 boolean indicating a volatile access.
15178 The :ref:`align <attr_align>` parameter attribute can be provided
15179 for the first and second arguments.
15181 If the ``isvolatile`` parameter is ``true``, the ``llvm.memcpy`` call is
15182 a :ref:`volatile operation <volatile>`. The detailed access behavior is not
15183 very cleanly specified and it is unwise to depend on it.
15188 The '``llvm.memcpy.*``' intrinsics copy a block of memory from the source
15189 location to the destination location, which must either be equal or
15190 non-overlapping. It copies "len" bytes of memory over. If the argument is known
15191 to be aligned to some boundary, this can be specified as an attribute on the
15194 If ``<len>`` is 0, it is no-op modulo the behavior of attributes attached to
15196 If ``<len>`` is not a well-defined value, the behavior is undefined.
15197 If ``<len>`` is not zero, both ``<dest>`` and ``<src>`` should be well-defined,
15198 otherwise the behavior is undefined.
15200 .. _int_memcpy_inline:
15202 '``llvm.memcpy.inline``' Intrinsic
15203 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15208 This is an overloaded intrinsic. You can use ``llvm.memcpy.inline`` on any
15209 integer bit width and for different address spaces. Not all targets
15210 support all bit widths however.
15214 declare void @llvm.memcpy.inline.p0.p0.i32(ptr <dest>, ptr <src>,
15215 i32 <len>, i1 <isvolatile>)
15216 declare void @llvm.memcpy.inline.p0.p0.i64(ptr <dest>, ptr <src>,
15217 i64 <len>, i1 <isvolatile>)
15222 The '``llvm.memcpy.inline.*``' intrinsics copy a block of memory from the
15223 source location to the destination location and guarantees that no external
15224 functions are called.
15226 Note that, unlike the standard libc function, the ``llvm.memcpy.inline.*``
15227 intrinsics do not return a value, takes extra isvolatile
15228 arguments and the pointers can be in specified address spaces.
15233 The first argument is a pointer to the destination, the second is a
15234 pointer to the source. The third argument is an integer argument
15235 specifying the number of bytes to copy, and the fourth is a
15236 boolean indicating a volatile access.
15238 The :ref:`align <attr_align>` parameter attribute can be provided
15239 for the first and second arguments.
15241 If the ``isvolatile`` parameter is ``true``, the ``llvm.memcpy.inline`` call is
15242 a :ref:`volatile operation <volatile>`. The detailed access behavior is not
15243 very cleanly specified and it is unwise to depend on it.
15248 The '``llvm.memcpy.inline.*``' intrinsics copy a block of memory from the
15249 source location to the destination location, which are not allowed to
15250 overlap. It copies "len" bytes of memory over. If the argument is known
15251 to be aligned to some boundary, this can be specified as an attribute on
15253 The behavior of '``llvm.memcpy.inline.*``' is equivalent to the behavior of
15254 '``llvm.memcpy.*``', but the generated code is guaranteed not to call any
15255 external functions.
15259 '``llvm.memmove``' Intrinsic
15260 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15265 This is an overloaded intrinsic. You can use llvm.memmove on any integer
15266 bit width and for different address space. Not all targets support all
15267 bit widths however.
15271 declare void @llvm.memmove.p0.p0.i32(ptr <dest>, ptr <src>,
15272 i32 <len>, i1 <isvolatile>)
15273 declare void @llvm.memmove.p0.p0.i64(ptr <dest>, ptr <src>,
15274 i64 <len>, i1 <isvolatile>)
15279 The '``llvm.memmove.*``' intrinsics move a block of memory from the
15280 source location to the destination location. It is similar to the
15281 '``llvm.memcpy``' intrinsic but allows the two memory locations to
15284 Note that, unlike the standard libc function, the ``llvm.memmove.*``
15285 intrinsics do not return a value, takes an extra isvolatile
15286 argument and the pointers can be in specified address spaces.
15291 The first argument is a pointer to the destination, the second is a
15292 pointer to the source. The third argument is an integer argument
15293 specifying the number of bytes to copy, and the fourth is a
15294 boolean indicating a volatile access.
15296 The :ref:`align <attr_align>` parameter attribute can be provided
15297 for the first and second arguments.
15299 If the ``isvolatile`` parameter is ``true``, the ``llvm.memmove`` call
15300 is a :ref:`volatile operation <volatile>`. The detailed access behavior is
15301 not very cleanly specified and it is unwise to depend on it.
15306 The '``llvm.memmove.*``' intrinsics copy a block of memory from the
15307 source location to the destination location, which may overlap. It
15308 copies "len" bytes of memory over. If the argument is known to be
15309 aligned to some boundary, this can be specified as an attribute on
15312 If ``<len>`` is 0, it is no-op modulo the behavior of attributes attached to
15314 If ``<len>`` is not a well-defined value, the behavior is undefined.
15315 If ``<len>`` is not zero, both ``<dest>`` and ``<src>`` should be well-defined,
15316 otherwise the behavior is undefined.
15320 '``llvm.memset.*``' Intrinsics
15321 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15326 This is an overloaded intrinsic. You can use llvm.memset on any integer
15327 bit width and for different address spaces. However, not all targets
15328 support all bit widths.
15332 declare void @llvm.memset.p0.i32(ptr <dest>, i8 <val>,
15333 i32 <len>, i1 <isvolatile>)
15334 declare void @llvm.memset.p0.i64(ptr <dest>, i8 <val>,
15335 i64 <len>, i1 <isvolatile>)
15340 The '``llvm.memset.*``' intrinsics fill a block of memory with a
15341 particular byte value.
15343 Note that, unlike the standard libc function, the ``llvm.memset``
15344 intrinsic does not return a value and takes an extra volatile
15345 argument. Also, the destination can be in an arbitrary address space.
15350 The first argument is a pointer to the destination to fill, the second
15351 is the byte value with which to fill it, the third argument is an
15352 integer argument specifying the number of bytes to fill, and the fourth
15353 is a boolean indicating a volatile access.
15355 The :ref:`align <attr_align>` parameter attribute can be provided
15356 for the first arguments.
15358 If the ``isvolatile`` parameter is ``true``, the ``llvm.memset`` call is
15359 a :ref:`volatile operation <volatile>`. The detailed access behavior is not
15360 very cleanly specified and it is unwise to depend on it.
15365 The '``llvm.memset.*``' intrinsics fill "len" bytes of memory starting
15366 at the destination location. If the argument is known to be
15367 aligned to some boundary, this can be specified as an attribute on
15370 If ``<len>`` is 0, it is no-op modulo the behavior of attributes attached to
15372 If ``<len>`` is not a well-defined value, the behavior is undefined.
15373 If ``<len>`` is not zero, ``<dest>`` should be well-defined, otherwise the
15374 behavior is undefined.
15376 .. _int_memset_inline:
15378 '``llvm.memset.inline``' Intrinsic
15379 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15384 This is an overloaded intrinsic. You can use ``llvm.memset.inline`` on any
15385 integer bit width and for different address spaces. Not all targets
15386 support all bit widths however.
15390 declare void @llvm.memset.inline.p0.p0i8.i32(ptr <dest>, i8 <val>,
15391 i32 <len>, i1 <isvolatile>)
15392 declare void @llvm.memset.inline.p0.p0.i64(ptr <dest>, i8 <val>,
15393 i64 <len>, i1 <isvolatile>)
15398 The '``llvm.memset.inline.*``' intrinsics fill a block of memory with a
15399 particular byte value and guarantees that no external functions are called.
15401 Note that, unlike the standard libc function, the ``llvm.memset.inline.*``
15402 intrinsics do not return a value, take an extra isvolatile argument and the
15403 pointer can be in specified address spaces.
15408 The first argument is a pointer to the destination to fill, the second
15409 is the byte value with which to fill it, the third argument is a constant
15410 integer argument specifying the number of bytes to fill, and the fourth
15411 is a boolean indicating a volatile access.
15413 The :ref:`align <attr_align>` parameter attribute can be provided
15414 for the first argument.
15416 If the ``isvolatile`` parameter is ``true``, the ``llvm.memset.inline`` call is
15417 a :ref:`volatile operation <volatile>`. The detailed access behavior is not
15418 very cleanly specified and it is unwise to depend on it.
15423 The '``llvm.memset.inline.*``' intrinsics fill "len" bytes of memory starting
15424 at the destination location. If the argument is known to be
15425 aligned to some boundary, this can be specified as an attribute on
15428 If ``<len>`` is 0, it is no-op modulo the behavior of attributes attached to
15430 If ``<len>`` is not a well-defined value, the behavior is undefined.
15431 If ``<len>`` is not zero, ``<dest>`` should be well-defined, otherwise the
15432 behavior is undefined.
15434 The behavior of '``llvm.memset.inline.*``' is equivalent to the behavior of
15435 '``llvm.memset.*``', but the generated code is guaranteed not to call any
15436 external functions.
15438 .. _int_experimental_memset_pattern:
15440 '``llvm.experimental.memset.pattern``' Intrinsic
15441 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15446 This is an overloaded intrinsic. You can use
15447 ``llvm.experimental.memset.pattern`` on any integer bit width and for
15448 different address spaces. Not all targets support all bit widths however.
15452 declare void @llvm.experimental.memset.pattern.p0.i128.i64(ptr <dest>, i128 <val>,
15453 i64 <count>, i1 <isvolatile>)
15458 The '``llvm.experimental.memset.pattern.*``' intrinsics fill a block of memory
15459 with a particular value. This may be expanded to an inline loop, a sequence of
15460 stores, or a libcall depending on what is available for the target and the
15461 expected performance and code size impact.
15466 The first argument is a pointer to the destination to fill, the second
15467 is the value with which to fill it, the third argument is an integer
15468 argument specifying the number of times to fill the value, and the fourth is a
15469 boolean indicating a volatile access.
15471 The :ref:`align <attr_align>` parameter attribute can be provided
15472 for the first argument.
15474 If the ``isvolatile`` parameter is ``true``, the
15475 ``llvm.experimental.memset.pattern`` call is a :ref:`volatile operation
15476 <volatile>`. The detailed access behavior is not very cleanly specified and it
15477 is unwise to depend on it.
15482 The '``llvm.experimental.memset.pattern*``' intrinsic fills memory starting at
15483 the destination location with the given pattern ``<count>`` times,
15484 incrementing by the allocation size of the type each time. The stores follow
15485 the usual semantics of store instructions, including regarding endianness and
15486 padding. If the argument is known to be aligned to some boundary, this can be
15487 specified as an attribute on the argument.
15489 If ``<count>`` is 0, it is no-op modulo the behavior of attributes attached to
15491 If ``<count>`` is not a well-defined value, the behavior is undefined.
15492 If ``<count>`` is not zero, ``<dest>`` should be well-defined, otherwise the
15493 behavior is undefined.
15497 '``llvm.sqrt.*``' Intrinsic
15498 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
15503 This is an overloaded intrinsic. You can use ``llvm.sqrt`` on any
15504 floating-point or vector of floating-point type. Not all targets support
15509 declare float @llvm.sqrt.f32(float %Val)
15510 declare double @llvm.sqrt.f64(double %Val)
15511 declare x86_fp80 @llvm.sqrt.f80(x86_fp80 %Val)
15512 declare fp128 @llvm.sqrt.f128(fp128 %Val)
15513 declare ppc_fp128 @llvm.sqrt.ppcf128(ppc_fp128 %Val)
15518 The '``llvm.sqrt``' intrinsics return the square root of the specified value.
15523 The argument and return value are floating-point numbers of the same type.
15528 Return the same value as a corresponding libm '``sqrt``' function but without
15529 trapping or setting ``errno``. For types specified by IEEE-754, the result
15530 matches a conforming libm implementation.
15532 When specified with the fast-math-flag 'afn', the result may be approximated
15533 using a less accurate calculation.
15535 '``llvm.powi.*``' Intrinsic
15536 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
15541 This is an overloaded intrinsic. You can use ``llvm.powi`` on any
15542 floating-point or vector of floating-point type. Not all targets support
15545 Generally, the only supported type for the exponent is the one matching
15546 with the C type ``int``.
15550 declare float @llvm.powi.f32.i32(float %Val, i32 %power)
15551 declare double @llvm.powi.f64.i16(double %Val, i16 %power)
15552 declare x86_fp80 @llvm.powi.f80.i32(x86_fp80 %Val, i32 %power)
15553 declare fp128 @llvm.powi.f128.i32(fp128 %Val, i32 %power)
15554 declare ppc_fp128 @llvm.powi.ppcf128.i32(ppc_fp128 %Val, i32 %power)
15559 The '``llvm.powi.*``' intrinsics return the first operand raised to the
15560 specified (positive or negative) power. The order of evaluation of
15561 multiplications is not defined. When a vector of floating-point type is
15562 used, the second argument remains a scalar integer value.
15567 The second argument is an integer power, and the first is a value to
15568 raise to that power.
15573 This function returns the first value raised to the second power with an
15574 unspecified sequence of rounding operations.
15578 '``llvm.sin.*``' Intrinsic
15579 ^^^^^^^^^^^^^^^^^^^^^^^^^^
15584 This is an overloaded intrinsic. You can use ``llvm.sin`` on any
15585 floating-point or vector of floating-point type. Not all targets support
15590 declare float @llvm.sin.f32(float %Val)
15591 declare double @llvm.sin.f64(double %Val)
15592 declare x86_fp80 @llvm.sin.f80(x86_fp80 %Val)
15593 declare fp128 @llvm.sin.f128(fp128 %Val)
15594 declare ppc_fp128 @llvm.sin.ppcf128(ppc_fp128 %Val)
15599 The '``llvm.sin.*``' intrinsics return the sine of the operand.
15604 The argument and return value are floating-point numbers of the same type.
15609 Return the same value as a corresponding libm '``sin``' function but without
15610 trapping or setting ``errno``.
15612 When specified with the fast-math-flag 'afn', the result may be approximated
15613 using a less accurate calculation.
15617 '``llvm.cos.*``' Intrinsic
15618 ^^^^^^^^^^^^^^^^^^^^^^^^^^
15623 This is an overloaded intrinsic. You can use ``llvm.cos`` on any
15624 floating-point or vector of floating-point type. Not all targets support
15629 declare float @llvm.cos.f32(float %Val)
15630 declare double @llvm.cos.f64(double %Val)
15631 declare x86_fp80 @llvm.cos.f80(x86_fp80 %Val)
15632 declare fp128 @llvm.cos.f128(fp128 %Val)
15633 declare ppc_fp128 @llvm.cos.ppcf128(ppc_fp128 %Val)
15638 The '``llvm.cos.*``' intrinsics return the cosine of the operand.
15643 The argument and return value are floating-point numbers of the same type.
15648 Return the same value as a corresponding libm '``cos``' function but without
15649 trapping or setting ``errno``.
15651 When specified with the fast-math-flag 'afn', the result may be approximated
15652 using a less accurate calculation.
15654 '``llvm.tan.*``' Intrinsic
15655 ^^^^^^^^^^^^^^^^^^^^^^^^^^
15660 This is an overloaded intrinsic. You can use ``llvm.tan`` on any
15661 floating-point or vector of floating-point type. Not all targets support
15666 declare float @llvm.tan.f32(float %Val)
15667 declare double @llvm.tan.f64(double %Val)
15668 declare x86_fp80 @llvm.tan.f80(x86_fp80 %Val)
15669 declare fp128 @llvm.tan.f128(fp128 %Val)
15670 declare ppc_fp128 @llvm.tan.ppcf128(ppc_fp128 %Val)
15675 The '``llvm.tan.*``' intrinsics return the tangent of the operand.
15680 The argument and return value are floating-point numbers of the same type.
15685 Return the same value as a corresponding libm '``tan``' function but without
15686 trapping or setting ``errno``.
15688 When specified with the fast-math-flag 'afn', the result may be approximated
15689 using a less accurate calculation.
15691 '``llvm.asin.*``' Intrinsic
15692 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
15697 This is an overloaded intrinsic. You can use ``llvm.asin`` on any
15698 floating-point or vector of floating-point type. Not all targets support
15703 declare float @llvm.asin.f32(float %Val)
15704 declare double @llvm.asin.f64(double %Val)
15705 declare x86_fp80 @llvm.asin.f80(x86_fp80 %Val)
15706 declare fp128 @llvm.asin.f128(fp128 %Val)
15707 declare ppc_fp128 @llvm.asin.ppcf128(ppc_fp128 %Val)
15712 The '``llvm.asin.*``' intrinsics return the arcsine of the operand.
15717 The argument and return value are floating-point numbers of the same type.
15722 Return the same value as a corresponding libm '``asin``' function but without
15723 trapping or setting ``errno``.
15725 When specified with the fast-math-flag 'afn', the result may be approximated
15726 using a less accurate calculation.
15728 '``llvm.acos.*``' Intrinsic
15729 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
15734 This is an overloaded intrinsic. You can use ``llvm.acos`` on any
15735 floating-point or vector of floating-point type. Not all targets support
15740 declare float @llvm.acos.f32(float %Val)
15741 declare double @llvm.acos.f64(double %Val)
15742 declare x86_fp80 @llvm.acos.f80(x86_fp80 %Val)
15743 declare fp128 @llvm.acos.f128(fp128 %Val)
15744 declare ppc_fp128 @llvm.acos.ppcf128(ppc_fp128 %Val)
15749 The '``llvm.acos.*``' intrinsics return the arccosine of the operand.
15754 The argument and return value are floating-point numbers of the same type.
15759 Return the same value as a corresponding libm '``acos``' function but without
15760 trapping or setting ``errno``.
15762 When specified with the fast-math-flag 'afn', the result may be approximated
15763 using a less accurate calculation.
15765 '``llvm.atan.*``' Intrinsic
15766 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
15771 This is an overloaded intrinsic. You can use ``llvm.atan`` on any
15772 floating-point or vector of floating-point type. Not all targets support
15777 declare float @llvm.atan.f32(float %Val)
15778 declare double @llvm.atan.f64(double %Val)
15779 declare x86_fp80 @llvm.atan.f80(x86_fp80 %Val)
15780 declare fp128 @llvm.atan.f128(fp128 %Val)
15781 declare ppc_fp128 @llvm.atan.ppcf128(ppc_fp128 %Val)
15786 The '``llvm.atan.*``' intrinsics return the arctangent of the operand.
15791 The argument and return value are floating-point numbers of the same type.
15796 Return the same value as a corresponding libm '``atan``' function but without
15797 trapping or setting ``errno``.
15799 When specified with the fast-math-flag 'afn', the result may be approximated
15800 using a less accurate calculation.
15802 '``llvm.atan2.*``' Intrinsic
15803 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15808 This is an overloaded intrinsic. You can use ``llvm.atan2`` on any
15809 floating-point or vector of floating-point type. Not all targets support
15814 declare float @llvm.atan2.f32(float %Y, float %X)
15815 declare double @llvm.atan2.f64(double %Y, double %X)
15816 declare x86_fp80 @llvm.atan2.f80(x86_fp80 %Y, x86_fp80 %X)
15817 declare fp128 @llvm.atan2.f128(fp128 %Y, fp128 %X)
15818 declare ppc_fp128 @llvm.atan2.ppcf128(ppc_fp128 %Y, ppc_fp128 %X)
15823 The '``llvm.atan2.*``' intrinsics return the arctangent of ``Y/X`` accounting
15829 The arguments and return value are floating-point numbers of the same type.
15834 Return the same value as a corresponding libm '``atan2``' function but without
15835 trapping or setting ``errno``.
15837 When specified with the fast-math-flag 'afn', the result may be approximated
15838 using a less accurate calculation.
15840 '``llvm.sinh.*``' Intrinsic
15841 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
15846 This is an overloaded intrinsic. You can use ``llvm.sinh`` on any
15847 floating-point or vector of floating-point type. Not all targets support
15852 declare float @llvm.sinh.f32(float %Val)
15853 declare double @llvm.sinh.f64(double %Val)
15854 declare x86_fp80 @llvm.sinh.f80(x86_fp80 %Val)
15855 declare fp128 @llvm.sinh.f128(fp128 %Val)
15856 declare ppc_fp128 @llvm.sinh.ppcf128(ppc_fp128 %Val)
15861 The '``llvm.sinh.*``' intrinsics return the hyperbolic sine of the operand.
15866 The argument and return value are floating-point numbers of the same type.
15871 Return the same value as a corresponding libm '``sinh``' function but without
15872 trapping or setting ``errno``.
15874 When specified with the fast-math-flag 'afn', the result may be approximated
15875 using a less accurate calculation.
15877 '``llvm.cosh.*``' Intrinsic
15878 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
15883 This is an overloaded intrinsic. You can use ``llvm.cosh`` on any
15884 floating-point or vector of floating-point type. Not all targets support
15889 declare float @llvm.cosh.f32(float %Val)
15890 declare double @llvm.cosh.f64(double %Val)
15891 declare x86_fp80 @llvm.cosh.f80(x86_fp80 %Val)
15892 declare fp128 @llvm.cosh.f128(fp128 %Val)
15893 declare ppc_fp128 @llvm.cosh.ppcf128(ppc_fp128 %Val)
15898 The '``llvm.cosh.*``' intrinsics return the hyperbolic cosine of the operand.
15903 The argument and return value are floating-point numbers of the same type.
15908 Return the same value as a corresponding libm '``cosh``' function but without
15909 trapping or setting ``errno``.
15911 When specified with the fast-math-flag 'afn', the result may be approximated
15912 using a less accurate calculation.
15914 '``llvm.tanh.*``' Intrinsic
15915 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
15920 This is an overloaded intrinsic. You can use ``llvm.tanh`` on any
15921 floating-point or vector of floating-point type. Not all targets support
15926 declare float @llvm.tanh.f32(float %Val)
15927 declare double @llvm.tanh.f64(double %Val)
15928 declare x86_fp80 @llvm.tanh.f80(x86_fp80 %Val)
15929 declare fp128 @llvm.tanh.f128(fp128 %Val)
15930 declare ppc_fp128 @llvm.tanh.ppcf128(ppc_fp128 %Val)
15935 The '``llvm.tanh.*``' intrinsics return the hyperbolic tangent of the operand.
15940 The argument and return value are floating-point numbers of the same type.
15945 Return the same value as a corresponding libm '``tanh``' function but without
15946 trapping or setting ``errno``.
15948 When specified with the fast-math-flag 'afn', the result may be approximated
15949 using a less accurate calculation.
15952 '``llvm.sincos.*``' Intrinsic
15953 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15958 This is an overloaded intrinsic. You can use ``llvm.sincos`` on any
15959 floating-point or vector of floating-point type. Not all targets support
15964 declare { float, float } @llvm.sincos.f32(float %Val)
15965 declare { double, double } @llvm.sincos.f64(double %Val)
15966 declare { x86_fp80, x86_fp80 } @llvm.sincos.f80(x86_fp80 %Val)
15967 declare { fp128, fp128 } @llvm.sincos.f128(fp128 %Val)
15968 declare { ppc_fp128, ppc_fp128 } @llvm.sincos.ppcf128(ppc_fp128 %Val)
15969 declare { <4 x float>, <4 x float> } @llvm.sincos.v4f32(<4 x float> %Val)
15974 The '``llvm.sincos.*``' intrinsics returns the sine and cosine of the operand.
15979 The argument is a :ref:`floating-point <t_floating>` value or
15980 :ref:`vector <t_vector>` of floating-point values. Returns two values matching
15981 the argument type in a struct.
15986 This intrinsic is equivalent to a calling both :ref:`llvm.sin <t_llvm_sin>`
15987 and :ref:`llvm.cos <t_llvm_cos>` on the argument.
15989 The first result is the sine of the argument and the second result is the cosine
15992 When specified with the fast-math-flag 'afn', the result may be approximated
15993 using a less accurate calculation.
15995 '``llvm.pow.*``' Intrinsic
15996 ^^^^^^^^^^^^^^^^^^^^^^^^^^
16001 This is an overloaded intrinsic. You can use ``llvm.pow`` on any
16002 floating-point or vector of floating-point type. Not all targets support
16007 declare float @llvm.pow.f32(float %Val, float %Power)
16008 declare double @llvm.pow.f64(double %Val, double %Power)
16009 declare x86_fp80 @llvm.pow.f80(x86_fp80 %Val, x86_fp80 %Power)
16010 declare fp128 @llvm.pow.f128(fp128 %Val, fp128 %Power)
16011 declare ppc_fp128 @llvm.pow.ppcf128(ppc_fp128 %Val, ppc_fp128 Power)
16016 The '``llvm.pow.*``' intrinsics return the first operand raised to the
16017 specified (positive or negative) power.
16022 The arguments and return value are floating-point numbers of the same type.
16027 Return the same value as a corresponding libm '``pow``' function but without
16028 trapping or setting ``errno``.
16030 When specified with the fast-math-flag 'afn', the result may be approximated
16031 using a less accurate calculation.
16035 '``llvm.exp.*``' Intrinsic
16036 ^^^^^^^^^^^^^^^^^^^^^^^^^^
16041 This is an overloaded intrinsic. You can use ``llvm.exp`` on any
16042 floating-point or vector of floating-point type. Not all targets support
16047 declare float @llvm.exp.f32(float %Val)
16048 declare double @llvm.exp.f64(double %Val)
16049 declare x86_fp80 @llvm.exp.f80(x86_fp80 %Val)
16050 declare fp128 @llvm.exp.f128(fp128 %Val)
16051 declare ppc_fp128 @llvm.exp.ppcf128(ppc_fp128 %Val)
16056 The '``llvm.exp.*``' intrinsics compute the base-e exponential of the specified
16062 The argument and return value are floating-point numbers of the same type.
16067 Return the same value as a corresponding libm '``exp``' function but without
16068 trapping or setting ``errno``.
16070 When specified with the fast-math-flag 'afn', the result may be approximated
16071 using a less accurate calculation.
16075 '``llvm.exp2.*``' Intrinsic
16076 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
16081 This is an overloaded intrinsic. You can use ``llvm.exp2`` on any
16082 floating-point or vector of floating-point type. Not all targets support
16087 declare float @llvm.exp2.f32(float %Val)
16088 declare double @llvm.exp2.f64(double %Val)
16089 declare x86_fp80 @llvm.exp2.f80(x86_fp80 %Val)
16090 declare fp128 @llvm.exp2.f128(fp128 %Val)
16091 declare ppc_fp128 @llvm.exp2.ppcf128(ppc_fp128 %Val)
16096 The '``llvm.exp2.*``' intrinsics compute the base-2 exponential of the
16102 The argument and return value are floating-point numbers of the same type.
16107 Return the same value as a corresponding libm '``exp2``' function but without
16108 trapping or setting ``errno``.
16110 When specified with the fast-math-flag 'afn', the result may be approximated
16111 using a less accurate calculation.
16115 '``llvm.exp10.*``' Intrinsic
16116 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16121 This is an overloaded intrinsic. You can use ``llvm.exp10`` on any
16122 floating-point or vector of floating-point type. Not all targets support
16127 declare float @llvm.exp10.f32(float %Val)
16128 declare double @llvm.exp10.f64(double %Val)
16129 declare x86_fp80 @llvm.exp10.f80(x86_fp80 %Val)
16130 declare fp128 @llvm.exp10.f128(fp128 %Val)
16131 declare ppc_fp128 @llvm.exp10.ppcf128(ppc_fp128 %Val)
16136 The '``llvm.exp10.*``' intrinsics compute the base-10 exponential of the
16142 The argument and return value are floating-point numbers of the same type.
16147 Return the same value as a corresponding libm '``exp10``' function but without
16148 trapping or setting ``errno``.
16150 When specified with the fast-math-flag 'afn', the result may be approximated
16151 using a less accurate calculation.
16154 '``llvm.ldexp.*``' Intrinsic
16155 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16160 This is an overloaded intrinsic. You can use ``llvm.ldexp`` on any
16161 floating point or vector of floating point type. Not all targets support
16166 declare float @llvm.ldexp.f32.i32(float %Val, i32 %Exp)
16167 declare double @llvm.ldexp.f64.i32(double %Val, i32 %Exp)
16168 declare x86_fp80 @llvm.ldexp.f80.i32(x86_fp80 %Val, i32 %Exp)
16169 declare fp128 @llvm.ldexp.f128.i32(fp128 %Val, i32 %Exp)
16170 declare ppc_fp128 @llvm.ldexp.ppcf128.i32(ppc_fp128 %Val, i32 %Exp)
16171 declare <2 x float> @llvm.ldexp.v2f32.v2i32(<2 x float> %Val, <2 x i32> %Exp)
16176 The '``llvm.ldexp.*``' intrinsics perform the ldexp function.
16181 The first argument and the return value are :ref:`floating-point
16182 <t_floating>` or :ref:`vector <t_vector>` of floating-point values of
16183 the same type. The second argument is an integer with the same number
16189 This function multiplies the first argument by 2 raised to the second
16190 argument's power. If the first argument is NaN or infinite, the same
16191 value is returned. If the result underflows a zero with the same sign
16192 is returned. If the result overflows, the result is an infinity with
16197 '``llvm.frexp.*``' Intrinsic
16198 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16203 This is an overloaded intrinsic. You can use ``llvm.frexp`` on any
16204 floating point or vector of floating point type. Not all targets support
16209 declare { float, i32 } @llvm.frexp.f32.i32(float %Val)
16210 declare { double, i32 } @llvm.frexp.f64.i32(double %Val)
16211 declare { x86_fp80, i32 } @llvm.frexp.f80.i32(x86_fp80 %Val)
16212 declare { fp128, i32 } @llvm.frexp.f128.i32(fp128 %Val)
16213 declare { ppc_fp128, i32 } @llvm.frexp.ppcf128.i32(ppc_fp128 %Val)
16214 declare { <2 x float>, <2 x i32> } @llvm.frexp.v2f32.v2i32(<2 x float> %Val)
16219 The '``llvm.frexp.*``' intrinsics perform the frexp function.
16224 The argument is a :ref:`floating-point <t_floating>` or
16225 :ref:`vector <t_vector>` of floating-point values. Returns two values
16226 in a struct. The first struct field matches the argument type, and the
16227 second field is an integer or a vector of integer values with the same
16228 number of elements as the argument.
16233 This intrinsic splits a floating point value into a normalized
16234 fractional component and integral exponent.
16236 For a non-zero argument, returns the argument multiplied by some power
16237 of two such that the absolute value of the returned value is in the
16238 range [0.5, 1.0), with the same sign as the argument. The second
16239 result is an integer such that the first result raised to the power of
16240 the second result is the input argument.
16242 If the argument is a zero, returns a zero with the same sign and a 0
16245 If the argument is a NaN, a NaN is returned and the returned exponent
16248 If the argument is an infinity, returns an infinity with the same sign
16249 and an unspecified exponent.
16253 '``llvm.log.*``' Intrinsic
16254 ^^^^^^^^^^^^^^^^^^^^^^^^^^
16259 This is an overloaded intrinsic. You can use ``llvm.log`` on any
16260 floating-point or vector of floating-point type. Not all targets support
16265 declare float @llvm.log.f32(float %Val)
16266 declare double @llvm.log.f64(double %Val)
16267 declare x86_fp80 @llvm.log.f80(x86_fp80 %Val)
16268 declare fp128 @llvm.log.f128(fp128 %Val)
16269 declare ppc_fp128 @llvm.log.ppcf128(ppc_fp128 %Val)
16274 The '``llvm.log.*``' intrinsics compute the base-e logarithm of the specified
16280 The argument and return value are floating-point numbers of the same type.
16285 Return the same value as a corresponding libm '``log``' function but without
16286 trapping or setting ``errno``.
16288 When specified with the fast-math-flag 'afn', the result may be approximated
16289 using a less accurate calculation.
16293 '``llvm.log10.*``' Intrinsic
16294 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16299 This is an overloaded intrinsic. You can use ``llvm.log10`` on any
16300 floating-point or vector of floating-point type. Not all targets support
16305 declare float @llvm.log10.f32(float %Val)
16306 declare double @llvm.log10.f64(double %Val)
16307 declare x86_fp80 @llvm.log10.f80(x86_fp80 %Val)
16308 declare fp128 @llvm.log10.f128(fp128 %Val)
16309 declare ppc_fp128 @llvm.log10.ppcf128(ppc_fp128 %Val)
16314 The '``llvm.log10.*``' intrinsics compute the base-10 logarithm of the
16320 The argument and return value are floating-point numbers of the same type.
16325 Return the same value as a corresponding libm '``log10``' function but without
16326 trapping or setting ``errno``.
16328 When specified with the fast-math-flag 'afn', the result may be approximated
16329 using a less accurate calculation.
16334 '``llvm.log2.*``' Intrinsic
16335 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
16340 This is an overloaded intrinsic. You can use ``llvm.log2`` on any
16341 floating-point or vector of floating-point type. Not all targets support
16346 declare float @llvm.log2.f32(float %Val)
16347 declare double @llvm.log2.f64(double %Val)
16348 declare x86_fp80 @llvm.log2.f80(x86_fp80 %Val)
16349 declare fp128 @llvm.log2.f128(fp128 %Val)
16350 declare ppc_fp128 @llvm.log2.ppcf128(ppc_fp128 %Val)
16355 The '``llvm.log2.*``' intrinsics compute the base-2 logarithm of the specified
16361 The argument and return value are floating-point numbers of the same type.
16366 Return the same value as a corresponding libm '``log2``' function but without
16367 trapping or setting ``errno``.
16369 When specified with the fast-math-flag 'afn', the result may be approximated
16370 using a less accurate calculation.
16374 '``llvm.fma.*``' Intrinsic
16375 ^^^^^^^^^^^^^^^^^^^^^^^^^^
16380 This is an overloaded intrinsic. You can use ``llvm.fma`` on any
16381 floating-point or vector of floating-point type. Not all targets support
16386 declare float @llvm.fma.f32(float %a, float %b, float %c)
16387 declare double @llvm.fma.f64(double %a, double %b, double %c)
16388 declare x86_fp80 @llvm.fma.f80(x86_fp80 %a, x86_fp80 %b, x86_fp80 %c)
16389 declare fp128 @llvm.fma.f128(fp128 %a, fp128 %b, fp128 %c)
16390 declare ppc_fp128 @llvm.fma.ppcf128(ppc_fp128 %a, ppc_fp128 %b, ppc_fp128 %c)
16395 The '``llvm.fma.*``' intrinsics perform the fused multiply-add operation.
16400 The arguments and return value are floating-point numbers of the same type.
16405 Return the same value as the IEEE-754 fusedMultiplyAdd operation. This
16406 is assumed to not trap or set ``errno``.
16408 When specified with the fast-math-flag 'afn', the result may be approximated
16409 using a less accurate calculation.
16413 '``llvm.fabs.*``' Intrinsic
16414 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
16419 This is an overloaded intrinsic. You can use ``llvm.fabs`` on any
16420 floating-point or vector of floating-point type. Not all targets support
16425 declare float @llvm.fabs.f32(float %Val)
16426 declare double @llvm.fabs.f64(double %Val)
16427 declare x86_fp80 @llvm.fabs.f80(x86_fp80 %Val)
16428 declare fp128 @llvm.fabs.f128(fp128 %Val)
16429 declare ppc_fp128 @llvm.fabs.ppcf128(ppc_fp128 %Val)
16434 The '``llvm.fabs.*``' intrinsics return the absolute value of the
16440 The argument and return value are floating-point numbers of the same
16446 This function returns the same values as the libm ``fabs`` functions
16447 would, and handles error conditions in the same way.
16448 The returned value is completely identical to the input except for the sign bit;
16449 in particular, if the input is a NaN, then the quiet/signaling bit and payload
16450 are perfectly preserved.
16452 .. _i_fminmax_family:
16454 '``llvm.min.*``' Intrinsics Comparation
16455 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16460 IEEE754 and ISO C define some min/max operations, and they have some differences
16461 on working with qNaN/sNaN and +0.0/-0.0. Here is the list:
16468 - fmininum/fmaximum
16469 - fminimum_num/fmaximum_num
16472 - minNum/maxNum (2008)
16473 - minimum/maximum (2019)
16474 - minimumNumber/maximumNumber (2019)
16476 * - ``+0.0 vs -0.0``
16481 * - ``NUM vs sNaN``
16482 - qNaN, invalid exception
16483 - qNaN, invalid exception
16484 - NUM, invalid exception
16486 * - ``qNaN vs sNaN``
16487 - qNaN, invalid exception
16488 - qNaN, invalid exception
16489 - qNaN, invalid exception
16491 * - ``NUM vs qNaN``
16492 - NUM, no exception
16493 - qNaN, no exception
16494 - NUM, no exception
16496 LLVM Implementation:
16497 """"""""""""""""""""
16499 LLVM implements all ISO C flavors as listed in this table, except in the
16500 default floating-point environment exceptions are ignored. The constrained
16501 versions of the intrinsics respect the exception behavior.
16505 :widths: 16 28 28 28
16510 - minimumnum/maximumnum
16512 * - ``NUM vs qNaN``
16513 - NUM, no exception
16514 - qNaN, no exception
16515 - NUM, no exception
16517 * - ``NUM vs sNaN``
16518 - qNaN, invalid exception
16519 - qNaN, invalid exception
16520 - NUM, invalid exception
16522 * - ``qNaN vs sNaN``
16523 - qNaN, invalid exception
16524 - qNaN, invalid exception
16525 - qNaN, invalid exception
16527 * - ``sNaN vs sNaN``
16528 - qNaN, invalid exception
16529 - qNaN, invalid exception
16530 - qNaN, invalid exception
16532 * - ``+0.0 vs -0.0``
16534 - +0.0(max)/-0.0(min)
16535 - +0.0(max)/-0.0(min)
16538 - larger(max)/smaller(min)
16539 - larger(max)/smaller(min)
16540 - larger(max)/smaller(min)
16544 '``llvm.minnum.*``' Intrinsic
16545 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16550 This is an overloaded intrinsic. You can use ``llvm.minnum`` on any
16551 floating-point or vector of floating-point type. Not all targets support
16556 declare float @llvm.minnum.f32(float %Val0, float %Val1)
16557 declare double @llvm.minnum.f64(double %Val0, double %Val1)
16558 declare x86_fp80 @llvm.minnum.f80(x86_fp80 %Val0, x86_fp80 %Val1)
16559 declare fp128 @llvm.minnum.f128(fp128 %Val0, fp128 %Val1)
16560 declare ppc_fp128 @llvm.minnum.ppcf128(ppc_fp128 %Val0, ppc_fp128 %Val1)
16565 The '``llvm.minnum.*``' intrinsics return the minimum of the two
16572 The arguments and return value are floating-point numbers of the same
16578 Follows the IEEE-754 semantics for minNum, except for handling of
16579 signaling NaNs. This match's the behavior of libm's fmin.
16581 If either operand is a NaN, returns the other non-NaN operand. Returns
16582 NaN only if both operands are NaN. If the operands compare equal,
16583 returns either one of the operands. For example, this means that
16584 fmin(+0.0, -0.0) returns either operand.
16586 Unlike the IEEE-754 2008 behavior, this does not distinguish between
16587 signaling and quiet NaN inputs. If a target's implementation follows
16588 the standard and returns a quiet NaN if either input is a signaling
16589 NaN, the intrinsic lowering is responsible for quieting the inputs to
16590 correctly return the non-NaN input (e.g. by using the equivalent of
16591 ``llvm.canonicalize``).
16595 '``llvm.maxnum.*``' Intrinsic
16596 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16601 This is an overloaded intrinsic. You can use ``llvm.maxnum`` on any
16602 floating-point or vector of floating-point type. Not all targets support
16607 declare float @llvm.maxnum.f32(float %Val0, float %Val1)
16608 declare double @llvm.maxnum.f64(double %Val0, double %Val1)
16609 declare x86_fp80 @llvm.maxnum.f80(x86_fp80 %Val0, x86_fp80 %Val1)
16610 declare fp128 @llvm.maxnum.f128(fp128 %Val0, fp128 %Val1)
16611 declare ppc_fp128 @llvm.maxnum.ppcf128(ppc_fp128 %Val0, ppc_fp128 %Val1)
16616 The '``llvm.maxnum.*``' intrinsics return the maximum of the two
16623 The arguments and return value are floating-point numbers of the same
16628 Follows the IEEE-754 semantics for maxNum except for the handling of
16629 signaling NaNs. This matches the behavior of libm's fmax.
16631 If either operand is a NaN, returns the other non-NaN operand. Returns
16632 NaN only if both operands are NaN. If the operands compare equal,
16633 returns either one of the operands. For example, this means that
16634 fmax(+0.0, -0.0) returns either -0.0 or 0.0.
16636 Unlike the IEEE-754 2008 behavior, this does not distinguish between
16637 signaling and quiet NaN inputs. If a target's implementation follows
16638 the standard and returns a quiet NaN if either input is a signaling
16639 NaN, the intrinsic lowering is responsible for quieting the inputs to
16640 correctly return the non-NaN input (e.g. by using the equivalent of
16641 ``llvm.canonicalize``).
16645 '``llvm.minimum.*``' Intrinsic
16646 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16651 This is an overloaded intrinsic. You can use ``llvm.minimum`` on any
16652 floating-point or vector of floating-point type. Not all targets support
16657 declare float @llvm.minimum.f32(float %Val0, float %Val1)
16658 declare double @llvm.minimum.f64(double %Val0, double %Val1)
16659 declare x86_fp80 @llvm.minimum.f80(x86_fp80 %Val0, x86_fp80 %Val1)
16660 declare fp128 @llvm.minimum.f128(fp128 %Val0, fp128 %Val1)
16661 declare ppc_fp128 @llvm.minimum.ppcf128(ppc_fp128 %Val0, ppc_fp128 %Val1)
16666 The '``llvm.minimum.*``' intrinsics return the minimum of the two
16667 arguments, propagating NaNs and treating -0.0 as less than +0.0.
16673 The arguments and return value are floating-point numbers of the same
16678 If either operand is a NaN, returns NaN. Otherwise returns the lesser
16679 of the two arguments. -0.0 is considered to be less than +0.0 for this
16680 intrinsic. Note that these are the semantics specified in the draft of
16685 '``llvm.maximum.*``' Intrinsic
16686 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16691 This is an overloaded intrinsic. You can use ``llvm.maximum`` on any
16692 floating-point or vector of floating-point type. Not all targets support
16697 declare float @llvm.maximum.f32(float %Val0, float %Val1)
16698 declare double @llvm.maximum.f64(double %Val0, double %Val1)
16699 declare x86_fp80 @llvm.maximum.f80(x86_fp80 %Val0, x86_fp80 %Val1)
16700 declare fp128 @llvm.maximum.f128(fp128 %Val0, fp128 %Val1)
16701 declare ppc_fp128 @llvm.maximum.ppcf128(ppc_fp128 %Val0, ppc_fp128 %Val1)
16706 The '``llvm.maximum.*``' intrinsics return the maximum of the two
16707 arguments, propagating NaNs and treating -0.0 as less than +0.0.
16713 The arguments and return value are floating-point numbers of the same
16718 If either operand is a NaN, returns NaN. Otherwise returns the greater
16719 of the two arguments. -0.0 is considered to be less than +0.0 for this
16720 intrinsic. Note that these are the semantics specified in the draft of
16725 '``llvm.minimumnum.*``' Intrinsic
16726 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16731 This is an overloaded intrinsic. You can use ``llvm.minimumnum`` on any
16732 floating-point or vector of floating-point type. Not all targets support
16737 declare float @llvm.minimumnum.f32(float %Val0, float %Val1)
16738 declare double @llvm.minimumnum.f64(double %Val0, double %Val1)
16739 declare x86_fp80 @llvm.minimumnum.f80(x86_fp80 %Val0, x86_fp80 %Val1)
16740 declare fp128 @llvm.minimumnum.f128(fp128 %Val0, fp128 %Val1)
16741 declare ppc_fp128 @llvm.minimumnum.ppcf128(ppc_fp128 %Val0, ppc_fp128 %Val1)
16746 The '``llvm.minimumnum.*``' intrinsics return the minimum of the two
16747 arguments, not propagating NaNs and treating -0.0 as less than +0.0.
16753 The arguments and return value are floating-point numbers of the same
16758 If both operands are NaNs (including sNaN), returns qNaN. If one operand
16759 is NaN (including sNaN) and another operand is a number, return the number.
16760 Otherwise returns the lesser of the two arguments. -0.0 is considered to
16761 be less than +0.0 for this intrinsic.
16763 Note that these are the semantics of minimumNumber specified in IEEE 754-2019.
16765 It has some differences with '``llvm.minnum.*``':
16766 1)'``llvm.minnum.*``' will return qNaN if either operand is sNaN.
16767 2)'``llvm.minnum*``' may return either one if we compare +0.0 vs -0.0.
16771 '``llvm.maximumnum.*``' Intrinsic
16772 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16777 This is an overloaded intrinsic. You can use ``llvm.maximumnum`` on any
16778 floating-point or vector of floating-point type. Not all targets support
16783 declare float @llvm.maximumnum.f32(float %Val0, float %Val1)
16784 declare double @llvm.maximumnum.f64(double %Val0, double %Val1)
16785 declare x86_fp80 @llvm.maximumnum.f80(x86_fp80 %Val0, x86_fp80 %Val1)
16786 declare fp128 @llvm.maximumnum.f128(fp128 %Val0, fp128 %Val1)
16787 declare ppc_fp128 @llvm.maximumnum.ppcf128(ppc_fp128 %Val0, ppc_fp128 %Val1)
16792 The '``llvm.maximumnum.*``' intrinsics return the maximum of the two
16793 arguments, not propagating NaNs and treating -0.0 as less than +0.0.
16799 The arguments and return value are floating-point numbers of the same
16804 If both operands are NaNs (including sNaN), returns qNaN. If one operand
16805 is NaN (including sNaN) and another operand is a number, return the number.
16806 Otherwise returns the greater of the two arguments. -0.0 is considered to
16807 be less than +0.0 for this intrinsic.
16809 Note that these are the semantics of maximumNumber specified in IEEE 754-2019.
16811 It has some differences with '``llvm.maxnum.*``':
16812 1)'``llvm.maxnum.*``' will return qNaN if either operand is sNaN.
16813 2)'``llvm.maxnum*``' may return either one if we compare +0.0 vs -0.0.
16817 '``llvm.copysign.*``' Intrinsic
16818 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16823 This is an overloaded intrinsic. You can use ``llvm.copysign`` on any
16824 floating-point or vector of floating-point type. Not all targets support
16829 declare float @llvm.copysign.f32(float %Mag, float %Sgn)
16830 declare double @llvm.copysign.f64(double %Mag, double %Sgn)
16831 declare x86_fp80 @llvm.copysign.f80(x86_fp80 %Mag, x86_fp80 %Sgn)
16832 declare fp128 @llvm.copysign.f128(fp128 %Mag, fp128 %Sgn)
16833 declare ppc_fp128 @llvm.copysign.ppcf128(ppc_fp128 %Mag, ppc_fp128 %Sgn)
16838 The '``llvm.copysign.*``' intrinsics return a value with the magnitude of the
16839 first operand and the sign of the second operand.
16844 The arguments and return value are floating-point numbers of the same
16850 This function returns the same values as the libm ``copysign``
16851 functions would, and handles error conditions in the same way.
16852 The returned value is completely identical to the first operand except for the
16853 sign bit; in particular, if the input is a NaN, then the quiet/signaling bit and
16854 payload are perfectly preserved.
16858 '``llvm.floor.*``' Intrinsic
16859 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16864 This is an overloaded intrinsic. You can use ``llvm.floor`` on any
16865 floating-point or vector of floating-point type. Not all targets support
16870 declare float @llvm.floor.f32(float %Val)
16871 declare double @llvm.floor.f64(double %Val)
16872 declare x86_fp80 @llvm.floor.f80(x86_fp80 %Val)
16873 declare fp128 @llvm.floor.f128(fp128 %Val)
16874 declare ppc_fp128 @llvm.floor.ppcf128(ppc_fp128 %Val)
16879 The '``llvm.floor.*``' intrinsics return the floor of the operand.
16884 The argument and return value are floating-point numbers of the same
16890 This function returns the same values as the libm ``floor`` functions
16891 would, and handles error conditions in the same way.
16895 '``llvm.ceil.*``' Intrinsic
16896 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
16901 This is an overloaded intrinsic. You can use ``llvm.ceil`` on any
16902 floating-point or vector of floating-point type. Not all targets support
16907 declare float @llvm.ceil.f32(float %Val)
16908 declare double @llvm.ceil.f64(double %Val)
16909 declare x86_fp80 @llvm.ceil.f80(x86_fp80 %Val)
16910 declare fp128 @llvm.ceil.f128(fp128 %Val)
16911 declare ppc_fp128 @llvm.ceil.ppcf128(ppc_fp128 %Val)
16916 The '``llvm.ceil.*``' intrinsics return the ceiling of the operand.
16921 The argument and return value are floating-point numbers of the same
16927 This function returns the same values as the libm ``ceil`` functions
16928 would, and handles error conditions in the same way.
16931 .. _int_llvm_trunc:
16933 '``llvm.trunc.*``' Intrinsic
16934 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16939 This is an overloaded intrinsic. You can use ``llvm.trunc`` on any
16940 floating-point or vector of floating-point type. Not all targets support
16945 declare float @llvm.trunc.f32(float %Val)
16946 declare double @llvm.trunc.f64(double %Val)
16947 declare x86_fp80 @llvm.trunc.f80(x86_fp80 %Val)
16948 declare fp128 @llvm.trunc.f128(fp128 %Val)
16949 declare ppc_fp128 @llvm.trunc.ppcf128(ppc_fp128 %Val)
16954 The '``llvm.trunc.*``' intrinsics returns the operand rounded to the
16955 nearest integer not larger in magnitude than the operand.
16960 The argument and return value are floating-point numbers of the same
16966 This function returns the same values as the libm ``trunc`` functions
16967 would, and handles error conditions in the same way.
16971 '``llvm.rint.*``' Intrinsic
16972 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
16977 This is an overloaded intrinsic. You can use ``llvm.rint`` on any
16978 floating-point or vector of floating-point type. Not all targets support
16983 declare float @llvm.rint.f32(float %Val)
16984 declare double @llvm.rint.f64(double %Val)
16985 declare x86_fp80 @llvm.rint.f80(x86_fp80 %Val)
16986 declare fp128 @llvm.rint.f128(fp128 %Val)
16987 declare ppc_fp128 @llvm.rint.ppcf128(ppc_fp128 %Val)
16992 The '``llvm.rint.*``' intrinsics returns the operand rounded to the
16993 nearest integer. It may raise an inexact floating-point exception if the
16994 operand isn't an integer.
16999 The argument and return value are floating-point numbers of the same
17005 This function returns the same values as the libm ``rint`` functions
17006 would, and handles error conditions in the same way. Since LLVM assumes the
17007 :ref:`default floating-point environment <floatenv>`, the rounding mode is
17008 assumed to be set to "nearest", so halfway cases are rounded to the even
17009 integer. Use :ref:`Constrained Floating-Point Intrinsics <constrainedfp>`
17010 to avoid that assumption.
17014 '``llvm.nearbyint.*``' Intrinsic
17015 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17020 This is an overloaded intrinsic. You can use ``llvm.nearbyint`` on any
17021 floating-point or vector of floating-point type. Not all targets support
17026 declare float @llvm.nearbyint.f32(float %Val)
17027 declare double @llvm.nearbyint.f64(double %Val)
17028 declare x86_fp80 @llvm.nearbyint.f80(x86_fp80 %Val)
17029 declare fp128 @llvm.nearbyint.f128(fp128 %Val)
17030 declare ppc_fp128 @llvm.nearbyint.ppcf128(ppc_fp128 %Val)
17035 The '``llvm.nearbyint.*``' intrinsics returns the operand rounded to the
17041 The argument and return value are floating-point numbers of the same
17047 This function returns the same values as the libm ``nearbyint``
17048 functions would, and handles error conditions in the same way. Since LLVM
17049 assumes the :ref:`default floating-point environment <floatenv>`, the rounding
17050 mode is assumed to be set to "nearest", so halfway cases are rounded to the even
17051 integer. Use :ref:`Constrained Floating-Point Intrinsics <constrainedfp>` to
17052 avoid that assumption.
17056 '``llvm.round.*``' Intrinsic
17057 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17062 This is an overloaded intrinsic. You can use ``llvm.round`` on any
17063 floating-point or vector of floating-point type. Not all targets support
17068 declare float @llvm.round.f32(float %Val)
17069 declare double @llvm.round.f64(double %Val)
17070 declare x86_fp80 @llvm.round.f80(x86_fp80 %Val)
17071 declare fp128 @llvm.round.f128(fp128 %Val)
17072 declare ppc_fp128 @llvm.round.ppcf128(ppc_fp128 %Val)
17077 The '``llvm.round.*``' intrinsics returns the operand rounded to the
17083 The argument and return value are floating-point numbers of the same
17089 This function returns the same values as the libm ``round``
17090 functions would, and handles error conditions in the same way.
17094 '``llvm.roundeven.*``' Intrinsic
17095 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17100 This is an overloaded intrinsic. You can use ``llvm.roundeven`` on any
17101 floating-point or vector of floating-point type. Not all targets support
17106 declare float @llvm.roundeven.f32(float %Val)
17107 declare double @llvm.roundeven.f64(double %Val)
17108 declare x86_fp80 @llvm.roundeven.f80(x86_fp80 %Val)
17109 declare fp128 @llvm.roundeven.f128(fp128 %Val)
17110 declare ppc_fp128 @llvm.roundeven.ppcf128(ppc_fp128 %Val)
17115 The '``llvm.roundeven.*``' intrinsics returns the operand rounded to the nearest
17116 integer in floating-point format rounding halfway cases to even (that is, to the
17117 nearest value that is an even integer).
17122 The argument and return value are floating-point numbers of the same type.
17127 This function implements IEEE-754 operation ``roundToIntegralTiesToEven``. It
17128 also behaves in the same way as C standard function ``roundeven``, except that
17129 it does not raise floating point exceptions.
17132 '``llvm.lround.*``' Intrinsic
17133 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17138 This is an overloaded intrinsic. You can use ``llvm.lround`` on any
17139 floating-point type or vector of floating-point type. Not all targets
17140 support all types however.
17144 declare i32 @llvm.lround.i32.f32(float %Val)
17145 declare i32 @llvm.lround.i32.f64(double %Val)
17146 declare i32 @llvm.lround.i32.f80(float %Val)
17147 declare i32 @llvm.lround.i32.f128(double %Val)
17148 declare i32 @llvm.lround.i32.ppcf128(double %Val)
17150 declare i64 @llvm.lround.i64.f32(float %Val)
17151 declare i64 @llvm.lround.i64.f64(double %Val)
17152 declare i64 @llvm.lround.i64.f80(float %Val)
17153 declare i64 @llvm.lround.i64.f128(double %Val)
17154 declare i64 @llvm.lround.i64.ppcf128(double %Val)
17159 The '``llvm.lround.*``' intrinsics return the operand rounded to the nearest
17160 integer with ties away from zero.
17166 The argument is a floating-point number and the return value is an integer
17172 This function returns the same values as the libm ``lround`` functions
17173 would, but without setting errno. If the rounded value is too large to
17174 be stored in the result type, the return value is a non-deterministic
17175 value (equivalent to `freeze poison`).
17177 '``llvm.llround.*``' Intrinsic
17178 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17183 This is an overloaded intrinsic. You can use ``llvm.llround`` on any
17184 floating-point type. Not all targets support all types however.
17188 declare i64 @llvm.llround.i64.f32(float %Val)
17189 declare i64 @llvm.llround.i64.f64(double %Val)
17190 declare i64 @llvm.llround.i64.f80(float %Val)
17191 declare i64 @llvm.llround.i64.f128(double %Val)
17192 declare i64 @llvm.llround.i64.ppcf128(double %Val)
17197 The '``llvm.llround.*``' intrinsics return the operand rounded to the nearest
17198 integer with ties away from zero.
17203 The argument is a floating-point number and the return value is an integer
17209 This function returns the same values as the libm ``llround``
17210 functions would, but without setting errno. If the rounded value is
17211 too large to be stored in the result type, the return value is a
17212 non-deterministic value (equivalent to `freeze poison`).
17216 '``llvm.lrint.*``' Intrinsic
17217 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17222 This is an overloaded intrinsic. You can use ``llvm.lrint`` on any
17223 floating-point type or vector of floating-point type. Not all targets
17224 support all types however.
17228 declare i32 @llvm.lrint.i32.f32(float %Val)
17229 declare i32 @llvm.lrint.i32.f64(double %Val)
17230 declare i32 @llvm.lrint.i32.f80(float %Val)
17231 declare i32 @llvm.lrint.i32.f128(double %Val)
17232 declare i32 @llvm.lrint.i32.ppcf128(double %Val)
17234 declare i64 @llvm.lrint.i64.f32(float %Val)
17235 declare i64 @llvm.lrint.i64.f64(double %Val)
17236 declare i64 @llvm.lrint.i64.f80(float %Val)
17237 declare i64 @llvm.lrint.i64.f128(double %Val)
17238 declare i64 @llvm.lrint.i64.ppcf128(double %Val)
17243 The '``llvm.lrint.*``' intrinsics return the operand rounded to the nearest
17250 The argument is a floating-point number and the return value is an integer
17256 This function returns the same values as the libm ``lrint`` functions
17257 would, but without setting errno. If the rounded value is too large to
17258 be stored in the result type, the return value is a non-deterministic
17259 value (equivalent to `freeze poison`).
17263 '``llvm.llrint.*``' Intrinsic
17264 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17269 This is an overloaded intrinsic. You can use ``llvm.llrint`` on any
17270 floating-point type or vector of floating-point type. Not all targets
17271 support all types however.
17275 declare i64 @llvm.llrint.i64.f32(float %Val)
17276 declare i64 @llvm.llrint.i64.f64(double %Val)
17277 declare i64 @llvm.llrint.i64.f80(float %Val)
17278 declare i64 @llvm.llrint.i64.f128(double %Val)
17279 declare i64 @llvm.llrint.i64.ppcf128(double %Val)
17284 The '``llvm.llrint.*``' intrinsics return the operand rounded to the nearest
17290 The argument is a floating-point number and the return value is an integer
17296 This function returns the same values as the libm ``llrint`` functions
17297 would, but without setting errno. If the rounded value is too large to
17298 be stored in the result type, the return value is a non-deterministic
17299 value (equivalent to `freeze poison`).
17301 Bit Manipulation Intrinsics
17302 ---------------------------
17304 LLVM provides intrinsics for a few important bit manipulation
17305 operations. These allow efficient code generation for some algorithms.
17307 .. _int_bitreverse:
17309 '``llvm.bitreverse.*``' Intrinsics
17310 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17315 This is an overloaded intrinsic function. You can use bitreverse on any
17320 declare i16 @llvm.bitreverse.i16(i16 <id>)
17321 declare i32 @llvm.bitreverse.i32(i32 <id>)
17322 declare i64 @llvm.bitreverse.i64(i64 <id>)
17323 declare <4 x i32> @llvm.bitreverse.v4i32(<4 x i32> <id>)
17328 The '``llvm.bitreverse``' family of intrinsics is used to reverse the
17329 bitpattern of an integer value or vector of integer values; for example
17330 ``0b10110110`` becomes ``0b01101101``.
17335 The ``llvm.bitreverse.iN`` intrinsic returns an iN value that has bit
17336 ``M`` in the input moved to bit ``N-M-1`` in the output. The vector
17337 intrinsics, such as ``llvm.bitreverse.v4i32``, operate on a per-element
17338 basis and the element order is not affected.
17342 '``llvm.bswap.*``' Intrinsics
17343 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17348 This is an overloaded intrinsic function. You can use bswap on any
17349 integer type that is an even number of bytes (i.e. BitWidth % 16 == 0).
17353 declare i16 @llvm.bswap.i16(i16 <id>)
17354 declare i32 @llvm.bswap.i32(i32 <id>)
17355 declare i64 @llvm.bswap.i64(i64 <id>)
17356 declare <4 x i32> @llvm.bswap.v4i32(<4 x i32> <id>)
17361 The '``llvm.bswap``' family of intrinsics is used to byte swap an integer
17362 value or vector of integer values with an even number of bytes (positive
17363 multiple of 16 bits).
17368 The ``llvm.bswap.i16`` intrinsic returns an i16 value that has the high
17369 and low byte of the input i16 swapped. Similarly, the ``llvm.bswap.i32``
17370 intrinsic returns an i32 value that has the four bytes of the input i32
17371 swapped, so that if the input bytes are numbered 0, 1, 2, 3 then the
17372 returned i32 will have its bytes in 3, 2, 1, 0 order. The
17373 ``llvm.bswap.i48``, ``llvm.bswap.i64`` and other intrinsics extend this
17374 concept to additional even-byte lengths (6 bytes, 8 bytes and more,
17375 respectively). The vector intrinsics, such as ``llvm.bswap.v4i32``,
17376 operate on a per-element basis and the element order is not affected.
17380 '``llvm.ctpop.*``' Intrinsic
17381 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17386 This is an overloaded intrinsic. You can use llvm.ctpop on any integer
17387 bit width, or on any vector with integer elements. Not all targets
17388 support all bit widths or vector types, however.
17392 declare i8 @llvm.ctpop.i8(i8 <src>)
17393 declare i16 @llvm.ctpop.i16(i16 <src>)
17394 declare i32 @llvm.ctpop.i32(i32 <src>)
17395 declare i64 @llvm.ctpop.i64(i64 <src>)
17396 declare i256 @llvm.ctpop.i256(i256 <src>)
17397 declare <2 x i32> @llvm.ctpop.v2i32(<2 x i32> <src>)
17402 The '``llvm.ctpop``' family of intrinsics counts the number of bits set
17408 The only argument is the value to be counted. The argument may be of any
17409 integer type, or a vector with integer elements. The return type must
17410 match the argument type.
17415 The '``llvm.ctpop``' intrinsic counts the 1's in a variable, or within
17416 each element of a vector.
17420 '``llvm.ctlz.*``' Intrinsic
17421 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
17426 This is an overloaded intrinsic. You can use ``llvm.ctlz`` on any
17427 integer bit width, or any vector whose elements are integers. Not all
17428 targets support all bit widths or vector types, however.
17432 declare i8 @llvm.ctlz.i8 (i8 <src>, i1 <is_zero_poison>)
17433 declare <2 x i37> @llvm.ctlz.v2i37(<2 x i37> <src>, i1 <is_zero_poison>)
17438 The '``llvm.ctlz``' family of intrinsic functions counts the number of
17439 leading zeros in a variable.
17444 The first argument is the value to be counted. This argument may be of
17445 any integer type, or a vector with integer element type. The return
17446 type must match the first argument type.
17448 The second argument is a constant flag that indicates whether the intrinsic
17449 returns a valid result if the first argument is zero. If the first
17450 argument is zero and the second argument is true, the result is poison.
17451 Historically some architectures did not provide a defined result for zero
17452 values as efficiently, and many algorithms are now predicated on avoiding
17458 The '``llvm.ctlz``' intrinsic counts the leading (most significant)
17459 zeros in a variable, or within each element of the vector. If
17460 ``src == 0`` then the result is the size in bits of the type of ``src``
17461 if ``is_zero_poison == 0`` and ``poison`` otherwise. For example,
17462 ``llvm.ctlz(i32 2) = 30``.
17466 '``llvm.cttz.*``' Intrinsic
17467 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
17472 This is an overloaded intrinsic. You can use ``llvm.cttz`` on any
17473 integer bit width, or any vector of integer elements. Not all targets
17474 support all bit widths or vector types, however.
17478 declare i42 @llvm.cttz.i42 (i42 <src>, i1 <is_zero_poison>)
17479 declare <2 x i32> @llvm.cttz.v2i32(<2 x i32> <src>, i1 <is_zero_poison>)
17484 The '``llvm.cttz``' family of intrinsic functions counts the number of
17490 The first argument is the value to be counted. This argument may be of
17491 any integer type, or a vector with integer element type. The return
17492 type must match the first argument type.
17494 The second argument is a constant flag that indicates whether the intrinsic
17495 returns a valid result if the first argument is zero. If the first
17496 argument is zero and the second argument is true, the result is poison.
17497 Historically some architectures did not provide a defined result for zero
17498 values as efficiently, and many algorithms are now predicated on avoiding
17504 The '``llvm.cttz``' intrinsic counts the trailing (least significant)
17505 zeros in a variable, or within each element of a vector. If ``src == 0``
17506 then the result is the size in bits of the type of ``src`` if
17507 ``is_zero_poison == 0`` and ``poison`` otherwise. For example,
17508 ``llvm.cttz(2) = 1``.
17514 '``llvm.fshl.*``' Intrinsic
17515 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
17520 This is an overloaded intrinsic. You can use ``llvm.fshl`` on any
17521 integer bit width or any vector of integer elements. Not all targets
17522 support all bit widths or vector types, however.
17526 declare i8 @llvm.fshl.i8 (i8 %a, i8 %b, i8 %c)
17527 declare i64 @llvm.fshl.i64(i64 %a, i64 %b, i64 %c)
17528 declare <2 x i32> @llvm.fshl.v2i32(<2 x i32> %a, <2 x i32> %b, <2 x i32> %c)
17533 The '``llvm.fshl``' family of intrinsic functions performs a funnel shift left:
17534 the first two values are concatenated as { %a : %b } (%a is the most significant
17535 bits of the wide value), the combined value is shifted left, and the most
17536 significant bits are extracted to produce a result that is the same size as the
17537 original arguments. If the first 2 arguments are identical, this is equivalent
17538 to a rotate left operation. For vector types, the operation occurs for each
17539 element of the vector. The shift argument is treated as an unsigned amount
17540 modulo the element size of the arguments.
17545 The first two arguments are the values to be concatenated. The third
17546 argument is the shift amount. The arguments may be any integer type or a
17547 vector with integer element type. All arguments and the return value must
17548 have the same type.
17553 .. code-block:: text
17555 %r = call i8 @llvm.fshl.i8(i8 %x, i8 %y, i8 %z) ; %r = i8: msb_extract((concat(x, y) << (z % 8)), 8)
17556 %r = call i8 @llvm.fshl.i8(i8 255, i8 0, i8 15) ; %r = i8: 128 (0b10000000)
17557 %r = call i8 @llvm.fshl.i8(i8 15, i8 15, i8 11) ; %r = i8: 120 (0b01111000)
17558 %r = call i8 @llvm.fshl.i8(i8 0, i8 255, i8 8) ; %r = i8: 0 (0b00000000)
17562 '``llvm.fshr.*``' Intrinsic
17563 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
17568 This is an overloaded intrinsic. You can use ``llvm.fshr`` on any
17569 integer bit width or any vector of integer elements. Not all targets
17570 support all bit widths or vector types, however.
17574 declare i8 @llvm.fshr.i8 (i8 %a, i8 %b, i8 %c)
17575 declare i64 @llvm.fshr.i64(i64 %a, i64 %b, i64 %c)
17576 declare <2 x i32> @llvm.fshr.v2i32(<2 x i32> %a, <2 x i32> %b, <2 x i32> %c)
17581 The '``llvm.fshr``' family of intrinsic functions performs a funnel shift right:
17582 the first two values are concatenated as { %a : %b } (%a is the most significant
17583 bits of the wide value), the combined value is shifted right, and the least
17584 significant bits are extracted to produce a result that is the same size as the
17585 original arguments. If the first 2 arguments are identical, this is equivalent
17586 to a rotate right operation. For vector types, the operation occurs for each
17587 element of the vector. The shift argument is treated as an unsigned amount
17588 modulo the element size of the arguments.
17593 The first two arguments are the values to be concatenated. The third
17594 argument is the shift amount. The arguments may be any integer type or a
17595 vector with integer element type. All arguments and the return value must
17596 have the same type.
17601 .. code-block:: text
17603 %r = call i8 @llvm.fshr.i8(i8 %x, i8 %y, i8 %z) ; %r = i8: lsb_extract((concat(x, y) >> (z % 8)), 8)
17604 %r = call i8 @llvm.fshr.i8(i8 255, i8 0, i8 15) ; %r = i8: 254 (0b11111110)
17605 %r = call i8 @llvm.fshr.i8(i8 15, i8 15, i8 11) ; %r = i8: 225 (0b11100001)
17606 %r = call i8 @llvm.fshr.i8(i8 0, i8 255, i8 8) ; %r = i8: 255 (0b11111111)
17608 Arithmetic with Overflow Intrinsics
17609 -----------------------------------
17611 LLVM provides intrinsics for fast arithmetic overflow checking.
17613 Each of these intrinsics returns a two-element struct. The first
17614 element of this struct contains the result of the corresponding
17615 arithmetic operation modulo 2\ :sup:`n`\ , where n is the bit width of
17616 the result. Therefore, for example, the first element of the struct
17617 returned by ``llvm.sadd.with.overflow.i32`` is always the same as the
17618 result of a 32-bit ``add`` instruction with the same operands, where
17619 the ``add`` is *not* modified by an ``nsw`` or ``nuw`` flag.
17621 The second element of the result is an ``i1`` that is 1 if the
17622 arithmetic operation overflowed and 0 otherwise. An operation
17623 overflows if, for any values of its operands ``A`` and ``B`` and for
17624 any ``N`` larger than the operands' width, ``ext(A op B) to iN`` is
17625 not equal to ``(ext(A) to iN) op (ext(B) to iN)`` where ``ext`` is
17626 ``sext`` for signed overflow and ``zext`` for unsigned overflow, and
17627 ``op`` is the underlying arithmetic operation.
17629 The behavior of these intrinsics is well-defined for all argument
17632 '``llvm.sadd.with.overflow.*``' Intrinsics
17633 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17638 This is an overloaded intrinsic. You can use ``llvm.sadd.with.overflow``
17639 on any integer bit width or vectors of integers.
17643 declare {i16, i1} @llvm.sadd.with.overflow.i16(i16 %a, i16 %b)
17644 declare {i32, i1} @llvm.sadd.with.overflow.i32(i32 %a, i32 %b)
17645 declare {i64, i1} @llvm.sadd.with.overflow.i64(i64 %a, i64 %b)
17646 declare {<4 x i32>, <4 x i1>} @llvm.sadd.with.overflow.v4i32(<4 x i32> %a, <4 x i32> %b)
17651 The '``llvm.sadd.with.overflow``' family of intrinsic functions perform
17652 a signed addition of the two arguments, and indicate whether an overflow
17653 occurred during the signed summation.
17658 The arguments (%a and %b) and the first element of the result structure
17659 may be of integer types of any bit width, but they must have the same
17660 bit width. The second element of the result structure must be of type
17661 ``i1``. ``%a`` and ``%b`` are the two values that will undergo signed
17667 The '``llvm.sadd.with.overflow``' family of intrinsic functions perform
17668 a signed addition of the two variables. They return a structure --- the
17669 first element of which is the signed summation, and the second element
17670 of which is a bit specifying if the signed summation resulted in an
17676 .. code-block:: llvm
17678 %res = call {i32, i1} @llvm.sadd.with.overflow.i32(i32 %a, i32 %b)
17679 %sum = extractvalue {i32, i1} %res, 0
17680 %obit = extractvalue {i32, i1} %res, 1
17681 br i1 %obit, label %overflow, label %normal
17683 '``llvm.uadd.with.overflow.*``' Intrinsics
17684 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17689 This is an overloaded intrinsic. You can use ``llvm.uadd.with.overflow``
17690 on any integer bit width or vectors of integers.
17694 declare {i16, i1} @llvm.uadd.with.overflow.i16(i16 %a, i16 %b)
17695 declare {i32, i1} @llvm.uadd.with.overflow.i32(i32 %a, i32 %b)
17696 declare {i64, i1} @llvm.uadd.with.overflow.i64(i64 %a, i64 %b)
17697 declare {<4 x i32>, <4 x i1>} @llvm.uadd.with.overflow.v4i32(<4 x i32> %a, <4 x i32> %b)
17702 The '``llvm.uadd.with.overflow``' family of intrinsic functions perform
17703 an unsigned addition of the two arguments, and indicate whether a carry
17704 occurred during the unsigned summation.
17709 The arguments (%a and %b) and the first element of the result structure
17710 may be of integer types of any bit width, but they must have the same
17711 bit width. The second element of the result structure must be of type
17712 ``i1``. ``%a`` and ``%b`` are the two values that will undergo unsigned
17718 The '``llvm.uadd.with.overflow``' family of intrinsic functions perform
17719 an unsigned addition of the two arguments. They return a structure --- the
17720 first element of which is the sum, and the second element of which is a
17721 bit specifying if the unsigned summation resulted in a carry.
17726 .. code-block:: llvm
17728 %res = call {i32, i1} @llvm.uadd.with.overflow.i32(i32 %a, i32 %b)
17729 %sum = extractvalue {i32, i1} %res, 0
17730 %obit = extractvalue {i32, i1} %res, 1
17731 br i1 %obit, label %carry, label %normal
17733 '``llvm.ssub.with.overflow.*``' Intrinsics
17734 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17739 This is an overloaded intrinsic. You can use ``llvm.ssub.with.overflow``
17740 on any integer bit width or vectors of integers.
17744 declare {i16, i1} @llvm.ssub.with.overflow.i16(i16 %a, i16 %b)
17745 declare {i32, i1} @llvm.ssub.with.overflow.i32(i32 %a, i32 %b)
17746 declare {i64, i1} @llvm.ssub.with.overflow.i64(i64 %a, i64 %b)
17747 declare {<4 x i32>, <4 x i1>} @llvm.ssub.with.overflow.v4i32(<4 x i32> %a, <4 x i32> %b)
17752 The '``llvm.ssub.with.overflow``' family of intrinsic functions perform
17753 a signed subtraction of the two arguments, and indicate whether an
17754 overflow occurred during the signed subtraction.
17759 The arguments (%a and %b) and the first element of the result structure
17760 may be of integer types of any bit width, but they must have the same
17761 bit width. The second element of the result structure must be of type
17762 ``i1``. ``%a`` and ``%b`` are the two values that will undergo signed
17768 The '``llvm.ssub.with.overflow``' family of intrinsic functions perform
17769 a signed subtraction of the two arguments. They return a structure --- the
17770 first element of which is the subtraction, and the second element of
17771 which is a bit specifying if the signed subtraction resulted in an
17777 .. code-block:: llvm
17779 %res = call {i32, i1} @llvm.ssub.with.overflow.i32(i32 %a, i32 %b)
17780 %sum = extractvalue {i32, i1} %res, 0
17781 %obit = extractvalue {i32, i1} %res, 1
17782 br i1 %obit, label %overflow, label %normal
17784 '``llvm.usub.with.overflow.*``' Intrinsics
17785 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17790 This is an overloaded intrinsic. You can use ``llvm.usub.with.overflow``
17791 on any integer bit width or vectors of integers.
17795 declare {i16, i1} @llvm.usub.with.overflow.i16(i16 %a, i16 %b)
17796 declare {i32, i1} @llvm.usub.with.overflow.i32(i32 %a, i32 %b)
17797 declare {i64, i1} @llvm.usub.with.overflow.i64(i64 %a, i64 %b)
17798 declare {<4 x i32>, <4 x i1>} @llvm.usub.with.overflow.v4i32(<4 x i32> %a, <4 x i32> %b)
17803 The '``llvm.usub.with.overflow``' family of intrinsic functions perform
17804 an unsigned subtraction of the two arguments, and indicate whether an
17805 overflow occurred during the unsigned subtraction.
17810 The arguments (%a and %b) and the first element of the result structure
17811 may be of integer types of any bit width, but they must have the same
17812 bit width. The second element of the result structure must be of type
17813 ``i1``. ``%a`` and ``%b`` are the two values that will undergo unsigned
17819 The '``llvm.usub.with.overflow``' family of intrinsic functions perform
17820 an unsigned subtraction of the two arguments. They return a structure ---
17821 the first element of which is the subtraction, and the second element of
17822 which is a bit specifying if the unsigned subtraction resulted in an
17828 .. code-block:: llvm
17830 %res = call {i32, i1} @llvm.usub.with.overflow.i32(i32 %a, i32 %b)
17831 %sum = extractvalue {i32, i1} %res, 0
17832 %obit = extractvalue {i32, i1} %res, 1
17833 br i1 %obit, label %overflow, label %normal
17835 '``llvm.smul.with.overflow.*``' Intrinsics
17836 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17841 This is an overloaded intrinsic. You can use ``llvm.smul.with.overflow``
17842 on any integer bit width or vectors of integers.
17846 declare {i16, i1} @llvm.smul.with.overflow.i16(i16 %a, i16 %b)
17847 declare {i32, i1} @llvm.smul.with.overflow.i32(i32 %a, i32 %b)
17848 declare {i64, i1} @llvm.smul.with.overflow.i64(i64 %a, i64 %b)
17849 declare {<4 x i32>, <4 x i1>} @llvm.smul.with.overflow.v4i32(<4 x i32> %a, <4 x i32> %b)
17854 The '``llvm.smul.with.overflow``' family of intrinsic functions perform
17855 a signed multiplication of the two arguments, and indicate whether an
17856 overflow occurred during the signed multiplication.
17861 The arguments (%a and %b) and the first element of the result structure
17862 may be of integer types of any bit width, but they must have the same
17863 bit width. The second element of the result structure must be of type
17864 ``i1``. ``%a`` and ``%b`` are the two values that will undergo signed
17870 The '``llvm.smul.with.overflow``' family of intrinsic functions perform
17871 a signed multiplication of the two arguments. They return a structure ---
17872 the first element of which is the multiplication, and the second element
17873 of which is a bit specifying if the signed multiplication resulted in an
17879 .. code-block:: llvm
17881 %res = call {i32, i1} @llvm.smul.with.overflow.i32(i32 %a, i32 %b)
17882 %sum = extractvalue {i32, i1} %res, 0
17883 %obit = extractvalue {i32, i1} %res, 1
17884 br i1 %obit, label %overflow, label %normal
17886 '``llvm.umul.with.overflow.*``' Intrinsics
17887 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17892 This is an overloaded intrinsic. You can use ``llvm.umul.with.overflow``
17893 on any integer bit width or vectors of integers.
17897 declare {i16, i1} @llvm.umul.with.overflow.i16(i16 %a, i16 %b)
17898 declare {i32, i1} @llvm.umul.with.overflow.i32(i32 %a, i32 %b)
17899 declare {i64, i1} @llvm.umul.with.overflow.i64(i64 %a, i64 %b)
17900 declare {<4 x i32>, <4 x i1>} @llvm.umul.with.overflow.v4i32(<4 x i32> %a, <4 x i32> %b)
17905 The '``llvm.umul.with.overflow``' family of intrinsic functions perform
17906 a unsigned multiplication of the two arguments, and indicate whether an
17907 overflow occurred during the unsigned multiplication.
17912 The arguments (%a and %b) and the first element of the result structure
17913 may be of integer types of any bit width, but they must have the same
17914 bit width. The second element of the result structure must be of type
17915 ``i1``. ``%a`` and ``%b`` are the two values that will undergo unsigned
17921 The '``llvm.umul.with.overflow``' family of intrinsic functions perform
17922 an unsigned multiplication of the two arguments. They return a structure ---
17923 the first element of which is the multiplication, and the second
17924 element of which is a bit specifying if the unsigned multiplication
17925 resulted in an overflow.
17930 .. code-block:: llvm
17932 %res = call {i32, i1} @llvm.umul.with.overflow.i32(i32 %a, i32 %b)
17933 %sum = extractvalue {i32, i1} %res, 0
17934 %obit = extractvalue {i32, i1} %res, 1
17935 br i1 %obit, label %overflow, label %normal
17937 Saturation Arithmetic Intrinsics
17938 ---------------------------------
17940 Saturation arithmetic is a version of arithmetic in which operations are
17941 limited to a fixed range between a minimum and maximum value. If the result of
17942 an operation is greater than the maximum value, the result is set (or
17943 "clamped") to this maximum. If it is below the minimum, it is clamped to this
17948 '``llvm.sadd.sat.*``' Intrinsics
17949 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17954 This is an overloaded intrinsic. You can use ``llvm.sadd.sat``
17955 on any integer bit width or vectors of integers.
17959 declare i16 @llvm.sadd.sat.i16(i16 %a, i16 %b)
17960 declare i32 @llvm.sadd.sat.i32(i32 %a, i32 %b)
17961 declare i64 @llvm.sadd.sat.i64(i64 %a, i64 %b)
17962 declare <4 x i32> @llvm.sadd.sat.v4i32(<4 x i32> %a, <4 x i32> %b)
17967 The '``llvm.sadd.sat``' family of intrinsic functions perform signed
17968 saturating addition on the 2 arguments.
17973 The arguments (%a and %b) and the result may be of integer types of any bit
17974 width, but they must have the same bit width. ``%a`` and ``%b`` are the two
17975 values that will undergo signed addition.
17980 The maximum value this operation can clamp to is the largest signed value
17981 representable by the bit width of the arguments. The minimum value is the
17982 smallest signed value representable by this bit width.
17988 .. code-block:: llvm
17990 %res = call i4 @llvm.sadd.sat.i4(i4 1, i4 2) ; %res = 3
17991 %res = call i4 @llvm.sadd.sat.i4(i4 5, i4 6) ; %res = 7
17992 %res = call i4 @llvm.sadd.sat.i4(i4 -4, i4 2) ; %res = -2
17993 %res = call i4 @llvm.sadd.sat.i4(i4 -4, i4 -5) ; %res = -8
17998 '``llvm.uadd.sat.*``' Intrinsics
17999 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18004 This is an overloaded intrinsic. You can use ``llvm.uadd.sat``
18005 on any integer bit width or vectors of integers.
18009 declare i16 @llvm.uadd.sat.i16(i16 %a, i16 %b)
18010 declare i32 @llvm.uadd.sat.i32(i32 %a, i32 %b)
18011 declare i64 @llvm.uadd.sat.i64(i64 %a, i64 %b)
18012 declare <4 x i32> @llvm.uadd.sat.v4i32(<4 x i32> %a, <4 x i32> %b)
18017 The '``llvm.uadd.sat``' family of intrinsic functions perform unsigned
18018 saturating addition on the 2 arguments.
18023 The arguments (%a and %b) and the result may be of integer types of any bit
18024 width, but they must have the same bit width. ``%a`` and ``%b`` are the two
18025 values that will undergo unsigned addition.
18030 The maximum value this operation can clamp to is the largest unsigned value
18031 representable by the bit width of the arguments. Because this is an unsigned
18032 operation, the result will never saturate towards zero.
18038 .. code-block:: llvm
18040 %res = call i4 @llvm.uadd.sat.i4(i4 1, i4 2) ; %res = 3
18041 %res = call i4 @llvm.uadd.sat.i4(i4 5, i4 6) ; %res = 11
18042 %res = call i4 @llvm.uadd.sat.i4(i4 8, i4 8) ; %res = 15
18047 '``llvm.ssub.sat.*``' Intrinsics
18048 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18053 This is an overloaded intrinsic. You can use ``llvm.ssub.sat``
18054 on any integer bit width or vectors of integers.
18058 declare i16 @llvm.ssub.sat.i16(i16 %a, i16 %b)
18059 declare i32 @llvm.ssub.sat.i32(i32 %a, i32 %b)
18060 declare i64 @llvm.ssub.sat.i64(i64 %a, i64 %b)
18061 declare <4 x i32> @llvm.ssub.sat.v4i32(<4 x i32> %a, <4 x i32> %b)
18066 The '``llvm.ssub.sat``' family of intrinsic functions perform signed
18067 saturating subtraction on the 2 arguments.
18072 The arguments (%a and %b) and the result may be of integer types of any bit
18073 width, but they must have the same bit width. ``%a`` and ``%b`` are the two
18074 values that will undergo signed subtraction.
18079 The maximum value this operation can clamp to is the largest signed value
18080 representable by the bit width of the arguments. The minimum value is the
18081 smallest signed value representable by this bit width.
18087 .. code-block:: llvm
18089 %res = call i4 @llvm.ssub.sat.i4(i4 2, i4 1) ; %res = 1
18090 %res = call i4 @llvm.ssub.sat.i4(i4 2, i4 6) ; %res = -4
18091 %res = call i4 @llvm.ssub.sat.i4(i4 -4, i4 5) ; %res = -8
18092 %res = call i4 @llvm.ssub.sat.i4(i4 4, i4 -5) ; %res = 7
18097 '``llvm.usub.sat.*``' Intrinsics
18098 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18103 This is an overloaded intrinsic. You can use ``llvm.usub.sat``
18104 on any integer bit width or vectors of integers.
18108 declare i16 @llvm.usub.sat.i16(i16 %a, i16 %b)
18109 declare i32 @llvm.usub.sat.i32(i32 %a, i32 %b)
18110 declare i64 @llvm.usub.sat.i64(i64 %a, i64 %b)
18111 declare <4 x i32> @llvm.usub.sat.v4i32(<4 x i32> %a, <4 x i32> %b)
18116 The '``llvm.usub.sat``' family of intrinsic functions perform unsigned
18117 saturating subtraction on the 2 arguments.
18122 The arguments (%a and %b) and the result may be of integer types of any bit
18123 width, but they must have the same bit width. ``%a`` and ``%b`` are the two
18124 values that will undergo unsigned subtraction.
18129 The minimum value this operation can clamp to is 0, which is the smallest
18130 unsigned value representable by the bit width of the unsigned arguments.
18131 Because this is an unsigned operation, the result will never saturate towards
18132 the largest possible value representable by this bit width.
18138 .. code-block:: llvm
18140 %res = call i4 @llvm.usub.sat.i4(i4 2, i4 1) ; %res = 1
18141 %res = call i4 @llvm.usub.sat.i4(i4 2, i4 6) ; %res = 0
18144 '``llvm.sshl.sat.*``' Intrinsics
18145 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18150 This is an overloaded intrinsic. You can use ``llvm.sshl.sat``
18151 on integers or vectors of integers of any bit width.
18155 declare i16 @llvm.sshl.sat.i16(i16 %a, i16 %b)
18156 declare i32 @llvm.sshl.sat.i32(i32 %a, i32 %b)
18157 declare i64 @llvm.sshl.sat.i64(i64 %a, i64 %b)
18158 declare <4 x i32> @llvm.sshl.sat.v4i32(<4 x i32> %a, <4 x i32> %b)
18163 The '``llvm.sshl.sat``' family of intrinsic functions perform signed
18164 saturating left shift on the first argument.
18169 The arguments (``%a`` and ``%b``) and the result may be of integer types of any
18170 bit width, but they must have the same bit width. ``%a`` is the value to be
18171 shifted, and ``%b`` is the amount to shift by. If ``b`` is (statically or
18172 dynamically) equal to or larger than the integer bit width of the arguments,
18173 the result is a :ref:`poison value <poisonvalues>`. If the arguments are
18174 vectors, each vector element of ``a`` is shifted by the corresponding shift
18181 The maximum value this operation can clamp to is the largest signed value
18182 representable by the bit width of the arguments. The minimum value is the
18183 smallest signed value representable by this bit width.
18189 .. code-block:: llvm
18191 %res = call i4 @llvm.sshl.sat.i4(i4 2, i4 1) ; %res = 4
18192 %res = call i4 @llvm.sshl.sat.i4(i4 2, i4 2) ; %res = 7
18193 %res = call i4 @llvm.sshl.sat.i4(i4 -5, i4 1) ; %res = -8
18194 %res = call i4 @llvm.sshl.sat.i4(i4 -1, i4 1) ; %res = -2
18197 '``llvm.ushl.sat.*``' Intrinsics
18198 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18203 This is an overloaded intrinsic. You can use ``llvm.ushl.sat``
18204 on integers or vectors of integers of any bit width.
18208 declare i16 @llvm.ushl.sat.i16(i16 %a, i16 %b)
18209 declare i32 @llvm.ushl.sat.i32(i32 %a, i32 %b)
18210 declare i64 @llvm.ushl.sat.i64(i64 %a, i64 %b)
18211 declare <4 x i32> @llvm.ushl.sat.v4i32(<4 x i32> %a, <4 x i32> %b)
18216 The '``llvm.ushl.sat``' family of intrinsic functions perform unsigned
18217 saturating left shift on the first argument.
18222 The arguments (``%a`` and ``%b``) and the result may be of integer types of any
18223 bit width, but they must have the same bit width. ``%a`` is the value to be
18224 shifted, and ``%b`` is the amount to shift by. If ``b`` is (statically or
18225 dynamically) equal to or larger than the integer bit width of the arguments,
18226 the result is a :ref:`poison value <poisonvalues>`. If the arguments are
18227 vectors, each vector element of ``a`` is shifted by the corresponding shift
18233 The maximum value this operation can clamp to is the largest unsigned value
18234 representable by the bit width of the arguments.
18240 .. code-block:: llvm
18242 %res = call i4 @llvm.ushl.sat.i4(i4 2, i4 1) ; %res = 4
18243 %res = call i4 @llvm.ushl.sat.i4(i4 3, i4 3) ; %res = 15
18246 Fixed Point Arithmetic Intrinsics
18247 ---------------------------------
18249 A fixed point number represents a real data type for a number that has a fixed
18250 number of digits after a radix point (equivalent to the decimal point '.').
18251 The number of digits after the radix point is referred as the `scale`. These
18252 are useful for representing fractional values to a specific precision. The
18253 following intrinsics perform fixed point arithmetic operations on 2 operands
18254 of the same scale, specified as the third argument.
18256 The ``llvm.*mul.fix`` family of intrinsic functions represents a multiplication
18257 of fixed point numbers through scaled integers. Therefore, fixed point
18258 multiplication can be represented as
18260 .. code-block:: llvm
18262 %result = call i4 @llvm.smul.fix.i4(i4 %a, i4 %b, i32 %scale)
18265 %a2 = sext i4 %a to i8
18266 %b2 = sext i4 %b to i8
18267 %mul = mul nsw nuw i8 %a2, %b2
18268 %scale2 = trunc i32 %scale to i8
18269 %r = ashr i8 %mul, i8 %scale2 ; this is for a target rounding down towards negative infinity
18270 %result = trunc i8 %r to i4
18272 The ``llvm.*div.fix`` family of intrinsic functions represents a division of
18273 fixed point numbers through scaled integers. Fixed point division can be
18276 .. code-block:: llvm
18278 %result call i4 @llvm.sdiv.fix.i4(i4 %a, i4 %b, i32 %scale)
18281 %a2 = sext i4 %a to i8
18282 %b2 = sext i4 %b to i8
18283 %scale2 = trunc i32 %scale to i8
18284 %a3 = shl i8 %a2, %scale2
18285 %r = sdiv i8 %a3, %b2 ; this is for a target rounding towards zero
18286 %result = trunc i8 %r to i4
18288 For each of these functions, if the result cannot be represented exactly with
18289 the provided scale, the result is rounded. Rounding is unspecified since
18290 preferred rounding may vary for different targets. Rounding is specified
18291 through a target hook. Different pipelines should legalize or optimize this
18292 using the rounding specified by this hook if it is provided. Operations like
18293 constant folding, instruction combining, KnownBits, and ValueTracking should
18294 also use this hook, if provided, and not assume the direction of rounding. A
18295 rounded result must always be within one unit of precision from the true
18296 result. That is, the error between the returned result and the true result must
18297 be less than 1/2^(scale).
18300 '``llvm.smul.fix.*``' Intrinsics
18301 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18306 This is an overloaded intrinsic. You can use ``llvm.smul.fix``
18307 on any integer bit width or vectors of integers.
18311 declare i16 @llvm.smul.fix.i16(i16 %a, i16 %b, i32 %scale)
18312 declare i32 @llvm.smul.fix.i32(i32 %a, i32 %b, i32 %scale)
18313 declare i64 @llvm.smul.fix.i64(i64 %a, i64 %b, i32 %scale)
18314 declare <4 x i32> @llvm.smul.fix.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale)
18319 The '``llvm.smul.fix``' family of intrinsic functions perform signed
18320 fixed point multiplication on 2 arguments of the same scale.
18325 The arguments (%a and %b) and the result may be of integer types of any bit
18326 width, but they must have the same bit width. The arguments may also work with
18327 int vectors of the same length and int size. ``%a`` and ``%b`` are the two
18328 values that will undergo signed fixed point multiplication. The argument
18329 ``%scale`` represents the scale of both operands, and must be a constant
18335 This operation performs fixed point multiplication on the 2 arguments of a
18336 specified scale. The result will also be returned in the same scale specified
18337 in the third argument.
18339 If the result value cannot be precisely represented in the given scale, the
18340 value is rounded up or down to the closest representable value. The rounding
18341 direction is unspecified.
18343 It is undefined behavior if the result value does not fit within the range of
18344 the fixed point type.
18350 .. code-block:: llvm
18352 %res = call i4 @llvm.smul.fix.i4(i4 3, i4 2, i32 0) ; %res = 6 (2 x 3 = 6)
18353 %res = call i4 @llvm.smul.fix.i4(i4 3, i4 2, i32 1) ; %res = 3 (1.5 x 1 = 1.5)
18354 %res = call i4 @llvm.smul.fix.i4(i4 3, i4 -2, i32 1) ; %res = -3 (1.5 x -1 = -1.5)
18356 ; The result in the following could be rounded up to -2 or down to -2.5
18357 %res = call i4 @llvm.smul.fix.i4(i4 3, i4 -3, i32 1) ; %res = -5 (or -4) (1.5 x -1.5 = -2.25)
18360 '``llvm.umul.fix.*``' Intrinsics
18361 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18366 This is an overloaded intrinsic. You can use ``llvm.umul.fix``
18367 on any integer bit width or vectors of integers.
18371 declare i16 @llvm.umul.fix.i16(i16 %a, i16 %b, i32 %scale)
18372 declare i32 @llvm.umul.fix.i32(i32 %a, i32 %b, i32 %scale)
18373 declare i64 @llvm.umul.fix.i64(i64 %a, i64 %b, i32 %scale)
18374 declare <4 x i32> @llvm.umul.fix.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale)
18379 The '``llvm.umul.fix``' family of intrinsic functions perform unsigned
18380 fixed point multiplication on 2 arguments of the same scale.
18385 The arguments (%a and %b) and the result may be of integer types of any bit
18386 width, but they must have the same bit width. The arguments may also work with
18387 int vectors of the same length and int size. ``%a`` and ``%b`` are the two
18388 values that will undergo unsigned fixed point multiplication. The argument
18389 ``%scale`` represents the scale of both operands, and must be a constant
18395 This operation performs unsigned fixed point multiplication on the 2 arguments of a
18396 specified scale. The result will also be returned in the same scale specified
18397 in the third argument.
18399 If the result value cannot be precisely represented in the given scale, the
18400 value is rounded up or down to the closest representable value. The rounding
18401 direction is unspecified.
18403 It is undefined behavior if the result value does not fit within the range of
18404 the fixed point type.
18410 .. code-block:: llvm
18412 %res = call i4 @llvm.umul.fix.i4(i4 3, i4 2, i32 0) ; %res = 6 (2 x 3 = 6)
18413 %res = call i4 @llvm.umul.fix.i4(i4 3, i4 2, i32 1) ; %res = 3 (1.5 x 1 = 1.5)
18415 ; The result in the following could be rounded down to 3.5 or up to 4
18416 %res = call i4 @llvm.umul.fix.i4(i4 15, i4 1, i32 1) ; %res = 7 (or 8) (7.5 x 0.5 = 3.75)
18419 '``llvm.smul.fix.sat.*``' Intrinsics
18420 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18425 This is an overloaded intrinsic. You can use ``llvm.smul.fix.sat``
18426 on any integer bit width or vectors of integers.
18430 declare i16 @llvm.smul.fix.sat.i16(i16 %a, i16 %b, i32 %scale)
18431 declare i32 @llvm.smul.fix.sat.i32(i32 %a, i32 %b, i32 %scale)
18432 declare i64 @llvm.smul.fix.sat.i64(i64 %a, i64 %b, i32 %scale)
18433 declare <4 x i32> @llvm.smul.fix.sat.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale)
18438 The '``llvm.smul.fix.sat``' family of intrinsic functions perform signed
18439 fixed point saturating multiplication on 2 arguments of the same scale.
18444 The arguments (%a and %b) and the result may be of integer types of any bit
18445 width, but they must have the same bit width. ``%a`` and ``%b`` are the two
18446 values that will undergo signed fixed point multiplication. The argument
18447 ``%scale`` represents the scale of both operands, and must be a constant
18453 This operation performs fixed point multiplication on the 2 arguments of a
18454 specified scale. The result will also be returned in the same scale specified
18455 in the third argument.
18457 If the result value cannot be precisely represented in the given scale, the
18458 value is rounded up or down to the closest representable value. The rounding
18459 direction is unspecified.
18461 The maximum value this operation can clamp to is the largest signed value
18462 representable by the bit width of the first 2 arguments. The minimum value is the
18463 smallest signed value representable by this bit width.
18469 .. code-block:: llvm
18471 %res = call i4 @llvm.smul.fix.sat.i4(i4 3, i4 2, i32 0) ; %res = 6 (2 x 3 = 6)
18472 %res = call i4 @llvm.smul.fix.sat.i4(i4 3, i4 2, i32 1) ; %res = 3 (1.5 x 1 = 1.5)
18473 %res = call i4 @llvm.smul.fix.sat.i4(i4 3, i4 -2, i32 1) ; %res = -3 (1.5 x -1 = -1.5)
18475 ; The result in the following could be rounded up to -2 or down to -2.5
18476 %res = call i4 @llvm.smul.fix.sat.i4(i4 3, i4 -3, i32 1) ; %res = -5 (or -4) (1.5 x -1.5 = -2.25)
18479 %res = call i4 @llvm.smul.fix.sat.i4(i4 7, i4 2, i32 0) ; %res = 7
18480 %res = call i4 @llvm.smul.fix.sat.i4(i4 7, i4 4, i32 2) ; %res = 7
18481 %res = call i4 @llvm.smul.fix.sat.i4(i4 -8, i4 5, i32 2) ; %res = -8
18482 %res = call i4 @llvm.smul.fix.sat.i4(i4 -8, i4 -2, i32 1) ; %res = 7
18484 ; Scale can affect the saturation result
18485 %res = call i4 @llvm.smul.fix.sat.i4(i4 2, i4 4, i32 0) ; %res = 7 (2 x 4 -> clamped to 7)
18486 %res = call i4 @llvm.smul.fix.sat.i4(i4 2, i4 4, i32 1) ; %res = 4 (1 x 2 = 2)
18489 '``llvm.umul.fix.sat.*``' Intrinsics
18490 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18495 This is an overloaded intrinsic. You can use ``llvm.umul.fix.sat``
18496 on any integer bit width or vectors of integers.
18500 declare i16 @llvm.umul.fix.sat.i16(i16 %a, i16 %b, i32 %scale)
18501 declare i32 @llvm.umul.fix.sat.i32(i32 %a, i32 %b, i32 %scale)
18502 declare i64 @llvm.umul.fix.sat.i64(i64 %a, i64 %b, i32 %scale)
18503 declare <4 x i32> @llvm.umul.fix.sat.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale)
18508 The '``llvm.umul.fix.sat``' family of intrinsic functions perform unsigned
18509 fixed point saturating multiplication on 2 arguments of the same scale.
18514 The arguments (%a and %b) and the result may be of integer types of any bit
18515 width, but they must have the same bit width. ``%a`` and ``%b`` are the two
18516 values that will undergo unsigned fixed point multiplication. The argument
18517 ``%scale`` represents the scale of both operands, and must be a constant
18523 This operation performs fixed point multiplication on the 2 arguments of a
18524 specified scale. The result will also be returned in the same scale specified
18525 in the third argument.
18527 If the result value cannot be precisely represented in the given scale, the
18528 value is rounded up or down to the closest representable value. The rounding
18529 direction is unspecified.
18531 The maximum value this operation can clamp to is the largest unsigned value
18532 representable by the bit width of the first 2 arguments. The minimum value is the
18533 smallest unsigned value representable by this bit width (zero).
18539 .. code-block:: llvm
18541 %res = call i4 @llvm.umul.fix.sat.i4(i4 3, i4 2, i32 0) ; %res = 6 (2 x 3 = 6)
18542 %res = call i4 @llvm.umul.fix.sat.i4(i4 3, i4 2, i32 1) ; %res = 3 (1.5 x 1 = 1.5)
18544 ; The result in the following could be rounded down to 2 or up to 2.5
18545 %res = call i4 @llvm.umul.fix.sat.i4(i4 3, i4 3, i32 1) ; %res = 4 (or 5) (1.5 x 1.5 = 2.25)
18548 %res = call i4 @llvm.umul.fix.sat.i4(i4 8, i4 2, i32 0) ; %res = 15 (8 x 2 -> clamped to 15)
18549 %res = call i4 @llvm.umul.fix.sat.i4(i4 8, i4 8, i32 2) ; %res = 15 (2 x 2 -> clamped to 3.75)
18551 ; Scale can affect the saturation result
18552 %res = call i4 @llvm.umul.fix.sat.i4(i4 2, i4 4, i32 0) ; %res = 7 (2 x 4 -> clamped to 7)
18553 %res = call i4 @llvm.umul.fix.sat.i4(i4 2, i4 4, i32 1) ; %res = 4 (1 x 2 = 2)
18556 '``llvm.sdiv.fix.*``' Intrinsics
18557 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18562 This is an overloaded intrinsic. You can use ``llvm.sdiv.fix``
18563 on any integer bit width or vectors of integers.
18567 declare i16 @llvm.sdiv.fix.i16(i16 %a, i16 %b, i32 %scale)
18568 declare i32 @llvm.sdiv.fix.i32(i32 %a, i32 %b, i32 %scale)
18569 declare i64 @llvm.sdiv.fix.i64(i64 %a, i64 %b, i32 %scale)
18570 declare <4 x i32> @llvm.sdiv.fix.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale)
18575 The '``llvm.sdiv.fix``' family of intrinsic functions perform signed
18576 fixed point division on 2 arguments of the same scale.
18581 The arguments (%a and %b) and the result may be of integer types of any bit
18582 width, but they must have the same bit width. The arguments may also work with
18583 int vectors of the same length and int size. ``%a`` and ``%b`` are the two
18584 values that will undergo signed fixed point division. The argument
18585 ``%scale`` represents the scale of both operands, and must be a constant
18591 This operation performs fixed point division on the 2 arguments of a
18592 specified scale. The result will also be returned in the same scale specified
18593 in the third argument.
18595 If the result value cannot be precisely represented in the given scale, the
18596 value is rounded up or down to the closest representable value. The rounding
18597 direction is unspecified.
18599 It is undefined behavior if the result value does not fit within the range of
18600 the fixed point type, or if the second argument is zero.
18606 .. code-block:: llvm
18608 %res = call i4 @llvm.sdiv.fix.i4(i4 6, i4 2, i32 0) ; %res = 3 (6 / 2 = 3)
18609 %res = call i4 @llvm.sdiv.fix.i4(i4 6, i4 4, i32 1) ; %res = 3 (3 / 2 = 1.5)
18610 %res = call i4 @llvm.sdiv.fix.i4(i4 3, i4 -2, i32 1) ; %res = -3 (1.5 / -1 = -1.5)
18612 ; The result in the following could be rounded up to 1 or down to 0.5
18613 %res = call i4 @llvm.sdiv.fix.i4(i4 3, i4 4, i32 1) ; %res = 2 (or 1) (1.5 / 2 = 0.75)
18616 '``llvm.udiv.fix.*``' Intrinsics
18617 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18622 This is an overloaded intrinsic. You can use ``llvm.udiv.fix``
18623 on any integer bit width or vectors of integers.
18627 declare i16 @llvm.udiv.fix.i16(i16 %a, i16 %b, i32 %scale)
18628 declare i32 @llvm.udiv.fix.i32(i32 %a, i32 %b, i32 %scale)
18629 declare i64 @llvm.udiv.fix.i64(i64 %a, i64 %b, i32 %scale)
18630 declare <4 x i32> @llvm.udiv.fix.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale)
18635 The '``llvm.udiv.fix``' family of intrinsic functions perform unsigned
18636 fixed point division on 2 arguments of the same scale.
18641 The arguments (%a and %b) and the result may be of integer types of any bit
18642 width, but they must have the same bit width. The arguments may also work with
18643 int vectors of the same length and int size. ``%a`` and ``%b`` are the two
18644 values that will undergo unsigned fixed point division. The argument
18645 ``%scale`` represents the scale of both operands, and must be a constant
18651 This operation performs fixed point division on the 2 arguments of a
18652 specified scale. The result will also be returned in the same scale specified
18653 in the third argument.
18655 If the result value cannot be precisely represented in the given scale, the
18656 value is rounded up or down to the closest representable value. The rounding
18657 direction is unspecified.
18659 It is undefined behavior if the result value does not fit within the range of
18660 the fixed point type, or if the second argument is zero.
18666 .. code-block:: llvm
18668 %res = call i4 @llvm.udiv.fix.i4(i4 6, i4 2, i32 0) ; %res = 3 (6 / 2 = 3)
18669 %res = call i4 @llvm.udiv.fix.i4(i4 6, i4 4, i32 1) ; %res = 3 (3 / 2 = 1.5)
18670 %res = call i4 @llvm.udiv.fix.i4(i4 1, i4 -8, i32 4) ; %res = 2 (0.0625 / 0.5 = 0.125)
18672 ; The result in the following could be rounded up to 1 or down to 0.5
18673 %res = call i4 @llvm.udiv.fix.i4(i4 3, i4 4, i32 1) ; %res = 2 (or 1) (1.5 / 2 = 0.75)
18676 '``llvm.sdiv.fix.sat.*``' Intrinsics
18677 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18682 This is an overloaded intrinsic. You can use ``llvm.sdiv.fix.sat``
18683 on any integer bit width or vectors of integers.
18687 declare i16 @llvm.sdiv.fix.sat.i16(i16 %a, i16 %b, i32 %scale)
18688 declare i32 @llvm.sdiv.fix.sat.i32(i32 %a, i32 %b, i32 %scale)
18689 declare i64 @llvm.sdiv.fix.sat.i64(i64 %a, i64 %b, i32 %scale)
18690 declare <4 x i32> @llvm.sdiv.fix.sat.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale)
18695 The '``llvm.sdiv.fix.sat``' family of intrinsic functions perform signed
18696 fixed point saturating division on 2 arguments of the same scale.
18701 The arguments (%a and %b) and the result may be of integer types of any bit
18702 width, but they must have the same bit width. ``%a`` and ``%b`` are the two
18703 values that will undergo signed fixed point division. The argument
18704 ``%scale`` represents the scale of both operands, and must be a constant
18710 This operation performs fixed point division on the 2 arguments of a
18711 specified scale. The result will also be returned in the same scale specified
18712 in the third argument.
18714 If the result value cannot be precisely represented in the given scale, the
18715 value is rounded up or down to the closest representable value. The rounding
18716 direction is unspecified.
18718 The maximum value this operation can clamp to is the largest signed value
18719 representable by the bit width of the first 2 arguments. The minimum value is the
18720 smallest signed value representable by this bit width.
18722 It is undefined behavior if the second argument is zero.
18728 .. code-block:: llvm
18730 %res = call i4 @llvm.sdiv.fix.sat.i4(i4 6, i4 2, i32 0) ; %res = 3 (6 / 2 = 3)
18731 %res = call i4 @llvm.sdiv.fix.sat.i4(i4 6, i4 4, i32 1) ; %res = 3 (3 / 2 = 1.5)
18732 %res = call i4 @llvm.sdiv.fix.sat.i4(i4 3, i4 -2, i32 1) ; %res = -3 (1.5 / -1 = -1.5)
18734 ; The result in the following could be rounded up to 1 or down to 0.5
18735 %res = call i4 @llvm.sdiv.fix.sat.i4(i4 3, i4 4, i32 1) ; %res = 2 (or 1) (1.5 / 2 = 0.75)
18738 %res = call i4 @llvm.sdiv.fix.sat.i4(i4 -8, i4 -1, i32 0) ; %res = 7 (-8 / -1 = 8 => 7)
18739 %res = call i4 @llvm.sdiv.fix.sat.i4(i4 4, i4 2, i32 2) ; %res = 7 (1 / 0.5 = 2 => 1.75)
18740 %res = call i4 @llvm.sdiv.fix.sat.i4(i4 -4, i4 1, i32 2) ; %res = -8 (-1 / 0.25 = -4 => -2)
18743 '``llvm.udiv.fix.sat.*``' Intrinsics
18744 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18749 This is an overloaded intrinsic. You can use ``llvm.udiv.fix.sat``
18750 on any integer bit width or vectors of integers.
18754 declare i16 @llvm.udiv.fix.sat.i16(i16 %a, i16 %b, i32 %scale)
18755 declare i32 @llvm.udiv.fix.sat.i32(i32 %a, i32 %b, i32 %scale)
18756 declare i64 @llvm.udiv.fix.sat.i64(i64 %a, i64 %b, i32 %scale)
18757 declare <4 x i32> @llvm.udiv.fix.sat.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale)
18762 The '``llvm.udiv.fix.sat``' family of intrinsic functions perform unsigned
18763 fixed point saturating division on 2 arguments of the same scale.
18768 The arguments (%a and %b) and the result may be of integer types of any bit
18769 width, but they must have the same bit width. ``%a`` and ``%b`` are the two
18770 values that will undergo unsigned fixed point division. The argument
18771 ``%scale`` represents the scale of both operands, and must be a constant
18777 This operation performs fixed point division on the 2 arguments of a
18778 specified scale. The result will also be returned in the same scale specified
18779 in the third argument.
18781 If the result value cannot be precisely represented in the given scale, the
18782 value is rounded up or down to the closest representable value. The rounding
18783 direction is unspecified.
18785 The maximum value this operation can clamp to is the largest unsigned value
18786 representable by the bit width of the first 2 arguments. The minimum value is the
18787 smallest unsigned value representable by this bit width (zero).
18789 It is undefined behavior if the second argument is zero.
18794 .. code-block:: llvm
18796 %res = call i4 @llvm.udiv.fix.sat.i4(i4 6, i4 2, i32 0) ; %res = 3 (6 / 2 = 3)
18797 %res = call i4 @llvm.udiv.fix.sat.i4(i4 6, i4 4, i32 1) ; %res = 3 (3 / 2 = 1.5)
18799 ; The result in the following could be rounded down to 0.5 or up to 1
18800 %res = call i4 @llvm.udiv.fix.sat.i4(i4 3, i4 4, i32 1) ; %res = 1 (or 2) (1.5 / 2 = 0.75)
18803 %res = call i4 @llvm.udiv.fix.sat.i4(i4 8, i4 2, i32 2) ; %res = 15 (2 / 0.5 = 4 => 3.75)
18806 Specialized Arithmetic Intrinsics
18807 ---------------------------------
18809 .. _i_intr_llvm_canonicalize:
18811 '``llvm.canonicalize.*``' Intrinsic
18812 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18819 declare float @llvm.canonicalize.f32(float %a)
18820 declare double @llvm.canonicalize.f64(double %b)
18825 The '``llvm.canonicalize.*``' intrinsic returns the platform specific canonical
18826 encoding of a floating-point number. This canonicalization is useful for
18827 implementing certain numeric primitives such as frexp. The canonical encoding is
18828 defined by IEEE-754-2008 to be:
18832 2.1.8 canonical encoding: The preferred encoding of a floating-point
18833 representation in a format. Applied to declets, significands of finite
18834 numbers, infinities, and NaNs, especially in decimal formats.
18836 This operation can also be considered equivalent to the IEEE-754-2008
18837 conversion of a floating-point value to the same format. NaNs are handled
18838 according to section 6.2.
18840 Examples of non-canonical encodings:
18842 - x87 pseudo denormals, pseudo NaNs, pseudo Infinity, Unnormals. These are
18843 converted to a canonical representation per hardware-specific protocol.
18844 - Many normal decimal floating-point numbers have non-canonical alternative
18846 - Some machines, like GPUs or ARMv7 NEON, do not support subnormal values.
18847 These are treated as non-canonical encodings of zero and will be flushed to
18848 a zero of the same sign by this operation.
18850 Note that per IEEE-754-2008 6.2, systems that support signaling NaNs with
18851 default exception handling must signal an invalid exception, and produce a
18854 This function should always be implementable as multiplication by 1.0, provided
18855 that the compiler does not constant fold the operation. Likewise, division by
18856 1.0 and ``llvm.minnum(x, x)`` are possible implementations. Addition with
18857 -0.0 is also sufficient provided that the rounding mode is not -Infinity.
18859 ``@llvm.canonicalize`` must preserve the equality relation. That is:
18861 - ``(@llvm.canonicalize(x) == x)`` is equivalent to ``(x == x)``
18862 - ``(@llvm.canonicalize(x) == @llvm.canonicalize(y))`` is equivalent
18865 Additionally, the sign of zero must be conserved:
18866 ``@llvm.canonicalize(-0.0) = -0.0`` and ``@llvm.canonicalize(+0.0) = +0.0``
18868 The payload bits of a NaN must be conserved, with two exceptions.
18869 First, environments which use only a single canonical representation of NaN
18870 must perform said canonicalization. Second, SNaNs must be quieted per the
18873 The canonicalization operation may be optimized away if:
18875 - The input is known to be canonical. For example, it was produced by a
18876 floating-point operation that is required by the standard to be canonical.
18877 - The result is consumed only by (or fused with) other floating-point
18878 operations. That is, the bits of the floating-point value are not examined.
18882 '``llvm.fmuladd.*``' Intrinsic
18883 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18890 declare float @llvm.fmuladd.f32(float %a, float %b, float %c)
18891 declare double @llvm.fmuladd.f64(double %a, double %b, double %c)
18896 The '``llvm.fmuladd.*``' intrinsic functions represent multiply-add
18897 expressions that can be fused if the code generator determines that (a) the
18898 target instruction set has support for a fused operation, and (b) that the
18899 fused operation is more efficient than the equivalent, separate pair of mul
18900 and add instructions.
18905 The '``llvm.fmuladd.*``' intrinsics each take three arguments: two
18906 multiplicands, a and b, and an addend c.
18915 %0 = call float @llvm.fmuladd.f32(%a, %b, %c)
18917 is equivalent to the expression a \* b + c, except that it is unspecified
18918 whether rounding will be performed between the multiplication and addition
18919 steps. Fusion is not guaranteed, even if the target platform supports it.
18920 If a fused multiply-add is required, the corresponding
18921 :ref:`llvm.fma <int_fma>` intrinsic function should be used instead.
18922 This never sets errno, just as '``llvm.fma.*``'.
18927 .. code-block:: llvm
18929 %r2 = call float @llvm.fmuladd.f32(float %a, float %b, float %c) ; yields float:r2 = (a * b) + c
18932 Hardware-Loop Intrinsics
18933 ------------------------
18935 LLVM support several intrinsics to mark a loop as a hardware-loop. They are
18936 hints to the backend which are required to lower these intrinsics further to target
18937 specific instructions, or revert the hardware-loop to a normal loop if target
18938 specific restriction are not met and a hardware-loop can't be generated.
18940 These intrinsics may be modified in the future and are not intended to be used
18941 outside the backend. Thus, front-end and mid-level optimizations should not be
18942 generating these intrinsics.
18945 '``llvm.set.loop.iterations.*``' Intrinsic
18946 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18951 This is an overloaded intrinsic.
18955 declare void @llvm.set.loop.iterations.i32(i32)
18956 declare void @llvm.set.loop.iterations.i64(i64)
18961 The '``llvm.set.loop.iterations.*``' intrinsics are used to specify the
18962 hardware-loop trip count. They are placed in the loop preheader basic block and
18963 are marked as ``IntrNoDuplicate`` to avoid optimizers duplicating these
18969 The integer operand is the loop trip count of the hardware-loop, and thus
18970 not e.g. the loop back-edge taken count.
18975 The '``llvm.set.loop.iterations.*``' intrinsics do not perform any arithmetic
18976 on their operand. It's a hint to the backend that can use this to set up the
18977 hardware-loop count with a target specific instruction, usually a move of this
18978 value to a special register or a hardware-loop instruction.
18981 '``llvm.start.loop.iterations.*``' Intrinsic
18982 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18987 This is an overloaded intrinsic.
18991 declare i32 @llvm.start.loop.iterations.i32(i32)
18992 declare i64 @llvm.start.loop.iterations.i64(i64)
18997 The '``llvm.start.loop.iterations.*``' intrinsics are similar to the
18998 '``llvm.set.loop.iterations.*``' intrinsics, used to specify the
18999 hardware-loop trip count but also produce a value identical to the input
19000 that can be used as the input to the loop. They are placed in the loop
19001 preheader basic block and the output is expected to be the input to the
19002 phi for the induction variable of the loop, decremented by the
19003 '``llvm.loop.decrement.reg.*``'.
19008 The integer operand is the loop trip count of the hardware-loop, and thus
19009 not e.g. the loop back-edge taken count.
19014 The '``llvm.start.loop.iterations.*``' intrinsics do not perform any arithmetic
19015 on their operand. It's a hint to the backend that can use this to set up the
19016 hardware-loop count with a target specific instruction, usually a move of this
19017 value to a special register or a hardware-loop instruction.
19019 '``llvm.test.set.loop.iterations.*``' Intrinsic
19020 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19025 This is an overloaded intrinsic.
19029 declare i1 @llvm.test.set.loop.iterations.i32(i32)
19030 declare i1 @llvm.test.set.loop.iterations.i64(i64)
19035 The '``llvm.test.set.loop.iterations.*``' intrinsics are used to specify the
19036 the loop trip count, and also test that the given count is not zero, allowing
19037 it to control entry to a while-loop. They are placed in the loop preheader's
19038 predecessor basic block, and are marked as ``IntrNoDuplicate`` to avoid
19039 optimizers duplicating these instructions.
19044 The integer operand is the loop trip count of the hardware-loop, and thus
19045 not e.g. the loop back-edge taken count.
19050 The '``llvm.test.set.loop.iterations.*``' intrinsics do not perform any
19051 arithmetic on their operand. It's a hint to the backend that can use this to
19052 set up the hardware-loop count with a target specific instruction, usually a
19053 move of this value to a special register or a hardware-loop instruction.
19054 The result is the conditional value of whether the given count is not zero.
19057 '``llvm.test.start.loop.iterations.*``' Intrinsic
19058 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19063 This is an overloaded intrinsic.
19067 declare {i32, i1} @llvm.test.start.loop.iterations.i32(i32)
19068 declare {i64, i1} @llvm.test.start.loop.iterations.i64(i64)
19073 The '``llvm.test.start.loop.iterations.*``' intrinsics are similar to the
19074 '``llvm.test.set.loop.iterations.*``' and '``llvm.start.loop.iterations.*``'
19075 intrinsics, used to specify the hardware-loop trip count, but also produce a
19076 value identical to the input that can be used as the input to the loop. The
19077 second i1 output controls entry to a while-loop.
19082 The integer operand is the loop trip count of the hardware-loop, and thus
19083 not e.g. the loop back-edge taken count.
19088 The '``llvm.test.start.loop.iterations.*``' intrinsics do not perform any
19089 arithmetic on their operand. It's a hint to the backend that can use this to
19090 set up the hardware-loop count with a target specific instruction, usually a
19091 move of this value to a special register or a hardware-loop instruction.
19092 The result is a pair of the input and a conditional value of whether the
19093 given count is not zero.
19096 '``llvm.loop.decrement.reg.*``' Intrinsic
19097 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19102 This is an overloaded intrinsic.
19106 declare i32 @llvm.loop.decrement.reg.i32(i32, i32)
19107 declare i64 @llvm.loop.decrement.reg.i64(i64, i64)
19112 The '``llvm.loop.decrement.reg.*``' intrinsics are used to lower the loop
19113 iteration counter and return an updated value that will be used in the next
19119 Both arguments must have identical integer types. The first operand is the
19120 loop iteration counter. The second operand is the maximum number of elements
19121 processed in an iteration.
19126 The '``llvm.loop.decrement.reg.*``' intrinsics do an integer ``SUB`` of its
19127 two operands, which is not allowed to wrap. They return the remaining number of
19128 iterations still to be executed, and can be used together with a ``PHI``,
19129 ``ICMP`` and ``BR`` to control the number of loop iterations executed. Any
19130 optimizations are allowed to treat it is a ``SUB``, and it is supported by
19131 SCEV, so it's the backends responsibility to handle cases where it may be
19132 optimized. These intrinsics are marked as ``IntrNoDuplicate`` to avoid
19133 optimizers duplicating these instructions.
19136 '``llvm.loop.decrement.*``' Intrinsic
19137 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19142 This is an overloaded intrinsic.
19146 declare i1 @llvm.loop.decrement.i32(i32)
19147 declare i1 @llvm.loop.decrement.i64(i64)
19152 The HardwareLoops pass allows the loop decrement value to be specified with an
19153 option. It defaults to a loop decrement value of 1, but it can be an unsigned
19154 integer value provided by this option. The '``llvm.loop.decrement.*``'
19155 intrinsics decrement the loop iteration counter with this value, and return a
19156 false predicate if the loop should exit, and true otherwise.
19157 This is emitted if the loop counter is not updated via a ``PHI`` node, which
19158 can also be controlled with an option.
19163 The integer argument is the loop decrement value used to decrement the loop
19169 The '``llvm.loop.decrement.*``' intrinsics do a ``SUB`` of the loop iteration
19170 counter with the given loop decrement value, and return false if the loop
19171 should exit, this ``SUB`` is not allowed to wrap. The result is a condition
19172 that is used by the conditional branch controlling the loop.
19175 Vector Reduction Intrinsics
19176 ---------------------------
19178 Horizontal reductions of vectors can be expressed using the following
19179 intrinsics. Each one takes a vector operand as an input and applies its
19180 respective operation across all elements of the vector, returning a single
19181 scalar result of the same element type.
19183 .. _int_vector_reduce_add:
19185 '``llvm.vector.reduce.add.*``' Intrinsic
19186 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19193 declare i32 @llvm.vector.reduce.add.v4i32(<4 x i32> %a)
19194 declare i64 @llvm.vector.reduce.add.v2i64(<2 x i64> %a)
19199 The '``llvm.vector.reduce.add.*``' intrinsics do an integer ``ADD``
19200 reduction of a vector, returning the result as a scalar. The return type matches
19201 the element-type of the vector input.
19205 The argument to this intrinsic must be a vector of integer values.
19207 .. _int_vector_reduce_fadd:
19209 '``llvm.vector.reduce.fadd.*``' Intrinsic
19210 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19217 declare float @llvm.vector.reduce.fadd.v4f32(float %start_value, <4 x float> %a)
19218 declare double @llvm.vector.reduce.fadd.v2f64(double %start_value, <2 x double> %a)
19223 The '``llvm.vector.reduce.fadd.*``' intrinsics do a floating-point
19224 ``ADD`` reduction of a vector, returning the result as a scalar. The return type
19225 matches the element-type of the vector input.
19227 If the intrinsic call has the 'reassoc' flag set, then the reduction will not
19228 preserve the associativity of an equivalent scalarized counterpart. Otherwise
19229 the reduction will be *sequential*, thus implying that the operation respects
19230 the associativity of a scalarized reduction. That is, the reduction begins with
19231 the start value and performs an fadd operation with consecutively increasing
19232 vector element indices. See the following pseudocode:
19236 float sequential_fadd(start_value, input_vector)
19237 result = start_value
19238 for i = 0 to length(input_vector)
19239 result = result + input_vector[i]
19245 The first argument to this intrinsic is a scalar start value for the reduction.
19246 The type of the start value matches the element-type of the vector input.
19247 The second argument must be a vector of floating-point values.
19249 To ignore the start value, negative zero (``-0.0``) can be used, as it is
19250 the neutral value of floating point addition.
19257 %unord = call reassoc float @llvm.vector.reduce.fadd.v4f32(float -0.0, <4 x float> %input) ; relaxed reduction
19258 %ord = call float @llvm.vector.reduce.fadd.v4f32(float %start_value, <4 x float> %input) ; sequential reduction
19261 .. _int_vector_reduce_mul:
19263 '``llvm.vector.reduce.mul.*``' Intrinsic
19264 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19271 declare i32 @llvm.vector.reduce.mul.v4i32(<4 x i32> %a)
19272 declare i64 @llvm.vector.reduce.mul.v2i64(<2 x i64> %a)
19277 The '``llvm.vector.reduce.mul.*``' intrinsics do an integer ``MUL``
19278 reduction of a vector, returning the result as a scalar. The return type matches
19279 the element-type of the vector input.
19283 The argument to this intrinsic must be a vector of integer values.
19285 .. _int_vector_reduce_fmul:
19287 '``llvm.vector.reduce.fmul.*``' Intrinsic
19288 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19295 declare float @llvm.vector.reduce.fmul.v4f32(float %start_value, <4 x float> %a)
19296 declare double @llvm.vector.reduce.fmul.v2f64(double %start_value, <2 x double> %a)
19301 The '``llvm.vector.reduce.fmul.*``' intrinsics do a floating-point
19302 ``MUL`` reduction of a vector, returning the result as a scalar. The return type
19303 matches the element-type of the vector input.
19305 If the intrinsic call has the 'reassoc' flag set, then the reduction will not
19306 preserve the associativity of an equivalent scalarized counterpart. Otherwise
19307 the reduction will be *sequential*, thus implying that the operation respects
19308 the associativity of a scalarized reduction. That is, the reduction begins with
19309 the start value and performs an fmul operation with consecutively increasing
19310 vector element indices. See the following pseudocode:
19314 float sequential_fmul(start_value, input_vector)
19315 result = start_value
19316 for i = 0 to length(input_vector)
19317 result = result * input_vector[i]
19323 The first argument to this intrinsic is a scalar start value for the reduction.
19324 The type of the start value matches the element-type of the vector input.
19325 The second argument must be a vector of floating-point values.
19327 To ignore the start value, one (``1.0``) can be used, as it is the neutral
19328 value of floating point multiplication.
19335 %unord = call reassoc float @llvm.vector.reduce.fmul.v4f32(float 1.0, <4 x float> %input) ; relaxed reduction
19336 %ord = call float @llvm.vector.reduce.fmul.v4f32(float %start_value, <4 x float> %input) ; sequential reduction
19338 .. _int_vector_reduce_and:
19340 '``llvm.vector.reduce.and.*``' Intrinsic
19341 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19348 declare i32 @llvm.vector.reduce.and.v4i32(<4 x i32> %a)
19353 The '``llvm.vector.reduce.and.*``' intrinsics do a bitwise ``AND``
19354 reduction of a vector, returning the result as a scalar. The return type matches
19355 the element-type of the vector input.
19359 The argument to this intrinsic must be a vector of integer values.
19361 .. _int_vector_reduce_or:
19363 '``llvm.vector.reduce.or.*``' Intrinsic
19364 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19371 declare i32 @llvm.vector.reduce.or.v4i32(<4 x i32> %a)
19376 The '``llvm.vector.reduce.or.*``' intrinsics do a bitwise ``OR`` reduction
19377 of a vector, returning the result as a scalar. The return type matches the
19378 element-type of the vector input.
19382 The argument to this intrinsic must be a vector of integer values.
19384 .. _int_vector_reduce_xor:
19386 '``llvm.vector.reduce.xor.*``' Intrinsic
19387 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19394 declare i32 @llvm.vector.reduce.xor.v4i32(<4 x i32> %a)
19399 The '``llvm.vector.reduce.xor.*``' intrinsics do a bitwise ``XOR``
19400 reduction of a vector, returning the result as a scalar. The return type matches
19401 the element-type of the vector input.
19405 The argument to this intrinsic must be a vector of integer values.
19407 .. _int_vector_reduce_smax:
19409 '``llvm.vector.reduce.smax.*``' Intrinsic
19410 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19417 declare i32 @llvm.vector.reduce.smax.v4i32(<4 x i32> %a)
19422 The '``llvm.vector.reduce.smax.*``' intrinsics do a signed integer
19423 ``MAX`` reduction of a vector, returning the result as a scalar. The return type
19424 matches the element-type of the vector input.
19428 The argument to this intrinsic must be a vector of integer values.
19430 .. _int_vector_reduce_smin:
19432 '``llvm.vector.reduce.smin.*``' Intrinsic
19433 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19440 declare i32 @llvm.vector.reduce.smin.v4i32(<4 x i32> %a)
19445 The '``llvm.vector.reduce.smin.*``' intrinsics do a signed integer
19446 ``MIN`` reduction of a vector, returning the result as a scalar. The return type
19447 matches the element-type of the vector input.
19451 The argument to this intrinsic must be a vector of integer values.
19453 .. _int_vector_reduce_umax:
19455 '``llvm.vector.reduce.umax.*``' Intrinsic
19456 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19463 declare i32 @llvm.vector.reduce.umax.v4i32(<4 x i32> %a)
19468 The '``llvm.vector.reduce.umax.*``' intrinsics do an unsigned
19469 integer ``MAX`` reduction of a vector, returning the result as a scalar. The
19470 return type matches the element-type of the vector input.
19474 The argument to this intrinsic must be a vector of integer values.
19476 .. _int_vector_reduce_umin:
19478 '``llvm.vector.reduce.umin.*``' Intrinsic
19479 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19486 declare i32 @llvm.vector.reduce.umin.v4i32(<4 x i32> %a)
19491 The '``llvm.vector.reduce.umin.*``' intrinsics do an unsigned
19492 integer ``MIN`` reduction of a vector, returning the result as a scalar. The
19493 return type matches the element-type of the vector input.
19497 The argument to this intrinsic must be a vector of integer values.
19499 .. _int_vector_reduce_fmax:
19501 '``llvm.vector.reduce.fmax.*``' Intrinsic
19502 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19509 declare float @llvm.vector.reduce.fmax.v4f32(<4 x float> %a)
19510 declare double @llvm.vector.reduce.fmax.v2f64(<2 x double> %a)
19515 The '``llvm.vector.reduce.fmax.*``' intrinsics do a floating-point
19516 ``MAX`` reduction of a vector, returning the result as a scalar. The return type
19517 matches the element-type of the vector input.
19519 This instruction has the same comparison semantics as the '``llvm.maxnum.*``'
19520 intrinsic. That is, the result will always be a number unless all elements of
19521 the vector are NaN. For a vector with maximum element magnitude 0.0 and
19522 containing both +0.0 and -0.0 elements, the sign of the result is unspecified.
19524 If the intrinsic call has the ``nnan`` fast-math flag, then the operation can
19525 assume that NaNs are not present in the input vector.
19529 The argument to this intrinsic must be a vector of floating-point values.
19531 .. _int_vector_reduce_fmin:
19533 '``llvm.vector.reduce.fmin.*``' Intrinsic
19534 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19538 This is an overloaded intrinsic.
19542 declare float @llvm.vector.reduce.fmin.v4f32(<4 x float> %a)
19543 declare double @llvm.vector.reduce.fmin.v2f64(<2 x double> %a)
19548 The '``llvm.vector.reduce.fmin.*``' intrinsics do a floating-point
19549 ``MIN`` reduction of a vector, returning the result as a scalar. The return type
19550 matches the element-type of the vector input.
19552 This instruction has the same comparison semantics as the '``llvm.minnum.*``'
19553 intrinsic. That is, the result will always be a number unless all elements of
19554 the vector are NaN. For a vector with minimum element magnitude 0.0 and
19555 containing both +0.0 and -0.0 elements, the sign of the result is unspecified.
19557 If the intrinsic call has the ``nnan`` fast-math flag, then the operation can
19558 assume that NaNs are not present in the input vector.
19562 The argument to this intrinsic must be a vector of floating-point values.
19564 .. _int_vector_reduce_fmaximum:
19566 '``llvm.vector.reduce.fmaximum.*``' Intrinsic
19567 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19571 This is an overloaded intrinsic.
19575 declare float @llvm.vector.reduce.fmaximum.v4f32(<4 x float> %a)
19576 declare double @llvm.vector.reduce.fmaximum.v2f64(<2 x double> %a)
19581 The '``llvm.vector.reduce.fmaximum.*``' intrinsics do a floating-point
19582 ``MAX`` reduction of a vector, returning the result as a scalar. The return type
19583 matches the element-type of the vector input.
19585 This instruction has the same comparison semantics as the '``llvm.maximum.*``'
19586 intrinsic. That is, this intrinsic propagates NaNs and +0.0 is considered
19587 greater than -0.0. If any element of the vector is a NaN, the result is NaN.
19591 The argument to this intrinsic must be a vector of floating-point values.
19593 .. _int_vector_reduce_fminimum:
19595 '``llvm.vector.reduce.fminimum.*``' Intrinsic
19596 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19600 This is an overloaded intrinsic.
19604 declare float @llvm.vector.reduce.fminimum.v4f32(<4 x float> %a)
19605 declare double @llvm.vector.reduce.fminimum.v2f64(<2 x double> %a)
19610 The '``llvm.vector.reduce.fminimum.*``' intrinsics do a floating-point
19611 ``MIN`` reduction of a vector, returning the result as a scalar. The return type
19612 matches the element-type of the vector input.
19614 This instruction has the same comparison semantics as the '``llvm.minimum.*``'
19615 intrinsic. That is, this intrinsic propagates NaNs and -0.0 is considered less
19616 than +0.0. If any element of the vector is a NaN, the result is NaN.
19620 The argument to this intrinsic must be a vector of floating-point values.
19622 '``llvm.vector.insert``' Intrinsic
19623 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19627 This is an overloaded intrinsic.
19631 ; Insert fixed type into scalable type
19632 declare <vscale x 4 x float> @llvm.vector.insert.nxv4f32.v4f32(<vscale x 4 x float> %vec, <4 x float> %subvec, i64 <idx>)
19633 declare <vscale x 2 x double> @llvm.vector.insert.nxv2f64.v2f64(<vscale x 2 x double> %vec, <2 x double> %subvec, i64 <idx>)
19635 ; Insert scalable type into scalable type
19636 declare <vscale x 4 x float> @llvm.vector.insert.nxv4f64.nxv2f64(<vscale x 4 x float> %vec, <vscale x 2 x float> %subvec, i64 <idx>)
19638 ; Insert fixed type into fixed type
19639 declare <4 x double> @llvm.vector.insert.v4f64.v2f64(<4 x double> %vec, <2 x double> %subvec, i64 <idx>)
19644 The '``llvm.vector.insert.*``' intrinsics insert a vector into another vector
19645 starting from a given index. The return type matches the type of the vector we
19646 insert into. Conceptually, this can be used to build a scalable vector out of
19647 non-scalable vectors, however this intrinsic can also be used on purely fixed
19650 Scalable vectors can only be inserted into other scalable vectors.
19655 The ``vec`` is the vector which ``subvec`` will be inserted into.
19656 The ``subvec`` is the vector that will be inserted.
19658 ``idx`` represents the starting element number at which ``subvec`` will be
19659 inserted. ``idx`` must be a constant multiple of ``subvec``'s known minimum
19660 vector length. If ``subvec`` is a scalable vector, ``idx`` is first scaled by
19661 the runtime scaling factor of ``subvec``. The elements of ``vec`` starting at
19662 ``idx`` are overwritten with ``subvec``. Elements ``idx`` through (``idx`` +
19663 num_elements(``subvec``) - 1) must be valid ``vec`` indices. If this condition
19664 cannot be determined statically but is false at runtime, then the result vector
19665 is a :ref:`poison value <poisonvalues>`.
19668 '``llvm.vector.extract``' Intrinsic
19669 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19673 This is an overloaded intrinsic.
19677 ; Extract fixed type from scalable type
19678 declare <4 x float> @llvm.vector.extract.v4f32.nxv4f32(<vscale x 4 x float> %vec, i64 <idx>)
19679 declare <2 x double> @llvm.vector.extract.v2f64.nxv2f64(<vscale x 2 x double> %vec, i64 <idx>)
19681 ; Extract scalable type from scalable type
19682 declare <vscale x 2 x float> @llvm.vector.extract.nxv2f32.nxv4f32(<vscale x 4 x float> %vec, i64 <idx>)
19684 ; Extract fixed type from fixed type
19685 declare <2 x double> @llvm.vector.extract.v2f64.v4f64(<4 x double> %vec, i64 <idx>)
19690 The '``llvm.vector.extract.*``' intrinsics extract a vector from within another
19691 vector starting from a given index. The return type must be explicitly
19692 specified. Conceptually, this can be used to decompose a scalable vector into
19693 non-scalable parts, however this intrinsic can also be used on purely fixed
19696 Scalable vectors can only be extracted from other scalable vectors.
19701 The ``vec`` is the vector from which we will extract a subvector.
19703 The ``idx`` specifies the starting element number within ``vec`` from which a
19704 subvector is extracted. ``idx`` must be a constant multiple of the known-minimum
19705 vector length of the result type. If the result type is a scalable vector,
19706 ``idx`` is first scaled by the result type's runtime scaling factor. Elements
19707 ``idx`` through (``idx`` + num_elements(result_type) - 1) must be valid vector
19708 indices. If this condition cannot be determined statically but is false at
19709 runtime, then the result vector is a :ref:`poison value <poisonvalues>`. The
19710 ``idx`` parameter must be a vector index constant type (for most targets this
19711 will be an integer pointer type).
19713 '``llvm.vector.reverse``' Intrinsic
19714 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19718 This is an overloaded intrinsic.
19722 declare <2 x i8> @llvm.vector.reverse.v2i8(<2 x i8> %a)
19723 declare <vscale x 4 x i32> @llvm.vector.reverse.nxv4i32(<vscale x 4 x i32> %a)
19728 The '``llvm.vector.reverse.*``' intrinsics reverse a vector.
19729 The intrinsic takes a single vector and returns a vector of matching type but
19730 with the original lane order reversed. These intrinsics work for both fixed
19731 and scalable vectors. While this intrinsic supports all vector types
19732 the recommended way to express this operation for fixed-width vectors is
19733 still to use a shufflevector, as that may allow for more optimization
19739 The argument to this intrinsic must be a vector.
19741 '``llvm.vector.deinterleave2``' Intrinsic
19742 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19746 This is an overloaded intrinsic.
19750 declare {<2 x double>, <2 x double>} @llvm.vector.deinterleave2.v4f64(<4 x double> %vec1)
19751 declare {<vscale x 4 x i32>, <vscale x 4 x i32>} @llvm.vector.deinterleave2.nxv8i32(<vscale x 8 x i32> %vec1)
19756 The '``llvm.vector.deinterleave2``' intrinsic constructs two
19757 vectors by deinterleaving the even and odd lanes of the input vector.
19759 This intrinsic works for both fixed and scalable vectors. While this intrinsic
19760 supports all vector types the recommended way to express this operation for
19761 fixed-width vectors is still to use a shufflevector, as that may allow for more
19762 optimization opportunities.
19766 .. code-block:: text
19768 {<2 x i64>, <2 x i64>} llvm.vector.deinterleave2.v4i64(<4 x i64> <i64 0, i64 1, i64 2, i64 3>); ==> {<2 x i64> <i64 0, i64 2>, <2 x i64> <i64 1, i64 3>}
19773 The argument is a vector whose type corresponds to the logical concatenation of
19774 the two result types.
19776 '``llvm.vector.interleave2``' Intrinsic
19777 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19781 This is an overloaded intrinsic.
19785 declare <4 x double> @llvm.vector.interleave2.v4f64(<2 x double> %vec1, <2 x double> %vec2)
19786 declare <vscale x 8 x i32> @llvm.vector.interleave2.nxv8i32(<vscale x 4 x i32> %vec1, <vscale x 4 x i32> %vec2)
19791 The '``llvm.vector.interleave2``' intrinsic constructs a vector
19792 by interleaving two input vectors.
19794 This intrinsic works for both fixed and scalable vectors. While this intrinsic
19795 supports all vector types the recommended way to express this operation for
19796 fixed-width vectors is still to use a shufflevector, as that may allow for more
19797 optimization opportunities.
19801 .. code-block:: text
19803 <4 x i64> llvm.vector.interleave2.v4i64(<2 x i64> <i64 0, i64 2>, <2 x i64> <i64 1, i64 3>); ==> <4 x i64> <i64 0, i64 1, i64 2, i64 3>
19807 Both arguments must be vectors of the same type whereby their logical
19808 concatenation matches the result type.
19810 '``llvm.experimental.cttz.elts``' Intrinsic
19811 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19816 This is an overloaded intrinsic. You can use ```llvm.experimental.cttz.elts```
19817 on any vector of integer elements, both fixed width and scalable.
19821 declare i8 @llvm.experimental.cttz.elts.i8.v8i1(<8 x i1> <src>, i1 <is_zero_poison>)
19826 The '``llvm.experimental.cttz.elts``' intrinsic counts the number of trailing
19827 zero elements of a vector.
19832 The first argument is the vector to be counted. This argument must be a vector
19833 with integer element type. The return type must also be an integer type which is
19834 wide enough to hold the maximum number of elements of the source vector. The
19835 behavior of this intrinsic is undefined if the return type is not wide enough
19836 for the number of elements in the input vector.
19838 The second argument is a constant flag that indicates whether the intrinsic
19839 returns a valid result if the first argument is all zero. If the first argument
19840 is all zero and the second argument is true, the result is poison.
19845 The '``llvm.experimental.cttz.elts``' intrinsic counts the trailing (least
19846 significant) zero elements in a vector. If ``src == 0`` the result is the
19847 number of elements in the input vector.
19849 '``llvm.vector.splice``' Intrinsic
19850 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19854 This is an overloaded intrinsic.
19858 declare <2 x double> @llvm.vector.splice.v2f64(<2 x double> %vec1, <2 x double> %vec2, i32 %imm)
19859 declare <vscale x 4 x i32> @llvm.vector.splice.nxv4i32(<vscale x 4 x i32> %vec1, <vscale x 4 x i32> %vec2, i32 %imm)
19864 The '``llvm.vector.splice.*``' intrinsics construct a vector by
19865 concatenating elements from the first input vector with elements of the second
19866 input vector, returning a vector of the same type as the input vectors. The
19867 signed immediate, modulo the number of elements in the vector, is the index
19868 into the first vector from which to extract the result value. This means
19869 conceptually that for a positive immediate, a vector is extracted from
19870 ``concat(%vec1, %vec2)`` starting at index ``imm``, whereas for a negative
19871 immediate, it extracts ``-imm`` trailing elements from the first vector, and
19872 the remaining elements from ``%vec2``.
19874 These intrinsics work for both fixed and scalable vectors. While this intrinsic
19875 supports all vector types the recommended way to express this operation for
19876 fixed-width vectors is still to use a shufflevector, as that may allow for more
19877 optimization opportunities.
19881 .. code-block:: text
19883 llvm.vector.splice(<A,B,C,D>, <E,F,G,H>, 1); ==> <B, C, D, E> index
19884 llvm.vector.splice(<A,B,C,D>, <E,F,G,H>, -3); ==> <B, C, D, E> trailing elements
19890 The first two operands are vectors with the same type. The start index is imm
19891 modulo the runtime number of elements in the source vector. For a fixed-width
19892 vector <N x eltty>, imm is a signed integer constant in the range
19893 -N <= imm < N. For a scalable vector <vscale x N x eltty>, imm is a signed
19894 integer constant in the range -X <= imm < X where X=vscale_range_min * N.
19896 '``llvm.stepvector``' Intrinsic
19897 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19899 This is an overloaded intrinsic. You can use ``llvm.stepvector``
19900 to generate a vector whose lane values comprise the linear sequence
19901 <0, 1, 2, ...>. It is primarily intended for scalable vectors.
19905 declare <vscale x 4 x i32> @llvm.stepvector.nxv4i32()
19906 declare <vscale x 8 x i16> @llvm.stepvector.nxv8i16()
19908 The '``llvm.stepvector``' intrinsics are used to create vectors
19909 of integers whose elements contain a linear sequence of values starting from 0
19910 with a step of 1. This intrinsic can only be used for vectors with integer
19911 elements that are at least 8 bits in size. If the sequence value exceeds
19912 the allowed limit for the element type then the result for that lane is
19915 These intrinsics work for both fixed and scalable vectors. While this intrinsic
19916 supports all vector types, the recommended way to express this operation for
19917 fixed-width vectors is still to generate a constant vector instead.
19926 '``llvm.experimental.get.vector.length``' Intrinsic
19927 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19931 This is an overloaded intrinsic.
19935 declare i32 @llvm.experimental.get.vector.length.i32(i32 %cnt, i32 immarg %vf, i1 immarg %scalable)
19936 declare i32 @llvm.experimental.get.vector.length.i64(i64 %cnt, i32 immarg %vf, i1 immarg %scalable)
19941 The '``llvm.experimental.get.vector.length.*``' intrinsics take a number of
19942 elements to process and returns how many of the elements can be processed
19943 with the requested vectorization factor.
19948 The first argument is an unsigned value of any scalar integer type and specifies
19949 the total number of elements to be processed. The second argument is an i32
19950 immediate for the vectorization factor. The third argument indicates if the
19951 vectorization factor should be multiplied by vscale.
19956 Returns a non-negative i32 value (explicit vector length) that is unknown at compile
19957 time and depends on the hardware specification.
19958 If the result value does not fit in the result type, then the result is
19959 a :ref:`poison value <poisonvalues>`.
19961 This intrinsic is intended to be used by loop vectorization with VP intrinsics
19962 in order to get the number of elements to process on each loop iteration. The
19963 result should be used to decrease the count for the next iteration until the
19964 count reaches zero.
19966 Let ``%max_lanes`` be the number of lanes in the type described by ``%vf`` and
19967 ``%scalable``, here are the constraints on the returned value:
19969 - If ``%cnt`` equals to 0, returns 0.
19970 - The returned value is always less than or equal to ``%max_lanes``.
19971 - The returned value is always greater than or equal to ``ceil(%cnt / ceil(%cnt / %max_lanes))``,
19972 if ``%cnt`` is non-zero.
19973 - The returned values are monotonically non-increasing in each loop iteration. That is,
19974 the returned value of an iteration is at least as large as that of any later
19977 Note that it has the following implications:
19979 - For a loop that uses this intrinsic, the number of iterations is equal to
19980 ``ceil(%C / %max_lanes)`` where ``%C`` is the initial ``%cnt`` value.
19981 - If ``%cnt`` is non-zero, the return value is non-zero as well.
19982 - If ``%cnt`` is less than or equal to ``%max_lanes``, the return value is equal to ``%cnt``.
19984 '``llvm.experimental.vector.partial.reduce.add.*``' Intrinsic
19985 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19989 This is an overloaded intrinsic.
19993 declare <4 x i32> @llvm.experimental.vector.partial.reduce.add.v4i32.v4i32.v8i32(<4 x i32> %a, <8 x i32> %b)
19994 declare <4 x i32> @llvm.experimental.vector.partial.reduce.add.v4i32.v4i32.v16i32(<4 x i32> %a, <16 x i32> %b)
19995 declare <vscale x 4 x i32> @llvm.experimental.vector.partial.reduce.add.nxv4i32.nxv4i32.nxv8i32(<vscale x 4 x i32> %a, <vscale x 8 x i32> %b)
19996 declare <vscale x 4 x i32> @llvm.experimental.vector.partial.reduce.add.nxv4i32.nxv4i32.nxv16i32(<vscale x 4 x i32> %a, <vscale x 16 x i32> %b)
20001 The '``llvm.vector.experimental.partial.reduce.add.*``' intrinsics reduce the
20002 concatenation of the two vector operands down to the number of elements dictated
20003 by the result type. The result type is a vector type that matches the type of the
20004 first operand vector.
20009 Both arguments must be vectors of matching element types. The first argument type must
20010 match the result type, while the second argument type must have a vector length that is a
20011 positive integer multiple of the first vector/result type. The arguments must be either be
20012 both fixed or both scalable vectors.
20015 '``llvm.experimental.vector.histogram.*``' Intrinsic
20016 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20018 These intrinsics are overloaded.
20020 These intrinsics represent histogram-like operations; that is, updating values
20021 in memory that may not be contiguous, and where multiple elements within a
20022 single vector may be updating the same value in memory.
20024 The update operation must be specified as part of the intrinsic name. For a
20025 simple histogram like the following the ``add`` operation would be used.
20029 void simple_histogram(int *restrict buckets, unsigned *indices, int N, int inc) {
20030 for (int i = 0; i < N; ++i)
20031 buckets[indices[i]] += inc;
20034 More update operation types may be added in the future.
20038 declare void @llvm.experimental.vector.histogram.add.v8p0.i32(<8 x ptr> %ptrs, i32 %inc, <8 x i1> %mask)
20039 declare void @llvm.experimental.vector.histogram.add.nxv2p0.i64(<vscale x 2 x ptr> %ptrs, i64 %inc, <vscale x 2 x i1> %mask)
20044 The first argument is a vector of pointers to the memory locations to be
20045 updated. The second argument is a scalar used to update the value from
20046 memory; it must match the type of value to be updated. The final argument
20047 is a mask value to exclude locations from being modified.
20052 The '``llvm.experimental.vector.histogram.*``' intrinsics are used to perform
20053 updates on potentially overlapping values in memory. The intrinsics represent
20054 the follow sequence of operations:
20056 1. Gather load from the ``ptrs`` operand, with element type matching that of
20057 the ``inc`` operand.
20058 2. Update of the values loaded from memory. In the case of the ``add``
20059 update operation, this means:
20061 1. Perform a cross-vector histogram operation on the ``ptrs`` operand.
20062 2. Multiply the result by the ``inc`` operand.
20063 3. Add the result to the values loaded from memory
20064 3. Scatter the result of the update operation to the memory locations from
20065 the ``ptrs`` operand.
20067 The ``mask`` operand will apply to at least the gather and scatter operations.
20069 '``llvm.experimental.vector.extract.last.active``' Intrinsic
20070 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20072 This is an overloaded intrinsic.
20076 declare i32 @llvm.experimental.vector.extract.last.active.v4i32(<4 x i32> %data, <4 x i1> %mask, i32 %passthru)
20077 declare i16 @llvm.experimental.vector.extract.last.active.nxv8i16(<vscale x 8 x i16> %data, <vscale x 8 x i1> %mask, i16 %passthru)
20082 The first argument is the data vector to extract a lane from. The second is a
20083 mask vector controlling the extraction. The third argument is a passthru
20086 The two input vectors must have the same number of elements, and the type of
20087 the passthru value must match that of the elements of the data vector.
20092 The '``llvm.experimental.vector.extract.last.active``' intrinsic will extract an
20093 element from the data vector at the index matching the highest active lane of
20094 the mask vector. If no mask lanes are active then the passthru value is
20097 .. _int_vector_compress:
20099 '``llvm.experimental.vector.compress.*``' Intrinsics
20100 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20102 LLVM provides an intrinsic for compressing data within a vector based on a selection mask.
20103 Semantically, this is similar to :ref:`llvm.masked.compressstore <int_compressstore>` but with weaker assumptions
20104 and without storing the results to memory, i.e., the data remains in the vector.
20108 This is an overloaded intrinsic. A number of scalar values of integer, floating point or pointer data type are collected
20109 from an input vector and placed adjacently within the result vector. A mask defines which elements to collect from the vector.
20110 The remaining lanes are filled with values from ``passthru``.
20112 .. code-block:: llvm
20114 declare <8 x i32> @llvm.experimental.vector.compress.v8i32(<8 x i32> <value>, <8 x i1> <mask>, <8 x i32> <passthru>)
20115 declare <16 x float> @llvm.experimental.vector.compress.v16f32(<16 x float> <value>, <16 x i1> <mask>, <16 x float> undef)
20120 Selects elements from input vector ``value`` according to the ``mask``.
20121 All selected elements are written into adjacent lanes in the result vector,
20122 from lower to higher.
20123 The mask holds an entry for each vector lane, and is used to select elements
20125 If a ``passthru`` vector is given, all remaining lanes are filled with the
20126 corresponding lane's value from ``passthru``.
20127 The main difference to :ref:`llvm.masked.compressstore <int_compressstore>` is
20128 that the we do not need to guard against memory access for unselected lanes.
20129 This allows for branchless code and better optimization for all targets that
20130 do not support or have inefficient
20131 instructions of the explicit semantics of
20132 :ref:`llvm.masked.compressstore <int_compressstore>` but still have some form
20133 of compress operations.
20134 The result vector can be written with a similar effect, as all the selected
20135 values are at the lower positions of the vector, but without requiring
20136 branches to avoid writes where the mask is ``false``.
20141 The first operand is the input vector, from which elements are selected.
20142 The second operand is the mask, a vector of boolean values.
20143 The third operand is the passthru vector, from which elements are filled
20144 into remaining lanes.
20145 The mask and the input vector must have the same number of vector elements.
20146 The input and passthru vectors must have the same type.
20151 The ``llvm.experimental.vector.compress`` intrinsic compresses data within a vector.
20152 It collects elements from possibly non-adjacent lanes of a vector and places
20153 them contiguously in the result vector based on a selection mask, filling the
20154 remaining lanes with values from ``passthru``.
20155 This intrinsic performs the logic of the following C++ example.
20156 All values in ``out`` after the last selected one are undefined if
20157 ``passthru`` is undefined.
20158 If all entries in the ``mask`` are 0, the ``out`` vector is ``passthru``.
20159 If any element of the mask is poison, all elements of the result are poison.
20160 Otherwise, if any element of the mask is undef, all elements of the result are undef.
20161 If ``passthru`` is undefined, the number of valid lanes is equal to the number
20162 of ``true`` entries in the mask, i.e., all lanes >= number-of-selected-values
20165 .. code-block:: cpp
20167 // Consecutively place selected values in a vector.
20168 using VecT __attribute__((vector_size(N))) = int;
20169 VecT compress(VecT vec, VecT mask, VecT passthru) {
20172 for (int i = 0; i < N / sizeof(int); ++i) {
20174 idx += static_cast<bool>(mask[i]);
20176 for (; idx < N / sizeof(int); ++idx) {
20177 out[idx] = passthru[idx];
20183 '``llvm.experimental.vector.match.*``' Intrinsic
20184 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20189 This is an overloaded intrinsic.
20193 declare <<n> x i1> @llvm.experimental.vector.match(<<n> x <ty>> %op1, <<m> x <ty>> %op2, <<n> x i1> %mask)
20194 declare <vscale x <n> x i1> @llvm.experimental.vector.match(<vscale x <n> x <ty>> %op1, <<m> x <ty>> %op2, <vscale x <n> x i1> %mask)
20199 Find active elements of the first argument matching any elements of the second.
20204 The first argument is the search vector, the second argument the vector of
20205 elements we are searching for (i.e. for which we consider a match successful),
20206 and the third argument is a mask that controls which elements of the first
20207 argument are active. The first two arguments must be vectors of matching
20208 integer element types. The first and third arguments and the result type must
20209 have matching element counts (fixed or scalable). The second argument must be a
20210 fixed vector, but its length may be different from the remaining arguments.
20215 The '``llvm.experimental.vector.match``' intrinsic compares each active element
20216 in the first argument against the elements of the second argument, placing
20217 ``1`` in the corresponding element of the output vector if any equality
20218 comparison is successful, and ``0`` otherwise. Inactive elements in the mask
20219 are set to ``0`` in the output.
20224 Operations on matrixes requiring shape information (like number of rows/columns
20225 or the memory layout) can be expressed using the matrix intrinsics. These
20226 intrinsics require matrix dimensions to be passed as immediate arguments, and
20227 matrixes are passed and returned as vectors. This means that for a ``R`` x
20228 ``C`` matrix, element ``i`` of column ``j`` is at index ``j * R + i`` in the
20229 corresponding vector, with indices starting at 0. Currently column-major layout
20230 is assumed. The intrinsics support both integer and floating point matrixes.
20233 '``llvm.matrix.transpose.*``' Intrinsic
20234 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20238 This is an overloaded intrinsic.
20242 declare vectorty @llvm.matrix.transpose.*(vectorty %In, i32 <Rows>, i32 <Cols>)
20247 The '``llvm.matrix.transpose.*``' intrinsics treat ``%In`` as a ``<Rows> x
20248 <Cols>`` matrix and return the transposed matrix in the result vector.
20253 The first argument ``%In`` is a vector that corresponds to a ``<Rows> x
20254 <Cols>`` matrix. Thus, arguments ``<Rows>`` and ``<Cols>`` correspond to the
20255 number of rows and columns, respectively, and must be positive, constant
20256 integers. The returned vector must have ``<Rows> * <Cols>`` elements, and have
20257 the same float or integer element type as ``%In``.
20259 '``llvm.matrix.multiply.*``' Intrinsic
20260 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20264 This is an overloaded intrinsic.
20268 declare vectorty @llvm.matrix.multiply.*(vectorty %A, vectorty %B, i32 <OuterRows>, i32 <Inner>, i32 <OuterColumns>)
20273 The '``llvm.matrix.multiply.*``' intrinsics treat ``%A`` as a ``<OuterRows> x
20274 <Inner>`` matrix, ``%B`` as a ``<Inner> x <OuterColumns>`` matrix, and
20275 multiplies them. The result matrix is returned in the result vector.
20280 The first vector argument ``%A`` corresponds to a matrix with ``<OuterRows> *
20281 <Inner>`` elements, and the second argument ``%B`` to a matrix with
20282 ``<Inner> * <OuterColumns>`` elements. Arguments ``<OuterRows>``,
20283 ``<Inner>`` and ``<OuterColumns>`` must be positive, constant integers. The
20284 returned vector must have ``<OuterRows> * <OuterColumns>`` elements.
20285 Vectors ``%A``, ``%B``, and the returned vector all have the same float or
20286 integer element type.
20289 '``llvm.matrix.column.major.load.*``' Intrinsic
20290 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20294 This is an overloaded intrinsic.
20298 declare vectorty @llvm.matrix.column.major.load.*(
20299 ptrty %Ptr, i64 %Stride, i1 <IsVolatile>, i32 <Rows>, i32 <Cols>)
20304 The '``llvm.matrix.column.major.load.*``' intrinsics load a ``<Rows> x <Cols>``
20305 matrix using a stride of ``%Stride`` to compute the start address of the
20306 different columns. The offset is computed using ``%Stride``'s bitwidth. This
20307 allows for convenient loading of sub matrixes. If ``<IsVolatile>`` is true, the
20308 intrinsic is considered a :ref:`volatile memory access <volatile>`. The result
20309 matrix is returned in the result vector. If the ``%Ptr`` argument is known to
20310 be aligned to some boundary, this can be specified as an attribute on the
20316 The first argument ``%Ptr`` is a pointer type to the returned vector type, and
20317 corresponds to the start address to load from. The second argument ``%Stride``
20318 is a positive, constant integer with ``%Stride >= <Rows>``. ``%Stride`` is used
20319 to compute the column memory addresses. I.e., for a column ``C``, its start
20320 memory addresses is calculated with ``%Ptr + C * %Stride``. The third Argument
20321 ``<IsVolatile>`` is a boolean value. The fourth and fifth arguments,
20322 ``<Rows>`` and ``<Cols>``, correspond to the number of rows and columns,
20323 respectively, and must be positive, constant integers. The returned vector must
20324 have ``<Rows> * <Cols>`` elements.
20326 The :ref:`align <attr_align>` parameter attribute can be provided for the
20327 ``%Ptr`` arguments.
20330 '``llvm.matrix.column.major.store.*``' Intrinsic
20331 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20338 declare void @llvm.matrix.column.major.store.*(
20339 vectorty %In, ptrty %Ptr, i64 %Stride, i1 <IsVolatile>, i32 <Rows>, i32 <Cols>)
20344 The '``llvm.matrix.column.major.store.*``' intrinsics store the ``<Rows> x
20345 <Cols>`` matrix in ``%In`` to memory using a stride of ``%Stride`` between
20346 columns. The offset is computed using ``%Stride``'s bitwidth. If
20347 ``<IsVolatile>`` is true, the intrinsic is considered a
20348 :ref:`volatile memory access <volatile>`.
20350 If the ``%Ptr`` argument is known to be aligned to some boundary, this can be
20351 specified as an attribute on the argument.
20356 The first argument ``%In`` is a vector that corresponds to a ``<Rows> x
20357 <Cols>`` matrix to be stored to memory. The second argument ``%Ptr`` is a
20358 pointer to the vector type of ``%In``, and is the start address of the matrix
20359 in memory. The third argument ``%Stride`` is a positive, constant integer with
20360 ``%Stride >= <Rows>``. ``%Stride`` is used to compute the column memory
20361 addresses. I.e., for a column ``C``, its start memory addresses is calculated
20362 with ``%Ptr + C * %Stride``. The fourth argument ``<IsVolatile>`` is a boolean
20363 value. The arguments ``<Rows>`` and ``<Cols>`` correspond to the number of rows
20364 and columns, respectively, and must be positive, constant integers.
20366 The :ref:`align <attr_align>` parameter attribute can be provided
20367 for the ``%Ptr`` arguments.
20370 Half Precision Floating-Point Intrinsics
20371 ----------------------------------------
20373 For most target platforms, half precision floating-point is a
20374 storage-only format. This means that it is a dense encoding (in memory)
20375 but does not support computation in the format.
20377 This means that code must first load the half-precision floating-point
20378 value as an i16, then convert it to float with
20379 :ref:`llvm.convert.from.fp16 <int_convert_from_fp16>`. Computation can
20380 then be performed on the float value (including extending to double
20381 etc). To store the value back to memory, it is first converted to float
20382 if needed, then converted to i16 with
20383 :ref:`llvm.convert.to.fp16 <int_convert_to_fp16>`, then storing as an
20386 .. _int_convert_to_fp16:
20388 '``llvm.convert.to.fp16``' Intrinsic
20389 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20396 declare i16 @llvm.convert.to.fp16.f32(float %a)
20397 declare i16 @llvm.convert.to.fp16.f64(double %a)
20402 The '``llvm.convert.to.fp16``' intrinsic function performs a conversion from a
20403 conventional floating-point type to half precision floating-point format.
20408 The intrinsic function contains single argument - the value to be
20414 The '``llvm.convert.to.fp16``' intrinsic function performs a conversion from a
20415 conventional floating-point format to half precision floating-point format. The
20416 return value is an ``i16`` which contains the converted number.
20421 .. code-block:: llvm
20423 %res = call i16 @llvm.convert.to.fp16.f32(float %a)
20424 store i16 %res, i16* @x, align 2
20426 .. _int_convert_from_fp16:
20428 '``llvm.convert.from.fp16``' Intrinsic
20429 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20436 declare float @llvm.convert.from.fp16.f32(i16 %a)
20437 declare double @llvm.convert.from.fp16.f64(i16 %a)
20442 The '``llvm.convert.from.fp16``' intrinsic function performs a
20443 conversion from half precision floating-point format to single precision
20444 floating-point format.
20449 The intrinsic function contains single argument - the value to be
20455 The '``llvm.convert.from.fp16``' intrinsic function performs a
20456 conversion from half single precision floating-point format to single
20457 precision floating-point format. The input half-float value is
20458 represented by an ``i16`` value.
20463 .. code-block:: llvm
20465 %a = load i16, ptr @x, align 2
20466 %res = call float @llvm.convert.from.fp16(i16 %a)
20468 Saturating floating-point to integer conversions
20469 ------------------------------------------------
20471 The ``fptoui`` and ``fptosi`` instructions return a
20472 :ref:`poison value <poisonvalues>` if the rounded-towards-zero value is not
20473 representable by the result type. These intrinsics provide an alternative
20474 conversion, which will saturate towards the smallest and largest representable
20475 integer values instead.
20477 '``llvm.fptoui.sat.*``' Intrinsic
20478 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20483 This is an overloaded intrinsic. You can use ``llvm.fptoui.sat`` on any
20484 floating-point argument type and any integer result type, or vectors thereof.
20485 Not all targets may support all types, however.
20489 declare i32 @llvm.fptoui.sat.i32.f32(float %f)
20490 declare i19 @llvm.fptoui.sat.i19.f64(double %f)
20491 declare <4 x i100> @llvm.fptoui.sat.v4i100.v4f128(<4 x fp128> %f)
20496 This intrinsic converts the argument into an unsigned integer using saturating
20502 The argument may be any floating-point or vector of floating-point type. The
20503 return value may be any integer or vector of integer type. The number of vector
20504 elements in argument and return must be the same.
20509 The conversion to integer is performed subject to the following rules:
20511 - If the argument is any NaN, zero is returned.
20512 - If the argument is smaller than zero (this includes negative infinity),
20514 - If the argument is larger than the largest representable unsigned integer of
20515 the result type (this includes positive infinity), the largest representable
20516 unsigned integer is returned.
20517 - Otherwise, the result of rounding the argument towards zero is returned.
20522 .. code-block:: text
20524 %a = call i8 @llvm.fptoui.sat.i8.f32(float 123.875) ; yields i8: 123
20525 %b = call i8 @llvm.fptoui.sat.i8.f32(float -5.75) ; yields i8: 0
20526 %c = call i8 @llvm.fptoui.sat.i8.f32(float 377.0) ; yields i8: 255
20527 %d = call i8 @llvm.fptoui.sat.i8.f32(float 0xFFF8000000000000) ; yields i8: 0
20529 '``llvm.fptosi.sat.*``' Intrinsic
20530 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20535 This is an overloaded intrinsic. You can use ``llvm.fptosi.sat`` on any
20536 floating-point argument type and any integer result type, or vectors thereof.
20537 Not all targets may support all types, however.
20541 declare i32 @llvm.fptosi.sat.i32.f32(float %f)
20542 declare i19 @llvm.fptosi.sat.i19.f64(double %f)
20543 declare <4 x i100> @llvm.fptosi.sat.v4i100.v4f128(<4 x fp128> %f)
20548 This intrinsic converts the argument into a signed integer using saturating
20554 The argument may be any floating-point or vector of floating-point type. The
20555 return value may be any integer or vector of integer type. The number of vector
20556 elements in argument and return must be the same.
20561 The conversion to integer is performed subject to the following rules:
20563 - If the argument is any NaN, zero is returned.
20564 - If the argument is smaller than the smallest representable signed integer of
20565 the result type (this includes negative infinity), the smallest
20566 representable signed integer is returned.
20567 - If the argument is larger than the largest representable signed integer of
20568 the result type (this includes positive infinity), the largest representable
20569 signed integer is returned.
20570 - Otherwise, the result of rounding the argument towards zero is returned.
20575 .. code-block:: text
20577 %a = call i8 @llvm.fptosi.sat.i8.f32(float 23.875) ; yields i8: 23
20578 %b = call i8 @llvm.fptosi.sat.i8.f32(float -130.75) ; yields i8: -128
20579 %c = call i8 @llvm.fptosi.sat.i8.f32(float 999.0) ; yields i8: 127
20580 %d = call i8 @llvm.fptosi.sat.i8.f32(float 0xFFF8000000000000) ; yields i8: 0
20582 Convergence Intrinsics
20583 ----------------------
20585 The LLVM convergence intrinsics for controlling the semantics of ``convergent``
20586 operations, which all start with the ``llvm.experimental.convergence.``
20587 prefix, are described in the :doc:`ConvergentOperations` document.
20589 .. _dbg_intrinsics:
20591 Debugger Intrinsics
20592 -------------------
20594 The LLVM debugger intrinsics (which all start with ``llvm.dbg.``
20595 prefix), are described in the `LLVM Source Level
20596 Debugging <SourceLevelDebugging.html#format-common-intrinsics>`_
20599 Exception Handling Intrinsics
20600 -----------------------------
20602 The LLVM exception handling intrinsics (which all start with
20603 ``llvm.eh.`` prefix), are described in the `LLVM Exception
20604 Handling <ExceptionHandling.html#format-common-intrinsics>`_ document.
20606 Pointer Authentication Intrinsics
20607 ---------------------------------
20609 The LLVM pointer authentication intrinsics (which all start with
20610 ``llvm.ptrauth.`` prefix), are described in the `Pointer Authentication
20611 <PointerAuth.html#intrinsics>`_ document.
20613 .. _int_trampoline:
20615 Trampoline Intrinsics
20616 ---------------------
20618 These intrinsics make it possible to excise one parameter, marked with
20619 the :ref:`nest <nest>` attribute, from a function. The result is a
20620 callable function pointer lacking the nest parameter - the caller does
20621 not need to provide a value for it. Instead, the value to use is stored
20622 in advance in a "trampoline", a block of memory usually allocated on the
20623 stack, which also contains code to splice the nest value into the
20624 argument list. This is used to implement the GCC nested function address
20627 For example, if the function is ``i32 f(ptr nest %c, i32 %x, i32 %y)``
20628 then the resulting function pointer has signature ``i32 (i32, i32)``.
20629 It can be created as follows:
20631 .. code-block:: llvm
20633 %tramp = alloca [10 x i8], align 4 ; size and alignment only correct for X86
20634 call ptr @llvm.init.trampoline(ptr %tramp, ptr @f, ptr %nval)
20635 %fp = call ptr @llvm.adjust.trampoline(ptr %tramp)
20637 The call ``%val = call i32 %fp(i32 %x, i32 %y)`` is then equivalent to
20638 ``%val = call i32 %f(ptr %nval, i32 %x, i32 %y)``.
20642 '``llvm.init.trampoline``' Intrinsic
20643 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20650 declare void @llvm.init.trampoline(ptr <tramp>, ptr <func>, ptr <nval>)
20655 This fills the memory pointed to by ``tramp`` with executable code,
20656 turning it into a trampoline.
20661 The ``llvm.init.trampoline`` intrinsic takes three arguments, all
20662 pointers. The ``tramp`` argument must point to a sufficiently large and
20663 sufficiently aligned block of memory; this memory is written to by the
20664 intrinsic. Note that the size and the alignment are target-specific -
20665 LLVM currently provides no portable way of determining them, so a
20666 front-end that generates this intrinsic needs to have some
20667 target-specific knowledge. The ``func`` argument must hold a function.
20672 The block of memory pointed to by ``tramp`` is filled with target
20673 dependent code, turning it into a function. Then ``tramp`` needs to be
20674 passed to :ref:`llvm.adjust.trampoline <int_at>` to get a pointer which can
20675 be :ref:`bitcast (to a new function) and called <int_trampoline>`. The new
20676 function's signature is the same as that of ``func`` with any arguments
20677 marked with the ``nest`` attribute removed. At most one such ``nest``
20678 argument is allowed, and it must be of pointer type. Calling the new
20679 function is equivalent to calling ``func`` with the same argument list,
20680 but with ``nval`` used for the missing ``nest`` argument. If, after
20681 calling ``llvm.init.trampoline``, the memory pointed to by ``tramp`` is
20682 modified, then the effect of any later call to the returned function
20683 pointer is undefined.
20687 '``llvm.adjust.trampoline``' Intrinsic
20688 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20695 declare ptr @llvm.adjust.trampoline(ptr <tramp>)
20700 This performs any required machine-specific adjustment to the address of
20701 a trampoline (passed as ``tramp``).
20706 ``tramp`` must point to a block of memory which already has trampoline
20707 code filled in by a previous call to
20708 :ref:`llvm.init.trampoline <int_it>`.
20713 On some architectures the address of the code to be executed needs to be
20714 different than the address where the trampoline is actually stored. This
20715 intrinsic returns the executable address corresponding to ``tramp``
20716 after performing the required machine specific adjustments. The pointer
20717 returned can then be :ref:`bitcast and executed <int_trampoline>`.
20722 Vector Predication Intrinsics
20723 -----------------------------
20724 VP intrinsics are intended for predicated SIMD/vector code. A typical VP
20725 operation takes a vector mask and an explicit vector length parameter as in:
20729 <W x T> llvm.vp.<opcode>.*(<W x T> %x, <W x T> %y, <W x i1> %mask, i32 %evl)
20731 The vector mask parameter (%mask) always has a vector of `i1` type, for example
20732 `<32 x i1>`. The explicit vector length parameter always has the type `i32` and
20733 is an unsigned integer value. The explicit vector length parameter (%evl) is in
20738 0 <= %evl <= W, where W is the number of vector elements
20740 Note that for :ref:`scalable vector types <t_vector>` ``W`` is the runtime
20741 length of the vector.
20743 The VP intrinsic has undefined behavior if ``%evl > W``. The explicit vector
20744 length (%evl) creates a mask, %EVLmask, with all elements ``0 <= i < %evl`` set
20745 to True, and all other lanes ``%evl <= i < W`` to False. A new mask %M is
20746 calculated with an element-wise AND from %mask and %EVLmask:
20750 M = %mask AND %EVLmask
20752 A vector operation ``<opcode>`` on vectors ``A`` and ``B`` calculates:
20756 A <opcode> B = { A[i] <opcode> B[i] M[i] = True, and
20762 Some targets, such as AVX512, do not support the %evl parameter in hardware.
20763 The use of an effective %evl is discouraged for those targets. The function
20764 ``TargetTransformInfo::hasActiveVectorLength()`` returns true when the target
20765 has native support for %evl.
20769 '``llvm.vp.select.*``' Intrinsics
20770 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20774 This is an overloaded intrinsic.
20778 declare <16 x i32> @llvm.vp.select.v16i32 (<16 x i1> <condition>, <16 x i32> <on_true>, <16 x i32> <on_false>, i32 <evl>)
20779 declare <vscale x 4 x i64> @llvm.vp.select.nxv4i64 (<vscale x 4 x i1> <condition>, <vscale x 4 x i64> <on_true>, <vscale x 4 x i64> <on_false>, i32 <evl>)
20784 The '``llvm.vp.select``' intrinsic is used to choose one value based on a
20785 condition vector, without IR-level branching.
20790 The first argument is a vector of ``i1`` and indicates the condition. The
20791 second argument is the value that is selected where the condition vector is
20792 true. The third argument is the value that is selected where the condition
20793 vector is false. The vectors must be of the same size. The fourth argument is
20794 the explicit vector length.
20796 #. The optional ``fast-math flags`` marker indicates that the select has one or
20797 more :ref:`fast-math flags <fastmath>`. These are optimization hints to
20798 enable otherwise unsafe floating-point optimizations. Fast-math flags are
20799 only valid for selects that return :ref:`supported floating-point types
20800 <fastmath_return_types>`.
20805 The intrinsic selects lanes from the second and third argument depending on a
20808 All result lanes at positions greater or equal than ``%evl`` are undefined.
20809 For all lanes below ``%evl`` where the condition vector is true the lane is
20810 taken from the second argument. Otherwise, the lane is taken from the third
20816 .. code-block:: llvm
20818 %r = call <4 x i32> @llvm.vp.select.v4i32(<4 x i1> %cond, <4 x i32> %on_true, <4 x i32> %on_false, i32 %evl)
20821 ;; Any result is legal on lanes at and above %evl.
20822 %also.r = select <4 x i1> %cond, <4 x i32> %on_true, <4 x i32> %on_false
20827 '``llvm.vp.merge.*``' Intrinsics
20828 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20832 This is an overloaded intrinsic.
20836 declare <16 x i32> @llvm.vp.merge.v16i32 (<16 x i1> <condition>, <16 x i32> <on_true>, <16 x i32> <on_false>, i32 <pivot>)
20837 declare <vscale x 4 x i64> @llvm.vp.merge.nxv4i64 (<vscale x 4 x i1> <condition>, <vscale x 4 x i64> <on_true>, <vscale x 4 x i64> <on_false>, i32 <pivot>)
20842 The '``llvm.vp.merge``' intrinsic is used to choose one value based on a
20843 condition vector and an index argument, without IR-level branching.
20848 The first argument is a vector of ``i1`` and indicates the condition. The
20849 second argument is the value that is merged where the condition vector is true.
20850 The third argument is the value that is selected where the condition vector is
20851 false or the lane position is greater equal than the pivot. The fourth argument
20854 #. The optional ``fast-math flags`` marker indicates that the merge has one or
20855 more :ref:`fast-math flags <fastmath>`. These are optimization hints to
20856 enable otherwise unsafe floating-point optimizations. Fast-math flags are
20857 only valid for merges that return :ref:`supported floating-point types
20858 <fastmath_return_types>`.
20863 The intrinsic selects lanes from the second and third argument depending on a
20864 condition vector and pivot value.
20866 For all lanes where the condition vector is true and the lane position is less
20867 than ``%pivot`` the lane is taken from the second argument. Otherwise, the lane
20868 is taken from the third argument.
20873 .. code-block:: llvm
20875 %r = call <4 x i32> @llvm.vp.merge.v4i32(<4 x i1> %cond, <4 x i32> %on_true, <4 x i32> %on_false, i32 %pivot)
20878 ;; Lanes at and above %pivot are taken from %on_false
20879 %atfirst = insertelement <4 x i32> undef, i32 %pivot, i32 0
20880 %splat = shufflevector <4 x i32> %atfirst, <4 x i32> poison, <4 x i32> zeroinitializer
20881 %pivotmask = icmp ult <4 x i32> <i32 0, i32 1, i32 2, i32 3>, <4 x i32> %splat
20882 %mergemask = and <4 x i1> %cond, <4 x i1> %pivotmask
20883 %also.r = select <4 x i1> %mergemask, <4 x i32> %on_true, <4 x i32> %on_false
20889 '``llvm.vp.add.*``' Intrinsics
20890 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20894 This is an overloaded intrinsic.
20898 declare <16 x i32> @llvm.vp.add.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
20899 declare <vscale x 4 x i32> @llvm.vp.add.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
20900 declare <256 x i64> @llvm.vp.add.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
20905 Predicated integer addition of two vectors of integers.
20911 The first two arguments and the result have the same vector of integer type. The
20912 third argument is the vector mask and has the same number of elements as the
20913 result vector type. The fourth argument is the explicit vector length of the
20919 The '``llvm.vp.add``' intrinsic performs integer addition (:ref:`add <i_add>`)
20920 of the first and second vector arguments on each enabled lane. The result on
20921 disabled lanes is a :ref:`poison value <poisonvalues>`.
20926 .. code-block:: llvm
20928 %r = call <4 x i32> @llvm.vp.add.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
20929 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
20931 %t = add <4 x i32> %a, %b
20932 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
20936 '``llvm.vp.sub.*``' Intrinsics
20937 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20941 This is an overloaded intrinsic.
20945 declare <16 x i32> @llvm.vp.sub.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
20946 declare <vscale x 4 x i32> @llvm.vp.sub.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
20947 declare <256 x i64> @llvm.vp.sub.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
20952 Predicated integer subtraction of two vectors of integers.
20958 The first two arguments and the result have the same vector of integer type. The
20959 third argument is the vector mask and has the same number of elements as the
20960 result vector type. The fourth argument is the explicit vector length of the
20966 The '``llvm.vp.sub``' intrinsic performs integer subtraction
20967 (:ref:`sub <i_sub>`) of the first and second vector arguments on each enabled
20968 lane. The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
20973 .. code-block:: llvm
20975 %r = call <4 x i32> @llvm.vp.sub.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
20976 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
20978 %t = sub <4 x i32> %a, %b
20979 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
20985 '``llvm.vp.mul.*``' Intrinsics
20986 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20990 This is an overloaded intrinsic.
20994 declare <16 x i32> @llvm.vp.mul.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
20995 declare <vscale x 4 x i32> @llvm.vp.mul.nxv46i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
20996 declare <256 x i64> @llvm.vp.mul.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
21001 Predicated integer multiplication of two vectors of integers.
21007 The first two arguments and the result have the same vector of integer type. The
21008 third argument is the vector mask and has the same number of elements as the
21009 result vector type. The fourth argument is the explicit vector length of the
21014 The '``llvm.vp.mul``' intrinsic performs integer multiplication
21015 (:ref:`mul <i_mul>`) of the first and second vector arguments on each enabled
21016 lane. The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
21021 .. code-block:: llvm
21023 %r = call <4 x i32> @llvm.vp.mul.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
21024 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
21026 %t = mul <4 x i32> %a, %b
21027 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
21032 '``llvm.vp.sdiv.*``' Intrinsics
21033 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21037 This is an overloaded intrinsic.
21041 declare <16 x i32> @llvm.vp.sdiv.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
21042 declare <vscale x 4 x i32> @llvm.vp.sdiv.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
21043 declare <256 x i64> @llvm.vp.sdiv.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
21048 Predicated, signed division of two vectors of integers.
21054 The first two arguments and the result have the same vector of integer type. The
21055 third argument is the vector mask and has the same number of elements as the
21056 result vector type. The fourth argument is the explicit vector length of the
21062 The '``llvm.vp.sdiv``' intrinsic performs signed division (:ref:`sdiv <i_sdiv>`)
21063 of the first and second vector arguments on each enabled lane. The result on
21064 disabled lanes is a :ref:`poison value <poisonvalues>`.
21069 .. code-block:: llvm
21071 %r = call <4 x i32> @llvm.vp.sdiv.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
21072 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
21074 %t = sdiv <4 x i32> %a, %b
21075 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
21080 '``llvm.vp.udiv.*``' Intrinsics
21081 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21085 This is an overloaded intrinsic.
21089 declare <16 x i32> @llvm.vp.udiv.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
21090 declare <vscale x 4 x i32> @llvm.vp.udiv.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
21091 declare <256 x i64> @llvm.vp.udiv.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
21096 Predicated, unsigned division of two vectors of integers.
21102 The first two arguments and the result have the same vector of integer type. The
21103 third argument is the vector mask and has the same number of elements as the
21104 result vector type. The fourth argument is the explicit vector length of the
21110 The '``llvm.vp.udiv``' intrinsic performs unsigned division
21111 (:ref:`udiv <i_udiv>`) of the first and second vector arguments on each enabled
21112 lane. The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
21117 .. code-block:: llvm
21119 %r = call <4 x i32> @llvm.vp.udiv.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
21120 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
21122 %t = udiv <4 x i32> %a, %b
21123 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
21129 '``llvm.vp.srem.*``' Intrinsics
21130 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21134 This is an overloaded intrinsic.
21138 declare <16 x i32> @llvm.vp.srem.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
21139 declare <vscale x 4 x i32> @llvm.vp.srem.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
21140 declare <256 x i64> @llvm.vp.srem.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
21145 Predicated computations of the signed remainder of two integer vectors.
21151 The first two arguments and the result have the same vector of integer type. The
21152 third argument is the vector mask and has the same number of elements as the
21153 result vector type. The fourth argument is the explicit vector length of the
21159 The '``llvm.vp.srem``' intrinsic computes the remainder of the signed division
21160 (:ref:`srem <i_srem>`) of the first and second vector arguments on each enabled
21161 lane. The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
21166 .. code-block:: llvm
21168 %r = call <4 x i32> @llvm.vp.srem.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
21169 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
21171 %t = srem <4 x i32> %a, %b
21172 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
21178 '``llvm.vp.urem.*``' Intrinsics
21179 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21183 This is an overloaded intrinsic.
21187 declare <16 x i32> @llvm.vp.urem.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
21188 declare <vscale x 4 x i32> @llvm.vp.urem.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
21189 declare <256 x i64> @llvm.vp.urem.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
21194 Predicated computation of the unsigned remainder of two integer vectors.
21200 The first two arguments and the result have the same vector of integer type. The
21201 third argument is the vector mask and has the same number of elements as the
21202 result vector type. The fourth argument is the explicit vector length of the
21208 The '``llvm.vp.urem``' intrinsic computes the remainder of the unsigned division
21209 (:ref:`urem <i_urem>`) of the first and second vector arguments on each enabled
21210 lane. The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
21215 .. code-block:: llvm
21217 %r = call <4 x i32> @llvm.vp.urem.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
21218 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
21220 %t = urem <4 x i32> %a, %b
21221 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
21226 '``llvm.vp.ashr.*``' Intrinsics
21227 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21231 This is an overloaded intrinsic.
21235 declare <16 x i32> @llvm.vp.ashr.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
21236 declare <vscale x 4 x i32> @llvm.vp.ashr.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
21237 declare <256 x i64> @llvm.vp.ashr.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
21242 Vector-predicated arithmetic right-shift.
21248 The first two arguments and the result have the same vector of integer type. The
21249 third argument is the vector mask and has the same number of elements as the
21250 result vector type. The fourth argument is the explicit vector length of the
21256 The '``llvm.vp.ashr``' intrinsic computes the arithmetic right shift
21257 (:ref:`ashr <i_ashr>`) of the first argument by the second argument on each
21258 enabled lane. The result on disabled lanes is a
21259 :ref:`poison value <poisonvalues>`.
21264 .. code-block:: llvm
21266 %r = call <4 x i32> @llvm.vp.ashr.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
21267 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
21269 %t = ashr <4 x i32> %a, %b
21270 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
21276 '``llvm.vp.lshr.*``' Intrinsics
21277 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21281 This is an overloaded intrinsic.
21285 declare <16 x i32> @llvm.vp.lshr.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
21286 declare <vscale x 4 x i32> @llvm.vp.lshr.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
21287 declare <256 x i64> @llvm.vp.lshr.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
21292 Vector-predicated logical right-shift.
21298 The first two arguments and the result have the same vector of integer type. The
21299 third argument is the vector mask and has the same number of elements as the
21300 result vector type. The fourth argument is the explicit vector length of the
21306 The '``llvm.vp.lshr``' intrinsic computes the logical right shift
21307 (:ref:`lshr <i_lshr>`) of the first argument by the second argument on each
21308 enabled lane. The result on disabled lanes is a
21309 :ref:`poison value <poisonvalues>`.
21314 .. code-block:: llvm
21316 %r = call <4 x i32> @llvm.vp.lshr.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
21317 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
21319 %t = lshr <4 x i32> %a, %b
21320 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
21325 '``llvm.vp.shl.*``' Intrinsics
21326 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21330 This is an overloaded intrinsic.
21334 declare <16 x i32> @llvm.vp.shl.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
21335 declare <vscale x 4 x i32> @llvm.vp.shl.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
21336 declare <256 x i64> @llvm.vp.shl.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
21341 Vector-predicated left shift.
21347 The first two arguments and the result have the same vector of integer type. The
21348 third argument is the vector mask and has the same number of elements as the
21349 result vector type. The fourth argument is the explicit vector length of the
21355 The '``llvm.vp.shl``' intrinsic computes the left shift (:ref:`shl <i_shl>`) of
21356 the first argument by the second argument on each enabled lane. The result on
21357 disabled lanes is a :ref:`poison value <poisonvalues>`.
21362 .. code-block:: llvm
21364 %r = call <4 x i32> @llvm.vp.shl.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
21365 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
21367 %t = shl <4 x i32> %a, %b
21368 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
21373 '``llvm.vp.or.*``' Intrinsics
21374 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21378 This is an overloaded intrinsic.
21382 declare <16 x i32> @llvm.vp.or.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
21383 declare <vscale x 4 x i32> @llvm.vp.or.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
21384 declare <256 x i64> @llvm.vp.or.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
21389 Vector-predicated or.
21395 The first two arguments and the result have the same vector of integer type. The
21396 third argument is the vector mask and has the same number of elements as the
21397 result vector type. The fourth argument is the explicit vector length of the
21403 The '``llvm.vp.or``' intrinsic performs a bitwise or (:ref:`or <i_or>`) of the
21404 first two arguments on each enabled lane. The result on disabled lanes is
21405 a :ref:`poison value <poisonvalues>`.
21410 .. code-block:: llvm
21412 %r = call <4 x i32> @llvm.vp.or.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
21413 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
21415 %t = or <4 x i32> %a, %b
21416 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
21421 '``llvm.vp.and.*``' Intrinsics
21422 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21426 This is an overloaded intrinsic.
21430 declare <16 x i32> @llvm.vp.and.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
21431 declare <vscale x 4 x i32> @llvm.vp.and.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
21432 declare <256 x i64> @llvm.vp.and.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
21437 Vector-predicated and.
21443 The first two arguments and the result have the same vector of integer type. The
21444 third argument is the vector mask and has the same number of elements as the
21445 result vector type. The fourth argument is the explicit vector length of the
21451 The '``llvm.vp.and``' intrinsic performs a bitwise and (:ref:`and <i_or>`) of
21452 the first two arguments on each enabled lane. The result on disabled lanes is
21453 a :ref:`poison value <poisonvalues>`.
21458 .. code-block:: llvm
21460 %r = call <4 x i32> @llvm.vp.and.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
21461 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
21463 %t = and <4 x i32> %a, %b
21464 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
21469 '``llvm.vp.xor.*``' Intrinsics
21470 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21474 This is an overloaded intrinsic.
21478 declare <16 x i32> @llvm.vp.xor.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
21479 declare <vscale x 4 x i32> @llvm.vp.xor.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
21480 declare <256 x i64> @llvm.vp.xor.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
21485 Vector-predicated, bitwise xor.
21491 The first two arguments and the result have the same vector of integer type. The
21492 third argument is the vector mask and has the same number of elements as the
21493 result vector type. The fourth argument is the explicit vector length of the
21499 The '``llvm.vp.xor``' intrinsic performs a bitwise xor (:ref:`xor <i_xor>`) of
21500 the first two arguments on each enabled lane.
21501 The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
21506 .. code-block:: llvm
21508 %r = call <4 x i32> @llvm.vp.xor.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
21509 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
21511 %t = xor <4 x i32> %a, %b
21512 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
21516 '``llvm.vp.abs.*``' Intrinsics
21517 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21521 This is an overloaded intrinsic.
21525 declare <16 x i32> @llvm.vp.abs.v16i32 (<16 x i32> <op>, <16 x i1> <mask>, i32 <vector_length>, i1 <is_int_min_poison>)
21526 declare <vscale x 4 x i32> @llvm.vp.abs.nxv4i32 (<vscale x 4 x i32> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>, i1 <is_int_min_poison>)
21527 declare <256 x i64> @llvm.vp.abs.v256i64 (<256 x i64> <op>, <256 x i1> <mask>, i32 <vector_length>, i1 <is_int_min_poison>)
21532 Predicated abs of a vector of integers.
21538 The first argument and the result have the same vector of integer type. The
21539 second argument is the vector mask and has the same number of elements as the
21540 result vector type. The third argument is the explicit vector length of the
21541 operation. The fourth argument must be a constant and is a flag to indicate
21542 whether the result value of the '``llvm.vp.abs``' intrinsic is a
21543 :ref:`poison value <poisonvalues>` if the first argument is statically or
21544 dynamically an ``INT_MIN`` value.
21549 The '``llvm.vp.abs``' intrinsic performs abs (:ref:`abs <int_abs>`) of the first argument on each
21550 enabled lane. The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
21555 .. code-block:: llvm
21557 %r = call <4 x i32> @llvm.vp.abs.v4i32(<4 x i32> %a, <4 x i1> %mask, i32 %evl, i1 false)
21558 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
21560 %t = call <4 x i32> @llvm.abs.v4i32(<4 x i32> %a, i1 false)
21561 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
21567 '``llvm.vp.smax.*``' Intrinsics
21568 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21572 This is an overloaded intrinsic.
21576 declare <16 x i32> @llvm.vp.smax.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
21577 declare <vscale x 4 x i32> @llvm.vp.smax.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
21578 declare <256 x i64> @llvm.vp.smax.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
21583 Predicated integer signed maximum of two vectors of integers.
21589 The first two arguments and the result have the same vector of integer type. The
21590 third argument is the vector mask and has the same number of elements as the
21591 result vector type. The fourth argument is the explicit vector length of the
21597 The '``llvm.vp.smax``' intrinsic performs integer signed maximum (:ref:`smax <int_smax>`)
21598 of the first and second vector arguments on each enabled lane. The result on
21599 disabled lanes is a :ref:`poison value <poisonvalues>`.
21604 .. code-block:: llvm
21606 %r = call <4 x i32> @llvm.vp.smax.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
21607 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
21609 %t = call <4 x i32> @llvm.smax.v4i32(<4 x i32> %a, <4 x i32> %b)
21610 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
21615 '``llvm.vp.smin.*``' Intrinsics
21616 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21620 This is an overloaded intrinsic.
21624 declare <16 x i32> @llvm.vp.smin.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
21625 declare <vscale x 4 x i32> @llvm.vp.smin.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
21626 declare <256 x i64> @llvm.vp.smin.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
21631 Predicated integer signed minimum of two vectors of integers.
21637 The first two arguments and the result have the same vector of integer type. The
21638 third argument is the vector mask and has the same number of elements as the
21639 result vector type. The fourth argument is the explicit vector length of the
21645 The '``llvm.vp.smin``' intrinsic performs integer signed minimum (:ref:`smin <int_smin>`)
21646 of the first and second vector arguments on each enabled lane. The result on
21647 disabled lanes is a :ref:`poison value <poisonvalues>`.
21652 .. code-block:: llvm
21654 %r = call <4 x i32> @llvm.vp.smin.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
21655 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
21657 %t = call <4 x i32> @llvm.smin.v4i32(<4 x i32> %a, <4 x i32> %b)
21658 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
21663 '``llvm.vp.umax.*``' Intrinsics
21664 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21668 This is an overloaded intrinsic.
21672 declare <16 x i32> @llvm.vp.umax.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
21673 declare <vscale x 4 x i32> @llvm.vp.umax.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
21674 declare <256 x i64> @llvm.vp.umax.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
21679 Predicated integer unsigned maximum of two vectors of integers.
21685 The first two arguments and the result have the same vector of integer type. The
21686 third argument is the vector mask and has the same number of elements as the
21687 result vector type. The fourth argument is the explicit vector length of the
21693 The '``llvm.vp.umax``' intrinsic performs integer unsigned maximum (:ref:`umax <int_umax>`)
21694 of the first and second vector arguments on each enabled lane. The result on
21695 disabled lanes is a :ref:`poison value <poisonvalues>`.
21700 .. code-block:: llvm
21702 %r = call <4 x i32> @llvm.vp.umax.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
21703 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
21705 %t = call <4 x i32> @llvm.umax.v4i32(<4 x i32> %a, <4 x i32> %b)
21706 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
21711 '``llvm.vp.umin.*``' Intrinsics
21712 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21716 This is an overloaded intrinsic.
21720 declare <16 x i32> @llvm.vp.umin.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
21721 declare <vscale x 4 x i32> @llvm.vp.umin.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
21722 declare <256 x i64> @llvm.vp.umin.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
21727 Predicated integer unsigned minimum of two vectors of integers.
21733 The first two arguments and the result have the same vector of integer type. The
21734 third argument is the vector mask and has the same number of elements as the
21735 result vector type. The fourth argument is the explicit vector length of the
21741 The '``llvm.vp.umin``' intrinsic performs integer unsigned minimum (:ref:`umin <int_umin>`)
21742 of the first and second vector arguments on each enabled lane. The result on
21743 disabled lanes is a :ref:`poison value <poisonvalues>`.
21748 .. code-block:: llvm
21750 %r = call <4 x i32> @llvm.vp.umin.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
21751 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
21753 %t = call <4 x i32> @llvm.umin.v4i32(<4 x i32> %a, <4 x i32> %b)
21754 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
21757 .. _int_vp_copysign:
21759 '``llvm.vp.copysign.*``' Intrinsics
21760 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21764 This is an overloaded intrinsic.
21768 declare <16 x float> @llvm.vp.copysign.v16f32 (<16 x float> <mag_op>, <16 x float> <sign_op>, <16 x i1> <mask>, i32 <vector_length>)
21769 declare <vscale x 4 x float> @llvm.vp.copysign.nxv4f32 (<vscale x 4 x float> <mag_op>, <vscale x 4 x float> <sign_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
21770 declare <256 x double> @llvm.vp.copysign.v256f64 (<256 x double> <mag_op>, <256 x double> <sign_op>, <256 x i1> <mask>, i32 <vector_length>)
21775 Predicated floating-point copysign of two vectors of floating-point values.
21781 The first two arguments and the result have the same vector of floating-point type. The
21782 third argument is the vector mask and has the same number of elements as the
21783 result vector type. The fourth argument is the explicit vector length of the
21789 The '``llvm.vp.copysign``' intrinsic performs floating-point copysign (:ref:`copysign <int_copysign>`)
21790 of the first and second vector arguments on each enabled lane. The result on
21791 disabled lanes is a :ref:`poison value <poisonvalues>`. The operation is
21792 performed in the default floating-point environment.
21797 .. code-block:: llvm
21799 %r = call <4 x float> @llvm.vp.copysign.v4f32(<4 x float> %mag, <4 x float> %sign, <4 x i1> %mask, i32 %evl)
21800 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
21802 %t = call <4 x float> @llvm.copysign.v4f32(<4 x float> %mag, <4 x float> %sign)
21803 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
21808 '``llvm.vp.minnum.*``' Intrinsics
21809 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21813 This is an overloaded intrinsic.
21817 declare <16 x float> @llvm.vp.minnum.v16f32 (<16 x float> <left_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
21818 declare <vscale x 4 x float> @llvm.vp.minnum.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
21819 declare <256 x double> @llvm.vp.minnum.v256f64 (<256 x double> <left_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
21824 Predicated floating-point IEEE-754 minNum of two vectors of floating-point values.
21830 The first two arguments and the result have the same vector of floating-point type. The
21831 third argument is the vector mask and has the same number of elements as the
21832 result vector type. The fourth argument is the explicit vector length of the
21838 The '``llvm.vp.minnum``' intrinsic performs floating-point minimum (:ref:`minnum <i_minnum>`)
21839 of the first and second vector arguments on each enabled lane. The result on
21840 disabled lanes is a :ref:`poison value <poisonvalues>`. The operation is
21841 performed in the default floating-point environment.
21846 .. code-block:: llvm
21848 %r = call <4 x float> @llvm.vp.minnum.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl)
21849 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
21851 %t = call <4 x float> @llvm.minnum.v4f32(<4 x float> %a, <4 x float> %b)
21852 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
21857 '``llvm.vp.maxnum.*``' Intrinsics
21858 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21862 This is an overloaded intrinsic.
21866 declare <16 x float> @llvm.vp.maxnum.v16f32 (<16 x float> <left_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
21867 declare <vscale x 4 x float> @llvm.vp.maxnum.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
21868 declare <256 x double> @llvm.vp.maxnum.v256f64 (<256 x double> <left_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
21873 Predicated floating-point IEEE-754 maxNum of two vectors of floating-point values.
21879 The first two arguments and the result have the same vector of floating-point type. The
21880 third argument is the vector mask and has the same number of elements as the
21881 result vector type. The fourth argument is the explicit vector length of the
21887 The '``llvm.vp.maxnum``' intrinsic performs floating-point maximum (:ref:`maxnum <i_maxnum>`)
21888 of the first and second vector arguments on each enabled lane. The result on
21889 disabled lanes is a :ref:`poison value <poisonvalues>`. The operation is
21890 performed in the default floating-point environment.
21895 .. code-block:: llvm
21897 %r = call <4 x float> @llvm.vp.maxnum.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl)
21898 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
21900 %t = call <4 x float> @llvm.maxnum.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl)
21901 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
21904 .. _int_vp_minimum:
21906 '``llvm.vp.minimum.*``' Intrinsics
21907 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21911 This is an overloaded intrinsic.
21915 declare <16 x float> @llvm.vp.minimum.v16f32 (<16 x float> <left_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
21916 declare <vscale x 4 x float> @llvm.vp.minimum.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
21917 declare <256 x double> @llvm.vp.minimum.v256f64 (<256 x double> <left_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
21922 Predicated floating-point minimum of two vectors of floating-point values,
21923 propagating NaNs and treating -0.0 as less than +0.0.
21928 The first two arguments and the result have the same vector of floating-point type. The
21929 third argument is the vector mask and has the same number of elements as the
21930 result vector type. The fourth argument is the explicit vector length of the
21936 The '``llvm.vp.minimum``' intrinsic performs floating-point minimum (:ref:`minimum <i_minimum>`)
21937 of the first and second vector arguments on each enabled lane, the result being
21938 NaN if either argument is a NaN. -0.0 is considered to be less than +0.0 for this
21939 intrinsic. The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
21940 The operation is performed in the default floating-point environment.
21945 .. code-block:: llvm
21947 %r = call <4 x float> @llvm.vp.minimum.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl)
21948 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
21950 %t = call <4 x float> @llvm.minimum.v4f32(<4 x float> %a, <4 x float> %b)
21951 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
21954 .. _int_vp_maximum:
21956 '``llvm.vp.maximum.*``' Intrinsics
21957 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21961 This is an overloaded intrinsic.
21965 declare <16 x float> @llvm.vp.maximum.v16f32 (<16 x float> <left_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
21966 declare <vscale x 4 x float> @llvm.vp.maximum.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
21967 declare <256 x double> @llvm.vp.maximum.v256f64 (<256 x double> <left_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
21972 Predicated floating-point maximum of two vectors of floating-point values,
21973 propagating NaNs and treating -0.0 as less than +0.0.
21978 The first two arguments and the result have the same vector of floating-point type. The
21979 third argument is the vector mask and has the same number of elements as the
21980 result vector type. The fourth argument is the explicit vector length of the
21986 The '``llvm.vp.maximum``' intrinsic performs floating-point maximum (:ref:`maximum <i_maximum>`)
21987 of the first and second vector arguments on each enabled lane, the result being
21988 NaN if either argument is a NaN. -0.0 is considered to be less than +0.0 for this
21989 intrinsic. The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
21990 The operation is performed in the default floating-point environment.
21995 .. code-block:: llvm
21997 %r = call <4 x float> @llvm.vp.maximum.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl)
21998 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
22000 %t = call <4 x float> @llvm.maximum.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl)
22001 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
22006 '``llvm.vp.fadd.*``' Intrinsics
22007 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22011 This is an overloaded intrinsic.
22015 declare <16 x float> @llvm.vp.fadd.v16f32 (<16 x float> <left_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
22016 declare <vscale x 4 x float> @llvm.vp.fadd.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
22017 declare <256 x double> @llvm.vp.fadd.v256f64 (<256 x double> <left_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
22022 Predicated floating-point addition of two vectors of floating-point values.
22028 The first two arguments and the result have the same vector of floating-point type. The
22029 third argument is the vector mask and has the same number of elements as the
22030 result vector type. The fourth argument is the explicit vector length of the
22036 The '``llvm.vp.fadd``' intrinsic performs floating-point addition (:ref:`fadd <i_fadd>`)
22037 of the first and second vector arguments on each enabled lane. The result on
22038 disabled lanes is a :ref:`poison value <poisonvalues>`. The operation is
22039 performed in the default floating-point environment.
22044 .. code-block:: llvm
22046 %r = call <4 x float> @llvm.vp.fadd.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl)
22047 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
22049 %t = fadd <4 x float> %a, %b
22050 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
22055 '``llvm.vp.fsub.*``' Intrinsics
22056 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22060 This is an overloaded intrinsic.
22064 declare <16 x float> @llvm.vp.fsub.v16f32 (<16 x float> <left_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
22065 declare <vscale x 4 x float> @llvm.vp.fsub.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
22066 declare <256 x double> @llvm.vp.fsub.v256f64 (<256 x double> <left_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
22071 Predicated floating-point subtraction of two vectors of floating-point values.
22077 The first two arguments and the result have the same vector of floating-point type. The
22078 third argument is the vector mask and has the same number of elements as the
22079 result vector type. The fourth argument is the explicit vector length of the
22085 The '``llvm.vp.fsub``' intrinsic performs floating-point subtraction (:ref:`fsub <i_fsub>`)
22086 of the first and second vector arguments on each enabled lane. The result on
22087 disabled lanes is a :ref:`poison value <poisonvalues>`. The operation is
22088 performed in the default floating-point environment.
22093 .. code-block:: llvm
22095 %r = call <4 x float> @llvm.vp.fsub.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl)
22096 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
22098 %t = fsub <4 x float> %a, %b
22099 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
22104 '``llvm.vp.fmul.*``' Intrinsics
22105 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22109 This is an overloaded intrinsic.
22113 declare <16 x float> @llvm.vp.fmul.v16f32 (<16 x float> <left_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
22114 declare <vscale x 4 x float> @llvm.vp.fmul.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
22115 declare <256 x double> @llvm.vp.fmul.v256f64 (<256 x double> <left_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
22120 Predicated floating-point multiplication of two vectors of floating-point values.
22126 The first two arguments and the result have the same vector of floating-point type. The
22127 third argument is the vector mask and has the same number of elements as the
22128 result vector type. The fourth argument is the explicit vector length of the
22134 The '``llvm.vp.fmul``' intrinsic performs floating-point multiplication (:ref:`fmul <i_fmul>`)
22135 of the first and second vector arguments on each enabled lane. The result on
22136 disabled lanes is a :ref:`poison value <poisonvalues>`. The operation is
22137 performed in the default floating-point environment.
22142 .. code-block:: llvm
22144 %r = call <4 x float> @llvm.vp.fmul.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl)
22145 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
22147 %t = fmul <4 x float> %a, %b
22148 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
22153 '``llvm.vp.fdiv.*``' Intrinsics
22154 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22158 This is an overloaded intrinsic.
22162 declare <16 x float> @llvm.vp.fdiv.v16f32 (<16 x float> <left_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
22163 declare <vscale x 4 x float> @llvm.vp.fdiv.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
22164 declare <256 x double> @llvm.vp.fdiv.v256f64 (<256 x double> <left_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
22169 Predicated floating-point division of two vectors of floating-point values.
22175 The first two arguments and the result have the same vector of floating-point type. The
22176 third argument is the vector mask and has the same number of elements as the
22177 result vector type. The fourth argument is the explicit vector length of the
22183 The '``llvm.vp.fdiv``' intrinsic performs floating-point division (:ref:`fdiv <i_fdiv>`)
22184 of the first and second vector arguments on each enabled lane. The result on
22185 disabled lanes is a :ref:`poison value <poisonvalues>`. The operation is
22186 performed in the default floating-point environment.
22191 .. code-block:: llvm
22193 %r = call <4 x float> @llvm.vp.fdiv.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl)
22194 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
22196 %t = fdiv <4 x float> %a, %b
22197 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
22202 '``llvm.vp.frem.*``' Intrinsics
22203 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22207 This is an overloaded intrinsic.
22211 declare <16 x float> @llvm.vp.frem.v16f32 (<16 x float> <left_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
22212 declare <vscale x 4 x float> @llvm.vp.frem.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
22213 declare <256 x double> @llvm.vp.frem.v256f64 (<256 x double> <left_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
22218 Predicated floating-point remainder of two vectors of floating-point values.
22224 The first two arguments and the result have the same vector of floating-point type. The
22225 third argument is the vector mask and has the same number of elements as the
22226 result vector type. The fourth argument is the explicit vector length of the
22232 The '``llvm.vp.frem``' intrinsic performs floating-point remainder (:ref:`frem <i_frem>`)
22233 of the first and second vector arguments on each enabled lane. The result on
22234 disabled lanes is a :ref:`poison value <poisonvalues>`. The operation is
22235 performed in the default floating-point environment.
22240 .. code-block:: llvm
22242 %r = call <4 x float> @llvm.vp.frem.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl)
22243 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
22245 %t = frem <4 x float> %a, %b
22246 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
22251 '``llvm.vp.fneg.*``' Intrinsics
22252 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22256 This is an overloaded intrinsic.
22260 declare <16 x float> @llvm.vp.fneg.v16f32 (<16 x float> <op>, <16 x i1> <mask>, i32 <vector_length>)
22261 declare <vscale x 4 x float> @llvm.vp.fneg.nxv4f32 (<vscale x 4 x float> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
22262 declare <256 x double> @llvm.vp.fneg.v256f64 (<256 x double> <op>, <256 x i1> <mask>, i32 <vector_length>)
22267 Predicated floating-point negation of a vector of floating-point values.
22273 The first argument and the result have the same vector of floating-point type.
22274 The second argument is the vector mask and has the same number of elements as the
22275 result vector type. The third argument is the explicit vector length of the
22281 The '``llvm.vp.fneg``' intrinsic performs floating-point negation (:ref:`fneg <i_fneg>`)
22282 of the first vector argument on each enabled lane. The result on disabled lanes
22283 is a :ref:`poison value <poisonvalues>`.
22288 .. code-block:: llvm
22290 %r = call <4 x float> @llvm.vp.fneg.v4f32(<4 x float> %a, <4 x i1> %mask, i32 %evl)
22291 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
22293 %t = fneg <4 x float> %a
22294 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
22299 '``llvm.vp.fabs.*``' Intrinsics
22300 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22304 This is an overloaded intrinsic.
22308 declare <16 x float> @llvm.vp.fabs.v16f32 (<16 x float> <op>, <16 x i1> <mask>, i32 <vector_length>)
22309 declare <vscale x 4 x float> @llvm.vp.fabs.nxv4f32 (<vscale x 4 x float> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
22310 declare <256 x double> @llvm.vp.fabs.v256f64 (<256 x double> <op>, <256 x i1> <mask>, i32 <vector_length>)
22315 Predicated floating-point absolute value of a vector of floating-point values.
22321 The first argument and the result have the same vector of floating-point type.
22322 The second argument is the vector mask and has the same number of elements as the
22323 result vector type. The third argument is the explicit vector length of the
22329 The '``llvm.vp.fabs``' intrinsic performs floating-point absolute value
22330 (:ref:`fabs <int_fabs>`) of the first vector argument on each enabled lane. The
22331 result on disabled lanes is a :ref:`poison value <poisonvalues>`.
22336 .. code-block:: llvm
22338 %r = call <4 x float> @llvm.vp.fabs.v4f32(<4 x float> %a, <4 x i1> %mask, i32 %evl)
22339 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
22341 %t = call <4 x float> @llvm.fabs.v4f32(<4 x float> %a)
22342 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
22347 '``llvm.vp.sqrt.*``' Intrinsics
22348 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22352 This is an overloaded intrinsic.
22356 declare <16 x float> @llvm.vp.sqrt.v16f32 (<16 x float> <op>, <16 x i1> <mask>, i32 <vector_length>)
22357 declare <vscale x 4 x float> @llvm.vp.sqrt.nxv4f32 (<vscale x 4 x float> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
22358 declare <256 x double> @llvm.vp.sqrt.v256f64 (<256 x double> <op>, <256 x i1> <mask>, i32 <vector_length>)
22363 Predicated floating-point square root of a vector of floating-point values.
22369 The first argument and the result have the same vector of floating-point type.
22370 The second argument is the vector mask and has the same number of elements as the
22371 result vector type. The third argument is the explicit vector length of the
22377 The '``llvm.vp.sqrt``' intrinsic performs floating-point square root (:ref:`sqrt <int_sqrt>`) of
22378 the first vector argument on each enabled lane. The result on disabled lanes is
22379 a :ref:`poison value <poisonvalues>`. The operation is performed in the default
22380 floating-point environment.
22385 .. code-block:: llvm
22387 %r = call <4 x float> @llvm.vp.sqrt.v4f32(<4 x float> %a, <4 x i1> %mask, i32 %evl)
22388 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
22390 %t = call <4 x float> @llvm.sqrt.v4f32(<4 x float> %a)
22391 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
22396 '``llvm.vp.fma.*``' Intrinsics
22397 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22401 This is an overloaded intrinsic.
22405 declare <16 x float> @llvm.vp.fma.v16f32 (<16 x float> <left_op>, <16 x float> <middle_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
22406 declare <vscale x 4 x float> @llvm.vp.fma.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <middle_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
22407 declare <256 x double> @llvm.vp.fma.v256f64 (<256 x double> <left_op>, <256 x double> <middle_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
22412 Predicated floating-point fused multiply-add of two vectors of floating-point values.
22418 The first three arguments and the result have the same vector of floating-point type. The
22419 fourth argument is the vector mask and has the same number of elements as the
22420 result vector type. The fifth argument is the explicit vector length of the
22426 The '``llvm.vp.fma``' intrinsic performs floating-point fused multiply-add (:ref:`llvm.fma <int_fma>`)
22427 of the first, second, and third vector argument on each enabled lane. The result on
22428 disabled lanes is a :ref:`poison value <poisonvalues>`. The operation is
22429 performed in the default floating-point environment.
22434 .. code-block:: llvm
22436 %r = call <4 x float> @llvm.vp.fma.v4f32(<4 x float> %a, <4 x float> %b, <4 x float> %c, <4 x i1> %mask, i32 %evl)
22437 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
22439 %t = call <4 x float> @llvm.fma(<4 x float> %a, <4 x float> %b, <4 x float> %c)
22440 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
22443 .. _int_vp_fmuladd:
22445 '``llvm.vp.fmuladd.*``' Intrinsics
22446 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22450 This is an overloaded intrinsic.
22454 declare <16 x float> @llvm.vp.fmuladd.v16f32 (<16 x float> <left_op>, <16 x float> <middle_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
22455 declare <vscale x 4 x float> @llvm.vp.fmuladd.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <middle_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
22456 declare <256 x double> @llvm.vp.fmuladd.v256f64 (<256 x double> <left_op>, <256 x double> <middle_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
22461 Predicated floating-point multiply-add of two vectors of floating-point values
22462 that can be fused if code generator determines that (a) the target instruction
22463 set has support for a fused operation, and (b) that the fused operation is more
22464 efficient than the equivalent, separate pair of mul and add instructions.
22469 The first three arguments and the result have the same vector of floating-point
22470 type. The fourth argument is the vector mask and has the same number of elements
22471 as the result vector type. The fifth argument is the explicit vector length of
22477 The '``llvm.vp.fmuladd``' intrinsic performs floating-point multiply-add (:ref:`llvm.fuladd <int_fmuladd>`)
22478 of the first, second, and third vector argument on each enabled lane. The result
22479 on disabled lanes is a :ref:`poison value <poisonvalues>`. The operation is
22480 performed in the default floating-point environment.
22485 .. code-block:: llvm
22487 %r = call <4 x float> @llvm.vp.fmuladd.v4f32(<4 x float> %a, <4 x float> %b, <4 x float> %c, <4 x i1> %mask, i32 %evl)
22488 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
22490 %t = call <4 x float> @llvm.fmuladd(<4 x float> %a, <4 x float> %b, <4 x float> %c)
22491 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
22494 .. _int_vp_reduce_add:
22496 '``llvm.vp.reduce.add.*``' Intrinsics
22497 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22501 This is an overloaded intrinsic.
22505 declare i32 @llvm.vp.reduce.add.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>)
22506 declare i16 @llvm.vp.reduce.add.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
22511 Predicated integer ``ADD`` reduction of a vector and a scalar starting value,
22512 returning the result as a scalar.
22517 The first argument is the start value of the reduction, which must be a scalar
22518 integer type equal to the result type. The second argument is the vector on
22519 which the reduction is performed and must be a vector of integer values whose
22520 element type is the result/start type. The third argument is the vector mask and
22521 is a vector of boolean values with the same number of elements as the vector
22522 argument. The fourth argument is the explicit vector length of the operation.
22527 The '``llvm.vp.reduce.add``' intrinsic performs the integer ``ADD`` reduction
22528 (:ref:`llvm.vector.reduce.add <int_vector_reduce_add>`) of the vector argument
22529 ``val`` on each enabled lane, adding it to the scalar ``start_value``. Disabled
22530 lanes are treated as containing the neutral value ``0`` (i.e. having no effect
22531 on the reduction operation). If the vector length is zero, the result is equal
22532 to ``start_value``.
22534 To ignore the start value, the neutral value can be used.
22539 .. code-block:: llvm
22541 %r = call i32 @llvm.vp.reduce.add.v4i32(i32 %start, <4 x i32> %a, <4 x i1> %mask, i32 %evl)
22542 ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
22543 ; are treated as though %mask were false for those lanes.
22545 %masked.a = select <4 x i1> %mask, <4 x i32> %a, <4 x i32> zeroinitializer
22546 %reduction = call i32 @llvm.vector.reduce.add.v4i32(<4 x i32> %masked.a)
22547 %also.r = add i32 %reduction, %start
22550 .. _int_vp_reduce_fadd:
22552 '``llvm.vp.reduce.fadd.*``' Intrinsics
22553 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22557 This is an overloaded intrinsic.
22561 declare float @llvm.vp.reduce.fadd.v4f32(float <start_value>, <4 x float> <val>, <4 x i1> <mask>, i32 <vector_length>)
22562 declare double @llvm.vp.reduce.fadd.nxv8f64(double <start_value>, <vscale x 8 x double> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
22567 Predicated floating-point ``ADD`` reduction of a vector and a scalar starting
22568 value, returning the result as a scalar.
22573 The first argument is the start value of the reduction, which must be a scalar
22574 floating-point type equal to the result type. The second argument is the vector
22575 on which the reduction is performed and must be a vector of floating-point
22576 values whose element type is the result/start type. The third argument is the
22577 vector mask and is a vector of boolean values with the same number of elements
22578 as the vector argument. The fourth argument is the explicit vector length of the
22584 The '``llvm.vp.reduce.fadd``' intrinsic performs the floating-point ``ADD``
22585 reduction (:ref:`llvm.vector.reduce.fadd <int_vector_reduce_fadd>`) of the
22586 vector argument ``val`` on each enabled lane, adding it to the scalar
22587 ``start_value``. Disabled lanes are treated as containing the neutral value
22588 ``-0.0`` (i.e. having no effect on the reduction operation). If no lanes are
22589 enabled, the resulting value will be equal to ``start_value``.
22591 To ignore the start value, the neutral value can be used.
22593 See the unpredicated version (:ref:`llvm.vector.reduce.fadd
22594 <int_vector_reduce_fadd>`) for more detail on the semantics of the reduction.
22599 .. code-block:: llvm
22601 %r = call float @llvm.vp.reduce.fadd.v4f32(float %start, <4 x float> %a, <4 x i1> %mask, i32 %evl)
22602 ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
22603 ; are treated as though %mask were false for those lanes.
22605 %masked.a = select <4 x i1> %mask, <4 x float> %a, <4 x float> <float -0.0, float -0.0, float -0.0, float -0.0>
22606 %also.r = call float @llvm.vector.reduce.fadd.v4f32(float %start, <4 x float> %masked.a)
22609 .. _int_vp_reduce_mul:
22611 '``llvm.vp.reduce.mul.*``' Intrinsics
22612 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22616 This is an overloaded intrinsic.
22620 declare i32 @llvm.vp.reduce.mul.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>)
22621 declare i16 @llvm.vp.reduce.mul.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
22626 Predicated integer ``MUL`` reduction of a vector and a scalar starting value,
22627 returning the result as a scalar.
22633 The first argument is the start value of the reduction, which must be a scalar
22634 integer type equal to the result type. The second argument is the vector on
22635 which the reduction is performed and must be a vector of integer values whose
22636 element type is the result/start type. The third argument is the vector mask and
22637 is a vector of boolean values with the same number of elements as the vector
22638 argument. The fourth argument is the explicit vector length of the operation.
22643 The '``llvm.vp.reduce.mul``' intrinsic performs the integer ``MUL`` reduction
22644 (:ref:`llvm.vector.reduce.mul <int_vector_reduce_mul>`) of the vector argument ``val``
22645 on each enabled lane, multiplying it by the scalar ``start_value``. Disabled
22646 lanes are treated as containing the neutral value ``1`` (i.e. having no effect
22647 on the reduction operation). If the vector length is zero, the result is the
22650 To ignore the start value, the neutral value can be used.
22655 .. code-block:: llvm
22657 %r = call i32 @llvm.vp.reduce.mul.v4i32(i32 %start, <4 x i32> %a, <4 x i1> %mask, i32 %evl)
22658 ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
22659 ; are treated as though %mask were false for those lanes.
22661 %masked.a = select <4 x i1> %mask, <4 x i32> %a, <4 x i32> <i32 1, i32 1, i32 1, i32 1>
22662 %reduction = call i32 @llvm.vector.reduce.mul.v4i32(<4 x i32> %masked.a)
22663 %also.r = mul i32 %reduction, %start
22665 .. _int_vp_reduce_fmul:
22667 '``llvm.vp.reduce.fmul.*``' Intrinsics
22668 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22672 This is an overloaded intrinsic.
22676 declare float @llvm.vp.reduce.fmul.v4f32(float <start_value>, <4 x float> <val>, <4 x i1> <mask>, i32 <vector_length>)
22677 declare double @llvm.vp.reduce.fmul.nxv8f64(double <start_value>, <vscale x 8 x double> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
22682 Predicated floating-point ``MUL`` reduction of a vector and a scalar starting
22683 value, returning the result as a scalar.
22689 The first argument is the start value of the reduction, which must be a scalar
22690 floating-point type equal to the result type. The second argument is the vector
22691 on which the reduction is performed and must be a vector of floating-point
22692 values whose element type is the result/start type. The third argument is the
22693 vector mask and is a vector of boolean values with the same number of elements
22694 as the vector argument. The fourth argument is the explicit vector length of the
22700 The '``llvm.vp.reduce.fmul``' intrinsic performs the floating-point ``MUL``
22701 reduction (:ref:`llvm.vector.reduce.fmul <int_vector_reduce_fmul>`) of the
22702 vector argument ``val`` on each enabled lane, multiplying it by the scalar
22703 `start_value``. Disabled lanes are treated as containing the neutral value
22704 ``1.0`` (i.e. having no effect on the reduction operation). If no lanes are
22705 enabled, the resulting value will be equal to the starting value.
22707 To ignore the start value, the neutral value can be used.
22709 See the unpredicated version (:ref:`llvm.vector.reduce.fmul
22710 <int_vector_reduce_fmul>`) for more detail on the semantics.
22715 .. code-block:: llvm
22717 %r = call float @llvm.vp.reduce.fmul.v4f32(float %start, <4 x float> %a, <4 x i1> %mask, i32 %evl)
22718 ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
22719 ; are treated as though %mask were false for those lanes.
22721 %masked.a = select <4 x i1> %mask, <4 x float> %a, <4 x float> <float 1.0, float 1.0, float 1.0, float 1.0>
22722 %also.r = call float @llvm.vector.reduce.fmul.v4f32(float %start, <4 x float> %masked.a)
22725 .. _int_vp_reduce_and:
22727 '``llvm.vp.reduce.and.*``' Intrinsics
22728 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22732 This is an overloaded intrinsic.
22736 declare i32 @llvm.vp.reduce.and.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>)
22737 declare i16 @llvm.vp.reduce.and.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
22742 Predicated integer ``AND`` reduction of a vector and a scalar starting value,
22743 returning the result as a scalar.
22749 The first argument is the start value of the reduction, which must be a scalar
22750 integer type equal to the result type. The second argument is the vector on
22751 which the reduction is performed and must be a vector of integer values whose
22752 element type is the result/start type. The third argument is the vector mask and
22753 is a vector of boolean values with the same number of elements as the vector
22754 argument. The fourth argument is the explicit vector length of the operation.
22759 The '``llvm.vp.reduce.and``' intrinsic performs the integer ``AND`` reduction
22760 (:ref:`llvm.vector.reduce.and <int_vector_reduce_and>`) of the vector argument
22761 ``val`` on each enabled lane, performing an '``and``' of that with with the
22762 scalar ``start_value``. Disabled lanes are treated as containing the neutral
22763 value ``UINT_MAX``, or ``-1`` (i.e. having no effect on the reduction
22764 operation). If the vector length is zero, the result is the start value.
22766 To ignore the start value, the neutral value can be used.
22771 .. code-block:: llvm
22773 %r = call i32 @llvm.vp.reduce.and.v4i32(i32 %start, <4 x i32> %a, <4 x i1> %mask, i32 %evl)
22774 ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
22775 ; are treated as though %mask were false for those lanes.
22777 %masked.a = select <4 x i1> %mask, <4 x i32> %a, <4 x i32> <i32 -1, i32 -1, i32 -1, i32 -1>
22778 %reduction = call i32 @llvm.vector.reduce.and.v4i32(<4 x i32> %masked.a)
22779 %also.r = and i32 %reduction, %start
22782 .. _int_vp_reduce_or:
22784 '``llvm.vp.reduce.or.*``' Intrinsics
22785 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22789 This is an overloaded intrinsic.
22793 declare i32 @llvm.vp.reduce.or.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>)
22794 declare i16 @llvm.vp.reduce.or.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
22799 Predicated integer ``OR`` reduction of a vector and a scalar starting value,
22800 returning the result as a scalar.
22806 The first argument is the start value of the reduction, which must be a scalar
22807 integer type equal to the result type. The second argument is the vector on
22808 which the reduction is performed and must be a vector of integer values whose
22809 element type is the result/start type. The third argument is the vector mask and
22810 is a vector of boolean values with the same number of elements as the vector
22811 argument. The fourth argument is the explicit vector length of the operation.
22816 The '``llvm.vp.reduce.or``' intrinsic performs the integer ``OR`` reduction
22817 (:ref:`llvm.vector.reduce.or <int_vector_reduce_or>`) of the vector argument
22818 ``val`` on each enabled lane, performing an '``or``' of that with the scalar
22819 ``start_value``. Disabled lanes are treated as containing the neutral value
22820 ``0`` (i.e. having no effect on the reduction operation). If the vector length
22821 is zero, the result is the start value.
22823 To ignore the start value, the neutral value can be used.
22828 .. code-block:: llvm
22830 %r = call i32 @llvm.vp.reduce.or.v4i32(i32 %start, <4 x i32> %a, <4 x i1> %mask, i32 %evl)
22831 ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
22832 ; are treated as though %mask were false for those lanes.
22834 %masked.a = select <4 x i1> %mask, <4 x i32> %a, <4 x i32> <i32 0, i32 0, i32 0, i32 0>
22835 %reduction = call i32 @llvm.vector.reduce.or.v4i32(<4 x i32> %masked.a)
22836 %also.r = or i32 %reduction, %start
22838 .. _int_vp_reduce_xor:
22840 '``llvm.vp.reduce.xor.*``' Intrinsics
22841 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22845 This is an overloaded intrinsic.
22849 declare i32 @llvm.vp.reduce.xor.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>)
22850 declare i16 @llvm.vp.reduce.xor.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
22855 Predicated integer ``XOR`` reduction of a vector and a scalar starting value,
22856 returning the result as a scalar.
22862 The first argument is the start value of the reduction, which must be a scalar
22863 integer type equal to the result type. The second argument is the vector on
22864 which the reduction is performed and must be a vector of integer values whose
22865 element type is the result/start type. The third argument is the vector mask and
22866 is a vector of boolean values with the same number of elements as the vector
22867 argument. The fourth argument is the explicit vector length of the operation.
22872 The '``llvm.vp.reduce.xor``' intrinsic performs the integer ``XOR`` reduction
22873 (:ref:`llvm.vector.reduce.xor <int_vector_reduce_xor>`) of the vector argument
22874 ``val`` on each enabled lane, performing an '``xor``' of that with the scalar
22875 ``start_value``. Disabled lanes are treated as containing the neutral value
22876 ``0`` (i.e. having no effect on the reduction operation). If the vector length
22877 is zero, the result is the start value.
22879 To ignore the start value, the neutral value can be used.
22884 .. code-block:: llvm
22886 %r = call i32 @llvm.vp.reduce.xor.v4i32(i32 %start, <4 x i32> %a, <4 x i1> %mask, i32 %evl)
22887 ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
22888 ; are treated as though %mask were false for those lanes.
22890 %masked.a = select <4 x i1> %mask, <4 x i32> %a, <4 x i32> <i32 0, i32 0, i32 0, i32 0>
22891 %reduction = call i32 @llvm.vector.reduce.xor.v4i32(<4 x i32> %masked.a)
22892 %also.r = xor i32 %reduction, %start
22895 .. _int_vp_reduce_smax:
22897 '``llvm.vp.reduce.smax.*``' Intrinsics
22898 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22902 This is an overloaded intrinsic.
22906 declare i32 @llvm.vp.reduce.smax.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>)
22907 declare i16 @llvm.vp.reduce.smax.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
22912 Predicated signed-integer ``MAX`` reduction of a vector and a scalar starting
22913 value, returning the result as a scalar.
22919 The first argument is the start value of the reduction, which must be a scalar
22920 integer type equal to the result type. The second argument is the vector on
22921 which the reduction is performed and must be a vector of integer values whose
22922 element type is the result/start type. The third argument is the vector mask and
22923 is a vector of boolean values with the same number of elements as the vector
22924 argument. The fourth argument is the explicit vector length of the operation.
22929 The '``llvm.vp.reduce.smax``' intrinsic performs the signed-integer ``MAX``
22930 reduction (:ref:`llvm.vector.reduce.smax <int_vector_reduce_smax>`) of the
22931 vector argument ``val`` on each enabled lane, and taking the maximum of that and
22932 the scalar ``start_value``. Disabled lanes are treated as containing the
22933 neutral value ``INT_MIN`` (i.e. having no effect on the reduction operation).
22934 If the vector length is zero, the result is the start value.
22936 To ignore the start value, the neutral value can be used.
22941 .. code-block:: llvm
22943 %r = call i8 @llvm.vp.reduce.smax.v4i8(i8 %start, <4 x i8> %a, <4 x i1> %mask, i32 %evl)
22944 ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
22945 ; are treated as though %mask were false for those lanes.
22947 %masked.a = select <4 x i1> %mask, <4 x i8> %a, <4 x i8> <i8 -128, i8 -128, i8 -128, i8 -128>
22948 %reduction = call i8 @llvm.vector.reduce.smax.v4i8(<4 x i8> %masked.a)
22949 %also.r = call i8 @llvm.smax.i8(i8 %reduction, i8 %start)
22952 .. _int_vp_reduce_smin:
22954 '``llvm.vp.reduce.smin.*``' Intrinsics
22955 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22959 This is an overloaded intrinsic.
22963 declare i32 @llvm.vp.reduce.smin.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>)
22964 declare i16 @llvm.vp.reduce.smin.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
22969 Predicated signed-integer ``MIN`` reduction of a vector and a scalar starting
22970 value, returning the result as a scalar.
22976 The first argument is the start value of the reduction, which must be a scalar
22977 integer type equal to the result type. The second argument is the vector on
22978 which the reduction is performed and must be a vector of integer values whose
22979 element type is the result/start type. The third argument is the vector mask and
22980 is a vector of boolean values with the same number of elements as the vector
22981 argument. The fourth argument is the explicit vector length of the operation.
22986 The '``llvm.vp.reduce.smin``' intrinsic performs the signed-integer ``MIN``
22987 reduction (:ref:`llvm.vector.reduce.smin <int_vector_reduce_smin>`) of the
22988 vector argument ``val`` on each enabled lane, and taking the minimum of that and
22989 the scalar ``start_value``. Disabled lanes are treated as containing the
22990 neutral value ``INT_MAX`` (i.e. having no effect on the reduction operation).
22991 If the vector length is zero, the result is the start value.
22993 To ignore the start value, the neutral value can be used.
22998 .. code-block:: llvm
23000 %r = call i8 @llvm.vp.reduce.smin.v4i8(i8 %start, <4 x i8> %a, <4 x i1> %mask, i32 %evl)
23001 ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
23002 ; are treated as though %mask were false for those lanes.
23004 %masked.a = select <4 x i1> %mask, <4 x i8> %a, <4 x i8> <i8 127, i8 127, i8 127, i8 127>
23005 %reduction = call i8 @llvm.vector.reduce.smin.v4i8(<4 x i8> %masked.a)
23006 %also.r = call i8 @llvm.smin.i8(i8 %reduction, i8 %start)
23009 .. _int_vp_reduce_umax:
23011 '``llvm.vp.reduce.umax.*``' Intrinsics
23012 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23016 This is an overloaded intrinsic.
23020 declare i32 @llvm.vp.reduce.umax.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>)
23021 declare i16 @llvm.vp.reduce.umax.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
23026 Predicated unsigned-integer ``MAX`` reduction of a vector and a scalar starting
23027 value, returning the result as a scalar.
23033 The first argument is the start value of the reduction, which must be a scalar
23034 integer type equal to the result type. The second argument is the vector on
23035 which the reduction is performed and must be a vector of integer values whose
23036 element type is the result/start type. The third argument is the vector mask and
23037 is a vector of boolean values with the same number of elements as the vector
23038 argument. The fourth argument is the explicit vector length of the operation.
23043 The '``llvm.vp.reduce.umax``' intrinsic performs the unsigned-integer ``MAX``
23044 reduction (:ref:`llvm.vector.reduce.umax <int_vector_reduce_umax>`) of the
23045 vector argument ``val`` on each enabled lane, and taking the maximum of that and
23046 the scalar ``start_value``. Disabled lanes are treated as containing the
23047 neutral value ``0`` (i.e. having no effect on the reduction operation). If the
23048 vector length is zero, the result is the start value.
23050 To ignore the start value, the neutral value can be used.
23055 .. code-block:: llvm
23057 %r = call i32 @llvm.vp.reduce.umax.v4i32(i32 %start, <4 x i32> %a, <4 x i1> %mask, i32 %evl)
23058 ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
23059 ; are treated as though %mask were false for those lanes.
23061 %masked.a = select <4 x i1> %mask, <4 x i32> %a, <4 x i32> <i32 0, i32 0, i32 0, i32 0>
23062 %reduction = call i32 @llvm.vector.reduce.umax.v4i32(<4 x i32> %masked.a)
23063 %also.r = call i32 @llvm.umax.i32(i32 %reduction, i32 %start)
23066 .. _int_vp_reduce_umin:
23068 '``llvm.vp.reduce.umin.*``' Intrinsics
23069 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23073 This is an overloaded intrinsic.
23077 declare i32 @llvm.vp.reduce.umin.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>)
23078 declare i16 @llvm.vp.reduce.umin.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
23083 Predicated unsigned-integer ``MIN`` reduction of a vector and a scalar starting
23084 value, returning the result as a scalar.
23090 The first argument is the start value of the reduction, which must be a scalar
23091 integer type equal to the result type. The second argument is the vector on
23092 which the reduction is performed and must be a vector of integer values whose
23093 element type is the result/start type. The third argument is the vector mask and
23094 is a vector of boolean values with the same number of elements as the vector
23095 argument. The fourth argument is the explicit vector length of the operation.
23100 The '``llvm.vp.reduce.umin``' intrinsic performs the unsigned-integer ``MIN``
23101 reduction (:ref:`llvm.vector.reduce.umin <int_vector_reduce_umin>`) of the
23102 vector argument ``val`` on each enabled lane, taking the minimum of that and the
23103 scalar ``start_value``. Disabled lanes are treated as containing the neutral
23104 value ``UINT_MAX``, or ``-1`` (i.e. having no effect on the reduction
23105 operation). If the vector length is zero, the result is the start value.
23107 To ignore the start value, the neutral value can be used.
23112 .. code-block:: llvm
23114 %r = call i32 @llvm.vp.reduce.umin.v4i32(i32 %start, <4 x i32> %a, <4 x i1> %mask, i32 %evl)
23115 ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
23116 ; are treated as though %mask were false for those lanes.
23118 %masked.a = select <4 x i1> %mask, <4 x i32> %a, <4 x i32> <i32 -1, i32 -1, i32 -1, i32 -1>
23119 %reduction = call i32 @llvm.vector.reduce.umin.v4i32(<4 x i32> %masked.a)
23120 %also.r = call i32 @llvm.umin.i32(i32 %reduction, i32 %start)
23123 .. _int_vp_reduce_fmax:
23125 '``llvm.vp.reduce.fmax.*``' Intrinsics
23126 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23130 This is an overloaded intrinsic.
23134 declare float @llvm.vp.reduce.fmax.v4f32(float <start_value>, <4 x float> <val>, <4 x i1> <mask>, i32 <vector_length>)
23135 declare double @llvm.vp.reduce.fmax.nxv8f64(double <start_value>, <vscale x 8 x double> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
23140 Predicated floating-point ``MAX`` reduction of a vector and a scalar starting
23141 value, returning the result as a scalar.
23147 The first argument is the start value of the reduction, which must be a scalar
23148 floating-point type equal to the result type. The second argument is the vector
23149 on which the reduction is performed and must be a vector of floating-point
23150 values whose element type is the result/start type. The third argument is the
23151 vector mask and is a vector of boolean values with the same number of elements
23152 as the vector argument. The fourth argument is the explicit vector length of the
23158 The '``llvm.vp.reduce.fmax``' intrinsic performs the floating-point ``MAX``
23159 reduction (:ref:`llvm.vector.reduce.fmax <int_vector_reduce_fmax>`) of the
23160 vector argument ``val`` on each enabled lane, taking the maximum of that and the
23161 scalar ``start_value``. Disabled lanes are treated as containing the neutral
23162 value (i.e. having no effect on the reduction operation). If the vector length
23163 is zero, the result is the start value.
23165 The neutral value is dependent on the :ref:`fast-math flags <fastmath>`. If no
23166 flags are set, the neutral value is ``-QNAN``. If ``nnan`` and ``ninf`` are
23167 both set, then the neutral value is the smallest floating-point value for the
23168 result type. If only ``nnan`` is set then the neutral value is ``-Infinity``.
23170 This instruction has the same comparison semantics as the
23171 :ref:`llvm.vector.reduce.fmax <int_vector_reduce_fmax>` intrinsic (and thus the
23172 '``llvm.maxnum.*``' intrinsic). That is, the result will always be a number
23173 unless all elements of the vector and the starting value are ``NaN``. For a
23174 vector with maximum element magnitude ``0.0`` and containing both ``+0.0`` and
23175 ``-0.0`` elements, the sign of the result is unspecified.
23177 To ignore the start value, the neutral value can be used.
23182 .. code-block:: llvm
23184 %r = call float @llvm.vp.reduce.fmax.v4f32(float %float, <4 x float> %a, <4 x i1> %mask, i32 %evl)
23185 ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
23186 ; are treated as though %mask were false for those lanes.
23188 %masked.a = select <4 x i1> %mask, <4 x float> %a, <4 x float> <float QNAN, float QNAN, float QNAN, float QNAN>
23189 %reduction = call float @llvm.vector.reduce.fmax.v4f32(<4 x float> %masked.a)
23190 %also.r = call float @llvm.maxnum.f32(float %reduction, float %start)
23193 .. _int_vp_reduce_fmin:
23195 '``llvm.vp.reduce.fmin.*``' Intrinsics
23196 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23200 This is an overloaded intrinsic.
23204 declare float @llvm.vp.reduce.fmin.v4f32(float <start_value>, <4 x float> <val>, <4 x i1> <mask>, i32 <vector_length>)
23205 declare double @llvm.vp.reduce.fmin.nxv8f64(double <start_value>, <vscale x 8 x double> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
23210 Predicated floating-point ``MIN`` reduction of a vector and a scalar starting
23211 value, returning the result as a scalar.
23217 The first argument is the start value of the reduction, which must be a scalar
23218 floating-point type equal to the result type. The second argument is the vector
23219 on which the reduction is performed and must be a vector of floating-point
23220 values whose element type is the result/start type. The third argument is the
23221 vector mask and is a vector of boolean values with the same number of elements
23222 as the vector argument. The fourth argument is the explicit vector length of the
23228 The '``llvm.vp.reduce.fmin``' intrinsic performs the floating-point ``MIN``
23229 reduction (:ref:`llvm.vector.reduce.fmin <int_vector_reduce_fmin>`) of the
23230 vector argument ``val`` on each enabled lane, taking the minimum of that and the
23231 scalar ``start_value``. Disabled lanes are treated as containing the neutral
23232 value (i.e. having no effect on the reduction operation). If the vector length
23233 is zero, the result is the start value.
23235 The neutral value is dependent on the :ref:`fast-math flags <fastmath>`. If no
23236 flags are set, the neutral value is ``+QNAN``. If ``nnan`` and ``ninf`` are
23237 both set, then the neutral value is the largest floating-point value for the
23238 result type. If only ``nnan`` is set then the neutral value is ``+Infinity``.
23240 This instruction has the same comparison semantics as the
23241 :ref:`llvm.vector.reduce.fmin <int_vector_reduce_fmin>` intrinsic (and thus the
23242 '``llvm.minnum.*``' intrinsic). That is, the result will always be a number
23243 unless all elements of the vector and the starting value are ``NaN``. For a
23244 vector with maximum element magnitude ``0.0`` and containing both ``+0.0`` and
23245 ``-0.0`` elements, the sign of the result is unspecified.
23247 To ignore the start value, the neutral value can be used.
23252 .. code-block:: llvm
23254 %r = call float @llvm.vp.reduce.fmin.v4f32(float %start, <4 x float> %a, <4 x i1> %mask, i32 %evl)
23255 ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
23256 ; are treated as though %mask were false for those lanes.
23258 %masked.a = select <4 x i1> %mask, <4 x float> %a, <4 x float> <float QNAN, float QNAN, float QNAN, float QNAN>
23259 %reduction = call float @llvm.vector.reduce.fmin.v4f32(<4 x float> %masked.a)
23260 %also.r = call float @llvm.minnum.f32(float %reduction, float %start)
23263 .. _int_vp_reduce_fmaximum:
23265 '``llvm.vp.reduce.fmaximum.*``' Intrinsics
23266 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23270 This is an overloaded intrinsic.
23274 declare float @llvm.vp.reduce.fmaximum.v4f32(float <start_value>, <4 x float> <val>, <4 x i1> <mask>, i32 <vector_length>)
23275 declare double @llvm.vp.reduce.fmaximum.nxv8f64(double <start_value>, <vscale x 8 x double> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
23280 Predicated floating-point ``MAX`` reduction of a vector and a scalar starting
23281 value, returning the result as a scalar.
23287 The first argument is the start value of the reduction, which must be a scalar
23288 floating-point type equal to the result type. The second argument is the vector
23289 on which the reduction is performed and must be a vector of floating-point
23290 values whose element type is the result/start type. The third argument is the
23291 vector mask and is a vector of boolean values with the same number of elements
23292 as the vector argument. The fourth argument is the explicit vector length of the
23298 The '``llvm.vp.reduce.fmaximum``' intrinsic performs the floating-point ``MAX``
23299 reduction (:ref:`llvm.vector.reduce.fmaximum <int_vector_reduce_fmaximum>`) of
23300 the vector argument ``val`` on each enabled lane, taking the maximum of that and
23301 the scalar ``start_value``. Disabled lanes are treated as containing the
23302 neutral value (i.e. having no effect on the reduction operation). If the vector
23303 length is zero, the result is the start value.
23305 The neutral value is dependent on the :ref:`fast-math flags <fastmath>`. If no
23306 flags are set or only the ``nnan`` is set, the neutral value is ``-Infinity``.
23307 If ``ninf`` is set, then the neutral value is the smallest floating-point value
23308 for the result type.
23310 This instruction has the same comparison semantics as the
23311 :ref:`llvm.vector.reduce.fmaximum <int_vector_reduce_fmaximum>` intrinsic (and
23312 thus the '``llvm.maximum.*``' intrinsic). That is, the result will always be a
23313 number unless any of the elements in the vector or the starting value is
23314 ``NaN``. Namely, this intrinsic propagates ``NaN``. Also, -0.0 is considered
23317 To ignore the start value, the neutral value can be used.
23322 .. code-block:: llvm
23324 %r = call float @llvm.vp.reduce.fmaximum.v4f32(float %float, <4 x float> %a, <4 x i1> %mask, i32 %evl)
23325 ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
23326 ; are treated as though %mask were false for those lanes.
23328 %masked.a = select <4 x i1> %mask, <4 x float> %a, <4 x float> <float -infinity, float -infinity, float -infinity, float -infinity>
23329 %reduction = call float @llvm.vector.reduce.fmaximum.v4f32(<4 x float> %masked.a)
23330 %also.r = call float @llvm.maximum.f32(float %reduction, float %start)
23333 .. _int_vp_reduce_fminimum:
23335 '``llvm.vp.reduce.fminimum.*``' Intrinsics
23336 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23340 This is an overloaded intrinsic.
23344 declare float @llvm.vp.reduce.fminimum.v4f32(float <start_value>, <4 x float> <val>, <4 x i1> <mask>, i32 <vector_length>)
23345 declare double @llvm.vp.reduce.fminimum.nxv8f64(double <start_value>, <vscale x 8 x double> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
23350 Predicated floating-point ``MIN`` reduction of a vector and a scalar starting
23351 value, returning the result as a scalar.
23357 The first argument is the start value of the reduction, which must be a scalar
23358 floating-point type equal to the result type. The second argument is the vector
23359 on which the reduction is performed and must be a vector of floating-point
23360 values whose element type is the result/start type. The third argument is the
23361 vector mask and is a vector of boolean values with the same number of elements
23362 as the vector argument. The fourth argument is the explicit vector length of the
23368 The '``llvm.vp.reduce.fminimum``' intrinsic performs the floating-point ``MIN``
23369 reduction (:ref:`llvm.vector.reduce.fminimum <int_vector_reduce_fminimum>`) of
23370 the vector argument ``val`` on each enabled lane, taking the minimum of that and
23371 the scalar ``start_value``. Disabled lanes are treated as containing the neutral
23372 value (i.e. having no effect on the reduction operation). If the vector length
23373 is zero, the result is the start value.
23375 The neutral value is dependent on the :ref:`fast-math flags <fastmath>`. If no
23376 flags are set or only the ``nnan`` is set, the neutral value is ``+Infinity``.
23377 If ``ninf`` is set, then the neutral value is the largest floating-point value
23378 for the result type.
23380 This instruction has the same comparison semantics as the
23381 :ref:`llvm.vector.reduce.fminimum <int_vector_reduce_fminimum>` intrinsic (and
23382 thus the '``llvm.minimum.*``' intrinsic). That is, the result will always be a
23383 number unless any of the elements in the vector or the starting value is
23384 ``NaN``. Namely, this intrinsic propagates ``NaN``. Also, -0.0 is considered
23387 To ignore the start value, the neutral value can be used.
23392 .. code-block:: llvm
23394 %r = call float @llvm.vp.reduce.fminimum.v4f32(float %start, <4 x float> %a, <4 x i1> %mask, i32 %evl)
23395 ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
23396 ; are treated as though %mask were false for those lanes.
23398 %masked.a = select <4 x i1> %mask, <4 x float> %a, <4 x float> <float infinity, float infinity, float infinity, float infinity>
23399 %reduction = call float @llvm.vector.reduce.fminimum.v4f32(<4 x float> %masked.a)
23400 %also.r = call float @llvm.minimum.f32(float %reduction, float %start)
23403 .. _int_get_active_lane_mask:
23405 '``llvm.get.active.lane.mask.*``' Intrinsics
23406 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23410 This is an overloaded intrinsic.
23414 declare <4 x i1> @llvm.get.active.lane.mask.v4i1.i32(i32 %base, i32 %n)
23415 declare <8 x i1> @llvm.get.active.lane.mask.v8i1.i64(i64 %base, i64 %n)
23416 declare <16 x i1> @llvm.get.active.lane.mask.v16i1.i64(i64 %base, i64 %n)
23417 declare <vscale x 16 x i1> @llvm.get.active.lane.mask.nxv16i1.i64(i64 %base, i64 %n)
23423 Create a mask representing active and inactive vector lanes.
23429 Both arguments have the same scalar integer type. The result is a vector with
23430 the i1 element type.
23435 The '``llvm.get.active.lane.mask.*``' intrinsics are semantically equivalent
23440 %m[i] = icmp ult (%base + i), %n
23442 where ``%m`` is a vector (mask) of active/inactive lanes with its elements
23443 indexed by ``i``, and ``%base``, ``%n`` are the two arguments to
23444 ``llvm.get.active.lane.mask.*``, ``%icmp`` is an integer compare and ``ult``
23445 the unsigned less-than comparison operator. Overflow cannot occur in
23446 ``(%base + i)`` and its comparison against ``%n`` as it is performed in integer
23447 numbers and not in machine numbers. If ``%n`` is ``0``, then the result is a
23448 poison value. The above is equivalent to:
23452 %m = @llvm.get.active.lane.mask(%base, %n)
23454 This can, for example, be emitted by the loop vectorizer in which case
23455 ``%base`` is the first element of the vector induction variable (VIV) and
23456 ``%n`` is the loop tripcount. Thus, these intrinsics perform an element-wise
23457 less than comparison of VIV with the loop tripcount, producing a mask of
23458 true/false values representing active/inactive vector lanes, except if the VIV
23459 overflows in which case they return false in the lanes where the VIV overflows.
23460 The arguments are scalar types to accommodate scalable vector types, for which
23461 it is unknown what the type of the step vector needs to be that enumerate its
23462 lanes without overflow.
23464 This mask ``%m`` can e.g. be used in masked load/store instructions. These
23465 intrinsics provide a hint to the backend. I.e., for a vector loop, the
23466 back-edge taken count of the original scalar loop is explicit as the second
23473 .. code-block:: llvm
23475 %active.lane.mask = call <4 x i1> @llvm.get.active.lane.mask.v4i1.i64(i64 %elem0, i64 429)
23476 %wide.masked.load = call <4 x i32> @llvm.masked.load.v4i32.p0v4i32(<4 x i32>* %3, i32 4, <4 x i1> %active.lane.mask, <4 x i32> poison)
23479 .. _int_experimental_vp_splice:
23481 '``llvm.experimental.vp.splice``' Intrinsic
23482 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23486 This is an overloaded intrinsic.
23490 declare <2 x double> @llvm.experimental.vp.splice.v2f64(<2 x double> %vec1, <2 x double> %vec2, i32 %imm, <2 x i1> %mask, i32 %evl1, i32 %evl2)
23491 declare <vscale x 4 x i32> @llvm.experimental.vp.splice.nxv4i32(<vscale x 4 x i32> %vec1, <vscale x 4 x i32> %vec2, i32 %imm, <vscale x 4 x i1> %mask, i32 %evl1, i32 %evl2)
23496 The '``llvm.experimental.vp.splice.*``' intrinsic is the vector length
23497 predicated version of the '``llvm.vector.splice.*``' intrinsic.
23502 The result and the first two arguments ``vec1`` and ``vec2`` are vectors with
23503 the same type. The third argument ``imm`` is an immediate signed integer that
23504 indicates the offset index. The fourth argument ``mask`` is a vector mask and
23505 has the same number of elements as the result. The last two arguments ``evl1``
23506 and ``evl2`` are unsigned integers indicating the explicit vector lengths of
23507 ``vec1`` and ``vec2`` respectively. ``imm``, ``evl1`` and ``evl2`` should
23508 respect the following constraints: ``-evl1 <= imm < evl1``, ``0 <= evl1 <= VL``
23509 and ``0 <= evl2 <= VL``, where ``VL`` is the runtime vector factor. If these
23510 constraints are not satisfied the intrinsic has undefined behavior.
23515 Effectively, this intrinsic concatenates ``vec1[0..evl1-1]`` and
23516 ``vec2[0..evl2-1]`` and creates the result vector by selecting the elements in a
23517 window of size ``evl2``, starting at index ``imm`` (for a positive immediate) of
23518 the concatenated vector. Elements in the result vector beyond ``evl2`` are
23519 ``undef``. If ``imm`` is negative the starting index is ``evl1 + imm``. The result
23520 vector of active vector length ``evl2`` contains ``evl1 - imm`` (``-imm`` for
23521 negative ``imm``) elements from indices ``[imm..evl1 - 1]``
23522 (``[evl1 + imm..evl1 -1]`` for negative ``imm``) of ``vec1`` followed by the
23523 first ``evl2 - (evl1 - imm)`` (``evl2 + imm`` for negative ``imm``) elements of
23524 ``vec2``. If ``evl1 - imm`` (``-imm``) >= ``evl2``, only the first ``evl2``
23525 elements are considered and the remaining are ``undef``. The lanes in the result
23526 vector disabled by ``mask`` are ``poison``.
23531 .. code-block:: text
23533 llvm.experimental.vp.splice(<A,B,C,D>, <E,F,G,H>, 1, 2, 3); ==> <B, E, F, poison> index
23534 llvm.experimental.vp.splice(<A,B,C,D>, <E,F,G,H>, -2, 3, 2); ==> <B, C, poison, poison> trailing elements
23537 .. _int_experimental_vp_splat:
23540 '``llvm.experimental.vp.splat``' Intrinsic
23541 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23545 This is an overloaded intrinsic.
23549 declare <2 x double> @llvm.experimental.vp.splat.v2f64(double %scalar, <2 x i1> %mask, i32 %evl)
23550 declare <vscale x 4 x i32> @llvm.experimental.vp.splat.nxv4i32(i32 %scalar, <vscale x 4 x i1> %mask, i32 %evl)
23555 The '``llvm.experimental.vp.splat.*``' intrinsic is to create a predicated splat
23556 with specific effective vector length.
23561 The result is a vector and it is a splat of the first scalar argument. The
23562 second argument ``mask`` is a vector mask and has the same number of elements as
23563 the result. The third argument is the explicit vector length of the operation.
23568 This intrinsic splats a vector with ``evl`` elements of a scalar argument.
23569 The lanes in the result vector disabled by ``mask`` are ``poison``. The
23570 elements past ``evl`` are poison.
23575 .. code-block:: llvm
23577 %r = call <4 x float> @llvm.vp.splat.v4f32(float %a, <4 x i1> %mask, i32 %evl)
23578 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
23579 %e = insertelement <4 x float> poison, float %a, i32 0
23580 %s = shufflevector <4 x float> %e, <4 x float> poison, <4 x i32> zeroinitializer
23581 %also.r = select <4 x i1> %mask, <4 x float> %s, <4 x float> poison
23584 .. _int_experimental_vp_reverse:
23587 '``llvm.experimental.vp.reverse``' Intrinsic
23588 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23592 This is an overloaded intrinsic.
23596 declare <2 x double> @llvm.experimental.vp.reverse.v2f64(<2 x double> %vec, <2 x i1> %mask, i32 %evl)
23597 declare <vscale x 4 x i32> @llvm.experimental.vp.reverse.nxv4i32(<vscale x 4 x i32> %vec, <vscale x 4 x i1> %mask, i32 %evl)
23602 The '``llvm.experimental.vp.reverse.*``' intrinsic is the vector length
23603 predicated version of the '``llvm.vector.reverse.*``' intrinsic.
23608 The result and the first argument ``vec`` are vectors with the same type.
23609 The second argument ``mask`` is a vector mask and has the same number of
23610 elements as the result. The third argument is the explicit vector length of
23616 This intrinsic reverses the order of the first ``evl`` elements in a vector.
23617 The lanes in the result vector disabled by ``mask`` are ``poison``. The
23618 elements past ``evl`` are poison.
23622 '``llvm.vp.load``' Intrinsic
23623 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23627 This is an overloaded intrinsic.
23631 declare <4 x float> @llvm.vp.load.v4f32.p0(ptr %ptr, <4 x i1> %mask, i32 %evl)
23632 declare <vscale x 2 x i16> @llvm.vp.load.nxv2i16.p0(ptr %ptr, <vscale x 2 x i1> %mask, i32 %evl)
23633 declare <8 x float> @llvm.vp.load.v8f32.p1(ptr addrspace(1) %ptr, <8 x i1> %mask, i32 %evl)
23634 declare <vscale x 1 x i64> @llvm.vp.load.nxv1i64.p6(ptr addrspace(6) %ptr, <vscale x 1 x i1> %mask, i32 %evl)
23639 The '``llvm.vp.load.*``' intrinsic is the vector length predicated version of
23640 the :ref:`llvm.masked.load <int_mload>` intrinsic.
23645 The first argument is the base pointer for the load. The second argument is a
23646 vector of boolean values with the same number of elements as the return type.
23647 The third is the explicit vector length of the operation. The return type and
23648 underlying type of the base pointer are the same vector types.
23650 The :ref:`align <attr_align>` parameter attribute can be provided for the first
23656 The '``llvm.vp.load``' intrinsic reads a vector from memory in the same way as
23657 the '``llvm.masked.load``' intrinsic, where the mask is taken from the
23658 combination of the '``mask``' and '``evl``' arguments in the usual VP way.
23659 Certain '``llvm.masked.load``' arguments do not have corresponding arguments in
23660 '``llvm.vp.load``': the '``passthru``' argument is implicitly ``poison``; the
23661 '``alignment``' argument is taken as the ``align`` parameter attribute, if
23662 provided. The default alignment is taken as the ABI alignment of the return
23663 type as specified by the :ref:`datalayout string<langref_datalayout>`.
23668 .. code-block:: text
23670 %r = call <8 x i8> @llvm.vp.load.v8i8.p0(ptr align 2 %ptr, <8 x i1> %mask, i32 %evl)
23671 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
23673 %also.r = call <8 x i8> @llvm.masked.load.v8i8.p0(ptr %ptr, i32 2, <8 x i1> %mask, <8 x i8> poison)
23678 '``llvm.vp.store``' Intrinsic
23679 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23683 This is an overloaded intrinsic.
23687 declare void @llvm.vp.store.v4f32.p0(<4 x float> %val, ptr %ptr, <4 x i1> %mask, i32 %evl)
23688 declare void @llvm.vp.store.nxv2i16.p0(<vscale x 2 x i16> %val, ptr %ptr, <vscale x 2 x i1> %mask, i32 %evl)
23689 declare void @llvm.vp.store.v8f32.p1(<8 x float> %val, ptr addrspace(1) %ptr, <8 x i1> %mask, i32 %evl)
23690 declare void @llvm.vp.store.nxv1i64.p6(<vscale x 1 x i64> %val, ptr addrspace(6) %ptr, <vscale x 1 x i1> %mask, i32 %evl)
23695 The '``llvm.vp.store.*``' intrinsic is the vector length predicated version of
23696 the :ref:`llvm.masked.store <int_mstore>` intrinsic.
23701 The first argument is the vector value to be written to memory. The second
23702 argument is the base pointer for the store. It has the same underlying type as
23703 the value argument. The third argument is a vector of boolean values with the
23704 same number of elements as the return type. The fourth is the explicit vector
23705 length of the operation.
23707 The :ref:`align <attr_align>` parameter attribute can be provided for the
23713 The '``llvm.vp.store``' intrinsic reads a vector from memory in the same way as
23714 the '``llvm.masked.store``' intrinsic, where the mask is taken from the
23715 combination of the '``mask``' and '``evl``' arguments in the usual VP way. The
23716 alignment of the operation (corresponding to the '``alignment``' argument of
23717 '``llvm.masked.store``') is specified by the ``align`` parameter attribute (see
23718 above). If it is not provided then the ABI alignment of the type of the
23719 '``value``' argument as specified by the :ref:`datalayout
23720 string<langref_datalayout>` is used instead.
23725 .. code-block:: text
23727 call void @llvm.vp.store.v8i8.p0(<8 x i8> %val, ptr align 4 %ptr, <8 x i1> %mask, i32 %evl)
23728 ;; For all lanes below %evl, the call above is lane-wise equivalent to the call below.
23730 call void @llvm.masked.store.v8i8.p0(<8 x i8> %val, ptr %ptr, i32 4, <8 x i1> %mask)
23733 .. _int_experimental_vp_strided_load:
23735 '``llvm.experimental.vp.strided.load``' Intrinsic
23736 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23740 This is an overloaded intrinsic.
23744 declare <4 x float> @llvm.experimental.vp.strided.load.v4f32.i64(ptr %ptr, i64 %stride, <4 x i1> %mask, i32 %evl)
23745 declare <vscale x 2 x i16> @llvm.experimental.vp.strided.load.nxv2i16.i64(ptr %ptr, i64 %stride, <vscale x 2 x i1> %mask, i32 %evl)
23750 The '``llvm.experimental.vp.strided.load``' intrinsic loads, into a vector, scalar values from
23751 memory locations evenly spaced apart by '``stride``' number of bytes, starting from '``ptr``'.
23756 The first argument is the base pointer for the load. The second argument is the stride
23757 value expressed in bytes. The third argument is a vector of boolean values
23758 with the same number of elements as the return type. The fourth is the explicit
23759 vector length of the operation. The base pointer underlying type matches the type of the scalar
23760 elements of the return argument.
23762 The :ref:`align <attr_align>` parameter attribute can be provided for the first
23768 The '``llvm.experimental.vp.strided.load``' intrinsic loads, into a vector, multiple scalar
23769 values from memory in the same way as the :ref:`llvm.vp.gather <int_vp_gather>` intrinsic,
23770 where the vector of pointers is in the form:
23772 ``%ptrs = <%ptr, %ptr + %stride, %ptr + 2 * %stride, ... >``,
23774 with '``ptr``' previously casted to a pointer '``i8``', '``stride``' always interpreted as a signed
23775 integer and all arithmetic occurring in the pointer type.
23780 .. code-block:: text
23782 %r = call <8 x i64> @llvm.experimental.vp.strided.load.v8i64.i64(i64* %ptr, i64 %stride, <8 x i64> %mask, i32 %evl)
23783 ;; The operation can also be expressed like this:
23785 %addr = bitcast i64* %ptr to i8*
23786 ;; Create a vector of pointers %addrs in the form:
23787 ;; %addrs = <%addr, %addr + %stride, %addr + 2 * %stride, ...>
23788 %ptrs = bitcast <8 x i8* > %addrs to <8 x i64* >
23789 %also.r = call <8 x i64> @llvm.vp.gather.v8i64.v8p0i64(<8 x i64* > %ptrs, <8 x i64> %mask, i32 %evl)
23792 .. _int_experimental_vp_strided_store:
23794 '``llvm.experimental.vp.strided.store``' Intrinsic
23795 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23799 This is an overloaded intrinsic.
23803 declare void @llvm.experimental.vp.strided.store.v4f32.i64(<4 x float> %val, ptr %ptr, i64 %stride, <4 x i1> %mask, i32 %evl)
23804 declare void @llvm.experimental.vp.strided.store.nxv2i16.i64(<vscale x 2 x i16> %val, ptr %ptr, i64 %stride, <vscale x 2 x i1> %mask, i32 %evl)
23809 The '``@llvm.experimental.vp.strided.store``' intrinsic stores the elements of
23810 '``val``' into memory locations evenly spaced apart by '``stride``' number of
23811 bytes, starting from '``ptr``'.
23816 The first argument is the vector value to be written to memory. The second
23817 argument is the base pointer for the store. Its underlying type matches the
23818 scalar element type of the value argument. The third argument is the stride value
23819 expressed in bytes. The fourth argument is a vector of boolean values with the
23820 same number of elements as the return type. The fifth is the explicit vector
23821 length of the operation.
23823 The :ref:`align <attr_align>` parameter attribute can be provided for the
23829 The '``llvm.experimental.vp.strided.store``' intrinsic stores the elements of
23830 '``val``' in the same way as the :ref:`llvm.vp.scatter <int_vp_scatter>` intrinsic,
23831 where the vector of pointers is in the form:
23833 ``%ptrs = <%ptr, %ptr + %stride, %ptr + 2 * %stride, ... >``,
23835 with '``ptr``' previously casted to a pointer '``i8``', '``stride``' always interpreted as a signed
23836 integer and all arithmetic occurring in the pointer type.
23841 .. code-block:: text
23843 call void @llvm.experimental.vp.strided.store.v8i64.i64(<8 x i64> %val, i64* %ptr, i64 %stride, <8 x i1> %mask, i32 %evl)
23844 ;; The operation can also be expressed like this:
23846 %addr = bitcast i64* %ptr to i8*
23847 ;; Create a vector of pointers %addrs in the form:
23848 ;; %addrs = <%addr, %addr + %stride, %addr + 2 * %stride, ...>
23849 %ptrs = bitcast <8 x i8* > %addrs to <8 x i64* >
23850 call void @llvm.vp.scatter.v8i64.v8p0i64(<8 x i64> %val, <8 x i64*> %ptrs, <8 x i1> %mask, i32 %evl)
23855 '``llvm.vp.gather``' Intrinsic
23856 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23860 This is an overloaded intrinsic.
23864 declare <4 x double> @llvm.vp.gather.v4f64.v4p0(<4 x ptr> %ptrs, <4 x i1> %mask, i32 %evl)
23865 declare <vscale x 2 x i8> @llvm.vp.gather.nxv2i8.nxv2p0(<vscale x 2 x ptr> %ptrs, <vscale x 2 x i1> %mask, i32 %evl)
23866 declare <2 x float> @llvm.vp.gather.v2f32.v2p2(<2 x ptr addrspace(2)> %ptrs, <2 x i1> %mask, i32 %evl)
23867 declare <vscale x 4 x i32> @llvm.vp.gather.nxv4i32.nxv4p4(<vscale x 4 x ptr addrspace(4)> %ptrs, <vscale x 4 x i1> %mask, i32 %evl)
23872 The '``llvm.vp.gather.*``' intrinsic is the vector length predicated version of
23873 the :ref:`llvm.masked.gather <int_mgather>` intrinsic.
23878 The first argument is a vector of pointers which holds all memory addresses to
23879 read. The second argument is a vector of boolean values with the same number of
23880 elements as the return type. The third is the explicit vector length of the
23881 operation. The return type and underlying type of the vector of pointers are
23882 the same vector types.
23884 The :ref:`align <attr_align>` parameter attribute can be provided for the first
23890 The '``llvm.vp.gather``' intrinsic reads multiple scalar values from memory in
23891 the same way as the '``llvm.masked.gather``' intrinsic, where the mask is taken
23892 from the combination of the '``mask``' and '``evl``' arguments in the usual VP
23893 way. Certain '``llvm.masked.gather``' arguments do not have corresponding
23894 arguments in '``llvm.vp.gather``': the '``passthru``' argument is implicitly
23895 ``poison``; the '``alignment``' argument is taken as the ``align`` parameter, if
23896 provided. The default alignment is taken as the ABI alignment of the source
23897 addresses as specified by the :ref:`datalayout string<langref_datalayout>`.
23902 .. code-block:: text
23904 %r = call <8 x i8> @llvm.vp.gather.v8i8.v8p0(<8 x ptr> align 8 %ptrs, <8 x i1> %mask, i32 %evl)
23905 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
23907 %also.r = call <8 x i8> @llvm.masked.gather.v8i8.v8p0(<8 x ptr> %ptrs, i32 8, <8 x i1> %mask, <8 x i8> poison)
23910 .. _int_vp_scatter:
23912 '``llvm.vp.scatter``' Intrinsic
23913 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23917 This is an overloaded intrinsic.
23921 declare void @llvm.vp.scatter.v4f64.v4p0(<4 x double> %val, <4 x ptr> %ptrs, <4 x i1> %mask, i32 %evl)
23922 declare void @llvm.vp.scatter.nxv2i8.nxv2p0(<vscale x 2 x i8> %val, <vscale x 2 x ptr> %ptrs, <vscale x 2 x i1> %mask, i32 %evl)
23923 declare void @llvm.vp.scatter.v2f32.v2p2(<2 x float> %val, <2 x ptr addrspace(2)> %ptrs, <2 x i1> %mask, i32 %evl)
23924 declare void @llvm.vp.scatter.nxv4i32.nxv4p4(<vscale x 4 x i32> %val, <vscale x 4 x ptr addrspace(4)> %ptrs, <vscale x 4 x i1> %mask, i32 %evl)
23929 The '``llvm.vp.scatter.*``' intrinsic is the vector length predicated version of
23930 the :ref:`llvm.masked.scatter <int_mscatter>` intrinsic.
23935 The first argument is a vector value to be written to memory. The second argument
23936 is a vector of pointers, pointing to where the value elements should be stored.
23937 The third argument is a vector of boolean values with the same number of
23938 elements as the return type. The fourth is the explicit vector length of the
23941 The :ref:`align <attr_align>` parameter attribute can be provided for the
23947 The '``llvm.vp.scatter``' intrinsic writes multiple scalar values to memory in
23948 the same way as the '``llvm.masked.scatter``' intrinsic, where the mask is
23949 taken from the combination of the '``mask``' and '``evl``' arguments in the
23950 usual VP way. The '``alignment``' argument of the '``llvm.masked.scatter``' does
23951 not have a corresponding argument in '``llvm.vp.scatter``': it is instead
23952 provided via the optional ``align`` parameter attribute on the
23953 vector-of-pointers argument. Otherwise it is taken as the ABI alignment of the
23954 destination addresses as specified by the :ref:`datalayout
23955 string<langref_datalayout>`.
23960 .. code-block:: text
23962 call void @llvm.vp.scatter.v8i8.v8p0(<8 x i8> %val, <8 x ptr> align 1 %ptrs, <8 x i1> %mask, i32 %evl)
23963 ;; For all lanes below %evl, the call above is lane-wise equivalent to the call below.
23965 call void @llvm.masked.scatter.v8i8.v8p0(<8 x i8> %val, <8 x ptr> %ptrs, i32 1, <8 x i1> %mask)
23970 '``llvm.vp.trunc.*``' Intrinsics
23971 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23975 This is an overloaded intrinsic.
23979 declare <16 x i16> @llvm.vp.trunc.v16i16.v16i32 (<16 x i32> <op>, <16 x i1> <mask>, i32 <vector_length>)
23980 declare <vscale x 4 x i16> @llvm.vp.trunc.nxv4i16.nxv4i32 (<vscale x 4 x i32> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
23985 The '``llvm.vp.trunc``' intrinsic truncates its first argument to the return
23986 type. The operation has a mask and an explicit vector length parameter.
23992 The '``llvm.vp.trunc``' intrinsic takes a value to cast as its first argument.
23993 The return type is the type to cast the value to. Both types must be vector of
23994 :ref:`integer <t_integer>` type. The bit size of the value must be larger than
23995 the bit size of the return type. The second argument is the vector mask. The
23996 return type, the value to cast, and the vector mask have the same number of
23997 elements. The third argument is the explicit vector length of the operation.
24002 The '``llvm.vp.trunc``' intrinsic truncates the high order bits in value and
24003 converts the remaining bits to return type. Since the source size must be larger
24004 than the destination size, '``llvm.vp.trunc``' cannot be a *no-op cast*. It will
24005 always truncate bits. The conversion is performed on lane positions below the
24006 explicit vector length and where the vector mask is true. Masked-off lanes are
24012 .. code-block:: llvm
24014 %r = call <4 x i16> @llvm.vp.trunc.v4i16.v4i32(<4 x i32> %a, <4 x i1> %mask, i32 %evl)
24015 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
24017 %t = trunc <4 x i32> %a to <4 x i16>
24018 %also.r = select <4 x i1> %mask, <4 x i16> %t, <4 x i16> poison
24023 '``llvm.vp.zext.*``' Intrinsics
24024 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24028 This is an overloaded intrinsic.
24032 declare <16 x i32> @llvm.vp.zext.v16i32.v16i16 (<16 x i16> <op>, <16 x i1> <mask>, i32 <vector_length>)
24033 declare <vscale x 4 x i32> @llvm.vp.zext.nxv4i32.nxv4i16 (<vscale x 4 x i16> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
24038 The '``llvm.vp.zext``' intrinsic zero extends its first argument to the return
24039 type. The operation has a mask and an explicit vector length parameter.
24045 The '``llvm.vp.zext``' intrinsic takes a value to cast as its first argument.
24046 The return type is the type to cast the value to. Both types must be vectors of
24047 :ref:`integer <t_integer>` type. The bit size of the value must be smaller than
24048 the bit size of the return type. The second argument is the vector mask. The
24049 return type, the value to cast, and the vector mask have the same number of
24050 elements. The third argument is the explicit vector length of the operation.
24055 The '``llvm.vp.zext``' intrinsic fill the high order bits of the value with zero
24056 bits until it reaches the size of the return type. When zero extending from i1,
24057 the result will always be either 0 or 1. The conversion is performed on lane
24058 positions below the explicit vector length and where the vector mask is true.
24059 Masked-off lanes are ``poison``.
24064 .. code-block:: llvm
24066 %r = call <4 x i32> @llvm.vp.zext.v4i32.v4i16(<4 x i16> %a, <4 x i1> %mask, i32 %evl)
24067 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
24069 %t = zext <4 x i16> %a to <4 x i32>
24070 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
24075 '``llvm.vp.sext.*``' Intrinsics
24076 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24080 This is an overloaded intrinsic.
24084 declare <16 x i32> @llvm.vp.sext.v16i32.v16i16 (<16 x i16> <op>, <16 x i1> <mask>, i32 <vector_length>)
24085 declare <vscale x 4 x i32> @llvm.vp.sext.nxv4i32.nxv4i16 (<vscale x 4 x i16> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
24090 The '``llvm.vp.sext``' intrinsic sign extends its first argument to the return
24091 type. The operation has a mask and an explicit vector length parameter.
24097 The '``llvm.vp.sext``' intrinsic takes a value to cast as its first argument.
24098 The return type is the type to cast the value to. Both types must be vectors of
24099 :ref:`integer <t_integer>` type. The bit size of the value must be smaller than
24100 the bit size of the return type. The second argument is the vector mask. The
24101 return type, the value to cast, and the vector mask have the same number of
24102 elements. The third argument is the explicit vector length of the operation.
24107 The '``llvm.vp.sext``' intrinsic performs a sign extension by copying the sign
24108 bit (highest order bit) of the value until it reaches the size of the return
24109 type. When sign extending from i1, the result will always be either -1 or 0.
24110 The conversion is performed on lane positions below the explicit vector length
24111 and where the vector mask is true. Masked-off lanes are ``poison``.
24116 .. code-block:: llvm
24118 %r = call <4 x i32> @llvm.vp.sext.v4i32.v4i16(<4 x i16> %a, <4 x i1> %mask, i32 %evl)
24119 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
24121 %t = sext <4 x i16> %a to <4 x i32>
24122 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
24125 .. _int_vp_fptrunc:
24127 '``llvm.vp.fptrunc.*``' Intrinsics
24128 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24132 This is an overloaded intrinsic.
24136 declare <16 x float> @llvm.vp.fptrunc.v16f32.v16f64 (<16 x double> <op>, <16 x i1> <mask>, i32 <vector_length>)
24137 declare <vscale x 4 x float> @llvm.vp.trunc.nxv4f32.nxv4f64 (<vscale x 4 x double> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
24142 The '``llvm.vp.fptrunc``' intrinsic truncates its first argument to the return
24143 type. The operation has a mask and an explicit vector length parameter.
24149 The '``llvm.vp.fptrunc``' intrinsic takes a value to cast as its first argument.
24150 The return type is the type to cast the value to. Both types must be vector of
24151 :ref:`floating-point <t_floating>` type. The bit size of the value must be
24152 larger than the bit size of the return type. This implies that
24153 '``llvm.vp.fptrunc``' cannot be used to make a *no-op cast*. The second argument
24154 is the vector mask. The return type, the value to cast, and the vector mask have
24155 the same number of elements. The third argument is the explicit vector length of
24161 The '``llvm.vp.fptrunc``' intrinsic casts a ``value`` from a larger
24162 :ref:`floating-point <t_floating>` type to a smaller :ref:`floating-point
24163 <t_floating>` type.
24164 This instruction is assumed to execute in the default :ref:`floating-point
24165 environment <floatenv>`. The conversion is performed on lane positions below the
24166 explicit vector length and where the vector mask is true. Masked-off lanes are
24172 .. code-block:: llvm
24174 %r = call <4 x float> @llvm.vp.fptrunc.v4f32.v4f64(<4 x double> %a, <4 x i1> %mask, i32 %evl)
24175 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
24177 %t = fptrunc <4 x double> %a to <4 x float>
24178 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
24183 '``llvm.vp.fpext.*``' Intrinsics
24184 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24188 This is an overloaded intrinsic.
24192 declare <16 x double> @llvm.vp.fpext.v16f64.v16f32 (<16 x float> <op>, <16 x i1> <mask>, i32 <vector_length>)
24193 declare <vscale x 4 x double> @llvm.vp.fpext.nxv4f64.nxv4f32 (<vscale x 4 x float> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
24198 The '``llvm.vp.fpext``' intrinsic extends its first argument to the return
24199 type. The operation has a mask and an explicit vector length parameter.
24205 The '``llvm.vp.fpext``' intrinsic takes a value to cast as its first argument.
24206 The return type is the type to cast the value to. Both types must be vector of
24207 :ref:`floating-point <t_floating>` type. The bit size of the value must be
24208 smaller than the bit size of the return type. This implies that
24209 '``llvm.vp.fpext``' cannot be used to make a *no-op cast*. The second argument
24210 is the vector mask. The return type, the value to cast, and the vector mask have
24211 the same number of elements. The third argument is the explicit vector length of
24217 The '``llvm.vp.fpext``' intrinsic extends the ``value`` from a smaller
24218 :ref:`floating-point <t_floating>` type to a larger :ref:`floating-point
24219 <t_floating>` type. The '``llvm.vp.fpext``' cannot be used to make a
24220 *no-op cast* because it always changes bits. Use ``bitcast`` to make a
24221 *no-op cast* for a floating-point cast.
24222 The conversion is performed on lane positions below the explicit vector length
24223 and where the vector mask is true. Masked-off lanes are ``poison``.
24228 .. code-block:: llvm
24230 %r = call <4 x double> @llvm.vp.fpext.v4f64.v4f32(<4 x float> %a, <4 x i1> %mask, i32 %evl)
24231 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
24233 %t = fpext <4 x float> %a to <4 x double>
24234 %also.r = select <4 x i1> %mask, <4 x double> %t, <4 x double> poison
24239 '``llvm.vp.fptoui.*``' Intrinsics
24240 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24244 This is an overloaded intrinsic.
24248 declare <16 x i32> @llvm.vp.fptoui.v16i32.v16f32 (<16 x float> <op>, <16 x i1> <mask>, i32 <vector_length>)
24249 declare <vscale x 4 x i32> @llvm.vp.fptoui.nxv4i32.nxv4f32 (<vscale x 4 x float> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
24250 declare <256 x i64> @llvm.vp.fptoui.v256i64.v256f64 (<256 x double> <op>, <256 x i1> <mask>, i32 <vector_length>)
24255 The '``llvm.vp.fptoui``' intrinsic converts the :ref:`floating-point
24256 <t_floating>` argument to the unsigned integer return type.
24257 The operation has a mask and an explicit vector length parameter.
24263 The '``llvm.vp.fptoui``' intrinsic takes a value to cast as its first argument.
24264 The value to cast must be a vector of :ref:`floating-point <t_floating>` type.
24265 The return type is the type to cast the value to. The return type must be
24266 vector of :ref:`integer <t_integer>` type. The second argument is the vector
24267 mask. The return type, the value to cast, and the vector mask have the same
24268 number of elements. The third argument is the explicit vector length of the
24274 The '``llvm.vp.fptoui``' intrinsic converts its :ref:`floating-point
24275 <t_floating>` argument into the nearest (rounding towards zero) unsigned integer
24276 value where the lane position is below the explicit vector length and the
24277 vector mask is true. Masked-off lanes are ``poison``. On enabled lanes where
24278 conversion takes place and the value cannot fit in the return type, the result
24279 on that lane is a :ref:`poison value <poisonvalues>`.
24284 .. code-block:: llvm
24286 %r = call <4 x i32> @llvm.vp.fptoui.v4i32.v4f32(<4 x float> %a, <4 x i1> %mask, i32 %evl)
24287 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
24289 %t = fptoui <4 x float> %a to <4 x i32>
24290 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
24295 '``llvm.vp.fptosi.*``' Intrinsics
24296 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24300 This is an overloaded intrinsic.
24304 declare <16 x i32> @llvm.vp.fptosi.v16i32.v16f32 (<16 x float> <op>, <16 x i1> <mask>, i32 <vector_length>)
24305 declare <vscale x 4 x i32> @llvm.vp.fptosi.nxv4i32.nxv4f32 (<vscale x 4 x float> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
24306 declare <256 x i64> @llvm.vp.fptosi.v256i64.v256f64 (<256 x double> <op>, <256 x i1> <mask>, i32 <vector_length>)
24311 The '``llvm.vp.fptosi``' intrinsic converts the :ref:`floating-point
24312 <t_floating>` argument to the signed integer return type.
24313 The operation has a mask and an explicit vector length parameter.
24319 The '``llvm.vp.fptosi``' intrinsic takes a value to cast as its first argument.
24320 The value to cast must be a vector of :ref:`floating-point <t_floating>` type.
24321 The return type is the type to cast the value to. The return type must be
24322 vector of :ref:`integer <t_integer>` type. The second argument is the vector
24323 mask. The return type, the value to cast, and the vector mask have the same
24324 number of elements. The third argument is the explicit vector length of the
24330 The '``llvm.vp.fptosi``' intrinsic converts its :ref:`floating-point
24331 <t_floating>` argument into the nearest (rounding towards zero) signed integer
24332 value where the lane position is below the explicit vector length and the
24333 vector mask is true. Masked-off lanes are ``poison``. On enabled lanes where
24334 conversion takes place and the value cannot fit in the return type, the result
24335 on that lane is a :ref:`poison value <poisonvalues>`.
24340 .. code-block:: llvm
24342 %r = call <4 x i32> @llvm.vp.fptosi.v4i32.v4f32(<4 x float> %a, <4 x i1> %mask, i32 %evl)
24343 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
24345 %t = fptosi <4 x float> %a to <4 x i32>
24346 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
24351 '``llvm.vp.uitofp.*``' Intrinsics
24352 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24356 This is an overloaded intrinsic.
24360 declare <16 x float> @llvm.vp.uitofp.v16f32.v16i32 (<16 x i32> <op>, <16 x i1> <mask>, i32 <vector_length>)
24361 declare <vscale x 4 x float> @llvm.vp.uitofp.nxv4f32.nxv4i32 (<vscale x 4 x i32> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
24362 declare <256 x double> @llvm.vp.uitofp.v256f64.v256i64 (<256 x i64> <op>, <256 x i1> <mask>, i32 <vector_length>)
24367 The '``llvm.vp.uitofp``' intrinsic converts its unsigned integer argument to the
24368 :ref:`floating-point <t_floating>` return type. The operation has a mask and
24369 an explicit vector length parameter.
24375 The '``llvm.vp.uitofp``' intrinsic takes a value to cast as its first argument.
24376 The value to cast must be vector of :ref:`integer <t_integer>` type. The
24377 return type is the type to cast the value to. The return type must be a vector
24378 of :ref:`floating-point <t_floating>` type. The second argument is the vector
24379 mask. The return type, the value to cast, and the vector mask have the same
24380 number of elements. The third argument is the explicit vector length of the
24386 The '``llvm.vp.uitofp``' intrinsic interprets its first argument as an unsigned
24387 integer quantity and converts it to the corresponding floating-point value. If
24388 the value cannot be exactly represented, it is rounded using the default
24389 rounding mode. The conversion is performed on lane positions below the
24390 explicit vector length and where the vector mask is true. Masked-off lanes are
24396 .. code-block:: llvm
24398 %r = call <4 x float> @llvm.vp.uitofp.v4f32.v4i32(<4 x i32> %a, <4 x i1> %mask, i32 %evl)
24399 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
24401 %t = uitofp <4 x i32> %a to <4 x float>
24402 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
24407 '``llvm.vp.sitofp.*``' Intrinsics
24408 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24412 This is an overloaded intrinsic.
24416 declare <16 x float> @llvm.vp.sitofp.v16f32.v16i32 (<16 x i32> <op>, <16 x i1> <mask>, i32 <vector_length>)
24417 declare <vscale x 4 x float> @llvm.vp.sitofp.nxv4f32.nxv4i32 (<vscale x 4 x i32> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
24418 declare <256 x double> @llvm.vp.sitofp.v256f64.v256i64 (<256 x i64> <op>, <256 x i1> <mask>, i32 <vector_length>)
24423 The '``llvm.vp.sitofp``' intrinsic converts its signed integer argument to the
24424 :ref:`floating-point <t_floating>` return type. The operation has a mask and
24425 an explicit vector length parameter.
24431 The '``llvm.vp.sitofp``' intrinsic takes a value to cast as its first argument.
24432 The value to cast must be vector of :ref:`integer <t_integer>` type. The
24433 return type is the type to cast the value to. The return type must be a vector
24434 of :ref:`floating-point <t_floating>` type. The second argument is the vector
24435 mask. The return type, the value to cast, and the vector mask have the same
24436 number of elements. The third argument is the explicit vector length of the
24442 The '``llvm.vp.sitofp``' intrinsic interprets its first argument as a signed
24443 integer quantity and converts it to the corresponding floating-point value. If
24444 the value cannot be exactly represented, it is rounded using the default
24445 rounding mode. The conversion is performed on lane positions below the
24446 explicit vector length and where the vector mask is true. Masked-off lanes are
24452 .. code-block:: llvm
24454 %r = call <4 x float> @llvm.vp.sitofp.v4f32.v4i32(<4 x i32> %a, <4 x i1> %mask, i32 %evl)
24455 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
24457 %t = sitofp <4 x i32> %a to <4 x float>
24458 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
24461 .. _int_vp_ptrtoint:
24463 '``llvm.vp.ptrtoint.*``' Intrinsics
24464 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24468 This is an overloaded intrinsic.
24472 declare <16 x i8> @llvm.vp.ptrtoint.v16i8.v16p0(<16 x ptr> <op>, <16 x i1> <mask>, i32 <vector_length>)
24473 declare <vscale x 4 x i8> @llvm.vp.ptrtoint.nxv4i8.nxv4p0(<vscale x 4 x ptr> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
24474 declare <256 x i64> @llvm.vp.ptrtoint.v16i64.v16p0(<256 x ptr> <op>, <256 x i1> <mask>, i32 <vector_length>)
24479 The '``llvm.vp.ptrtoint``' intrinsic converts its pointer to the integer return
24480 type. The operation has a mask and an explicit vector length parameter.
24486 The '``llvm.vp.ptrtoint``' intrinsic takes a value to cast as its first argument
24487 , which must be a vector of pointers, and a type to cast it to return type,
24488 which must be a vector of :ref:`integer <t_integer>` type.
24489 The second argument is the vector mask. The return type, the value to cast, and
24490 the vector mask have the same number of elements.
24491 The third argument is the explicit vector length of the operation.
24496 The '``llvm.vp.ptrtoint``' intrinsic converts value to return type by
24497 interpreting the pointer value as an integer and either truncating or zero
24498 extending that value to the size of the integer type.
24499 If ``value`` is smaller than return type, then a zero extension is done. If
24500 ``value`` is larger than return type, then a truncation is done. If they are
24501 the same size, then nothing is done (*no-op cast*) other than a type
24503 The conversion is performed on lane positions below the explicit vector length
24504 and where the vector mask is true. Masked-off lanes are ``poison``.
24509 .. code-block:: llvm
24511 %r = call <4 x i8> @llvm.vp.ptrtoint.v4i8.v4p0i32(<4 x ptr> %a, <4 x i1> %mask, i32 %evl)
24512 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
24514 %t = ptrtoint <4 x ptr> %a to <4 x i8>
24515 %also.r = select <4 x i1> %mask, <4 x i8> %t, <4 x i8> poison
24518 .. _int_vp_inttoptr:
24520 '``llvm.vp.inttoptr.*``' Intrinsics
24521 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24525 This is an overloaded intrinsic.
24529 declare <16 x ptr> @llvm.vp.inttoptr.v16p0.v16i32 (<16 x i32> <op>, <16 x i1> <mask>, i32 <vector_length>)
24530 declare <vscale x 4 x ptr> @llvm.vp.inttoptr.nxv4p0.nxv4i32 (<vscale x 4 x i32> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
24531 declare <256 x ptr> @llvm.vp.inttoptr.v256p0.v256i32 (<256 x i32> <op>, <256 x i1> <mask>, i32 <vector_length>)
24536 The '``llvm.vp.inttoptr``' intrinsic converts its integer value to the point
24537 return type. The operation has a mask and an explicit vector length parameter.
24543 The '``llvm.vp.inttoptr``' intrinsic takes a value to cast as its first argument
24544 , which must be a vector of :ref:`integer <t_integer>` type, and a type to cast
24545 it to return type, which must be a vector of pointers type.
24546 The second argument is the vector mask. The return type, the value to cast, and
24547 the vector mask have the same number of elements.
24548 The third argument is the explicit vector length of the operation.
24553 The '``llvm.vp.inttoptr``' intrinsic converts ``value`` to return type by
24554 applying either a zero extension or a truncation depending on the size of the
24555 integer ``value``. If ``value`` is larger than the size of a pointer, then a
24556 truncation is done. If ``value`` is smaller than the size of a pointer, then a
24557 zero extension is done. If they are the same size, nothing is done (*no-op cast*).
24558 The conversion is performed on lane positions below the explicit vector length
24559 and where the vector mask is true. Masked-off lanes are ``poison``.
24564 .. code-block:: llvm
24566 %r = call <4 x ptr> @llvm.vp.inttoptr.v4p0i32.v4i32(<4 x i32> %a, <4 x i1> %mask, i32 %evl)
24567 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
24569 %t = inttoptr <4 x i32> %a to <4 x ptr>
24570 %also.r = select <4 x i1> %mask, <4 x ptr> %t, <4 x ptr> poison
24575 '``llvm.vp.fcmp.*``' Intrinsics
24576 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24580 This is an overloaded intrinsic.
24584 declare <16 x i1> @llvm.vp.fcmp.v16f32(<16 x float> <left_op>, <16 x float> <right_op>, metadata <condition code>, <16 x i1> <mask>, i32 <vector_length>)
24585 declare <vscale x 4 x i1> @llvm.vp.fcmp.nxv4f32(<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, metadata <condition code>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
24586 declare <256 x i1> @llvm.vp.fcmp.v256f64(<256 x double> <left_op>, <256 x double> <right_op>, metadata <condition code>, <256 x i1> <mask>, i32 <vector_length>)
24591 The '``llvm.vp.fcmp``' intrinsic returns a vector of boolean values based on
24592 the comparison of its arguments. The operation has a mask and an explicit vector
24599 The '``llvm.vp.fcmp``' intrinsic takes the two values to compare as its first
24600 and second arguments. These two values must be vectors of :ref:`floating-point
24601 <t_floating>` types.
24602 The return type is the result of the comparison. The return type must be a
24603 vector of :ref:`i1 <t_integer>` type. The fourth argument is the vector mask.
24604 The return type, the values to compare, and the vector mask have the same
24605 number of elements. The third argument is the condition code indicating the kind
24606 of comparison to perform. It must be a metadata string with :ref:`one of the
24607 supported floating-point condition code values <fcmp_md_cc>`. The fifth argument
24608 is the explicit vector length of the operation.
24613 The '``llvm.vp.fcmp``' compares its first two arguments according to the
24614 condition code given as the third argument. The arguments are compared element by
24615 element on each enabled lane, where the semantics of the comparison are
24616 defined :ref:`according to the condition code <fcmp_md_cc_sem>`. Masked-off
24617 lanes are ``poison``.
24622 .. code-block:: llvm
24624 %r = call <4 x i1> @llvm.vp.fcmp.v4f32(<4 x float> %a, <4 x float> %b, metadata !"oeq", <4 x i1> %mask, i32 %evl)
24625 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
24627 %t = fcmp oeq <4 x float> %a, %b
24628 %also.r = select <4 x i1> %mask, <4 x i1> %t, <4 x i1> poison
24633 '``llvm.vp.icmp.*``' Intrinsics
24634 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24638 This is an overloaded intrinsic.
24642 declare <32 x i1> @llvm.vp.icmp.v32i32(<32 x i32> <left_op>, <32 x i32> <right_op>, metadata <condition code>, <32 x i1> <mask>, i32 <vector_length>)
24643 declare <vscale x 2 x i1> @llvm.vp.icmp.nxv2i32(<vscale x 2 x i32> <left_op>, <vscale x 2 x i32> <right_op>, metadata <condition code>, <vscale x 2 x i1> <mask>, i32 <vector_length>)
24644 declare <128 x i1> @llvm.vp.icmp.v128i8(<128 x i8> <left_op>, <128 x i8> <right_op>, metadata <condition code>, <128 x i1> <mask>, i32 <vector_length>)
24649 The '``llvm.vp.icmp``' intrinsic returns a vector of boolean values based on
24650 the comparison of its arguments. The operation has a mask and an explicit vector
24657 The '``llvm.vp.icmp``' intrinsic takes the two values to compare as its first
24658 and second arguments. These two values must be vectors of :ref:`integer
24659 <t_integer>` types.
24660 The return type is the result of the comparison. The return type must be a
24661 vector of :ref:`i1 <t_integer>` type. The fourth argument is the vector mask.
24662 The return type, the values to compare, and the vector mask have the same
24663 number of elements. The third argument is the condition code indicating the kind
24664 of comparison to perform. It must be a metadata string with :ref:`one of the
24665 supported integer condition code values <icmp_md_cc>`. The fifth argument is the
24666 explicit vector length of the operation.
24671 The '``llvm.vp.icmp``' compares its first two arguments according to the
24672 condition code given as the third argument. The arguments are compared element by
24673 element on each enabled lane, where the semantics of the comparison are
24674 defined :ref:`according to the condition code <icmp_md_cc_sem>`. Masked-off
24675 lanes are ``poison``.
24680 .. code-block:: llvm
24682 %r = call <4 x i1> @llvm.vp.icmp.v4i32(<4 x i32> %a, <4 x i32> %b, metadata !"ne", <4 x i1> %mask, i32 %evl)
24683 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
24685 %t = icmp ne <4 x i32> %a, %b
24686 %also.r = select <4 x i1> %mask, <4 x i1> %t, <4 x i1> poison
24690 '``llvm.vp.ceil.*``' Intrinsics
24691 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24695 This is an overloaded intrinsic.
24699 declare <16 x float> @llvm.vp.ceil.v16f32 (<16 x float> <op>, <16 x i1> <mask>, i32 <vector_length>)
24700 declare <vscale x 4 x float> @llvm.vp.ceil.nxv4f32 (<vscale x 4 x float> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
24701 declare <256 x double> @llvm.vp.ceil.v256f64 (<256 x double> <op>, <256 x i1> <mask>, i32 <vector_length>)
24706 Predicated floating-point ceiling of a vector of floating-point values.
24712 The first argument and the result have the same vector of floating-point type.
24713 The second argument is the vector mask and has the same number of elements as the
24714 result vector type. The third argument is the explicit vector length of the
24720 The '``llvm.vp.ceil``' intrinsic performs floating-point ceiling
24721 (:ref:`ceil <int_ceil>`) of the first vector argument on each enabled lane. The
24722 result on disabled lanes is a :ref:`poison value <poisonvalues>`.
24727 .. code-block:: llvm
24729 %r = call <4 x float> @llvm.vp.ceil.v4f32(<4 x float> %a, <4 x i1> %mask, i32 %evl)
24730 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
24732 %t = call <4 x float> @llvm.ceil.v4f32(<4 x float> %a)
24733 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
24737 '``llvm.vp.floor.*``' Intrinsics
24738 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24742 This is an overloaded intrinsic.
24746 declare <16 x float> @llvm.vp.floor.v16f32 (<16 x float> <op>, <16 x i1> <mask>, i32 <vector_length>)
24747 declare <vscale x 4 x float> @llvm.vp.floor.nxv4f32 (<vscale x 4 x float> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
24748 declare <256 x double> @llvm.vp.floor.v256f64 (<256 x double> <op>, <256 x i1> <mask>, i32 <vector_length>)
24753 Predicated floating-point floor of a vector of floating-point values.
24759 The first argument and the result have the same vector of floating-point type.
24760 The second argument is the vector mask and has the same number of elements as the
24761 result vector type. The third argument is the explicit vector length of the
24767 The '``llvm.vp.floor``' intrinsic performs floating-point floor
24768 (:ref:`floor <int_floor>`) of the first vector argument on each enabled lane.
24769 The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
24774 .. code-block:: llvm
24776 %r = call <4 x float> @llvm.vp.floor.v4f32(<4 x float> %a, <4 x i1> %mask, i32 %evl)
24777 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
24779 %t = call <4 x float> @llvm.floor.v4f32(<4 x float> %a)
24780 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
24784 '``llvm.vp.rint.*``' Intrinsics
24785 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24789 This is an overloaded intrinsic.
24793 declare <16 x float> @llvm.vp.rint.v16f32 (<16 x float> <op>, <16 x i1> <mask>, i32 <vector_length>)
24794 declare <vscale x 4 x float> @llvm.vp.rint.nxv4f32 (<vscale x 4 x float> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
24795 declare <256 x double> @llvm.vp.rint.v256f64 (<256 x double> <op>, <256 x i1> <mask>, i32 <vector_length>)
24800 Predicated floating-point rint of a vector of floating-point values.
24806 The first argument and the result have the same vector of floating-point type.
24807 The second argument is the vector mask and has the same number of elements as the
24808 result vector type. The third argument is the explicit vector length of the
24814 The '``llvm.vp.rint``' intrinsic performs floating-point rint
24815 (:ref:`rint <int_rint>`) of the first vector argument on each enabled lane.
24816 The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
24821 .. code-block:: llvm
24823 %r = call <4 x float> @llvm.vp.rint.v4f32(<4 x float> %a, <4 x i1> %mask, i32 %evl)
24824 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
24826 %t = call <4 x float> @llvm.rint.v4f32(<4 x float> %a)
24827 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
24829 .. _int_vp_nearbyint:
24831 '``llvm.vp.nearbyint.*``' Intrinsics
24832 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24836 This is an overloaded intrinsic.
24840 declare <16 x float> @llvm.vp.nearbyint.v16f32 (<16 x float> <op>, <16 x i1> <mask>, i32 <vector_length>)
24841 declare <vscale x 4 x float> @llvm.vp.nearbyint.nxv4f32 (<vscale x 4 x float> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
24842 declare <256 x double> @llvm.vp.nearbyint.v256f64 (<256 x double> <op>, <256 x i1> <mask>, i32 <vector_length>)
24847 Predicated floating-point nearbyint of a vector of floating-point values.
24853 The first argument and the result have the same vector of floating-point type.
24854 The second argument is the vector mask and has the same number of elements as the
24855 result vector type. The third argument is the explicit vector length of the
24861 The '``llvm.vp.nearbyint``' intrinsic performs floating-point nearbyint
24862 (:ref:`nearbyint <int_nearbyint>`) of the first vector argument on each enabled lane.
24863 The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
24868 .. code-block:: llvm
24870 %r = call <4 x float> @llvm.vp.nearbyint.v4f32(<4 x float> %a, <4 x i1> %mask, i32 %evl)
24871 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
24873 %t = call <4 x float> @llvm.nearbyint.v4f32(<4 x float> %a)
24874 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
24878 '``llvm.vp.round.*``' Intrinsics
24879 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24883 This is an overloaded intrinsic.
24887 declare <16 x float> @llvm.vp.round.v16f32 (<16 x float> <op>, <16 x i1> <mask>, i32 <vector_length>)
24888 declare <vscale x 4 x float> @llvm.vp.round.nxv4f32 (<vscale x 4 x float> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
24889 declare <256 x double> @llvm.vp.round.v256f64 (<256 x double> <op>, <256 x i1> <mask>, i32 <vector_length>)
24894 Predicated floating-point round of a vector of floating-point values.
24900 The first argument and the result have the same vector of floating-point type.
24901 The second argument is the vector mask and has the same number of elements as the
24902 result vector type. The third argument is the explicit vector length of the
24908 The '``llvm.vp.round``' intrinsic performs floating-point round
24909 (:ref:`round <int_round>`) of the first vector argument on each enabled lane.
24910 The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
24915 .. code-block:: llvm
24917 %r = call <4 x float> @llvm.vp.round.v4f32(<4 x float> %a, <4 x i1> %mask, i32 %evl)
24918 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
24920 %t = call <4 x float> @llvm.round.v4f32(<4 x float> %a)
24921 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
24923 .. _int_vp_roundeven:
24925 '``llvm.vp.roundeven.*``' Intrinsics
24926 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24930 This is an overloaded intrinsic.
24934 declare <16 x float> @llvm.vp.roundeven.v16f32 (<16 x float> <op>, <16 x i1> <mask>, i32 <vector_length>)
24935 declare <vscale x 4 x float> @llvm.vp.roundeven.nxv4f32 (<vscale x 4 x float> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
24936 declare <256 x double> @llvm.vp.roundeven.v256f64 (<256 x double> <op>, <256 x i1> <mask>, i32 <vector_length>)
24941 Predicated floating-point roundeven of a vector of floating-point values.
24947 The first argument and the result have the same vector of floating-point type.
24948 The second argument is the vector mask and has the same number of elements as the
24949 result vector type. The third argument is the explicit vector length of the
24955 The '``llvm.vp.roundeven``' intrinsic performs floating-point roundeven
24956 (:ref:`roundeven <int_roundeven>`) of the first vector argument on each enabled
24957 lane. The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
24962 .. code-block:: llvm
24964 %r = call <4 x float> @llvm.vp.roundeven.v4f32(<4 x float> %a, <4 x i1> %mask, i32 %evl)
24965 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
24967 %t = call <4 x float> @llvm.roundeven.v4f32(<4 x float> %a)
24968 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
24970 .. _int_vp_roundtozero:
24972 '``llvm.vp.roundtozero.*``' Intrinsics
24973 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24977 This is an overloaded intrinsic.
24981 declare <16 x float> @llvm.vp.roundtozero.v16f32 (<16 x float> <op>, <16 x i1> <mask>, i32 <vector_length>)
24982 declare <vscale x 4 x float> @llvm.vp.roundtozero.nxv4f32 (<vscale x 4 x float> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
24983 declare <256 x double> @llvm.vp.roundtozero.v256f64 (<256 x double> <op>, <256 x i1> <mask>, i32 <vector_length>)
24988 Predicated floating-point round-to-zero of a vector of floating-point values.
24994 The first argument and the result have the same vector of floating-point type.
24995 The second argument is the vector mask and has the same number of elements as the
24996 result vector type. The third argument is the explicit vector length of the
25002 The '``llvm.vp.roundtozero``' intrinsic performs floating-point roundeven
25003 (:ref:`llvm.trunc <int_llvm_trunc>`) of the first vector argument on each enabled lane. The
25004 result on disabled lanes is a :ref:`poison value <poisonvalues>`.
25009 .. code-block:: llvm
25011 %r = call <4 x float> @llvm.vp.roundtozero.v4f32(<4 x float> %a, <4 x i1> %mask, i32 %evl)
25012 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
25014 %t = call <4 x float> @llvm.trunc.v4f32(<4 x float> %a)
25015 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
25019 '``llvm.vp.lrint.*``' Intrinsics
25020 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25024 This is an overloaded intrinsic.
25028 declare <16 x i32> @llvm.vp.lrint.v16i32.v16f32(<16 x float> <op>, <16 x i1> <mask>, i32 <vector_length>)
25029 declare <vscale x 4 x i32> @llvm.vp.lrint.nxv4i32.nxv4f32(<vscale x 4 x float> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
25030 declare <256 x i64> @llvm.vp.lrint.v256i64.v256f64(<256 x double> <op>, <256 x i1> <mask>, i32 <vector_length>)
25035 Predicated lrint of a vector of floating-point values.
25041 The result is an integer vector and the first argument is a vector of :ref:`floating-point <t_floating>`
25042 type with the same number of elements as the result vector type. The second
25043 argument is the vector mask and has the same number of elements as the result
25044 vector type. The third argument is the explicit vector length of the operation.
25049 The '``llvm.vp.lrint``' intrinsic performs lrint (:ref:`lrint <int_lrint>`) of
25050 the first vector argument on each enabled lane. The result on disabled lanes is a
25051 :ref:`poison value <poisonvalues>`.
25056 .. code-block:: llvm
25058 %r = call <4 x i32> @llvm.vp.lrint.v4i32.v4f32(<4 x float> %a, <4 x i1> %mask, i32 %evl)
25059 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
25061 %t = call <4 x i32> @llvm.lrint.v4f32(<4 x float> %a)
25062 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
25066 '``llvm.vp.llrint.*``' Intrinsics
25067 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25071 This is an overloaded intrinsic.
25075 declare <16 x i32> @llvm.vp.llrint.v16i32.v16f32(<16 x float> <op>, <16 x i1> <mask>, i32 <vector_length>)
25076 declare <vscale x 4 x i32> @llvm.vp.llrint.nxv4i32.nxv4f32(<vscale x 4 x float> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
25077 declare <256 x i64> @llvm.vp.llrint.v256i64.v256f64(<256 x double> <op>, <256 x i1> <mask>, i32 <vector_length>)
25082 Predicated llrint of a vector of floating-point values.
25087 The result is an integer vector and the first argument is a vector of :ref:`floating-point <t_floating>`
25088 type with the same number of elements as the result vector type. The second
25089 argument is the vector mask and has the same number of elements as the result
25090 vector type. The third argument is the explicit vector length of the operation.
25095 The '``llvm.vp.llrint``' intrinsic performs lrint (:ref:`llrint <int_llrint>`) of
25096 the first vector argument on each enabled lane. The result on disabled lanes is a
25097 :ref:`poison value <poisonvalues>`.
25102 .. code-block:: llvm
25104 %r = call <4 x i32> @llvm.vp.llrint.v4i32.v4f32(<4 x float> %a, <4 x i1> %mask, i32 %evl)
25105 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
25107 %t = call <4 x i32> @llvm.llrint.v4f32(<4 x float> %a)
25108 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
25111 .. _int_vp_bitreverse:
25113 '``llvm.vp.bitreverse.*``' Intrinsics
25114 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25118 This is an overloaded intrinsic.
25122 declare <16 x i32> @llvm.vp.bitreverse.v16i32 (<16 x i32> <op>, <16 x i1> <mask>, i32 <vector_length>)
25123 declare <vscale x 4 x i32> @llvm.vp.bitreverse.nxv4i32 (<vscale x 4 x i32> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
25124 declare <256 x i64> @llvm.vp.bitreverse.v256i64 (<256 x i64> <op>, <256 x i1> <mask>, i32 <vector_length>)
25129 Predicated bitreverse of a vector of integers.
25135 The first argument and the result have the same vector of integer type. The
25136 second argument is the vector mask and has the same number of elements as the
25137 result vector type. The third argument is the explicit vector length of the
25143 The '``llvm.vp.bitreverse``' intrinsic performs bitreverse (:ref:`bitreverse <int_bitreverse>`) of the first argument on each
25144 enabled lane. The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
25149 .. code-block:: llvm
25151 %r = call <4 x i32> @llvm.vp.bitreverse.v4i32(<4 x i32> %a, <4 x i1> %mask, i32 %evl)
25152 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
25154 %t = call <4 x i32> @llvm.bitreverse.v4i32(<4 x i32> %a)
25155 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
25160 '``llvm.vp.bswap.*``' Intrinsics
25161 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25165 This is an overloaded intrinsic.
25169 declare <16 x i32> @llvm.vp.bswap.v16i32 (<16 x i32> <op>, <16 x i1> <mask>, i32 <vector_length>)
25170 declare <vscale x 4 x i32> @llvm.vp.bswap.nxv4i32 (<vscale x 4 x i32> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
25171 declare <256 x i64> @llvm.vp.bswap.v256i64 (<256 x i64> <op>, <256 x i1> <mask>, i32 <vector_length>)
25176 Predicated bswap of a vector of integers.
25182 The first argument and the result have the same vector of integer type. The
25183 second argument is the vector mask and has the same number of elements as the
25184 result vector type. The third argument is the explicit vector length of the
25190 The '``llvm.vp.bswap``' intrinsic performs bswap (:ref:`bswap <int_bswap>`) of the first argument on each
25191 enabled lane. The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
25196 .. code-block:: llvm
25198 %r = call <4 x i32> @llvm.vp.bswap.v4i32(<4 x i32> %a, <4 x i1> %mask, i32 %evl)
25199 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
25201 %t = call <4 x i32> @llvm.bswap.v4i32(<4 x i32> %a)
25202 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
25207 '``llvm.vp.ctpop.*``' Intrinsics
25208 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25212 This is an overloaded intrinsic.
25216 declare <16 x i32> @llvm.vp.ctpop.v16i32 (<16 x i32> <op>, <16 x i1> <mask>, i32 <vector_length>)
25217 declare <vscale x 4 x i32> @llvm.vp.ctpop.nxv4i32 (<vscale x 4 x i32> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
25218 declare <256 x i64> @llvm.vp.ctpop.v256i64 (<256 x i64> <op>, <256 x i1> <mask>, i32 <vector_length>)
25223 Predicated ctpop of a vector of integers.
25229 The first argument and the result have the same vector of integer type. The
25230 second argument is the vector mask and has the same number of elements as the
25231 result vector type. The third argument is the explicit vector length of the
25237 The '``llvm.vp.ctpop``' intrinsic performs ctpop (:ref:`ctpop <int_ctpop>`) of the first argument on each
25238 enabled lane. The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
25243 .. code-block:: llvm
25245 %r = call <4 x i32> @llvm.vp.ctpop.v4i32(<4 x i32> %a, <4 x i1> %mask, i32 %evl)
25246 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
25248 %t = call <4 x i32> @llvm.ctpop.v4i32(<4 x i32> %a)
25249 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
25254 '``llvm.vp.ctlz.*``' Intrinsics
25255 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25259 This is an overloaded intrinsic.
25263 declare <16 x i32> @llvm.vp.ctlz.v16i32 (<16 x i32> <op>, <16 x i1> <mask>, i32 <vector_length>, i1 <is_zero_poison>)
25264 declare <vscale x 4 x i32> @llvm.vp.ctlz.nxv4i32 (<vscale x 4 x i32> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>, i1 <is_zero_poison>)
25265 declare <256 x i64> @llvm.vp.ctlz.v256i64 (<256 x i64> <op>, <256 x i1> <mask>, i32 <vector_length>, i1 <is_zero_poison>)
25270 Predicated ctlz of a vector of integers.
25276 The first argument and the result have the same vector of integer type. The
25277 second argument is the vector mask and has the same number of elements as the
25278 result vector type. The third argument is the explicit vector length of the
25279 operation. The fourth argument is a constant flag that indicates whether the
25280 intrinsic returns a valid result if the first argument is zero. If the first
25281 argument is zero and the fourth argument is true, the result is poison.
25286 The '``llvm.vp.ctlz``' intrinsic performs ctlz (:ref:`ctlz <int_ctlz>`) of the first argument on each
25287 enabled lane. The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
25292 .. code-block:: llvm
25294 %r = call <4 x i32> @llvm.vp.ctlz.v4i32(<4 x i32> %a, <4 x i1> %mask, i32 %evl, i1 false)
25295 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
25297 %t = call <4 x i32> @llvm.ctlz.v4i32(<4 x i32> %a, i1 false)
25298 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
25303 '``llvm.vp.cttz.*``' Intrinsics
25304 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25308 This is an overloaded intrinsic.
25312 declare <16 x i32> @llvm.vp.cttz.v16i32 (<16 x i32> <op>, <16 x i1> <mask>, i32 <vector_length>, i1 <is_zero_poison>)
25313 declare <vscale x 4 x i32> @llvm.vp.cttz.nxv4i32 (<vscale x 4 x i32> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>, i1 <is_zero_poison>)
25314 declare <256 x i64> @llvm.vp.cttz.v256i64 (<256 x i64> <op>, <256 x i1> <mask>, i32 <vector_length>, i1 <is_zero_poison>)
25319 Predicated cttz of a vector of integers.
25325 The first argument and the result have the same vector of integer type. The
25326 second argument is the vector mask and has the same number of elements as the
25327 result vector type. The third argument is the explicit vector length of the
25328 operation. The fourth argument is a constant flag that indicates whether the
25329 intrinsic returns a valid result if the first argument is zero. If the first
25330 argument is zero and the fourth argument is true, the result is poison.
25335 The '``llvm.vp.cttz``' intrinsic performs cttz (:ref:`cttz <int_cttz>`) of the first argument on each
25336 enabled lane. The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
25341 .. code-block:: llvm
25343 %r = call <4 x i32> @llvm.vp.cttz.v4i32(<4 x i32> %a, <4 x i1> %mask, i32 %evl, i1 false)
25344 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
25346 %t = call <4 x i32> @llvm.cttz.v4i32(<4 x i32> %a, i1 false)
25347 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
25350 .. _int_vp_cttz_elts:
25352 '``llvm.vp.cttz.elts.*``' Intrinsics
25353 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25357 This is an overloaded intrinsic. You can use ```llvm.vp.cttz.elts``` on any
25358 vector of integer elements, both fixed width and scalable.
25362 declare i32 @llvm.vp.cttz.elts.i32.v16i32 (<16 x i32> <op>, i1 <is_zero_poison>, <16 x i1> <mask>, i32 <vector_length>)
25363 declare i64 @llvm.vp.cttz.elts.i64.nxv4i32 (<vscale x 4 x i32> <op>, i1 <is_zero_poison>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
25364 declare i64 @llvm.vp.cttz.elts.i64.v256i1 (<256 x i1> <op>, i1 <is_zero_poison>, <256 x i1> <mask>, i32 <vector_length>)
25369 This '```llvm.vp.cttz.elts```' intrinsic counts the number of trailing zero
25370 elements of a vector. This is basically the vector-predicated version of
25371 '```llvm.experimental.cttz.elts```'.
25376 The first argument is the vector to be counted. This argument must be a vector
25377 with integer element type. The return type must also be an integer type which is
25378 wide enough to hold the maximum number of elements of the source vector. The
25379 behavior of this intrinsic is undefined if the return type is not wide enough
25380 for the number of elements in the input vector.
25382 The second argument is a constant flag that indicates whether the intrinsic
25383 returns a valid result if the first argument is all zero.
25385 The third argument is the vector mask and has the same number of elements as the
25386 input vector type. The fourth argument is the explicit vector length of the
25392 The '``llvm.vp.cttz.elts``' intrinsic counts the trailing (least
25393 significant / lowest-numbered) zero elements in the first argument on each
25394 enabled lane. If the first argument is all zero and the second argument is true,
25395 the result is poison. Otherwise, it returns the explicit vector length (i.e. the
25398 .. _int_vp_sadd_sat:
25400 '``llvm.vp.sadd.sat.*``' Intrinsics
25401 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25405 This is an overloaded intrinsic.
25409 declare <16 x i32> @llvm.vp.sadd.sat.v16i32 (<16 x i32> <left_op> <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
25410 declare <vscale x 4 x i32> @llvm.vp.sadd.sat.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
25411 declare <256 x i64> @llvm.vp.sadd.sat.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
25416 Predicated signed saturating addition of two vectors of integers.
25422 The first two arguments and the result have the same vector of integer type. The
25423 third argument is the vector mask and has the same number of elements as the
25424 result vector type. The fourth argument is the explicit vector length of the
25430 The '``llvm.vp.sadd.sat``' intrinsic performs sadd.sat (:ref:`sadd.sat <int_sadd_sat>`)
25431 of the first and second vector arguments on each enabled lane. The result on
25432 disabled lanes is a :ref:`poison value <poisonvalues>`.
25438 .. code-block:: llvm
25440 %r = call <4 x i32> @llvm.vp.sadd.sat.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
25441 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
25443 %t = call <4 x i32> @llvm.sadd.sat.v4i32(<4 x i32> %a, <4 x i32> %b)
25444 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
25447 .. _int_vp_uadd_sat:
25449 '``llvm.vp.uadd.sat.*``' Intrinsics
25450 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25454 This is an overloaded intrinsic.
25458 declare <16 x i32> @llvm.vp.uadd.sat.v16i32 (<16 x i32> <left_op> <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
25459 declare <vscale x 4 x i32> @llvm.vp.uadd.sat.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
25460 declare <256 x i64> @llvm.vp.uadd.sat.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
25465 Predicated unsigned saturating addition of two vectors of integers.
25471 The first two arguments and the result have the same vector of integer type. The
25472 third argument is the vector mask and has the same number of elements as the
25473 result vector type. The fourth argument is the explicit vector length of the
25479 The '``llvm.vp.uadd.sat``' intrinsic performs uadd.sat (:ref:`uadd.sat <int_uadd_sat>`)
25480 of the first and second vector arguments on each enabled lane. The result on
25481 disabled lanes is a :ref:`poison value <poisonvalues>`.
25487 .. code-block:: llvm
25489 %r = call <4 x i32> @llvm.vp.uadd.sat.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
25490 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
25492 %t = call <4 x i32> @llvm.uadd.sat.v4i32(<4 x i32> %a, <4 x i32> %b)
25493 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
25496 .. _int_vp_ssub_sat:
25498 '``llvm.vp.ssub.sat.*``' Intrinsics
25499 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25503 This is an overloaded intrinsic.
25507 declare <16 x i32> @llvm.vp.ssub.sat.v16i32 (<16 x i32> <left_op> <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
25508 declare <vscale x 4 x i32> @llvm.vp.ssub.sat.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
25509 declare <256 x i64> @llvm.vp.ssub.sat.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
25514 Predicated signed saturating subtraction of two vectors of integers.
25520 The first two arguments and the result have the same vector of integer type. The
25521 third argument is the vector mask and has the same number of elements as the
25522 result vector type. The fourth argument is the explicit vector length of the
25528 The '``llvm.vp.ssub.sat``' intrinsic performs ssub.sat (:ref:`ssub.sat <int_ssub_sat>`)
25529 of the first and second vector arguments on each enabled lane. The result on
25530 disabled lanes is a :ref:`poison value <poisonvalues>`.
25536 .. code-block:: llvm
25538 %r = call <4 x i32> @llvm.vp.ssub.sat.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
25539 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
25541 %t = call <4 x i32> @llvm.ssub.sat.v4i32(<4 x i32> %a, <4 x i32> %b)
25542 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
25545 .. _int_vp_usub_sat:
25547 '``llvm.vp.usub.sat.*``' Intrinsics
25548 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25552 This is an overloaded intrinsic.
25556 declare <16 x i32> @llvm.vp.usub.sat.v16i32 (<16 x i32> <left_op> <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
25557 declare <vscale x 4 x i32> @llvm.vp.usub.sat.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
25558 declare <256 x i64> @llvm.vp.usub.sat.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
25563 Predicated unsigned saturating subtraction of two vectors of integers.
25569 The first two arguments and the result have the same vector of integer type. The
25570 third argument is the vector mask and has the same number of elements as the
25571 result vector type. The fourth argument is the explicit vector length of the
25577 The '``llvm.vp.usub.sat``' intrinsic performs usub.sat (:ref:`usub.sat <int_usub_sat>`)
25578 of the first and second vector arguments on each enabled lane. The result on
25579 disabled lanes is a :ref:`poison value <poisonvalues>`.
25585 .. code-block:: llvm
25587 %r = call <4 x i32> @llvm.vp.usub.sat.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
25588 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
25590 %t = call <4 x i32> @llvm.usub.sat.v4i32(<4 x i32> %a, <4 x i32> %b)
25591 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
25596 '``llvm.vp.fshl.*``' Intrinsics
25597 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25601 This is an overloaded intrinsic.
25605 declare <16 x i32> @llvm.vp.fshl.v16i32 (<16 x i32> <left_op>, <16 x i32> <middle_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
25606 declare <vscale x 4 x i32> @llvm.vp.fshl.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <middle_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
25607 declare <256 x i64> @llvm.vp.fshl.v256i64 (<256 x i64> <left_op>, <256 x i64> <middle_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
25612 Predicated fshl of three vectors of integers.
25618 The first three arguments and the result have the same vector of integer type. The
25619 fourth argument is the vector mask and has the same number of elements as the
25620 result vector type. The fifth argument is the explicit vector length of the
25626 The '``llvm.vp.fshl``' intrinsic performs fshl (:ref:`fshl <int_fshl>`) of the first, second, and third
25627 vector argument on each enabled lane. The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
25633 .. code-block:: llvm
25635 %r = call <4 x i32> @llvm.vp.fshl.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i32> %c, <4 x i1> %mask, i32 %evl)
25636 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
25638 %t = call <4 x i32> @llvm.fshl.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i32> %c)
25639 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
25642 '``llvm.vp.fshr.*``' Intrinsics
25643 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25647 This is an overloaded intrinsic.
25651 declare <16 x i32> @llvm.vp.fshr.v16i32 (<16 x i32> <left_op>, <16 x i32> <middle_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
25652 declare <vscale x 4 x i32> @llvm.vp.fshr.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <middle_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
25653 declare <256 x i64> @llvm.vp.fshr.v256i64 (<256 x i64> <left_op>, <256 x i64> <middle_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
25658 Predicated fshr of three vectors of integers.
25664 The first three arguments and the result have the same vector of integer type. The
25665 fourth argument is the vector mask and has the same number of elements as the
25666 result vector type. The fifth argument is the explicit vector length of the
25672 The '``llvm.vp.fshr``' intrinsic performs fshr (:ref:`fshr <int_fshr>`) of the first, second, and third
25673 vector argument on each enabled lane. The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
25679 .. code-block:: llvm
25681 %r = call <4 x i32> @llvm.vp.fshr.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i32> %c, <4 x i1> %mask, i32 %evl)
25682 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
25684 %t = call <4 x i32> @llvm.fshr.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i32> %c)
25685 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
25687 '``llvm.vp.is.fpclass.*``' Intrinsics
25688 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25692 This is an overloaded intrinsic.
25696 declare <vscale x 2 x i1> @llvm.vp.is.fpclass.nxv2f32(<vscale x 2 x float> <op>, i32 <test>, <vscale x 2 x i1> <mask>, i32 <vector_length>)
25697 declare <2 x i1> @llvm.vp.is.fpclass.v2f16(<2 x half> <op>, i32 <test>, <2 x i1> <mask>, i32 <vector_length>)
25702 Predicated llvm.is.fpclass :ref:`llvm.is.fpclass <llvm.is.fpclass>`
25707 The first argument is a floating-point vector, the result type is a vector of
25708 boolean with the same number of elements as the first argument. The second
25709 argument specifies, which tests to perform :ref:`llvm.is.fpclass <llvm.is.fpclass>`.
25710 The third argument is the vector mask and has the same number of elements as the
25711 result vector type. The fourth argument is the explicit vector length of the
25717 The '``llvm.vp.is.fpclass``' intrinsic performs llvm.is.fpclass (:ref:`llvm.is.fpclass <llvm.is.fpclass>`).
25723 .. code-block:: llvm
25725 %r = call <2 x i1> @llvm.vp.is.fpclass.v2f16(<2 x half> %x, i32 3, <2 x i1> %m, i32 %evl)
25726 %t = call <vscale x 2 x i1> @llvm.vp.is.fpclass.nxv2f16(<vscale x 2 x half> %x, i32 3, <vscale x 2 x i1> %m, i32 %evl)
25728 .. _int_mload_mstore:
25730 Masked Vector Load and Store Intrinsics
25731 ---------------------------------------
25733 LLVM provides intrinsics for predicated vector load and store operations. The predicate is specified by a mask argument, which holds one bit per vector element, switching the associated vector lane on or off. The memory addresses corresponding to the "off" lanes are not accessed. When all bits of the mask are on, the intrinsic is identical to a regular vector load or store. When all bits are off, no memory is accessed.
25737 '``llvm.masked.load.*``' Intrinsics
25738 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25742 This is an overloaded intrinsic. The loaded data is a vector of any integer, floating-point or pointer data type.
25746 declare <16 x float> @llvm.masked.load.v16f32.p0(ptr <ptr>, i32 <alignment>, <16 x i1> <mask>, <16 x float> <passthru>)
25747 declare <2 x double> @llvm.masked.load.v2f64.p0(ptr <ptr>, i32 <alignment>, <2 x i1> <mask>, <2 x double> <passthru>)
25748 ;; The data is a vector of pointers
25749 declare <8 x ptr> @llvm.masked.load.v8p0.p0(ptr <ptr>, i32 <alignment>, <8 x i1> <mask>, <8 x ptr> <passthru>)
25754 Reads a vector from memory according to the provided mask. The mask holds a bit for each vector lane, and is used to prevent memory accesses to the masked-off lanes. The masked-off lanes in the result vector are taken from the corresponding lanes of the '``passthru``' argument.
25760 The first argument is the base pointer for the load. The second argument is the alignment of the source location. It must be a power of two constant integer value. The third argument, mask, is a vector of boolean values with the same number of elements as the return type. The fourth is a pass-through value that is used to fill the masked-off lanes of the result. The return type, underlying type of the base pointer and the type of the '``passthru``' argument are the same vector types.
25765 The '``llvm.masked.load``' intrinsic is designed for conditional reading of selected vector elements in a single IR operation. It is useful for targets that support vector masked loads and allows vectorizing predicated basic blocks on these targets. Other targets may support this intrinsic differently, for example by lowering it into a sequence of branches that guard scalar load operations.
25766 The result of this operation is equivalent to a regular vector load instruction followed by a 'select' between the loaded and the passthru values, predicated on the same mask, except that the masked-off lanes are not accessed.
25767 Only the masked-on lanes of the vector need to be inbounds of an allocation (but all these lanes need to be inbounds of the same allocation).
25768 In particular, using this intrinsic prevents exceptions on memory accesses to masked-off lanes.
25769 Masked-off lanes are also not considered accessed for the purpose of data races or ``noalias`` constraints.
25774 %res = call <16 x float> @llvm.masked.load.v16f32.p0(ptr %ptr, i32 4, <16 x i1>%mask, <16 x float> %passthru)
25776 ;; The result of the two following instructions is identical aside from potential memory access exception
25777 %loadlal = load <16 x float>, ptr %ptr, align 4
25778 %res = select <16 x i1> %mask, <16 x float> %loadlal, <16 x float> %passthru
25782 '``llvm.masked.store.*``' Intrinsics
25783 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25787 This is an overloaded intrinsic. The data stored in memory is a vector of any integer, floating-point or pointer data type.
25791 declare void @llvm.masked.store.v8i32.p0 (<8 x i32> <value>, ptr <ptr>, i32 <alignment>, <8 x i1> <mask>)
25792 declare void @llvm.masked.store.v16f32.p0(<16 x float> <value>, ptr <ptr>, i32 <alignment>, <16 x i1> <mask>)
25793 ;; The data is a vector of pointers
25794 declare void @llvm.masked.store.v8p0.p0 (<8 x ptr> <value>, ptr <ptr>, i32 <alignment>, <8 x i1> <mask>)
25799 Writes a vector to memory according to the provided mask. The mask holds a bit for each vector lane, and is used to prevent memory accesses to the masked-off lanes.
25804 The first argument is the vector value to be written to memory. The second argument is the base pointer for the store, it has the same underlying type as the value argument. The third argument is the alignment of the destination location. It must be a power of two constant integer value. The fourth argument, mask, is a vector of boolean values. The types of the mask and the value argument must have the same number of vector elements.
25810 The '``llvm.masked.store``' intrinsics is designed for conditional writing of selected vector elements in a single IR operation. It is useful for targets that support vector masked store and allows vectorizing predicated basic blocks on these targets. Other targets may support this intrinsic differently, for example by lowering it into a sequence of branches that guard scalar store operations.
25811 The result of this operation is equivalent to a load-modify-store sequence, except that the masked-off lanes are not accessed.
25812 Only the masked-on lanes of the vector need to be inbounds of an allocation (but all these lanes need to be inbounds of the same allocation).
25813 In particular, using this intrinsic prevents exceptions on memory accesses to masked-off lanes.
25814 Masked-off lanes are also not considered accessed for the purpose of data races or ``noalias`` constraints.
25818 call void @llvm.masked.store.v16f32.p0(<16 x float> %value, ptr %ptr, i32 4, <16 x i1> %mask)
25820 ;; The result of the following instructions is identical aside from potential data races and memory access exceptions
25821 %oldval = load <16 x float>, ptr %ptr, align 4
25822 %res = select <16 x i1> %mask, <16 x float> %value, <16 x float> %oldval
25823 store <16 x float> %res, ptr %ptr, align 4
25826 Masked Vector Gather and Scatter Intrinsics
25827 -------------------------------------------
25829 LLVM provides intrinsics for vector gather and scatter operations. They are similar to :ref:`Masked Vector Load and Store <int_mload_mstore>`, except they are designed for arbitrary memory accesses, rather than sequential memory accesses. Gather and scatter also employ a mask argument, which holds one bit per vector element, switching the associated vector lane on or off. The memory addresses corresponding to the "off" lanes are not accessed. When all bits are off, no memory is accessed.
25833 '``llvm.masked.gather.*``' Intrinsics
25834 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25838 This is an overloaded intrinsic. The loaded data are multiple scalar values of any integer, floating-point or pointer data type gathered together into one vector.
25842 declare <16 x float> @llvm.masked.gather.v16f32.v16p0(<16 x ptr> <ptrs>, i32 <alignment>, <16 x i1> <mask>, <16 x float> <passthru>)
25843 declare <2 x double> @llvm.masked.gather.v2f64.v2p1(<2 x ptr addrspace(1)> <ptrs>, i32 <alignment>, <2 x i1> <mask>, <2 x double> <passthru>)
25844 declare <8 x ptr> @llvm.masked.gather.v8p0.v8p0(<8 x ptr> <ptrs>, i32 <alignment>, <8 x i1> <mask>, <8 x ptr> <passthru>)
25849 Reads scalar values from arbitrary memory locations and gathers them into one vector. The memory locations are provided in the vector of pointers '``ptrs``'. The memory is accessed according to the provided mask. The mask holds a bit for each vector lane, and is used to prevent memory accesses to the masked-off lanes. The masked-off lanes in the result vector are taken from the corresponding lanes of the '``passthru``' argument.
25855 The first argument is a vector of pointers which holds all memory addresses to read. The second argument is an alignment of the source addresses. It must be 0 or a power of two constant integer value. The third argument, mask, is a vector of boolean values with the same number of elements as the return type. The fourth is a pass-through value that is used to fill the masked-off lanes of the result. The return type, underlying type of the vector of pointers and the type of the '``passthru``' argument are the same vector types.
25860 The '``llvm.masked.gather``' intrinsic is designed for conditional reading of multiple scalar values from arbitrary memory locations in a single IR operation. It is useful for targets that support vector masked gathers and allows vectorizing basic blocks with data and control divergence. Other targets may support this intrinsic differently, for example by lowering it into a sequence of scalar load operations.
25861 The semantics of this operation are equivalent to a sequence of conditional scalar loads with subsequent gathering all loaded values into a single vector. The mask restricts memory access to certain lanes and facilitates vectorization of predicated basic blocks.
25866 %res = call <4 x double> @llvm.masked.gather.v4f64.v4p0(<4 x ptr> %ptrs, i32 8, <4 x i1> <i1 true, i1 true, i1 true, i1 true>, <4 x double> poison)
25868 ;; The gather with all-true mask is equivalent to the following instruction sequence
25869 %ptr0 = extractelement <4 x ptr> %ptrs, i32 0
25870 %ptr1 = extractelement <4 x ptr> %ptrs, i32 1
25871 %ptr2 = extractelement <4 x ptr> %ptrs, i32 2
25872 %ptr3 = extractelement <4 x ptr> %ptrs, i32 3
25874 %val0 = load double, ptr %ptr0, align 8
25875 %val1 = load double, ptr %ptr1, align 8
25876 %val2 = load double, ptr %ptr2, align 8
25877 %val3 = load double, ptr %ptr3, align 8
25879 %vec0 = insertelement <4 x double> poison, %val0, 0
25880 %vec01 = insertelement <4 x double> %vec0, %val1, 1
25881 %vec012 = insertelement <4 x double> %vec01, %val2, 2
25882 %vec0123 = insertelement <4 x double> %vec012, %val3, 3
25886 '``llvm.masked.scatter.*``' Intrinsics
25887 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25891 This is an overloaded intrinsic. The data stored in memory is a vector of any integer, floating-point or pointer data type. Each vector element is stored in an arbitrary memory address. Scatter with overlapping addresses is guaranteed to be ordered from least-significant to most-significant element.
25895 declare void @llvm.masked.scatter.v8i32.v8p0 (<8 x i32> <value>, <8 x ptr> <ptrs>, i32 <alignment>, <8 x i1> <mask>)
25896 declare void @llvm.masked.scatter.v16f32.v16p1(<16 x float> <value>, <16 x ptr addrspace(1)> <ptrs>, i32 <alignment>, <16 x i1> <mask>)
25897 declare void @llvm.masked.scatter.v4p0.v4p0 (<4 x ptr> <value>, <4 x ptr> <ptrs>, i32 <alignment>, <4 x i1> <mask>)
25902 Writes each element from the value vector to the corresponding memory address. The memory addresses are represented as a vector of pointers. Writing is done according to the provided mask. The mask holds a bit for each vector lane, and is used to prevent memory accesses to the masked-off lanes.
25907 The first argument is a vector value to be written to memory. The second argument is a vector of pointers, pointing to where the value elements should be stored. It has the same underlying type as the value argument. The third argument is an alignment of the destination addresses. It must be 0 or a power of two constant integer value. The fourth argument, mask, is a vector of boolean values. The types of the mask and the value argument must have the same number of vector elements.
25912 The '``llvm.masked.scatter``' intrinsics is designed for writing selected vector elements to arbitrary memory addresses in a single IR operation. The operation may be conditional, when not all bits in the mask are switched on. It is useful for targets that support vector masked scatter and allows vectorizing basic blocks with data and control divergence. Other targets may support this intrinsic differently, for example by lowering it into a sequence of branches that guard scalar store operations.
25916 ;; This instruction unconditionally stores data vector in multiple addresses
25917 call @llvm.masked.scatter.v8i32.v8p0(<8 x i32> %value, <8 x ptr> %ptrs, i32 4, <8 x i1> <true, true, .. true>)
25919 ;; It is equivalent to a list of scalar stores
25920 %val0 = extractelement <8 x i32> %value, i32 0
25921 %val1 = extractelement <8 x i32> %value, i32 1
25923 %val7 = extractelement <8 x i32> %value, i32 7
25924 %ptr0 = extractelement <8 x ptr> %ptrs, i32 0
25925 %ptr1 = extractelement <8 x ptr> %ptrs, i32 1
25927 %ptr7 = extractelement <8 x ptr> %ptrs, i32 7
25928 ;; Note: the order of the following stores is important when they overlap:
25929 store i32 %val0, ptr %ptr0, align 4
25930 store i32 %val1, ptr %ptr1, align 4
25932 store i32 %val7, ptr %ptr7, align 4
25935 Masked Vector Expanding Load and Compressing Store Intrinsics
25936 -------------------------------------------------------------
25938 LLVM provides intrinsics for expanding load and compressing store operations. Data selected from a vector according to a mask is stored in consecutive memory addresses (compressed store), and vice-versa (expanding load). These operations effective map to "if (cond.i) a[j++] = v.i" and "if (cond.i) v.i = a[j++]" patterns, respectively. Note that when the mask starts with '1' bits followed by '0' bits, these operations are identical to :ref:`llvm.masked.store <int_mstore>` and :ref:`llvm.masked.load <int_mload>`.
25940 .. _int_expandload:
25942 '``llvm.masked.expandload.*``' Intrinsics
25943 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25947 This is an overloaded intrinsic. Several values of integer, floating point or pointer data type are loaded from consecutive memory addresses and stored into the elements of a vector according to the mask.
25951 declare <16 x float> @llvm.masked.expandload.v16f32 (ptr <ptr>, <16 x i1> <mask>, <16 x float> <passthru>)
25952 declare <2 x i64> @llvm.masked.expandload.v2i64 (ptr <ptr>, <2 x i1> <mask>, <2 x i64> <passthru>)
25957 Reads a number of scalar values sequentially from memory location provided in '``ptr``' and spreads them in a vector. The '``mask``' holds a bit for each vector lane. The number of elements read from memory is equal to the number of '1' bits in the mask. The loaded elements are positioned in the destination vector according to the sequence of '1' and '0' bits in the mask. E.g., if the mask vector is '10010001', "expandload" reads 3 values from memory addresses ptr, ptr+1, ptr+2 and places them in lanes 0, 3 and 7 accordingly. The masked-off lanes are filled by elements from the corresponding lanes of the '``passthru``' argument.
25963 The first argument is the base pointer for the load. It has the same underlying type as the element of the returned vector. The second argument, mask, is a vector of boolean values with the same number of elements as the return type. The third is a pass-through value that is used to fill the masked-off lanes of the result. The return type and the type of the '``passthru``' argument have the same vector type.
25965 The :ref:`align <attr_align>` parameter attribute can be provided for the first
25966 argument. The pointer alignment defaults to 1.
25971 The '``llvm.masked.expandload``' intrinsic is designed for reading multiple scalar values from adjacent memory addresses into possibly non-adjacent vector lanes. It is useful for targets that support vector expanding loads and allows vectorizing loop with cross-iteration dependency like in the following example:
25975 // In this loop we load from B and spread the elements into array A.
25976 double *A, B; int *C;
25977 for (int i = 0; i < size; ++i) {
25983 .. code-block:: llvm
25985 ; Load several elements from array B and expand them in a vector.
25986 ; The number of loaded elements is equal to the number of '1' elements in the Mask.
25987 %Tmp = call <8 x double> @llvm.masked.expandload.v8f64(ptr %Bptr, <8 x i1> %Mask, <8 x double> poison)
25988 ; Store the result in A
25989 call void @llvm.masked.store.v8f64.p0(<8 x double> %Tmp, ptr %Aptr, i32 8, <8 x i1> %Mask)
25991 ; %Bptr should be increased on each iteration according to the number of '1' elements in the Mask.
25992 %MaskI = bitcast <8 x i1> %Mask to i8
25993 %MaskIPopcnt = call i8 @llvm.ctpop.i8(i8 %MaskI)
25994 %MaskI64 = zext i8 %MaskIPopcnt to i64
25995 %BNextInd = add i64 %BInd, %MaskI64
25998 Other targets may support this intrinsic differently, for example, by lowering it into a sequence of conditional scalar load operations and shuffles.
25999 If all mask elements are '1', the intrinsic behavior is equivalent to the regular unmasked vector load.
26001 .. _int_compressstore:
26003 '``llvm.masked.compressstore.*``' Intrinsics
26004 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26008 This is an overloaded intrinsic. A number of scalar values of integer, floating point or pointer data type are collected from an input vector and stored into adjacent memory addresses. A mask defines which elements to collect from the vector.
26012 declare void @llvm.masked.compressstore.v8i32 (<8 x i32> <value>, ptr <ptr>, <8 x i1> <mask>)
26013 declare void @llvm.masked.compressstore.v16f32 (<16 x float> <value>, ptr <ptr>, <16 x i1> <mask>)
26018 Selects elements from input vector '``value``' according to the '``mask``'. All selected elements are written into adjacent memory addresses starting at address '`ptr`', from lower to higher. The mask holds a bit for each vector lane, and is used to select elements to be stored. The number of elements to be stored is equal to the number of active bits in the mask.
26023 The first argument is the input vector, from which elements are collected and written to memory. The second argument is the base pointer for the store, it has the same underlying type as the element of the input vector argument. The third argument is the mask, a vector of boolean values. The mask and the input vector must have the same number of vector elements.
26025 The :ref:`align <attr_align>` parameter attribute can be provided for the second
26026 argument. The pointer alignment defaults to 1.
26031 The '``llvm.masked.compressstore``' intrinsic is designed for compressing data in memory. It allows to collect elements from possibly non-adjacent lanes of a vector and store them contiguously in memory in one IR operation. It is useful for targets that support compressing store operations and allows vectorizing loops with cross-iteration dependencies like in the following example:
26035 // In this loop we load elements from A and store them consecutively in B
26036 double *A, B; int *C;
26037 for (int i = 0; i < size; ++i) {
26043 .. code-block:: llvm
26045 ; Load elements from A.
26046 %Tmp = call <8 x double> @llvm.masked.load.v8f64.p0(ptr %Aptr, i32 8, <8 x i1> %Mask, <8 x double> poison)
26047 ; Store all selected elements consecutively in array B
26048 call <void> @llvm.masked.compressstore.v8f64(<8 x double> %Tmp, ptr %Bptr, <8 x i1> %Mask)
26050 ; %Bptr should be increased on each iteration according to the number of '1' elements in the Mask.
26051 %MaskI = bitcast <8 x i1> %Mask to i8
26052 %MaskIPopcnt = call i8 @llvm.ctpop.i8(i8 %MaskI)
26053 %MaskI64 = zext i8 %MaskIPopcnt to i64
26054 %BNextInd = add i64 %BInd, %MaskI64
26057 Other targets may support this intrinsic differently, for example, by lowering it into a sequence of branches that guard scalar store operations.
26063 This class of intrinsics provides information about the
26064 :ref:`lifetime of memory objects <objectlifetime>` and ranges where variables
26069 '``llvm.lifetime.start``' Intrinsic
26070 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26077 declare void @llvm.lifetime.start(i64 <size>, ptr nocapture <ptr>)
26082 The '``llvm.lifetime.start``' intrinsic specifies the start of a memory
26088 The first argument is a constant integer representing the size of the
26089 object, or -1 if it is variable sized. The second argument is a pointer
26095 If ``ptr`` is a stack-allocated object and it points to the first byte of
26096 the object, the object is initially marked as dead.
26097 ``ptr`` is conservatively considered as a non-stack-allocated object if
26098 the stack coloring algorithm that is used in the optimization pipeline cannot
26099 conclude that ``ptr`` is a stack-allocated object.
26101 After '``llvm.lifetime.start``', the stack object that ``ptr`` points is marked
26102 as alive and has an uninitialized value.
26103 The stack object is marked as dead when either
26104 :ref:`llvm.lifetime.end <int_lifeend>` to the alloca is executed or the
26107 After :ref:`llvm.lifetime.end <int_lifeend>` is called,
26108 '``llvm.lifetime.start``' on the stack object can be called again.
26109 The second '``llvm.lifetime.start``' call marks the object as alive, but it
26110 does not change the address of the object.
26112 If ``ptr`` is a non-stack-allocated object, it does not point to the first
26113 byte of the object or it is a stack object that is already alive, it simply
26114 fills all bytes of the object with ``poison``.
26119 '``llvm.lifetime.end``' Intrinsic
26120 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26127 declare void @llvm.lifetime.end(i64 <size>, ptr nocapture <ptr>)
26132 The '``llvm.lifetime.end``' intrinsic specifies the end of a memory object's
26138 The first argument is a constant integer representing the size of the
26139 object, or -1 if it is variable sized. The second argument is a pointer
26145 If ``ptr`` is a stack-allocated object and it points to the first byte of the
26146 object, the object is dead.
26147 ``ptr`` is conservatively considered as a non-stack-allocated object if
26148 the stack coloring algorithm that is used in the optimization pipeline cannot
26149 conclude that ``ptr`` is a stack-allocated object.
26151 Calling ``llvm.lifetime.end`` on an already dead alloca is no-op.
26153 If ``ptr`` is a non-stack-allocated object or it does not point to the first
26154 byte of the object, it is equivalent to simply filling all bytes of the object
26158 '``llvm.invariant.start``' Intrinsic
26159 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26163 This is an overloaded intrinsic. The memory object can belong to any address space.
26167 declare ptr @llvm.invariant.start.p0(i64 <size>, ptr nocapture <ptr>)
26172 The '``llvm.invariant.start``' intrinsic specifies that the contents of
26173 a memory object will not change.
26178 The first argument is a constant integer representing the size of the
26179 object, or -1 if it is variable sized. The second argument is a pointer
26185 This intrinsic indicates that until an ``llvm.invariant.end`` that uses
26186 the return value, the referenced memory location is constant and
26189 '``llvm.invariant.end``' Intrinsic
26190 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26194 This is an overloaded intrinsic. The memory object can belong to any address space.
26198 declare void @llvm.invariant.end.p0(ptr <start>, i64 <size>, ptr nocapture <ptr>)
26203 The '``llvm.invariant.end``' intrinsic specifies that the contents of a
26204 memory object are mutable.
26209 The first argument is the matching ``llvm.invariant.start`` intrinsic.
26210 The second argument is a constant integer representing the size of the
26211 object, or -1 if it is variable sized and the third argument is a
26212 pointer to the object.
26217 This intrinsic indicates that the memory is mutable again.
26219 '``llvm.launder.invariant.group``' Intrinsic
26220 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26224 This is an overloaded intrinsic. The memory object can belong to any address
26225 space. The returned pointer must belong to the same address space as the
26230 declare ptr @llvm.launder.invariant.group.p0(ptr <ptr>)
26235 The '``llvm.launder.invariant.group``' intrinsic can be used when an invariant
26236 established by ``invariant.group`` metadata no longer holds, to obtain a new
26237 pointer value that carries fresh invariant group information. It is an
26238 experimental intrinsic, which means that its semantics might change in the
26245 The ``llvm.launder.invariant.group`` takes only one argument, which is a pointer
26251 Returns another pointer that aliases its argument but which is considered different
26252 for the purposes of ``load``/``store`` ``invariant.group`` metadata.
26253 It does not read any accessible memory and the execution can be speculated.
26255 '``llvm.strip.invariant.group``' Intrinsic
26256 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26260 This is an overloaded intrinsic. The memory object can belong to any address
26261 space. The returned pointer must belong to the same address space as the
26266 declare ptr @llvm.strip.invariant.group.p0(ptr <ptr>)
26271 The '``llvm.strip.invariant.group``' intrinsic can be used when an invariant
26272 established by ``invariant.group`` metadata no longer holds, to obtain a new pointer
26273 value that does not carry the invariant information. It is an experimental
26274 intrinsic, which means that its semantics might change in the future.
26280 The ``llvm.strip.invariant.group`` takes only one argument, which is a pointer
26286 Returns another pointer that aliases its argument but which has no associated
26287 ``invariant.group`` metadata.
26288 It does not read any memory and can be speculated.
26294 Constrained Floating-Point Intrinsics
26295 -------------------------------------
26297 These intrinsics are used to provide special handling of floating-point
26298 operations when specific rounding mode or floating-point exception behavior is
26299 required. By default, LLVM optimization passes assume that the rounding mode is
26300 round-to-nearest and that floating-point exceptions will not be monitored.
26301 Constrained FP intrinsics are used to support non-default rounding modes and
26302 accurately preserve exception behavior without compromising LLVM's ability to
26303 optimize FP code when the default behavior is used.
26305 If any FP operation in a function is constrained then they all must be
26306 constrained. This is required for correct LLVM IR. Optimizations that
26307 move code around can create miscompiles if mixing of constrained and normal
26308 operations is done. The correct way to mix constrained and less constrained
26309 operations is to use the rounding mode and exception handling metadata to
26310 mark constrained intrinsics as having LLVM's default behavior.
26312 Each of these intrinsics corresponds to a normal floating-point operation. The
26313 data arguments and the return value are the same as the corresponding FP
26316 The rounding mode argument is a metadata string specifying what
26317 assumptions, if any, the optimizer can make when transforming constant
26318 values. Some constrained FP intrinsics omit this argument. If required
26319 by the intrinsic, this argument must be one of the following strings:
26328 "round.tonearestaway"
26330 If this argument is "round.dynamic" optimization passes must assume that the
26331 rounding mode is unknown and may change at runtime. No transformations that
26332 depend on rounding mode may be performed in this case.
26334 The other possible values for the rounding mode argument correspond to the
26335 similarly named IEEE rounding modes. If the argument is any of these values
26336 optimization passes may perform transformations as long as they are consistent
26337 with the specified rounding mode.
26339 For example, 'x-0'->'x' is not a valid transformation if the rounding mode is
26340 "round.downward" or "round.dynamic" because if the value of 'x' is +0 then
26341 'x-0' should evaluate to '-0' when rounding downward. However, this
26342 transformation is legal for all other rounding modes.
26344 For values other than "round.dynamic" optimization passes may assume that the
26345 actual runtime rounding mode (as defined in a target-specific manner) matches
26346 the specified rounding mode, but this is not guaranteed. Using a specific
26347 non-dynamic rounding mode which does not match the actual rounding mode at
26348 runtime results in undefined behavior.
26350 The exception behavior argument is a metadata string describing the floating
26351 point exception semantics that required for the intrinsic. This argument
26352 must be one of the following strings:
26360 If this argument is "fpexcept.ignore" optimization passes may assume that the
26361 exception status flags will not be read and that floating-point exceptions will
26362 be masked. This allows transformations to be performed that may change the
26363 exception semantics of the original code. For example, FP operations may be
26364 speculatively executed in this case whereas they must not be for either of the
26365 other possible values of this argument.
26367 If the exception behavior argument is "fpexcept.maytrap" optimization passes
26368 must avoid transformations that may raise exceptions that would not have been
26369 raised by the original code (such as speculatively executing FP operations), but
26370 passes are not required to preserve all exceptions that are implied by the
26371 original code. For example, exceptions may be potentially hidden by constant
26374 If the exception behavior argument is "fpexcept.strict" all transformations must
26375 strictly preserve the floating-point exception semantics of the original code.
26376 Any FP exception that would have been raised by the original code must be raised
26377 by the transformed code, and the transformed code must not raise any FP
26378 exceptions that would not have been raised by the original code. This is the
26379 exception behavior argument that will be used if the code being compiled reads
26380 the FP exception status flags, but this mode can also be used with code that
26381 unmasks FP exceptions.
26383 The number and order of floating-point exceptions is NOT guaranteed. For
26384 example, a series of FP operations that each may raise exceptions may be
26385 vectorized into a single instruction that raises each unique exception a single
26388 Proper :ref:`function attributes <fnattrs>` usage is required for the
26389 constrained intrinsics to function correctly.
26391 All function *calls* done in a function that uses constrained floating
26392 point intrinsics must have the ``strictfp`` attribute either on the
26393 calling instruction or on the declaration or definition of the function
26396 All function *definitions* that use constrained floating point intrinsics
26397 must have the ``strictfp`` attribute.
26399 '``llvm.experimental.constrained.fadd``' Intrinsic
26400 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26408 @llvm.experimental.constrained.fadd(<type> <op1>, <type> <op2>,
26409 metadata <rounding mode>,
26410 metadata <exception behavior>)
26415 The '``llvm.experimental.constrained.fadd``' intrinsic returns the sum of its
26422 The first two arguments to the '``llvm.experimental.constrained.fadd``'
26423 intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>`
26424 of floating-point values. Both arguments must have identical types.
26426 The third and fourth arguments specify the rounding mode and exception
26427 behavior as described above.
26432 The value produced is the floating-point sum of the two value arguments and has
26433 the same type as the arguments.
26436 '``llvm.experimental.constrained.fsub``' Intrinsic
26437 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26445 @llvm.experimental.constrained.fsub(<type> <op1>, <type> <op2>,
26446 metadata <rounding mode>,
26447 metadata <exception behavior>)
26452 The '``llvm.experimental.constrained.fsub``' intrinsic returns the difference
26453 of its two arguments.
26459 The first two arguments to the '``llvm.experimental.constrained.fsub``'
26460 intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>`
26461 of floating-point values. Both arguments must have identical types.
26463 The third and fourth arguments specify the rounding mode and exception
26464 behavior as described above.
26469 The value produced is the floating-point difference of the two value arguments
26470 and has the same type as the arguments.
26473 '``llvm.experimental.constrained.fmul``' Intrinsic
26474 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26482 @llvm.experimental.constrained.fmul(<type> <op1>, <type> <op2>,
26483 metadata <rounding mode>,
26484 metadata <exception behavior>)
26489 The '``llvm.experimental.constrained.fmul``' intrinsic returns the product of
26496 The first two arguments to the '``llvm.experimental.constrained.fmul``'
26497 intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>`
26498 of floating-point values. Both arguments must have identical types.
26500 The third and fourth arguments specify the rounding mode and exception
26501 behavior as described above.
26506 The value produced is the floating-point product of the two value arguments and
26507 has the same type as the arguments.
26510 '``llvm.experimental.constrained.fdiv``' Intrinsic
26511 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26519 @llvm.experimental.constrained.fdiv(<type> <op1>, <type> <op2>,
26520 metadata <rounding mode>,
26521 metadata <exception behavior>)
26526 The '``llvm.experimental.constrained.fdiv``' intrinsic returns the quotient of
26533 The first two arguments to the '``llvm.experimental.constrained.fdiv``'
26534 intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>`
26535 of floating-point values. Both arguments must have identical types.
26537 The third and fourth arguments specify the rounding mode and exception
26538 behavior as described above.
26543 The value produced is the floating-point quotient of the two value arguments and
26544 has the same type as the arguments.
26547 '``llvm.experimental.constrained.frem``' Intrinsic
26548 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26556 @llvm.experimental.constrained.frem(<type> <op1>, <type> <op2>,
26557 metadata <rounding mode>,
26558 metadata <exception behavior>)
26563 The '``llvm.experimental.constrained.frem``' intrinsic returns the remainder
26564 from the division of its two arguments.
26570 The first two arguments to the '``llvm.experimental.constrained.frem``'
26571 intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>`
26572 of floating-point values. Both arguments must have identical types.
26574 The third and fourth arguments specify the rounding mode and exception
26575 behavior as described above. The rounding mode argument has no effect, since
26576 the result of frem is never rounded, but the argument is included for
26577 consistency with the other constrained floating-point intrinsics.
26582 The value produced is the floating-point remainder from the division of the two
26583 value arguments and has the same type as the arguments. The remainder has the
26584 same sign as the dividend.
26586 '``llvm.experimental.constrained.fma``' Intrinsic
26587 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26595 @llvm.experimental.constrained.fma(<type> <op1>, <type> <op2>, <type> <op3>,
26596 metadata <rounding mode>,
26597 metadata <exception behavior>)
26602 The '``llvm.experimental.constrained.fma``' intrinsic returns the result of a
26603 fused-multiply-add operation on its arguments.
26608 The first three arguments to the '``llvm.experimental.constrained.fma``'
26609 intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector
26610 <t_vector>` of floating-point values. All arguments must have identical types.
26612 The fourth and fifth arguments specify the rounding mode and exception behavior
26613 as described above.
26618 The result produced is the product of the first two arguments added to the third
26619 argument computed with infinite precision, and then rounded to the target
26622 '``llvm.experimental.constrained.fptoui``' Intrinsic
26623 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26631 @llvm.experimental.constrained.fptoui(<type> <value>,
26632 metadata <exception behavior>)
26637 The '``llvm.experimental.constrained.fptoui``' intrinsic converts a
26638 floating-point ``value`` to its unsigned integer equivalent of type ``ty2``.
26643 The first argument to the '``llvm.experimental.constrained.fptoui``'
26644 intrinsic must be :ref:`floating point <t_floating>` or :ref:`vector
26645 <t_vector>` of floating point values.
26647 The second argument specifies the exception behavior as described above.
26652 The result produced is an unsigned integer converted from the floating
26653 point argument. The value is truncated, so it is rounded towards zero.
26655 '``llvm.experimental.constrained.fptosi``' Intrinsic
26656 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26664 @llvm.experimental.constrained.fptosi(<type> <value>,
26665 metadata <exception behavior>)
26670 The '``llvm.experimental.constrained.fptosi``' intrinsic converts
26671 :ref:`floating-point <t_floating>` ``value`` to type ``ty2``.
26676 The first argument to the '``llvm.experimental.constrained.fptosi``'
26677 intrinsic must be :ref:`floating point <t_floating>` or :ref:`vector
26678 <t_vector>` of floating point values.
26680 The second argument specifies the exception behavior as described above.
26685 The result produced is a signed integer converted from the floating
26686 point argument. The value is truncated, so it is rounded towards zero.
26688 '``llvm.experimental.constrained.uitofp``' Intrinsic
26689 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26697 @llvm.experimental.constrained.uitofp(<type> <value>,
26698 metadata <rounding mode>,
26699 metadata <exception behavior>)
26704 The '``llvm.experimental.constrained.uitofp``' intrinsic converts an
26705 unsigned integer ``value`` to a floating-point of type ``ty2``.
26710 The first argument to the '``llvm.experimental.constrained.uitofp``'
26711 intrinsic must be an :ref:`integer <t_integer>` or :ref:`vector
26712 <t_vector>` of integer values.
26714 The second and third arguments specify the rounding mode and exception
26715 behavior as described above.
26720 An inexact floating-point exception will be raised if rounding is required.
26721 Any result produced is a floating point value converted from the input
26724 '``llvm.experimental.constrained.sitofp``' Intrinsic
26725 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26733 @llvm.experimental.constrained.sitofp(<type> <value>,
26734 metadata <rounding mode>,
26735 metadata <exception behavior>)
26740 The '``llvm.experimental.constrained.sitofp``' intrinsic converts a
26741 signed integer ``value`` to a floating-point of type ``ty2``.
26746 The first argument to the '``llvm.experimental.constrained.sitofp``'
26747 intrinsic must be an :ref:`integer <t_integer>` or :ref:`vector
26748 <t_vector>` of integer values.
26750 The second and third arguments specify the rounding mode and exception
26751 behavior as described above.
26756 An inexact floating-point exception will be raised if rounding is required.
26757 Any result produced is a floating point value converted from the input
26760 '``llvm.experimental.constrained.fptrunc``' Intrinsic
26761 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26769 @llvm.experimental.constrained.fptrunc(<type> <value>,
26770 metadata <rounding mode>,
26771 metadata <exception behavior>)
26776 The '``llvm.experimental.constrained.fptrunc``' intrinsic truncates ``value``
26782 The first argument to the '``llvm.experimental.constrained.fptrunc``'
26783 intrinsic must be :ref:`floating point <t_floating>` or :ref:`vector
26784 <t_vector>` of floating point values. This argument must be larger in size
26787 The second and third arguments specify the rounding mode and exception
26788 behavior as described above.
26793 The result produced is a floating point value truncated to be smaller in size
26796 '``llvm.experimental.constrained.fpext``' Intrinsic
26797 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26805 @llvm.experimental.constrained.fpext(<type> <value>,
26806 metadata <exception behavior>)
26811 The '``llvm.experimental.constrained.fpext``' intrinsic extends a
26812 floating-point ``value`` to a larger floating-point value.
26817 The first argument to the '``llvm.experimental.constrained.fpext``'
26818 intrinsic must be :ref:`floating point <t_floating>` or :ref:`vector
26819 <t_vector>` of floating point values. This argument must be smaller in size
26822 The second argument specifies the exception behavior as described above.
26827 The result produced is a floating point value extended to be larger in size
26828 than the argument. All restrictions that apply to the fpext instruction also
26829 apply to this intrinsic.
26831 '``llvm.experimental.constrained.fcmp``' and '``llvm.experimental.constrained.fcmps``' Intrinsics
26832 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26840 @llvm.experimental.constrained.fcmp(<type> <op1>, <type> <op2>,
26841 metadata <condition code>,
26842 metadata <exception behavior>)
26844 @llvm.experimental.constrained.fcmps(<type> <op1>, <type> <op2>,
26845 metadata <condition code>,
26846 metadata <exception behavior>)
26851 The '``llvm.experimental.constrained.fcmp``' and
26852 '``llvm.experimental.constrained.fcmps``' intrinsics return a boolean
26853 value or vector of boolean values based on comparison of its arguments.
26855 If the arguments are floating-point scalars, then the result type is a
26856 boolean (:ref:`i1 <t_integer>`).
26858 If the arguments are floating-point vectors, then the result type is a
26859 vector of boolean with the same number of elements as the arguments being
26862 The '``llvm.experimental.constrained.fcmp``' intrinsic performs a quiet
26863 comparison operation while the '``llvm.experimental.constrained.fcmps``'
26864 intrinsic performs a signaling comparison operation.
26869 The first two arguments to the '``llvm.experimental.constrained.fcmp``'
26870 and '``llvm.experimental.constrained.fcmps``' intrinsics must be
26871 :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>`
26872 of floating-point values. Both arguments must have identical types.
26874 The third argument is the condition code indicating the kind of comparison
26875 to perform. It must be a metadata string with one of the following values:
26879 - "``oeq``": ordered and equal
26880 - "``ogt``": ordered and greater than
26881 - "``oge``": ordered and greater than or equal
26882 - "``olt``": ordered and less than
26883 - "``ole``": ordered and less than or equal
26884 - "``one``": ordered and not equal
26885 - "``ord``": ordered (no nans)
26886 - "``ueq``": unordered or equal
26887 - "``ugt``": unordered or greater than
26888 - "``uge``": unordered or greater than or equal
26889 - "``ult``": unordered or less than
26890 - "``ule``": unordered or less than or equal
26891 - "``une``": unordered or not equal
26892 - "``uno``": unordered (either nans)
26894 *Ordered* means that neither argument is a NAN while *unordered* means
26895 that either argument may be a NAN.
26897 The fourth argument specifies the exception behavior as described above.
26902 ``op1`` and ``op2`` are compared according to the condition code given
26903 as the third argument. If the arguments are vectors, then the
26904 vectors are compared element by element. Each comparison performed
26905 always yields an :ref:`i1 <t_integer>` result, as follows:
26907 .. _fcmp_md_cc_sem:
26909 - "``oeq``": yields ``true`` if both arguments are not a NAN and ``op1``
26910 is equal to ``op2``.
26911 - "``ogt``": yields ``true`` if both arguments are not a NAN and ``op1``
26912 is greater than ``op2``.
26913 - "``oge``": yields ``true`` if both arguments are not a NAN and ``op1``
26914 is greater than or equal to ``op2``.
26915 - "``olt``": yields ``true`` if both arguments are not a NAN and ``op1``
26916 is less than ``op2``.
26917 - "``ole``": yields ``true`` if both arguments are not a NAN and ``op1``
26918 is less than or equal to ``op2``.
26919 - "``one``": yields ``true`` if both arguments are not a NAN and ``op1``
26920 is not equal to ``op2``.
26921 - "``ord``": yields ``true`` if both arguments are not a NAN.
26922 - "``ueq``": yields ``true`` if either argument is a NAN or ``op1`` is
26924 - "``ugt``": yields ``true`` if either argument is a NAN or ``op1`` is
26925 greater than ``op2``.
26926 - "``uge``": yields ``true`` if either argument is a NAN or ``op1`` is
26927 greater than or equal to ``op2``.
26928 - "``ult``": yields ``true`` if either argument is a NAN or ``op1`` is
26930 - "``ule``": yields ``true`` if either argument is a NAN or ``op1`` is
26931 less than or equal to ``op2``.
26932 - "``une``": yields ``true`` if either argument is a NAN or ``op1`` is
26933 not equal to ``op2``.
26934 - "``uno``": yields ``true`` if either argument is a NAN.
26936 The quiet comparison operation performed by
26937 '``llvm.experimental.constrained.fcmp``' will only raise an exception
26938 if either argument is a SNAN. The signaling comparison operation
26939 performed by '``llvm.experimental.constrained.fcmps``' will raise an
26940 exception if either argument is a NAN (QNAN or SNAN). Such an exception
26941 does not preclude a result being produced (e.g. exception might only
26942 set a flag), therefore the distinction between ordered and unordered
26943 comparisons is also relevant for the
26944 '``llvm.experimental.constrained.fcmps``' intrinsic.
26946 '``llvm.experimental.constrained.fmuladd``' Intrinsic
26947 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26955 @llvm.experimental.constrained.fmuladd(<type> <op1>, <type> <op2>,
26957 metadata <rounding mode>,
26958 metadata <exception behavior>)
26963 The '``llvm.experimental.constrained.fmuladd``' intrinsic represents
26964 multiply-add expressions that can be fused if the code generator determines
26965 that (a) the target instruction set has support for a fused operation,
26966 and (b) that the fused operation is more efficient than the equivalent,
26967 separate pair of mul and add instructions.
26972 The first three arguments to the '``llvm.experimental.constrained.fmuladd``'
26973 intrinsic must be floating-point or vector of floating-point values.
26974 All three arguments must have identical types.
26976 The fourth and fifth arguments specify the rounding mode and exception behavior
26977 as described above.
26986 %0 = call float @llvm.experimental.constrained.fmuladd.f32(%a, %b, %c,
26987 metadata <rounding mode>,
26988 metadata <exception behavior>)
26990 is equivalent to the expression:
26994 %0 = call float @llvm.experimental.constrained.fmul.f32(%a, %b,
26995 metadata <rounding mode>,
26996 metadata <exception behavior>)
26997 %1 = call float @llvm.experimental.constrained.fadd.f32(%0, %c,
26998 metadata <rounding mode>,
26999 metadata <exception behavior>)
27001 except that it is unspecified whether rounding will be performed between the
27002 multiplication and addition steps. Fusion is not guaranteed, even if the target
27003 platform supports it.
27004 If a fused multiply-add is required, the corresponding
27005 :ref:`llvm.experimental.constrained.fma <int_fma>` intrinsic function should be
27007 This never sets errno, just as '``llvm.experimental.constrained.fma.*``'.
27009 Constrained libm-equivalent Intrinsics
27010 --------------------------------------
27012 In addition to the basic floating-point operations for which constrained
27013 intrinsics are described above, there are constrained versions of various
27014 operations which provide equivalent behavior to a corresponding libm function.
27015 These intrinsics allow the precise behavior of these operations with respect to
27016 rounding mode and exception behavior to be controlled.
27018 As with the basic constrained floating-point intrinsics, the rounding mode
27019 and exception behavior arguments only control the behavior of the optimizer.
27020 They do not change the runtime floating-point environment.
27023 '``llvm.experimental.constrained.sqrt``' Intrinsic
27024 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27032 @llvm.experimental.constrained.sqrt(<type> <op1>,
27033 metadata <rounding mode>,
27034 metadata <exception behavior>)
27039 The '``llvm.experimental.constrained.sqrt``' intrinsic returns the square root
27040 of the specified value, returning the same value as the libm '``sqrt``'
27041 functions would, but without setting ``errno``.
27046 The first argument and the return type are floating-point numbers of the same
27049 The second and third arguments specify the rounding mode and exception
27050 behavior as described above.
27055 This function returns the nonnegative square root of the specified value.
27056 If the value is less than negative zero, a floating-point exception occurs
27057 and the return value is architecture specific.
27060 '``llvm.experimental.constrained.pow``' Intrinsic
27061 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27069 @llvm.experimental.constrained.pow(<type> <op1>, <type> <op2>,
27070 metadata <rounding mode>,
27071 metadata <exception behavior>)
27076 The '``llvm.experimental.constrained.pow``' intrinsic returns the first argument
27077 raised to the (positive or negative) power specified by the second argument.
27082 The first two arguments and the return value are floating-point numbers of the
27083 same type. The second argument specifies the power to which the first argument
27086 The third and fourth arguments specify the rounding mode and exception
27087 behavior as described above.
27092 This function returns the first value raised to the second power,
27093 returning the same values as the libm ``pow`` functions would, and
27094 handles error conditions in the same way.
27097 '``llvm.experimental.constrained.powi``' Intrinsic
27098 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27106 @llvm.experimental.constrained.powi(<type> <op1>, i32 <op2>,
27107 metadata <rounding mode>,
27108 metadata <exception behavior>)
27113 The '``llvm.experimental.constrained.powi``' intrinsic returns the first argument
27114 raised to the (positive or negative) power specified by the second argument. The
27115 order of evaluation of multiplications is not defined. When a vector of
27116 floating-point type is used, the second argument remains a scalar integer value.
27122 The first argument and the return value are floating-point numbers of the same
27123 type. The second argument is a 32-bit signed integer specifying the power to
27124 which the first argument should be raised.
27126 The third and fourth arguments specify the rounding mode and exception
27127 behavior as described above.
27132 This function returns the first value raised to the second power with an
27133 unspecified sequence of rounding operations.
27136 '``llvm.experimental.constrained.ldexp``' Intrinsic
27137 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27145 @llvm.experimental.constrained.ldexp(<type0> <op1>, <type1> <op2>,
27146 metadata <rounding mode>,
27147 metadata <exception behavior>)
27152 The '``llvm.experimental.constrained.ldexp``' performs the ldexp function.
27158 The first argument and the return value are :ref:`floating-point
27159 <t_floating>` or :ref:`vector <t_vector>` of floating-point values of
27160 the same type. The second argument is an integer with the same number
27164 The third and fourth arguments specify the rounding mode and exception
27165 behavior as described above.
27170 This function multiplies the first argument by 2 raised to the second
27171 argument's power. If the first argument is NaN or infinite, the same
27172 value is returned. If the result underflows a zero with the same sign
27173 is returned. If the result overflows, the result is an infinity with
27177 '``llvm.experimental.constrained.sin``' Intrinsic
27178 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27186 @llvm.experimental.constrained.sin(<type> <op1>,
27187 metadata <rounding mode>,
27188 metadata <exception behavior>)
27193 The '``llvm.experimental.constrained.sin``' intrinsic returns the sine of the
27199 The first argument and the return type are floating-point numbers of the same
27202 The second and third arguments specify the rounding mode and exception
27203 behavior as described above.
27208 This function returns the sine of the specified argument, returning the
27209 same values as the libm ``sin`` functions would, and handles error
27210 conditions in the same way.
27213 '``llvm.experimental.constrained.cos``' Intrinsic
27214 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27222 @llvm.experimental.constrained.cos(<type> <op1>,
27223 metadata <rounding mode>,
27224 metadata <exception behavior>)
27229 The '``llvm.experimental.constrained.cos``' intrinsic returns the cosine of the
27235 The first argument and the return type are floating-point numbers of the same
27238 The second and third arguments specify the rounding mode and exception
27239 behavior as described above.
27244 This function returns the cosine of the specified argument, returning the
27245 same values as the libm ``cos`` functions would, and handles error
27246 conditions in the same way.
27249 '``llvm.experimental.constrained.tan``' Intrinsic
27250 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27258 @llvm.experimental.constrained.tan(<type> <op1>,
27259 metadata <rounding mode>,
27260 metadata <exception behavior>)
27265 The '``llvm.experimental.constrained.tan``' intrinsic returns the tangent of the
27271 The first argument and the return type are floating-point numbers of the same
27274 The second and third arguments specify the rounding mode and exception
27275 behavior as described above.
27280 This function returns the tangent of the specified argument, returning the
27281 same values as the libm ``tan`` functions would, and handles error
27282 conditions in the same way.
27284 '``llvm.experimental.constrained.asin``' Intrinsic
27285 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27293 @llvm.experimental.constrained.asin(<type> <op1>,
27294 metadata <rounding mode>,
27295 metadata <exception behavior>)
27300 The '``llvm.experimental.constrained.asin``' intrinsic returns the arcsine of the
27306 The first argument and the return type are floating-point numbers of the same
27309 The second and third arguments specify the rounding mode and exception
27310 behavior as described above.
27315 This function returns the arcsine of the specified operand, returning the
27316 same values as the libm ``asin`` functions would, and handles error
27317 conditions in the same way.
27320 '``llvm.experimental.constrained.acos``' Intrinsic
27321 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27329 @llvm.experimental.constrained.acos(<type> <op1>,
27330 metadata <rounding mode>,
27331 metadata <exception behavior>)
27336 The '``llvm.experimental.constrained.acos``' intrinsic returns the arccosine of the
27342 The first argument and the return type are floating-point numbers of the same
27345 The second and third arguments specify the rounding mode and exception
27346 behavior as described above.
27351 This function returns the arccosine of the specified operand, returning the
27352 same values as the libm ``acos`` functions would, and handles error
27353 conditions in the same way.
27356 '``llvm.experimental.constrained.atan``' Intrinsic
27357 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27365 @llvm.experimental.constrained.atan(<type> <op1>,
27366 metadata <rounding mode>,
27367 metadata <exception behavior>)
27372 The '``llvm.experimental.constrained.atan``' intrinsic returns the arctangent of the
27378 The first argument and the return type are floating-point numbers of the same
27381 The second and third arguments specify the rounding mode and exception
27382 behavior as described above.
27387 This function returns the arctangent of the specified operand, returning the
27388 same values as the libm ``atan`` functions would, and handles error
27389 conditions in the same way.
27391 '``llvm.experimental.constrained.atan2``' Intrinsic
27392 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27400 @llvm.experimental.constrained.atan2(<type> <op1>,
27402 metadata <rounding mode>,
27403 metadata <exception behavior>)
27408 The '``llvm.experimental.constrained.atan2``' intrinsic returns the arctangent
27409 of ``<op1>`` divided by ``<op2>`` accounting for the quadrant.
27414 The first two arguments and the return value are floating-point numbers of the
27417 The third and fourth arguments specify the rounding mode and exception
27418 behavior as described above.
27423 This function returns the quadrant-specific arctangent using the specified
27424 operands, returning the same values as the libm ``atan2`` functions would, and
27425 handles error conditions in the same way.
27427 '``llvm.experimental.constrained.sinh``' Intrinsic
27428 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27436 @llvm.experimental.constrained.sinh(<type> <op1>,
27437 metadata <rounding mode>,
27438 metadata <exception behavior>)
27443 The '``llvm.experimental.constrained.sinh``' intrinsic returns the hyperbolic sine of the
27449 The first argument and the return type are floating-point numbers of the same
27452 The second and third arguments specify the rounding mode and exception
27453 behavior as described above.
27458 This function returns the hyperbolic sine of the specified operand, returning the
27459 same values as the libm ``sinh`` functions would, and handles error
27460 conditions in the same way.
27463 '``llvm.experimental.constrained.cosh``' Intrinsic
27464 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27472 @llvm.experimental.constrained.cosh(<type> <op1>,
27473 metadata <rounding mode>,
27474 metadata <exception behavior>)
27479 The '``llvm.experimental.constrained.cosh``' intrinsic returns the hyperbolic cosine of the
27485 The first argument and the return type are floating-point numbers of the same
27488 The second and third arguments specify the rounding mode and exception
27489 behavior as described above.
27494 This function returns the hyperbolic cosine of the specified operand, returning the
27495 same values as the libm ``cosh`` functions would, and handles error
27496 conditions in the same way.
27499 '``llvm.experimental.constrained.tanh``' Intrinsic
27500 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27508 @llvm.experimental.constrained.tanh(<type> <op1>,
27509 metadata <rounding mode>,
27510 metadata <exception behavior>)
27515 The '``llvm.experimental.constrained.tanh``' intrinsic returns the hyperbolic tangent of the
27521 The first argument and the return type are floating-point numbers of the same
27524 The second and third arguments specify the rounding mode and exception
27525 behavior as described above.
27530 This function returns the hyperbolic tangent of the specified operand, returning the
27531 same values as the libm ``tanh`` functions would, and handles error
27532 conditions in the same way.
27534 '``llvm.experimental.constrained.exp``' Intrinsic
27535 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27543 @llvm.experimental.constrained.exp(<type> <op1>,
27544 metadata <rounding mode>,
27545 metadata <exception behavior>)
27550 The '``llvm.experimental.constrained.exp``' intrinsic computes the base-e
27551 exponential of the specified value.
27556 The first argument and the return value are floating-point numbers of the same
27559 The second and third arguments specify the rounding mode and exception
27560 behavior as described above.
27565 This function returns the same values as the libm ``exp`` functions
27566 would, and handles error conditions in the same way.
27569 '``llvm.experimental.constrained.exp2``' Intrinsic
27570 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27578 @llvm.experimental.constrained.exp2(<type> <op1>,
27579 metadata <rounding mode>,
27580 metadata <exception behavior>)
27585 The '``llvm.experimental.constrained.exp2``' intrinsic computes the base-2
27586 exponential of the specified value.
27592 The first argument and the return value are floating-point numbers of the same
27595 The second and third arguments specify the rounding mode and exception
27596 behavior as described above.
27601 This function returns the same values as the libm ``exp2`` functions
27602 would, and handles error conditions in the same way.
27605 '``llvm.experimental.constrained.log``' Intrinsic
27606 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27614 @llvm.experimental.constrained.log(<type> <op1>,
27615 metadata <rounding mode>,
27616 metadata <exception behavior>)
27621 The '``llvm.experimental.constrained.log``' intrinsic computes the base-e
27622 logarithm of the specified value.
27627 The first argument and the return value are floating-point numbers of the same
27630 The second and third arguments specify the rounding mode and exception
27631 behavior as described above.
27637 This function returns the same values as the libm ``log`` functions
27638 would, and handles error conditions in the same way.
27641 '``llvm.experimental.constrained.log10``' Intrinsic
27642 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27650 @llvm.experimental.constrained.log10(<type> <op1>,
27651 metadata <rounding mode>,
27652 metadata <exception behavior>)
27657 The '``llvm.experimental.constrained.log10``' intrinsic computes the base-10
27658 logarithm of the specified value.
27663 The first argument and the return value are floating-point numbers of the same
27666 The second and third arguments specify the rounding mode and exception
27667 behavior as described above.
27672 This function returns the same values as the libm ``log10`` functions
27673 would, and handles error conditions in the same way.
27676 '``llvm.experimental.constrained.log2``' Intrinsic
27677 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27685 @llvm.experimental.constrained.log2(<type> <op1>,
27686 metadata <rounding mode>,
27687 metadata <exception behavior>)
27692 The '``llvm.experimental.constrained.log2``' intrinsic computes the base-2
27693 logarithm of the specified value.
27698 The first argument and the return value are floating-point numbers of the same
27701 The second and third arguments specify the rounding mode and exception
27702 behavior as described above.
27707 This function returns the same values as the libm ``log2`` functions
27708 would, and handles error conditions in the same way.
27711 '``llvm.experimental.constrained.rint``' Intrinsic
27712 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27720 @llvm.experimental.constrained.rint(<type> <op1>,
27721 metadata <rounding mode>,
27722 metadata <exception behavior>)
27727 The '``llvm.experimental.constrained.rint``' intrinsic returns the first
27728 argument rounded to the nearest integer. It may raise an inexact floating-point
27729 exception if the argument is not an integer.
27734 The first argument and the return value are floating-point numbers of the same
27737 The second and third arguments specify the rounding mode and exception
27738 behavior as described above.
27743 This function returns the same values as the libm ``rint`` functions
27744 would, and handles error conditions in the same way. The rounding mode is
27745 described, not determined, by the rounding mode argument. The actual rounding
27746 mode is determined by the runtime floating-point environment. The rounding
27747 mode argument is only intended as information to the compiler.
27750 '``llvm.experimental.constrained.lrint``' Intrinsic
27751 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27759 @llvm.experimental.constrained.lrint(<fptype> <op1>,
27760 metadata <rounding mode>,
27761 metadata <exception behavior>)
27766 The '``llvm.experimental.constrained.lrint``' intrinsic returns the first
27767 argument rounded to the nearest integer. An inexact floating-point exception
27768 will be raised if the argument is not an integer. An invalid exception is
27769 raised if the result is too large to fit into a supported integer type,
27770 and in this case the result is undefined.
27775 The first argument is a floating-point number. The return value is an
27776 integer type. Not all types are supported on all targets. The supported
27777 types are the same as the ``llvm.lrint`` intrinsic and the ``lrint``
27780 The second and third arguments specify the rounding mode and exception
27781 behavior as described above.
27786 This function returns the same values as the libm ``lrint`` functions
27787 would, and handles error conditions in the same way.
27789 The rounding mode is described, not determined, by the rounding mode
27790 argument. The actual rounding mode is determined by the runtime floating-point
27791 environment. The rounding mode argument is only intended as information
27794 If the runtime floating-point environment is using the default rounding mode
27795 then the results will be the same as the llvm.lrint intrinsic.
27798 '``llvm.experimental.constrained.llrint``' Intrinsic
27799 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27807 @llvm.experimental.constrained.llrint(<fptype> <op1>,
27808 metadata <rounding mode>,
27809 metadata <exception behavior>)
27814 The '``llvm.experimental.constrained.llrint``' intrinsic returns the first
27815 argument rounded to the nearest integer. An inexact floating-point exception
27816 will be raised if the argument is not an integer. An invalid exception is
27817 raised if the result is too large to fit into a supported integer type,
27818 and in this case the result is undefined.
27823 The first argument is a floating-point number. The return value is an
27824 integer type. Not all types are supported on all targets. The supported
27825 types are the same as the ``llvm.llrint`` intrinsic and the ``llrint``
27828 The second and third arguments specify the rounding mode and exception
27829 behavior as described above.
27834 This function returns the same values as the libm ``llrint`` functions
27835 would, and handles error conditions in the same way.
27837 The rounding mode is described, not determined, by the rounding mode
27838 argument. The actual rounding mode is determined by the runtime floating-point
27839 environment. The rounding mode argument is only intended as information
27842 If the runtime floating-point environment is using the default rounding mode
27843 then the results will be the same as the llvm.llrint intrinsic.
27846 '``llvm.experimental.constrained.nearbyint``' Intrinsic
27847 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27855 @llvm.experimental.constrained.nearbyint(<type> <op1>,
27856 metadata <rounding mode>,
27857 metadata <exception behavior>)
27862 The '``llvm.experimental.constrained.nearbyint``' intrinsic returns the first
27863 argument rounded to the nearest integer. It will not raise an inexact
27864 floating-point exception if the argument is not an integer.
27870 The first argument and the return value are floating-point numbers of the same
27873 The second and third arguments specify the rounding mode and exception
27874 behavior as described above.
27879 This function returns the same values as the libm ``nearbyint`` functions
27880 would, and handles error conditions in the same way. The rounding mode is
27881 described, not determined, by the rounding mode argument. The actual rounding
27882 mode is determined by the runtime floating-point environment. The rounding
27883 mode argument is only intended as information to the compiler.
27886 '``llvm.experimental.constrained.maxnum``' Intrinsic
27887 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27895 @llvm.experimental.constrained.maxnum(<type> <op1>, <type> <op2>
27896 metadata <exception behavior>)
27901 The '``llvm.experimental.constrained.maxnum``' intrinsic returns the maximum
27902 of the two arguments.
27907 The first two arguments and the return value are floating-point numbers
27910 The third argument specifies the exception behavior as described above.
27915 This function follows the IEEE-754 semantics for maxNum.
27918 '``llvm.experimental.constrained.minnum``' Intrinsic
27919 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27927 @llvm.experimental.constrained.minnum(<type> <op1>, <type> <op2>
27928 metadata <exception behavior>)
27933 The '``llvm.experimental.constrained.minnum``' intrinsic returns the minimum
27934 of the two arguments.
27939 The first two arguments and the return value are floating-point numbers
27942 The third argument specifies the exception behavior as described above.
27947 This function follows the IEEE-754 semantics for minNum.
27950 '``llvm.experimental.constrained.maximum``' Intrinsic
27951 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27959 @llvm.experimental.constrained.maximum(<type> <op1>, <type> <op2>
27960 metadata <exception behavior>)
27965 The '``llvm.experimental.constrained.maximum``' intrinsic returns the maximum
27966 of the two arguments, propagating NaNs and treating -0.0 as less than +0.0.
27971 The first two arguments and the return value are floating-point numbers
27974 The third argument specifies the exception behavior as described above.
27979 This function follows semantics specified in the draft of IEEE 754-2019.
27982 '``llvm.experimental.constrained.minimum``' Intrinsic
27983 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27991 @llvm.experimental.constrained.minimum(<type> <op1>, <type> <op2>
27992 metadata <exception behavior>)
27997 The '``llvm.experimental.constrained.minimum``' intrinsic returns the minimum
27998 of the two arguments, propagating NaNs and treating -0.0 as less than +0.0.
28003 The first two arguments and the return value are floating-point numbers
28006 The third argument specifies the exception behavior as described above.
28011 This function follows semantics specified in the draft of IEEE 754-2019.
28014 '``llvm.experimental.constrained.ceil``' Intrinsic
28015 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
28023 @llvm.experimental.constrained.ceil(<type> <op1>,
28024 metadata <exception behavior>)
28029 The '``llvm.experimental.constrained.ceil``' intrinsic returns the ceiling of the
28035 The first argument and the return value are floating-point numbers of the same
28038 The second argument specifies the exception behavior as described above.
28043 This function returns the same values as the libm ``ceil`` functions
28044 would and handles error conditions in the same way.
28047 '``llvm.experimental.constrained.floor``' Intrinsic
28048 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
28056 @llvm.experimental.constrained.floor(<type> <op1>,
28057 metadata <exception behavior>)
28062 The '``llvm.experimental.constrained.floor``' intrinsic returns the floor of the
28068 The first argument and the return value are floating-point numbers of the same
28071 The second argument specifies the exception behavior as described above.
28076 This function returns the same values as the libm ``floor`` functions
28077 would and handles error conditions in the same way.
28080 '``llvm.experimental.constrained.round``' Intrinsic
28081 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
28089 @llvm.experimental.constrained.round(<type> <op1>,
28090 metadata <exception behavior>)
28095 The '``llvm.experimental.constrained.round``' intrinsic returns the first
28096 argument rounded to the nearest integer.
28101 The first argument and the return value are floating-point numbers of the same
28104 The second argument specifies the exception behavior as described above.
28109 This function returns the same values as the libm ``round`` functions
28110 would and handles error conditions in the same way.
28113 '``llvm.experimental.constrained.roundeven``' Intrinsic
28114 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
28122 @llvm.experimental.constrained.roundeven(<type> <op1>,
28123 metadata <exception behavior>)
28128 The '``llvm.experimental.constrained.roundeven``' intrinsic returns the first
28129 argument rounded to the nearest integer in floating-point format, rounding
28130 halfway cases to even (that is, to the nearest value that is an even integer),
28131 regardless of the current rounding direction.
28136 The first argument and the return value are floating-point numbers of the same
28139 The second argument specifies the exception behavior as described above.
28144 This function implements IEEE-754 operation ``roundToIntegralTiesToEven``. It
28145 also behaves in the same way as C standard function ``roundeven`` and can signal
28146 the invalid operation exception for a SNAN argument.
28149 '``llvm.experimental.constrained.lround``' Intrinsic
28150 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
28158 @llvm.experimental.constrained.lround(<fptype> <op1>,
28159 metadata <exception behavior>)
28164 The '``llvm.experimental.constrained.lround``' intrinsic returns the first
28165 argument rounded to the nearest integer with ties away from zero. It will
28166 raise an inexact floating-point exception if the argument is not an integer.
28167 An invalid exception is raised if the result is too large to fit into a
28168 supported integer type, and in this case the result is undefined.
28173 The first argument is a floating-point number. The return value is an
28174 integer type. Not all types are supported on all targets. The supported
28175 types are the same as the ``llvm.lround`` intrinsic and the ``lround``
28178 The second argument specifies the exception behavior as described above.
28183 This function returns the same values as the libm ``lround`` functions
28184 would and handles error conditions in the same way.
28187 '``llvm.experimental.constrained.llround``' Intrinsic
28188 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
28196 @llvm.experimental.constrained.llround(<fptype> <op1>,
28197 metadata <exception behavior>)
28202 The '``llvm.experimental.constrained.llround``' intrinsic returns the first
28203 argument rounded to the nearest integer with ties away from zero. It will
28204 raise an inexact floating-point exception if the argument is not an integer.
28205 An invalid exception is raised if the result is too large to fit into a
28206 supported integer type, and in this case the result is undefined.
28211 The first argument is a floating-point number. The return value is an
28212 integer type. Not all types are supported on all targets. The supported
28213 types are the same as the ``llvm.llround`` intrinsic and the ``llround``
28216 The second argument specifies the exception behavior as described above.
28221 This function returns the same values as the libm ``llround`` functions
28222 would and handles error conditions in the same way.
28225 '``llvm.experimental.constrained.trunc``' Intrinsic
28226 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
28234 @llvm.experimental.constrained.trunc(<type> <op1>,
28235 metadata <exception behavior>)
28240 The '``llvm.experimental.constrained.trunc``' intrinsic returns the first
28241 argument rounded to the nearest integer not larger in magnitude than the
28247 The first argument and the return value are floating-point numbers of the same
28250 The second argument specifies the exception behavior as described above.
28255 This function returns the same values as the libm ``trunc`` functions
28256 would and handles error conditions in the same way.
28258 .. _int_experimental_noalias_scope_decl:
28260 '``llvm.experimental.noalias.scope.decl``' Intrinsic
28261 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
28269 declare void @llvm.experimental.noalias.scope.decl(metadata !id.scope.list)
28274 The ``llvm.experimental.noalias.scope.decl`` intrinsic identifies where a
28275 noalias scope is declared. When the intrinsic is duplicated, a decision must
28276 also be made about the scope: depending on the reason of the duplication,
28277 the scope might need to be duplicated as well.
28283 The ``!id.scope.list`` argument is metadata that is a list of ``noalias``
28284 metadata references. The format is identical to that required for ``noalias``
28285 metadata. This list must have exactly one element.
28290 The ``llvm.experimental.noalias.scope.decl`` intrinsic identifies where a
28291 noalias scope is declared. When the intrinsic is duplicated, a decision must
28292 also be made about the scope: depending on the reason of the duplication,
28293 the scope might need to be duplicated as well.
28295 For example, when the intrinsic is used inside a loop body, and that loop is
28296 unrolled, the associated noalias scope must also be duplicated. Otherwise, the
28297 noalias property it signifies would spill across loop iterations, whereas it
28298 was only valid within a single iteration.
28300 .. code-block:: llvm
28302 ; This examples shows two possible positions for noalias.decl and how they impact the semantics:
28303 ; If it is outside the loop (Version 1), then %a and %b are noalias across *all* iterations.
28304 ; If it is inside the loop (Version 2), then %a and %b are noalias only within *one* iteration.
28305 declare void @decl_in_loop(ptr %a.base, ptr %b.base) {
28307 ; call void @llvm.experimental.noalias.scope.decl(metadata !2) ; Version 1: noalias decl outside loop
28311 %a = phi ptr [ %a.base, %entry ], [ %a.inc, %loop ]
28312 %b = phi ptr [ %b.base, %entry ], [ %b.inc, %loop ]
28313 ; call void @llvm.experimental.noalias.scope.decl(metadata !2) ; Version 2: noalias decl inside loop
28314 %val = load i8, ptr %a, !alias.scope !2
28315 store i8 %val, ptr %b, !noalias !2
28316 %a.inc = getelementptr inbounds i8, ptr %a, i64 1
28317 %b.inc = getelementptr inbounds i8, ptr %b, i64 1
28318 %cond = call i1 @cond()
28319 br i1 %cond, label %loop, label %exit
28325 !0 = !{!0} ; domain
28326 !1 = !{!1, !0} ; scope
28327 !2 = !{!1} ; scope list
28329 Multiple calls to `@llvm.experimental.noalias.scope.decl` for the same scope
28330 are possible, but one should never dominate another. Violations are pointed out
28331 by the verifier as they indicate a problem in either a transformation pass or
28335 Floating Point Environment Manipulation intrinsics
28336 --------------------------------------------------
28338 These functions read or write floating point environment, such as rounding
28339 mode or state of floating point exceptions. Altering the floating point
28340 environment requires special care. See :ref:`Floating Point Environment <floatenv>`.
28342 .. _int_get_rounding:
28344 '``llvm.get.rounding``' Intrinsic
28345 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
28352 declare i32 @llvm.get.rounding()
28357 The '``llvm.get.rounding``' intrinsic reads the current rounding mode.
28362 The '``llvm.get.rounding``' intrinsic returns the current rounding mode.
28363 Encoding of the returned values is same as the result of ``FLT_ROUNDS``,
28364 specified by C standard:
28369 1 - to nearest, ties to even
28370 2 - toward positive infinity
28371 3 - toward negative infinity
28372 4 - to nearest, ties away from zero
28374 Other values may be used to represent additional rounding modes, supported by a
28375 target. These values are target-specific.
28377 .. _int_set_rounding:
28379 '``llvm.set.rounding``' Intrinsic
28380 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
28387 declare void @llvm.set.rounding(i32 <val>)
28392 The '``llvm.set.rounding``' intrinsic sets current rounding mode.
28397 The argument is the required rounding mode. Encoding of rounding mode is
28398 the same as used by '``llvm.get.rounding``'.
28403 The '``llvm.set.rounding``' intrinsic sets the current rounding mode. It is
28404 similar to C library function 'fesetround', however this intrinsic does not
28405 return any value and uses platform-independent representation of IEEE rounding
28410 '``llvm.get.fpenv``' Intrinsic
28411 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
28418 declare <integer_type> @llvm.get.fpenv()
28423 The '``llvm.get.fpenv``' intrinsic returns bits of the current floating-point
28424 environment. The return value type is platform-specific.
28429 The '``llvm.get.fpenv``' intrinsic reads the current floating-point environment
28430 and returns it as an integer value.
28434 '``llvm.set.fpenv``' Intrinsic
28435 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
28442 declare void @llvm.set.fpenv(<integer_type> <val>)
28447 The '``llvm.set.fpenv``' intrinsic sets the current floating-point environment.
28452 The argument is an integer representing the new floating-point environment. The
28453 integer type is platform-specific.
28458 The '``llvm.set.fpenv``' intrinsic sets the current floating-point environment
28459 to the state specified by the argument. The state may be previously obtained by a
28460 call to '``llvm.get.fpenv``' or synthesized in a platform-dependent way.
28463 '``llvm.reset.fpenv``' Intrinsic
28464 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
28471 declare void @llvm.reset.fpenv()
28476 The '``llvm.reset.fpenv``' intrinsic sets the default floating-point environment.
28481 The '``llvm.reset.fpenv``' intrinsic sets the current floating-point environment
28482 to default state. It is similar to the call 'fesetenv(FE_DFL_ENV)', except it
28483 does not return any value.
28485 .. _int_get_fpmode:
28487 '``llvm.get.fpmode``' Intrinsic
28488 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
28493 The '``llvm.get.fpmode``' intrinsic returns bits of the current floating-point
28494 control modes. The return value type is platform-specific.
28498 declare <integer_type> @llvm.get.fpmode()
28503 The '``llvm.get.fpmode``' intrinsic reads the current dynamic floating-point
28504 control modes and returns it as an integer value.
28514 The '``llvm.get.fpmode``' intrinsic reads the current dynamic floating-point
28515 control modes, such as rounding direction, precision, treatment of denormals and
28516 so on. It is similar to the C library function 'fegetmode', however this
28517 function does not store the set of control modes into memory but returns it as
28518 an integer value. Interpretation of the bits in this value is target-dependent.
28520 '``llvm.set.fpmode``' Intrinsic
28521 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
28526 The '``llvm.set.fpmode``' intrinsic sets the current floating-point control modes.
28530 declare void @llvm.set.fpmode(<integer_type> <val>)
28535 The '``llvm.set.fpmode``' intrinsic sets the current dynamic floating-point
28541 The argument is a set of floating-point control modes, represented as an integer
28542 value in a target-dependent way.
28547 The '``llvm.set.fpmode``' intrinsic sets the current dynamic floating-point
28548 control modes to the state specified by the argument, which must be obtained by
28549 a call to '``llvm.get.fpmode``' or constructed in a target-specific way. It is
28550 similar to the C library function 'fesetmode', however this function does not
28551 read the set of control modes from memory but gets it as integer value.
28553 '``llvm.reset.fpmode``' Intrinsic
28554 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
28561 declare void @llvm.reset.fpmode()
28566 The '``llvm.reset.fpmode``' intrinsic sets the default dynamic floating-point
28577 The '``llvm.reset.fpmode``' intrinsic sets the current dynamic floating-point
28578 environment to default state. It is similar to the C library function call
28579 'fesetmode(FE_DFL_MODE)', however this function does not return any value.
28582 Floating-Point Test Intrinsics
28583 ------------------------------
28585 These functions get properties of floating-point values.
28588 .. _llvm.is.fpclass:
28590 '``llvm.is.fpclass``' Intrinsic
28591 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
28598 declare i1 @llvm.is.fpclass(<fptype> <op>, i32 <test>)
28599 declare <N x i1> @llvm.is.fpclass(<vector-fptype> <op>, i32 <test>)
28604 The '``llvm.is.fpclass``' intrinsic returns a boolean value or vector of boolean
28605 values depending on whether the first argument satisfies the test specified by
28606 the second argument.
28608 If the first argument is a floating-point scalar, then the result type is a
28609 boolean (:ref:`i1 <t_integer>`).
28611 If the first argument is a floating-point vector, then the result type is a
28612 vector of boolean with the same number of elements as the first argument.
28617 The first argument to the '``llvm.is.fpclass``' intrinsic must be
28618 :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>`
28619 of floating-point values.
28621 The second argument specifies, which tests to perform. It must be a compile-time
28622 integer constant, each bit in which specifies floating-point class:
28624 +-------+----------------------+
28625 | Bit # | floating-point class |
28626 +=======+======================+
28627 | 0 | Signaling NaN |
28628 +-------+----------------------+
28630 +-------+----------------------+
28631 | 2 | Negative infinity |
28632 +-------+----------------------+
28633 | 3 | Negative normal |
28634 +-------+----------------------+
28635 | 4 | Negative subnormal |
28636 +-------+----------------------+
28637 | 5 | Negative zero |
28638 +-------+----------------------+
28639 | 6 | Positive zero |
28640 +-------+----------------------+
28641 | 7 | Positive subnormal |
28642 +-------+----------------------+
28643 | 8 | Positive normal |
28644 +-------+----------------------+
28645 | 9 | Positive infinity |
28646 +-------+----------------------+
28651 The function checks if ``op`` belongs to any of the floating-point classes
28652 specified by ``test``. If ``op`` is a vector, then the check is made element by
28653 element. Each check yields an :ref:`i1 <t_integer>` result, which is ``true``,
28654 if the element value satisfies the specified test. The argument ``test`` is a
28655 bit mask where each bit specifies floating-point class to test. For example, the
28656 value 0x108 makes test for normal value, - bits 3 and 8 in it are set, which
28657 means that the function returns ``true`` if ``op`` is a positive or negative
28658 normal value. The function never raises floating-point exceptions. The
28659 function does not canonicalize its input value and does not depend
28660 on the floating-point environment. If the floating-point environment
28661 has a zeroing treatment of subnormal input values (such as indicated
28662 by the ``"denormal-fp-math"`` attribute), a subnormal value will be
28663 observed (will not be implicitly treated as zero).
28669 This class of intrinsics is designed to be generic and has no specific
28672 '``llvm.var.annotation``' Intrinsic
28673 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
28680 declare void @llvm.var.annotation(ptr <val>, ptr <str>, ptr <str>, i32 <int>)
28685 The '``llvm.var.annotation``' intrinsic.
28690 The first argument is a pointer to a value, the second is a pointer to a
28691 global string, the third is a pointer to a global string which is the
28692 source file name, and the last argument is the line number.
28697 This intrinsic allows annotation of local variables with arbitrary
28698 strings. This can be useful for special purpose optimizations that want
28699 to look for these annotations. These have no other defined use; they are
28700 ignored by code generation and optimization.
28702 '``llvm.ptr.annotation.*``' Intrinsic
28703 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
28708 This is an overloaded intrinsic. You can use '``llvm.ptr.annotation``' on a
28709 pointer to an integer of any width. *NOTE* you must specify an address space for
28710 the pointer. The identifier for the default address space is the integer
28715 declare ptr @llvm.ptr.annotation.p0(ptr <val>, ptr <str>, ptr <str>, i32 <int>)
28716 declare ptr @llvm.ptr.annotation.p1(ptr addrspace(1) <val>, ptr <str>, ptr <str>, i32 <int>)
28721 The '``llvm.ptr.annotation``' intrinsic.
28726 The first argument is a pointer to an integer value of arbitrary bitwidth
28727 (result of some expression), the second is a pointer to a global string, the
28728 third is a pointer to a global string which is the source file name, and the
28729 last argument is the line number. It returns the value of the first argument.
28734 This intrinsic allows annotation of a pointer to an integer with arbitrary
28735 strings. This can be useful for special purpose optimizations that want to look
28736 for these annotations. These have no other defined use; transformations preserve
28737 annotations on a best-effort basis but are allowed to replace the intrinsic with
28738 its first argument without breaking semantics and the intrinsic is completely
28739 dropped during instruction selection.
28741 '``llvm.annotation.*``' Intrinsic
28742 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
28747 This is an overloaded intrinsic. You can use '``llvm.annotation``' on
28748 any integer bit width.
28752 declare i8 @llvm.annotation.i8(i8 <val>, ptr <str>, ptr <str>, i32 <int>)
28753 declare i16 @llvm.annotation.i16(i16 <val>, ptr <str>, ptr <str>, i32 <int>)
28754 declare i32 @llvm.annotation.i32(i32 <val>, ptr <str>, ptr <str>, i32 <int>)
28755 declare i64 @llvm.annotation.i64(i64 <val>, ptr <str>, ptr <str>, i32 <int>)
28756 declare i256 @llvm.annotation.i256(i256 <val>, ptr <str>, ptr <str>, i32 <int>)
28761 The '``llvm.annotation``' intrinsic.
28766 The first argument is an integer value (result of some expression), the
28767 second is a pointer to a global string, the third is a pointer to a
28768 global string which is the source file name, and the last argument is
28769 the line number. It returns the value of the first argument.
28774 This intrinsic allows annotations to be put on arbitrary expressions with
28775 arbitrary strings. This can be useful for special purpose optimizations that
28776 want to look for these annotations. These have no other defined use;
28777 transformations preserve annotations on a best-effort basis but are allowed to
28778 replace the intrinsic with its first argument without breaking semantics and the
28779 intrinsic is completely dropped during instruction selection.
28781 '``llvm.codeview.annotation``' Intrinsic
28782 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
28787 This annotation emits a label at its program point and an associated
28788 ``S_ANNOTATION`` codeview record with some additional string metadata. This is
28789 used to implement MSVC's ``__annotation`` intrinsic. It is marked
28790 ``noduplicate``, so calls to this intrinsic prevent inlining and should be
28791 considered expensive.
28795 declare void @llvm.codeview.annotation(metadata)
28800 The argument should be an MDTuple containing any number of MDStrings.
28804 '``llvm.trap``' Intrinsic
28805 ^^^^^^^^^^^^^^^^^^^^^^^^^
28812 declare void @llvm.trap() cold noreturn nounwind
28817 The '``llvm.trap``' intrinsic.
28827 This intrinsic is lowered to the target dependent trap instruction. If
28828 the target does not have a trap instruction, this intrinsic will be
28829 lowered to a call of the ``abort()`` function.
28831 .. _llvm.debugtrap:
28833 '``llvm.debugtrap``' Intrinsic
28834 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
28841 declare void @llvm.debugtrap() nounwind
28846 The '``llvm.debugtrap``' intrinsic.
28856 This intrinsic is lowered to code which is intended to cause an
28857 execution trap with the intention of requesting the attention of a
28860 .. _llvm.ubsantrap:
28862 '``llvm.ubsantrap``' Intrinsic
28863 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
28870 declare void @llvm.ubsantrap(i8 immarg) cold noreturn nounwind
28875 The '``llvm.ubsantrap``' intrinsic.
28880 An integer describing the kind of failure detected.
28885 This intrinsic is lowered to code which is intended to cause an execution trap,
28886 embedding the argument into encoding of that trap somehow to discriminate
28887 crashes if possible.
28889 Equivalent to ``@llvm.trap`` for targets that do not support this behavior.
28891 '``llvm.stackprotector``' Intrinsic
28892 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
28899 declare void @llvm.stackprotector(ptr <guard>, ptr <slot>)
28904 The ``llvm.stackprotector`` intrinsic takes the ``guard`` and stores it
28905 onto the stack at ``slot``. The stack slot is adjusted to ensure that it
28906 is placed on the stack before local variables.
28911 The ``llvm.stackprotector`` intrinsic requires two pointer arguments.
28912 The first argument is the value loaded from the stack guard
28913 ``@__stack_chk_guard``. The second variable is an ``alloca`` that has
28914 enough space to hold the value of the guard.
28919 This intrinsic causes the prologue/epilogue inserter to force the position of
28920 the ``AllocaInst`` stack slot to be before local variables on the stack. This is
28921 to ensure that if a local variable on the stack is overwritten, it will destroy
28922 the value of the guard. When the function exits, the guard on the stack is
28923 checked against the original guard by ``llvm.stackprotectorcheck``. If they are
28924 different, then ``llvm.stackprotectorcheck`` causes the program to abort by
28925 calling the ``__stack_chk_fail()`` function.
28927 '``llvm.stackguard``' Intrinsic
28928 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
28935 declare ptr @llvm.stackguard()
28940 The ``llvm.stackguard`` intrinsic returns the system stack guard value.
28942 It should not be generated by frontends, since it is only for internal usage.
28943 The reason why we create this intrinsic is that we still support IR form Stack
28944 Protector in FastISel.
28954 On some platforms, the value returned by this intrinsic remains unchanged
28955 between loads in the same thread. On other platforms, it returns the same
28956 global variable value, if any, e.g. ``@__stack_chk_guard``.
28958 Currently some platforms have IR-level customized stack guard loading (e.g.
28959 X86 Linux) that is not handled by ``llvm.stackguard()``, while they should be
28962 '``llvm.objectsize``' Intrinsic
28963 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
28970 declare i32 @llvm.objectsize.i32(ptr <object>, i1 <min>, i1 <nullunknown>, i1 <dynamic>)
28971 declare i64 @llvm.objectsize.i64(ptr <object>, i1 <min>, i1 <nullunknown>, i1 <dynamic>)
28976 The ``llvm.objectsize`` intrinsic is designed to provide information to the
28977 optimizer to determine whether a) an operation (like memcpy) will overflow a
28978 buffer that corresponds to an object, or b) that a runtime check for overflow
28979 isn't necessary. An object in this context means an allocation of a specific
28980 class, structure, array, or other object.
28985 The ``llvm.objectsize`` intrinsic takes four arguments. The first argument is a
28986 pointer to or into the ``object``. The second argument determines whether
28987 ``llvm.objectsize`` returns 0 (if true) or -1 (if false) when the object size is
28988 unknown. The third argument controls how ``llvm.objectsize`` acts when ``null``
28989 in address space 0 is used as its pointer argument. If it's ``false``,
28990 ``llvm.objectsize`` reports 0 bytes available when given ``null``. Otherwise, if
28991 the ``null`` is in a non-zero address space or if ``true`` is given for the
28992 third argument of ``llvm.objectsize``, we assume its size is unknown. The fourth
28993 argument to ``llvm.objectsize`` determines if the value should be evaluated at
28996 The second, third, and fourth arguments only accept constants.
29001 The ``llvm.objectsize`` intrinsic is lowered to a value representing the size of
29002 the object concerned. If the size cannot be determined, ``llvm.objectsize``
29003 returns ``i32/i64 -1 or 0`` (depending on the ``min`` argument).
29005 '``llvm.expect``' Intrinsic
29006 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
29011 This is an overloaded intrinsic. You can use ``llvm.expect`` on any
29016 declare i1 @llvm.expect.i1(i1 <val>, i1 <expected_val>)
29017 declare i32 @llvm.expect.i32(i32 <val>, i32 <expected_val>)
29018 declare i64 @llvm.expect.i64(i64 <val>, i64 <expected_val>)
29023 The ``llvm.expect`` intrinsic provides information about expected (the
29024 most probable) value of ``val``, which can be used by optimizers.
29029 The ``llvm.expect`` intrinsic takes two arguments. The first argument is
29030 a value. The second argument is an expected value.
29035 This intrinsic is lowered to the ``val``.
29037 '``llvm.expect.with.probability``' Intrinsic
29038 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
29043 This intrinsic is similar to ``llvm.expect``. This is an overloaded intrinsic.
29044 You can use ``llvm.expect.with.probability`` on any integer bit width.
29048 declare i1 @llvm.expect.with.probability.i1(i1 <val>, i1 <expected_val>, double <prob>)
29049 declare i32 @llvm.expect.with.probability.i32(i32 <val>, i32 <expected_val>, double <prob>)
29050 declare i64 @llvm.expect.with.probability.i64(i64 <val>, i64 <expected_val>, double <prob>)
29055 The ``llvm.expect.with.probability`` intrinsic provides information about
29056 expected value of ``val`` with probability(or confidence) ``prob``, which can
29057 be used by optimizers.
29062 The ``llvm.expect.with.probability`` intrinsic takes three arguments. The first
29063 argument is a value. The second argument is an expected value. The third
29064 argument is a probability.
29069 This intrinsic is lowered to the ``val``.
29073 '``llvm.assume``' Intrinsic
29074 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
29081 declare void @llvm.assume(i1 %cond)
29086 The ``llvm.assume`` allows the optimizer to assume that the provided
29087 condition is true. This information can then be used in simplifying other parts
29090 More complex assumptions can be encoded as
29091 :ref:`assume operand bundles <assume_opbundles>`.
29096 The argument of the call is the condition which the optimizer may assume is
29102 The intrinsic allows the optimizer to assume that the provided condition is
29103 always true whenever the control flow reaches the intrinsic call. No code is
29104 generated for this intrinsic, and instructions that contribute only to the
29105 provided condition are not used for code generation. If the condition is
29106 violated during execution, the behavior is undefined.
29108 Note that the optimizer might limit the transformations performed on values
29109 used by the ``llvm.assume`` intrinsic in order to preserve the instructions
29110 only used to form the intrinsic's input argument. This might prove undesirable
29111 if the extra information provided by the ``llvm.assume`` intrinsic does not cause
29112 sufficient overall improvement in code quality. For this reason,
29113 ``llvm.assume`` should not be used to document basic mathematical invariants
29114 that the optimizer can otherwise deduce or facts that are of little use to the
29119 '``llvm.ssa.copy``' Intrinsic
29120 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
29127 declare type @llvm.ssa.copy(type returned %operand) memory(none)
29132 The first argument is an operand which is used as the returned value.
29137 The ``llvm.ssa.copy`` intrinsic can be used to attach information to
29138 operations by copying them and giving them new names. For example,
29139 the PredicateInfo utility uses it to build Extended SSA form, and
29140 attach various forms of information to operands that dominate specific
29141 uses. It is not meant for general use, only for building temporary
29142 renaming forms that require value splits at certain points.
29146 '``llvm.type.test``' Intrinsic
29147 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
29154 declare i1 @llvm.type.test(ptr %ptr, metadata %type) nounwind memory(none)
29160 The first argument is a pointer to be tested. The second argument is a
29161 metadata object representing a :doc:`type identifier <TypeMetadata>`.
29166 The ``llvm.type.test`` intrinsic tests whether the given pointer is associated
29167 with the given type identifier.
29169 .. _type.checked.load:
29171 '``llvm.type.checked.load``' Intrinsic
29172 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
29179 declare {ptr, i1} @llvm.type.checked.load(ptr %ptr, i32 %offset, metadata %type) nounwind memory(argmem: read)
29185 The first argument is a pointer from which to load a function pointer. The
29186 second argument is the byte offset from which to load the function pointer. The
29187 third argument is a metadata object representing a :doc:`type identifier
29193 The ``llvm.type.checked.load`` intrinsic safely loads a function pointer from a
29194 virtual table pointer using type metadata. This intrinsic is used to implement
29195 control flow integrity in conjunction with virtual call optimization. The
29196 virtual call optimization pass will optimize away ``llvm.type.checked.load``
29197 intrinsics associated with devirtualized calls, thereby removing the type
29198 check in cases where it is not needed to enforce the control flow integrity
29201 If the given pointer is associated with a type metadata identifier, this
29202 function returns true as the second element of its return value. (Note that
29203 the function may also return true if the given pointer is not associated
29204 with a type metadata identifier.) If the function's return value's second
29205 element is true, the following rules apply to the first element:
29207 - If the given pointer is associated with the given type metadata identifier,
29208 it is the function pointer loaded from the given byte offset from the given
29211 - If the given pointer is not associated with the given type metadata
29212 identifier, it is one of the following (the choice of which is unspecified):
29214 1. The function pointer that would have been loaded from an arbitrarily chosen
29215 (through an unspecified mechanism) pointer associated with the type
29218 2. If the function has a non-void return type, a pointer to a function that
29219 returns an unspecified value without causing side effects.
29221 If the function's return value's second element is false, the value of the
29222 first element is undefined.
29224 .. _type.checked.load.relative:
29226 '``llvm.type.checked.load.relative``' Intrinsic
29227 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
29234 declare {ptr, i1} @llvm.type.checked.load.relative(ptr %ptr, i32 %offset, metadata %type) nounwind memory(argmem: read)
29239 The ``llvm.type.checked.load.relative`` intrinsic loads a relative pointer to a
29240 function from a virtual table pointer using metadata. Otherwise, its semantic is
29241 identical to the ``llvm.type.checked.load`` intrinsic.
29243 A relative pointer is a pointer to an offset to the pointed to value. The
29244 address of the underlying pointer of the relative pointer is obtained by adding
29245 the offset to the address of the offset value.
29247 '``llvm.arithmetic.fence``' Intrinsic
29248 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
29256 @llvm.arithmetic.fence(<type> <op>)
29261 The purpose of the ``llvm.arithmetic.fence`` intrinsic
29262 is to prevent the optimizer from performing fast-math optimizations,
29263 particularly reassociation,
29264 between the argument and the expression that contains the argument.
29265 It can be used to preserve the parentheses in the source language.
29270 The ``llvm.arithmetic.fence`` intrinsic takes only one argument.
29271 The argument and the return value are floating-point numbers,
29272 or vector floating-point numbers, of the same type.
29277 This intrinsic returns the value of its operand. The optimizer can optimize
29278 the argument, but the optimizer cannot hoist any component of the operand
29279 to the containing context, and the optimizer cannot move the calculation of
29280 any expression in the containing context into the operand.
29283 '``llvm.donothing``' Intrinsic
29284 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
29291 declare void @llvm.donothing() nounwind memory(none)
29296 The ``llvm.donothing`` intrinsic doesn't perform any operation. It's one of only
29297 three intrinsics (besides ``llvm.experimental.patchpoint`` and
29298 ``llvm.experimental.gc.statepoint``) that can be called with an invoke
29309 This intrinsic does nothing, and it's removed by optimizers and ignored
29312 '``llvm.experimental.deoptimize``' Intrinsic
29313 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
29320 declare type @llvm.experimental.deoptimize(...) [ "deopt"(...) ]
29325 This intrinsic, together with :ref:`deoptimization operand bundles
29326 <deopt_opbundles>`, allow frontends to express transfer of control and
29327 frame-local state from the currently executing (typically more specialized,
29328 hence faster) version of a function into another (typically more generic, hence
29331 In languages with a fully integrated managed runtime like Java and JavaScript
29332 this intrinsic can be used to implement "uncommon trap" or "side exit" like
29333 functionality. In unmanaged languages like C and C++, this intrinsic can be
29334 used to represent the slow paths of specialized functions.
29340 The intrinsic takes an arbitrary number of arguments, whose meaning is
29341 decided by the :ref:`lowering strategy<deoptimize_lowering>`.
29346 The ``@llvm.experimental.deoptimize`` intrinsic executes an attached
29347 deoptimization continuation (denoted using a :ref:`deoptimization
29348 operand bundle <deopt_opbundles>`) and returns the value returned by
29349 the deoptimization continuation. Defining the semantic properties of
29350 the continuation itself is out of scope of the language reference --
29351 as far as LLVM is concerned, the deoptimization continuation can
29352 invoke arbitrary side effects, including reading from and writing to
29355 Deoptimization continuations expressed using ``"deopt"`` operand bundles always
29356 continue execution to the end of the physical frame containing them, so all
29357 calls to ``@llvm.experimental.deoptimize`` must be in "tail position":
29359 - ``@llvm.experimental.deoptimize`` cannot be invoked.
29360 - The call must immediately precede a :ref:`ret <i_ret>` instruction.
29361 - The ``ret`` instruction must return the value produced by the
29362 ``@llvm.experimental.deoptimize`` call if there is one, or void.
29364 Note that the above restrictions imply that the return type for a call to
29365 ``@llvm.experimental.deoptimize`` will match the return type of its immediate
29368 The inliner composes the ``"deopt"`` continuations of the caller into the
29369 ``"deopt"`` continuations present in the inlinee, and also updates calls to this
29370 intrinsic to return directly from the frame of the function it inlined into.
29372 All declarations of ``@llvm.experimental.deoptimize`` must share the
29373 same calling convention.
29375 .. _deoptimize_lowering:
29380 Calls to ``@llvm.experimental.deoptimize`` are lowered to calls to the
29381 symbol ``__llvm_deoptimize`` (it is the frontend's responsibility to
29382 ensure that this symbol is defined). The call arguments to
29383 ``@llvm.experimental.deoptimize`` are lowered as if they were formal
29384 arguments of the specified types, and not as varargs.
29387 '``llvm.experimental.guard``' Intrinsic
29388 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
29395 declare void @llvm.experimental.guard(i1, ...) [ "deopt"(...) ]
29400 This intrinsic, together with :ref:`deoptimization operand bundles
29401 <deopt_opbundles>`, allows frontends to express guards or checks on
29402 optimistic assumptions made during compilation. The semantics of
29403 ``@llvm.experimental.guard`` is defined in terms of
29404 ``@llvm.experimental.deoptimize`` -- its body is defined to be
29407 .. code-block:: text
29409 define void @llvm.experimental.guard(i1 %pred, <args...>) {
29410 %realPred = and i1 %pred, undef
29411 br i1 %realPred, label %continue, label %leave [, !make.implicit !{}]
29414 call void @llvm.experimental.deoptimize(<args...>) [ "deopt"() ]
29422 with the optional ``[, !make.implicit !{}]`` present if and only if it
29423 is present on the call site. For more details on ``!make.implicit``,
29424 see :doc:`FaultMaps`.
29426 In words, ``@llvm.experimental.guard`` executes the attached
29427 ``"deopt"`` continuation if (but **not** only if) its first argument
29428 is ``false``. Since the optimizer is allowed to replace the ``undef``
29429 with an arbitrary value, it can optimize guard to fail "spuriously",
29430 i.e. without the original condition being false (hence the "not only
29431 if"); and this allows for "check widening" type optimizations.
29433 ``@llvm.experimental.guard`` cannot be invoked.
29435 After ``@llvm.experimental.guard`` was first added, a more general
29436 formulation was found in ``@llvm.experimental.widenable.condition``.
29437 Support for ``@llvm.experimental.guard`` is slowly being rephrased in
29438 terms of this alternate.
29440 '``llvm.experimental.widenable.condition``' Intrinsic
29441 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
29448 declare i1 @llvm.experimental.widenable.condition()
29453 This intrinsic represents a "widenable condition" which is
29454 boolean expressions with the following property: whether this
29455 expression is `true` or `false`, the program is correct and
29458 Together with :ref:`deoptimization operand bundles <deopt_opbundles>`,
29459 ``@llvm.experimental.widenable.condition`` allows frontends to
29460 express guards or checks on optimistic assumptions made during
29461 compilation and represent them as branch instructions on special
29464 While this may appear similar in semantics to `undef`, it is very
29465 different in that an invocation produces a particular, singular
29466 value. It is also intended to be lowered late, and remain available
29467 for specific optimizations and transforms that can benefit from its
29468 special properties.
29478 The intrinsic ``@llvm.experimental.widenable.condition()``
29479 returns either `true` or `false`. For each evaluation of a call
29480 to this intrinsic, the program must be valid and correct both if
29481 it returns `true` and if it returns `false`. This allows
29482 transformation passes to replace evaluations of this intrinsic
29483 with either value whenever one is beneficial.
29485 When used in a branch condition, it allows us to choose between
29486 two alternative correct solutions for the same problem, like
29489 .. code-block:: text
29491 %cond = call i1 @llvm.experimental.widenable.condition()
29492 br i1 %cond, label %fast_path, label %slow_path
29495 ; Apply memory-consuming but fast solution for a task.
29498 ; Cheap in memory but slow solution.
29500 Whether the result of intrinsic's call is `true` or `false`,
29501 it should be correct to pick either solution. We can switch
29502 between them by replacing the result of
29503 ``@llvm.experimental.widenable.condition`` with different
29506 This is how it can be used to represent guards as widenable branches:
29508 .. code-block:: text
29511 ; Unguarded instructions
29512 call void @llvm.experimental.guard(i1 %cond, <args...>) ["deopt"(<deopt_args...>)]
29513 ; Guarded instructions
29515 Can be expressed in an alternative equivalent form of explicit branch using
29516 ``@llvm.experimental.widenable.condition``:
29518 .. code-block:: text
29521 ; Unguarded instructions
29522 %widenable_condition = call i1 @llvm.experimental.widenable.condition()
29523 %guard_condition = and i1 %cond, %widenable_condition
29524 br i1 %guard_condition, label %guarded, label %deopt
29527 ; Guarded instructions
29530 call type @llvm.experimental.deoptimize(<args...>) [ "deopt"(<deopt_args...>) ]
29532 So the block `guarded` is only reachable when `%cond` is `true`,
29533 and it should be valid to go to the block `deopt` whenever `%cond`
29534 is `true` or `false`.
29536 ``@llvm.experimental.widenable.condition`` will never throw, thus
29537 it cannot be invoked.
29542 When ``@llvm.experimental.widenable.condition()`` is used in
29543 condition of a guard represented as explicit branch, it is
29544 legal to widen the guard's condition with any additional
29547 Guard widening looks like replacement of
29549 .. code-block:: text
29551 %widenable_cond = call i1 @llvm.experimental.widenable.condition()
29552 %guard_cond = and i1 %cond, %widenable_cond
29553 br i1 %guard_cond, label %guarded, label %deopt
29557 .. code-block:: text
29559 %widenable_cond = call i1 @llvm.experimental.widenable.condition()
29560 %new_cond = and i1 %any_other_cond, %widenable_cond
29561 %new_guard_cond = and i1 %cond, %new_cond
29562 br i1 %new_guard_cond, label %guarded, label %deopt
29564 for this branch. Here `%any_other_cond` is an arbitrarily chosen
29565 well-defined `i1` value. By making guard widening, we may
29566 impose stricter conditions on `guarded` block and bail to the
29567 deopt when the new condition is not met.
29572 Default lowering strategy is replacing the result of
29573 call of ``@llvm.experimental.widenable.condition`` with
29574 constant `true`. However it is always correct to replace
29575 it with any other `i1` value. Any pass can
29576 freely do it if it can benefit from non-default lowering.
29578 '``llvm.allow.ubsan.check``' Intrinsic
29579 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
29586 declare i1 @llvm.allow.ubsan.check(i8 immarg %kind)
29591 This intrinsic returns ``true`` if and only if the compiler opted to enable the
29592 ubsan check in the current basic block.
29594 Rules to allow ubsan checks are not part of the intrinsic declaration, and
29595 controlled by compiler options.
29597 This intrinsic is the ubsan specific version of ``@llvm.allow.runtime.check()``.
29602 An integer describing the kind of ubsan check guarded by the intrinsic.
29607 The intrinsic ``@llvm.allow.ubsan.check()`` returns either ``true`` or
29608 ``false``, depending on compiler options.
29610 For each evaluation of a call to this intrinsic, the program must be valid and
29611 correct both if it returns ``true`` and if it returns ``false``.
29613 When used in a branch condition, it selects one of the two paths:
29615 * `true``: Executes the UBSan check and reports any failures.
29617 * `false`: Bypasses the check, assuming it always succeeds.
29621 .. code-block:: text
29623 %allow = call i1 @llvm.allow.ubsan.check(i8 5)
29624 %not.allow = xor i1 %allow, true
29625 %cond = or i1 %ubcheck, %not.allow
29626 br i1 %cond, label %cont, label %trap
29632 call void @llvm.ubsantrap(i8 5)
29636 '``llvm.allow.runtime.check``' Intrinsic
29637 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
29644 declare i1 @llvm.allow.runtime.check(metadata %kind)
29649 This intrinsic returns ``true`` if and only if the compiler opted to enable
29650 runtime checks in the current basic block.
29652 Rules to allow runtime checks are not part of the intrinsic declaration, and
29653 controlled by compiler options.
29655 This intrinsic is non-ubsan specific version of ``@llvm.allow.ubsan.check()``.
29660 A string identifying the kind of runtime check guarded by the intrinsic. The
29661 string can be used to control rules to allow checks.
29666 The intrinsic ``@llvm.allow.runtime.check()`` returns either ``true`` or
29667 ``false``, depending on compiler options.
29669 For each evaluation of a call to this intrinsic, the program must be valid and
29670 correct both if it returns ``true`` and if it returns ``false``.
29672 When used in a branch condition, it allows us to choose between
29673 two alternative correct solutions for the same problem.
29675 If the intrinsic is evaluated as ``true``, program should execute a guarded
29676 check. If the intrinsic is evaluated as ``false``, the program should avoid any
29677 unnecessary checks.
29681 .. code-block:: text
29683 %allow = call i1 @llvm.allow.runtime.check(metadata !"my_check")
29684 br i1 %allow, label %fast_path, label %slow_path
29687 ; Omit diagnostics.
29690 ; Additional diagnostics.
29693 '``llvm.load.relative``' Intrinsic
29694 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
29701 declare ptr @llvm.load.relative.iN(ptr %ptr, iN %offset) nounwind memory(argmem: read)
29706 This intrinsic loads a 32-bit value from the address ``%ptr + %offset``,
29707 adds ``%ptr`` to that value and returns it. The constant folder specifically
29708 recognizes the form of this intrinsic and the constant initializers it may
29709 load from; if a loaded constant initializer is known to have the form
29710 ``i32 trunc(x - %ptr)``, the intrinsic call is folded to ``x``.
29712 LLVM provides that the calculation of such a constant initializer will
29713 not overflow at link time under the medium code model if ``x`` is an
29714 ``unnamed_addr`` function. However, it does not provide this guarantee for
29715 a constant initializer folded into a function body. This intrinsic can be
29716 used to avoid the possibility of overflows when loading from such a constant.
29718 .. _llvm_sideeffect:
29720 '``llvm.sideeffect``' Intrinsic
29721 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
29728 declare void @llvm.sideeffect() inaccessiblememonly nounwind willreturn
29733 The ``llvm.sideeffect`` intrinsic doesn't perform any operation. Optimizers
29734 treat it as having side effects, so it can be inserted into a loop to
29735 indicate that the loop shouldn't be assumed to terminate (which could
29736 potentially lead to the loop being optimized away entirely), even if it's
29737 an infinite loop with no other side effects.
29747 This intrinsic actually does nothing, but optimizers must assume that it
29748 has externally observable side effects.
29750 '``llvm.is.constant.*``' Intrinsic
29751 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
29756 This is an overloaded intrinsic. You can use llvm.is.constant with any argument type.
29760 declare i1 @llvm.is.constant.i32(i32 %operand) nounwind memory(none)
29761 declare i1 @llvm.is.constant.f32(float %operand) nounwind memory(none)
29762 declare i1 @llvm.is.constant.TYPENAME(TYPE %operand) nounwind memory(none)
29767 The '``llvm.is.constant``' intrinsic will return true if the argument
29768 is known to be a manifest compile-time constant. It is guaranteed to
29769 fold to either true or false before generating machine code.
29774 This intrinsic generates no code. If its argument is known to be a
29775 manifest compile-time constant value, then the intrinsic will be
29776 converted to a constant true value. Otherwise, it will be converted to
29777 a constant false value.
29779 In particular, note that if the argument is a constant expression
29780 which refers to a global (the address of which _is_ a constant, but
29781 not manifest during the compile), then the intrinsic evaluates to
29784 The result also intentionally depends on the result of optimization
29785 passes -- e.g., the result can change depending on whether a
29786 function gets inlined or not. A function's parameters are
29787 obviously not constant. However, a call like
29788 ``llvm.is.constant.i32(i32 %param)`` *can* return true after the
29789 function is inlined, if the value passed to the function parameter was
29794 '``llvm.ptrmask``' Intrinsic
29795 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
29802 declare ptrty llvm.ptrmask(ptrty %ptr, intty %mask) speculatable memory(none)
29807 The first argument is a pointer or vector of pointers. The second argument is
29808 an integer or vector of integers with the same bit width as the index type
29809 size of the first argument.
29814 The ``llvm.ptrmask`` intrinsic masks out bits of the pointer according to a mask.
29815 This allows stripping data from tagged pointers without converting them to an
29816 integer (ptrtoint/inttoptr). As a consequence, we can preserve more information
29817 to facilitate alias analysis and underlying-object detection.
29822 The result of ``ptrmask(%ptr, %mask)`` is equivalent to the following expansion,
29823 where ``iPtrIdx`` is the index type size of the pointer::
29825 %intptr = ptrtoint ptr %ptr to iPtrIdx ; this may truncate
29826 %masked = and iPtrIdx %intptr, %mask
29827 %diff = sub iPtrIdx %masked, %intptr
29828 %result = getelementptr i8, ptr %ptr, iPtrIdx %diff
29830 If the pointer index type size is smaller than the pointer type size, this
29831 implies that pointer bits beyond the index size are not affected by this
29832 intrinsic. For integral pointers, it behaves as if the mask were extended with
29833 1 bits to the pointer type size.
29835 Both the returned pointer(s) and the first argument are based on the same
29836 underlying object (for more information on the *based on* terminology see
29837 :ref:`the pointer aliasing rules <pointeraliasing>`).
29839 The intrinsic only captures the pointer argument through the return value.
29841 .. _int_threadlocal_address:
29843 '``llvm.threadlocal.address``' Intrinsic
29844 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
29851 declare ptr @llvm.threadlocal.address(ptr) nounwind willreturn memory(none)
29856 The `llvm.threadlocal.address` intrinsic requires a global value argument (a
29857 :ref:`global variable <globalvars>` or alias) that is thread local.
29862 The address of a thread local global is not a constant, since it depends on
29863 the calling thread. The `llvm.threadlocal.address` intrinsic returns the
29864 address of the given thread local global in the calling thread.
29868 '``llvm.vscale``' Intrinsic
29869 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
29876 declare i32 llvm.vscale.i32()
29877 declare i64 llvm.vscale.i64()
29882 The ``llvm.vscale`` intrinsic returns the value for ``vscale`` in scalable
29883 vectors such as ``<vscale x 16 x i8>``.
29888 ``vscale`` is a positive value that is constant throughout program
29889 execution, but is unknown at compile time.
29890 If the result value does not fit in the result type, then the result is
29891 a :ref:`poison value <poisonvalues>`.
29895 '``llvm.fake.use``' Intrinsic
29896 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
29903 declare void @llvm.fake.use(...)
29908 The ``llvm.fake.use`` intrinsic is a no-op. It takes a single
29909 value as an operand and is treated as a use of that operand, to force the
29910 optimizer to preserve that value prior to the fake use. This is used for
29911 extending the lifetimes of variables, where this intrinsic placed at the end of
29912 a variable's scope helps prevent that variable from being optimized out.
29917 The ``llvm.fake.use`` intrinsic takes one argument, which may be any
29918 function-local SSA value. Note that the signature is variadic so that the
29919 intrinsic can take any type of argument, but passing more than one argument will
29920 result in an error.
29925 This intrinsic does nothing, but optimizers must consider it a use of its single
29926 operand and should try to preserve the intrinsic and its position in the
29930 Stack Map Intrinsics
29931 --------------------
29933 LLVM provides experimental intrinsics to support runtime patching
29934 mechanisms commonly desired in dynamic language JITs. These intrinsics
29935 are described in :doc:`StackMaps`.
29937 Element Wise Atomic Memory Intrinsics
29938 -------------------------------------
29940 These intrinsics are similar to the standard library memory intrinsics except
29941 that they perform memory transfer as a sequence of atomic memory accesses.
29943 .. _int_memcpy_element_unordered_atomic:
29945 '``llvm.memcpy.element.unordered.atomic``' Intrinsic
29946 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
29951 This is an overloaded intrinsic. You can use ``llvm.memcpy.element.unordered.atomic`` on
29952 any integer bit width and for different address spaces. Not all targets
29953 support all bit widths however.
29957 declare void @llvm.memcpy.element.unordered.atomic.p0.p0.i32(ptr <dest>,
29960 i32 <element_size>)
29961 declare void @llvm.memcpy.element.unordered.atomic.p0.p0.i64(ptr <dest>,
29964 i32 <element_size>)
29969 The '``llvm.memcpy.element.unordered.atomic.*``' intrinsic is a specialization of the
29970 '``llvm.memcpy.*``' intrinsic. It differs in that the ``dest`` and ``src`` are treated
29971 as arrays with elements that are exactly ``element_size`` bytes, and the copy between
29972 buffers uses a sequence of :ref:`unordered atomic <ordering>` load/store operations
29973 that are a positive integer multiple of the ``element_size`` in size.
29978 The first three arguments are the same as they are in the :ref:`@llvm.memcpy <int_memcpy>`
29979 intrinsic, with the added constraint that ``len`` is required to be a positive integer
29980 multiple of the ``element_size``. If ``len`` is not a positive integer multiple of
29981 ``element_size``, then the behavior of the intrinsic is undefined.
29983 ``element_size`` must be a compile-time constant positive power of two no greater than
29984 target-specific atomic access size limit.
29986 For each of the input pointers ``align`` parameter attribute must be specified. It
29987 must be a power of two no less than the ``element_size``. Caller guarantees that
29988 both the source and destination pointers are aligned to that boundary.
29993 The '``llvm.memcpy.element.unordered.atomic.*``' intrinsic copies ``len`` bytes of
29994 memory from the source location to the destination location. These locations are not
29995 allowed to overlap. The memory copy is performed as a sequence of load/store operations
29996 where each access is guaranteed to be a multiple of ``element_size`` bytes wide and
29997 aligned at an ``element_size`` boundary.
29999 The order of the copy is unspecified. The same value may be read from the source
30000 buffer many times, but only one write is issued to the destination buffer per
30001 element. It is well defined to have concurrent reads and writes to both source and
30002 destination provided those reads and writes are unordered atomic when specified.
30004 This intrinsic does not provide any additional ordering guarantees over those
30005 provided by a set of unordered loads from the source location and stores to the
30011 In the most general case call to the '``llvm.memcpy.element.unordered.atomic.*``' is
30012 lowered to a call to the symbol ``__llvm_memcpy_element_unordered_atomic_*``. Where '*'
30013 is replaced with an actual element size. See :ref:`RewriteStatepointsForGC intrinsic
30014 lowering <RewriteStatepointsForGC_intrinsic_lowering>` for details on GC specific
30017 Optimizer is allowed to inline memory copy when it's profitable to do so.
30019 '``llvm.memmove.element.unordered.atomic``' Intrinsic
30020 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
30025 This is an overloaded intrinsic. You can use
30026 ``llvm.memmove.element.unordered.atomic`` on any integer bit width and for
30027 different address spaces. Not all targets support all bit widths however.
30031 declare void @llvm.memmove.element.unordered.atomic.p0.p0.i32(ptr <dest>,
30034 i32 <element_size>)
30035 declare void @llvm.memmove.element.unordered.atomic.p0.p0.i64(ptr <dest>,
30038 i32 <element_size>)
30043 The '``llvm.memmove.element.unordered.atomic.*``' intrinsic is a specialization
30044 of the '``llvm.memmove.*``' intrinsic. It differs in that the ``dest`` and
30045 ``src`` are treated as arrays with elements that are exactly ``element_size``
30046 bytes, and the copy between buffers uses a sequence of
30047 :ref:`unordered atomic <ordering>` load/store operations that are a positive
30048 integer multiple of the ``element_size`` in size.
30053 The first three arguments are the same as they are in the
30054 :ref:`@llvm.memmove <int_memmove>` intrinsic, with the added constraint that
30055 ``len`` is required to be a positive integer multiple of the ``element_size``.
30056 If ``len`` is not a positive integer multiple of ``element_size``, then the
30057 behavior of the intrinsic is undefined.
30059 ``element_size`` must be a compile-time constant positive power of two no
30060 greater than a target-specific atomic access size limit.
30062 For each of the input pointers the ``align`` parameter attribute must be
30063 specified. It must be a power of two no less than the ``element_size``. Caller
30064 guarantees that both the source and destination pointers are aligned to that
30070 The '``llvm.memmove.element.unordered.atomic.*``' intrinsic copies ``len`` bytes
30071 of memory from the source location to the destination location. These locations
30072 are allowed to overlap. The memory copy is performed as a sequence of load/store
30073 operations where each access is guaranteed to be a multiple of ``element_size``
30074 bytes wide and aligned at an ``element_size`` boundary.
30076 The order of the copy is unspecified. The same value may be read from the source
30077 buffer many times, but only one write is issued to the destination buffer per
30078 element. It is well defined to have concurrent reads and writes to both source
30079 and destination provided those reads and writes are unordered atomic when
30082 This intrinsic does not provide any additional ordering guarantees over those
30083 provided by a set of unordered loads from the source location and stores to the
30089 In the most general case call to the
30090 '``llvm.memmove.element.unordered.atomic.*``' is lowered to a call to the symbol
30091 ``__llvm_memmove_element_unordered_atomic_*``. Where '*' is replaced with an
30092 actual element size. See :ref:`RewriteStatepointsForGC intrinsic lowering
30093 <RewriteStatepointsForGC_intrinsic_lowering>` for details on GC specific
30096 The optimizer is allowed to inline the memory copy when it's profitable to do so.
30098 .. _int_memset_element_unordered_atomic:
30100 '``llvm.memset.element.unordered.atomic``' Intrinsic
30101 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
30106 This is an overloaded intrinsic. You can use ``llvm.memset.element.unordered.atomic`` on
30107 any integer bit width and for different address spaces. Not all targets
30108 support all bit widths however.
30112 declare void @llvm.memset.element.unordered.atomic.p0.i32(ptr <dest>,
30115 i32 <element_size>)
30116 declare void @llvm.memset.element.unordered.atomic.p0.i64(ptr <dest>,
30119 i32 <element_size>)
30124 The '``llvm.memset.element.unordered.atomic.*``' intrinsic is a specialization of the
30125 '``llvm.memset.*``' intrinsic. It differs in that the ``dest`` is treated as an array
30126 with elements that are exactly ``element_size`` bytes, and the assignment to that array
30127 uses uses a sequence of :ref:`unordered atomic <ordering>` store operations
30128 that are a positive integer multiple of the ``element_size`` in size.
30133 The first three arguments are the same as they are in the :ref:`@llvm.memset <int_memset>`
30134 intrinsic, with the added constraint that ``len`` is required to be a positive integer
30135 multiple of the ``element_size``. If ``len`` is not a positive integer multiple of
30136 ``element_size``, then the behavior of the intrinsic is undefined.
30138 ``element_size`` must be a compile-time constant positive power of two no greater than
30139 target-specific atomic access size limit.
30141 The ``dest`` input pointer must have the ``align`` parameter attribute specified. It
30142 must be a power of two no less than the ``element_size``. Caller guarantees that
30143 the destination pointer is aligned to that boundary.
30148 The '``llvm.memset.element.unordered.atomic.*``' intrinsic sets the ``len`` bytes of
30149 memory starting at the destination location to the given ``value``. The memory is
30150 set with a sequence of store operations where each access is guaranteed to be a
30151 multiple of ``element_size`` bytes wide and aligned at an ``element_size`` boundary.
30153 The order of the assignment is unspecified. Only one write is issued to the
30154 destination buffer per element. It is well defined to have concurrent reads and
30155 writes to the destination provided those reads and writes are unordered atomic
30158 This intrinsic does not provide any additional ordering guarantees over those
30159 provided by a set of unordered stores to the destination.
30164 In the most general case call to the '``llvm.memset.element.unordered.atomic.*``' is
30165 lowered to a call to the symbol ``__llvm_memset_element_unordered_atomic_*``. Where '*'
30166 is replaced with an actual element size.
30168 The optimizer is allowed to inline the memory assignment when it's profitable to do so.
30170 Objective-C ARC Runtime Intrinsics
30171 ----------------------------------
30173 LLVM provides intrinsics that lower to Objective-C ARC runtime entry points.
30174 LLVM is aware of the semantics of these functions, and optimizes based on that
30175 knowledge. You can read more about the details of Objective-C ARC `here
30176 <https://clang.llvm.org/docs/AutomaticReferenceCounting.html>`_.
30178 '``llvm.objc.autorelease``' Intrinsic
30179 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
30185 declare ptr @llvm.objc.autorelease(ptr)
30190 Lowers to a call to `objc_autorelease <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-autorelease>`_.
30192 '``llvm.objc.autoreleasePoolPop``' Intrinsic
30193 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
30199 declare void @llvm.objc.autoreleasePoolPop(ptr)
30204 Lowers to a call to `objc_autoreleasePoolPop <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-autoreleasepoolpop-void-pool>`_.
30206 '``llvm.objc.autoreleasePoolPush``' Intrinsic
30207 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
30213 declare ptr @llvm.objc.autoreleasePoolPush()
30218 Lowers to a call to `objc_autoreleasePoolPush <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-autoreleasepoolpush-void>`_.
30220 '``llvm.objc.autoreleaseReturnValue``' Intrinsic
30221 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
30227 declare ptr @llvm.objc.autoreleaseReturnValue(ptr)
30232 Lowers to a call to `objc_autoreleaseReturnValue <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-autoreleasereturnvalue>`_.
30234 '``llvm.objc.copyWeak``' Intrinsic
30235 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
30241 declare void @llvm.objc.copyWeak(ptr, ptr)
30246 Lowers to a call to `objc_copyWeak <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-copyweak-id-dest-id-src>`_.
30248 '``llvm.objc.destroyWeak``' Intrinsic
30249 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
30255 declare void @llvm.objc.destroyWeak(ptr)
30260 Lowers to a call to `objc_destroyWeak <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-destroyweak-id-object>`_.
30262 '``llvm.objc.initWeak``' Intrinsic
30263 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
30269 declare ptr @llvm.objc.initWeak(ptr, ptr)
30274 Lowers to a call to `objc_initWeak <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-initweak>`_.
30276 '``llvm.objc.loadWeak``' Intrinsic
30277 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
30283 declare ptr @llvm.objc.loadWeak(ptr)
30288 Lowers to a call to `objc_loadWeak <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-loadweak>`_.
30290 '``llvm.objc.loadWeakRetained``' Intrinsic
30291 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
30297 declare ptr @llvm.objc.loadWeakRetained(ptr)
30302 Lowers to a call to `objc_loadWeakRetained <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-loadweakretained>`_.
30304 '``llvm.objc.moveWeak``' Intrinsic
30305 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
30311 declare void @llvm.objc.moveWeak(ptr, ptr)
30316 Lowers to a call to `objc_moveWeak <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-moveweak-id-dest-id-src>`_.
30318 '``llvm.objc.release``' Intrinsic
30319 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
30325 declare void @llvm.objc.release(ptr)
30330 Lowers to a call to `objc_release <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-release-id-value>`_.
30332 '``llvm.objc.retain``' Intrinsic
30333 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
30339 declare ptr @llvm.objc.retain(ptr)
30344 Lowers to a call to `objc_retain <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-retain>`_.
30346 '``llvm.objc.retainAutorelease``' Intrinsic
30347 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
30353 declare ptr @llvm.objc.retainAutorelease(ptr)
30358 Lowers to a call to `objc_retainAutorelease <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-retainautorelease>`_.
30360 '``llvm.objc.retainAutoreleaseReturnValue``' Intrinsic
30361 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
30367 declare ptr @llvm.objc.retainAutoreleaseReturnValue(ptr)
30372 Lowers to a call to `objc_retainAutoreleaseReturnValue <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-retainautoreleasereturnvalue>`_.
30374 '``llvm.objc.retainAutoreleasedReturnValue``' Intrinsic
30375 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
30381 declare ptr @llvm.objc.retainAutoreleasedReturnValue(ptr)
30386 Lowers to a call to `objc_retainAutoreleasedReturnValue <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-retainautoreleasedreturnvalue>`_.
30388 '``llvm.objc.retainBlock``' Intrinsic
30389 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
30395 declare ptr @llvm.objc.retainBlock(ptr)
30400 Lowers to a call to `objc_retainBlock <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-retainblock>`_.
30402 '``llvm.objc.storeStrong``' Intrinsic
30403 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
30409 declare void @llvm.objc.storeStrong(ptr, ptr)
30414 Lowers to a call to `objc_storeStrong <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-storestrong-id-object-id-value>`_.
30416 '``llvm.objc.storeWeak``' Intrinsic
30417 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
30423 declare ptr @llvm.objc.storeWeak(ptr, ptr)
30428 Lowers to a call to `objc_storeWeak <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-storeweak>`_.
30430 Preserving Debug Information Intrinsics
30431 ---------------------------------------
30433 These intrinsics are used to carry certain debuginfo together with
30434 IR-level operations. For example, it may be desirable to
30435 know the structure/union name and the original user-level field
30436 indices. Such information got lost in IR GetElementPtr instruction
30437 since the IR types are different from debugInfo types and unions
30438 are converted to structs in IR.
30440 '``llvm.preserve.array.access.index``' Intrinsic
30441 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
30448 @llvm.preserve.array.access.index.p0s_union.anons.p0a10s_union.anons(<type> base,
30455 The '``llvm.preserve.array.access.index``' intrinsic returns the getelementptr address
30456 based on array base ``base``, array dimension ``dim`` and the last access index ``index``
30457 into the array. The return type ``ret_type`` is a pointer type to the array element.
30458 The array ``dim`` and ``index`` are preserved which is more robust than
30459 getelementptr instruction which may be subject to compiler transformation.
30460 The ``llvm.preserve.access.index`` type of metadata is attached to this call instruction
30461 to provide array or pointer debuginfo type.
30462 The metadata is a ``DICompositeType`` or ``DIDerivedType`` representing the
30463 debuginfo version of ``type``.
30468 The ``base`` is the array base address. The ``dim`` is the array dimension.
30469 The ``base`` is a pointer if ``dim`` equals 0.
30470 The ``index`` is the last access index into the array or pointer.
30472 The ``base`` argument must be annotated with an :ref:`elementtype
30473 <attr_elementtype>` attribute at the call-site. This attribute specifies the
30474 getelementptr element type.
30479 The '``llvm.preserve.array.access.index``' intrinsic produces the same result
30480 as a getelementptr with base ``base`` and access operands ``{dim's 0's, index}``.
30482 '``llvm.preserve.union.access.index``' Intrinsic
30483 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
30490 @llvm.preserve.union.access.index.p0s_union.anons.p0s_union.anons(<type> base,
30496 The '``llvm.preserve.union.access.index``' intrinsic carries the debuginfo field index
30497 ``di_index`` and returns the ``base`` address.
30498 The ``llvm.preserve.access.index`` type of metadata is attached to this call instruction
30499 to provide union debuginfo type.
30500 The metadata is a ``DICompositeType`` representing the debuginfo version of ``type``.
30501 The return type ``type`` is the same as the ``base`` type.
30506 The ``base`` is the union base address. The ``di_index`` is the field index in debuginfo.
30511 The '``llvm.preserve.union.access.index``' intrinsic returns the ``base`` address.
30513 '``llvm.preserve.struct.access.index``' Intrinsic
30514 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
30521 @llvm.preserve.struct.access.index.p0i8.p0s_struct.anon.0s(<type> base,
30528 The '``llvm.preserve.struct.access.index``' intrinsic returns the getelementptr address
30529 based on struct base ``base`` and IR struct member index ``gep_index``.
30530 The ``llvm.preserve.access.index`` type of metadata is attached to this call instruction
30531 to provide struct debuginfo type.
30532 The metadata is a ``DICompositeType`` representing the debuginfo version of ``type``.
30533 The return type ``ret_type`` is a pointer type to the structure member.
30538 The ``base`` is the structure base address. The ``gep_index`` is the struct member index
30539 based on IR structures. The ``di_index`` is the struct member index based on debuginfo.
30541 The ``base`` argument must be annotated with an :ref:`elementtype
30542 <attr_elementtype>` attribute at the call-site. This attribute specifies the
30543 getelementptr element type.
30548 The '``llvm.preserve.struct.access.index``' intrinsic produces the same result
30549 as a getelementptr with base ``base`` and access operands ``{0, gep_index}``.
30551 '``llvm.fptrunc.round``' Intrinsic
30552 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
30560 @llvm.fptrunc.round(<type> <value>, metadata <rounding mode>)
30565 The '``llvm.fptrunc.round``' intrinsic truncates
30566 :ref:`floating-point <t_floating>` ``value`` to type ``ty2``
30567 with a specified rounding mode.
30572 The '``llvm.fptrunc.round``' intrinsic takes a :ref:`floating-point
30573 <t_floating>` value to cast and a :ref:`floating-point <t_floating>` type
30574 to cast it to. This argument must be larger in size than the result.
30576 The second argument specifies the rounding mode as described in the constrained
30577 intrinsics section.
30578 For this intrinsic, the "round.dynamic" mode is not supported.
30583 The '``llvm.fptrunc.round``' intrinsic casts a ``value`` from a larger
30584 :ref:`floating-point <t_floating>` type to a smaller :ref:`floating-point
30585 <t_floating>` type.
30586 This intrinsic is assumed to execute in the default :ref:`floating-point
30587 environment <floatenv>` *except* for the rounding mode.
30588 This intrinsic is not supported on all targets. Some targets may not support
30589 all rounding modes.