1 ==============================
2 LLVM Language Reference Manual
3 ==============================
12 This document is a reference manual for the LLVM assembly language. LLVM
13 is a Static Single Assignment (SSA) based representation that provides
14 type safety, low-level operations, flexibility, and the capability of
15 representing 'all' high-level languages cleanly. It is the common code
16 representation used throughout all phases of the LLVM compilation
22 The LLVM code representation is designed to be used in three different
23 forms: as an in-memory compiler IR, as an on-disk bitcode representation
24 (suitable for fast loading by a Just-In-Time compiler), and as a human
25 readable assembly language representation. This allows LLVM to provide a
26 powerful intermediate representation for efficient compiler
27 transformations and analysis, while providing a natural means to debug
28 and visualize the transformations. The three different forms of LLVM are
29 all equivalent. This document describes the human readable
30 representation and notation.
32 The LLVM representation aims to be light-weight and low-level while
33 being expressive, typed, and extensible at the same time. It aims to be
34 a "universal IR" of sorts, by being at a low enough level that
35 high-level ideas may be cleanly mapped to it (similar to how
36 microprocessors are "universal IR's", allowing many source languages to
37 be mapped to them). By providing type information, LLVM can be used as
38 the target of optimizations: for example, through pointer analysis, it
39 can be proven that a C automatic variable is never accessed outside of
40 the current function, allowing it to be promoted to a simple SSA value
41 instead of a memory location.
48 It is important to note that this document describes 'well formed' LLVM
49 assembly language. There is a difference between what the parser accepts
50 and what is considered 'well formed'. For example, the following
51 instruction is syntactically okay, but not well formed:
57 because the definition of ``%x`` does not dominate all of its uses. The
58 LLVM infrastructure provides a verification pass that may be used to
59 verify that an LLVM module is well formed. This pass is automatically
60 run by the parser after parsing input assembly and by the optimizer
61 before it outputs bitcode. The violations pointed out by the verifier
62 pass indicate bugs in transformation passes or input to the parser.
69 LLVM identifiers come in two basic types: global and local. Global
70 identifiers (functions, global variables) begin with the ``'@'``
71 character. Local identifiers (register names, types) begin with the
72 ``'%'`` character. Additionally, there are three different formats for
73 identifiers, for different purposes:
75 #. Named values are represented as a string of characters with their
76 prefix. For example, ``%foo``, ``@DivisionByZero``,
77 ``%a.really.long.identifier``. The actual regular expression used is
78 '``[%@][-a-zA-Z$._][-a-zA-Z$._0-9]*``'. Identifiers that require other
79 characters in their names can be surrounded with quotes. Special
80 characters may be escaped using ``"\xx"`` where ``xx`` is the ASCII
81 code for the character in hexadecimal. In this way, any character can
82 be used in a name value, even quotes themselves. The ``"\01"`` prefix
83 can be used on global values to suppress mangling.
84 #. Unnamed values are represented as an unsigned numeric value with
85 their prefix. For example, ``%12``, ``@2``, ``%44``.
86 #. Constants, which are described in the section Constants_ below.
88 LLVM requires that values start with a prefix for two reasons: Compilers
89 don't need to worry about name clashes with reserved words, and the set
90 of reserved words may be expanded in the future without penalty.
91 Additionally, unnamed identifiers allow a compiler to quickly come up
92 with a temporary variable without having to avoid symbol table
95 Reserved words in LLVM are very similar to reserved words in other
96 languages. There are keywords for different opcodes ('``add``',
97 '``bitcast``', '``ret``', etc...), for primitive type names ('``void``',
98 '``i32``', etc...), and others. These reserved words cannot conflict
99 with variable names, because none of them start with a prefix character
100 (``'%'`` or ``'@'``).
102 Here is an example of LLVM code to multiply the integer variable
109 %result = mul i32 %X, 8
111 After strength reduction:
115 %result = shl i32 %X, 3
121 %0 = add i32 %X, %X ; yields i32:%0
122 %1 = add i32 %0, %0 ; yields i32:%1
123 %result = add i32 %1, %1
125 This last way of multiplying ``%X`` by 8 illustrates several important
126 lexical features of LLVM:
128 #. Comments are delimited with a '``;``' and go until the end of line.
129 #. Unnamed temporaries are created when the result of a computation is
130 not assigned to a named value.
131 #. By default, unnamed temporaries are numbered sequentially (using a
132 per-function incrementing counter, starting with 0). However, when explicitly
133 specifying temporary numbers, it is allowed to skip over numbers.
135 Note that basic blocks and unnamed function parameters are included in this
136 numbering. For example, if the entry basic block is not given a label name
137 and all function parameters are named, then it will get number 0.
139 It also shows a convention that we follow in this document. When
140 demonstrating instructions, we will follow an instruction with a comment
141 that defines the type and name of value produced.
149 LLVM programs are composed of ``Module``'s, each of which is a
150 translation unit of the input programs. Each module consists of
151 functions, global variables, and symbol table entries. Modules may be
152 combined together with the LLVM linker, which merges function (and
153 global variable) definitions, resolves forward declarations, and merges
154 symbol table entries. Here is an example of the "hello world" module:
158 ; Declare the string constant as a global constant.
159 @.str = private unnamed_addr constant [13 x i8] c"hello world\0A\00"
161 ; External declaration of the puts function
162 declare i32 @puts(ptr nocapture) nounwind
164 ; Definition of main function
166 ; Call puts function to write out the string to stdout.
167 call i32 @puts(ptr @.str)
172 !0 = !{i32 42, null, !"string"}
175 This example is made up of a :ref:`global variable <globalvars>` named
176 "``.str``", an external declaration of the "``puts``" function, a
177 :ref:`function definition <functionstructure>` for "``main``" and
178 :ref:`named metadata <namedmetadatastructure>` "``foo``".
180 In general, a module is made up of a list of global values (where both
181 functions and global variables are global values). Global values are
182 represented by a pointer to a memory location (in this case, a pointer
183 to an array of char, and a pointer to a function), and have one of the
184 following :ref:`linkage types <linkage>`.
191 All Global Variables and Functions have one of the following types of
195 Global values with "``private``" linkage are only directly
196 accessible by objects in the current module. In particular, linking
197 code into a module with a private global value may cause the
198 private to be renamed as necessary to avoid collisions. Because the
199 symbol is private to the module, all references can be updated. This
200 doesn't show up in any symbol table in the object file.
202 Similar to private, but the value shows as a local symbol
203 (``STB_LOCAL`` in the case of ELF) in the object file. This
204 corresponds to the notion of the '``static``' keyword in C.
205 ``available_externally``
206 Globals with "``available_externally``" linkage are never emitted into
207 the object file corresponding to the LLVM module. From the linker's
208 perspective, an ``available_externally`` global is equivalent to
209 an external declaration. They exist to allow inlining and other
210 optimizations to take place given knowledge of the definition of the
211 global, which is known to be somewhere outside the module. Globals
212 with ``available_externally`` linkage are allowed to be discarded at
213 will, and allow inlining and other optimizations. This linkage type is
214 only allowed on definitions, not declarations.
216 Globals with "``linkonce``" linkage are merged with other globals of
217 the same name when linkage occurs. This can be used to implement
218 some forms of inline functions, templates, or other code which must
219 be generated in each translation unit that uses it, but where the
220 body may be overridden with a more definitive definition later.
221 Unreferenced ``linkonce`` globals are allowed to be discarded. Note
222 that ``linkonce`` linkage does not actually allow the optimizer to
223 inline the body of this function into callers because it doesn't
224 know if this definition of the function is the definitive definition
225 within the program or whether it will be overridden by a stronger
226 definition. To enable inlining and other optimizations, use
227 "``linkonce_odr``" linkage.
229 "``weak``" linkage has the same merging semantics as ``linkonce``
230 linkage, except that unreferenced globals with ``weak`` linkage may
231 not be discarded. This is used for globals that are declared "weak"
234 "``common``" linkage is most similar to "``weak``" linkage, but they
235 are used for tentative definitions in C, such as "``int X;``" at
236 global scope. Symbols with "``common``" linkage are merged in the
237 same way as ``weak symbols``, and they may not be deleted if
238 unreferenced. ``common`` symbols may not have an explicit section,
239 must have a zero initializer, and may not be marked
240 ':ref:`constant <globalvars>`'. Functions and aliases may not have
243 .. _linkage_appending:
246 "``appending``" linkage may only be applied to global variables of
247 pointer to array type. When two global variables with appending
248 linkage are linked together, the two global arrays are appended
249 together. This is the LLVM, typesafe, equivalent of having the
250 system linker append together "sections" with identical names when
253 Unfortunately this doesn't correspond to any feature in .o files, so it
254 can only be used for variables like ``llvm.global_ctors`` which llvm
255 interprets specially.
258 The semantics of this linkage follow the ELF object file model: the
259 symbol is weak until linked, if not linked, the symbol becomes null
260 instead of being an undefined reference.
261 ``linkonce_odr``, ``weak_odr``
262 Some languages allow differing globals to be merged, such as two
263 functions with different semantics. Other languages, such as
264 ``C++``, ensure that only equivalent globals are ever merged (the
265 "one definition rule" --- "ODR"). Such languages can use the
266 ``linkonce_odr`` and ``weak_odr`` linkage types to indicate that the
267 global will only be merged with equivalent globals. These linkage
268 types are otherwise the same as their non-``odr`` versions.
270 If none of the above identifiers are used, the global is externally
271 visible, meaning that it participates in linkage and can be used to
272 resolve external symbol references.
274 It is illegal for a global variable or function *declaration* to have any
275 linkage type other than ``external`` or ``extern_weak``.
282 LLVM :ref:`functions <functionstructure>`, :ref:`calls <i_call>` and
283 :ref:`invokes <i_invoke>` can all have an optional calling convention
284 specified for the call. The calling convention of any pair of dynamic
285 caller/callee must match, or the behavior of the program is undefined.
286 The following calling conventions are supported by LLVM, and more may be
289 "``ccc``" - The C calling convention
290 This calling convention (the default if no other calling convention
291 is specified) matches the target C calling conventions. This calling
292 convention supports varargs function calls and tolerates some
293 mismatch in the declared prototype and implemented declaration of
294 the function (as does normal C).
295 "``fastcc``" - The fast calling convention
296 This calling convention attempts to make calls as fast as possible
297 (e.g. by passing things in registers). This calling convention
298 allows the target to use whatever tricks it wants to produce fast
299 code for the target, without having to conform to an externally
300 specified ABI (Application Binary Interface). `Tail calls can only
301 be optimized when this, the tailcc, the GHC or the HiPE convention is
302 used. <CodeGenerator.html#tail-call-optimization>`_ This calling
303 convention does not support varargs and requires the prototype of all
304 callees to exactly match the prototype of the function definition.
305 "``coldcc``" - The cold calling convention
306 This calling convention attempts to make code in the caller as
307 efficient as possible under the assumption that the call is not
308 commonly executed. As such, these calls often preserve all registers
309 so that the call does not break any live ranges in the caller side.
310 This calling convention does not support varargs and requires the
311 prototype of all callees to exactly match the prototype of the
312 function definition. Furthermore the inliner doesn't consider such function
314 "``ghccc``" - GHC convention
315 This calling convention has been implemented specifically for use by
316 the `Glasgow Haskell Compiler (GHC) <http://www.haskell.org/ghc>`_.
317 It passes everything in registers, going to extremes to achieve this
318 by disabling callee save registers. This calling convention should
319 not be used lightly but only for specific situations such as an
320 alternative to the *register pinning* performance technique often
321 used when implementing functional programming languages. At the
322 moment only X86, AArch64, and RISCV support this convention. The
323 following limitations exist:
325 - On *X86-32* only up to 4 bit type parameters are supported. No
326 floating-point types are supported.
327 - On *X86-64* only up to 10 bit type parameters and 6
328 floating-point parameters are supported.
329 - On *AArch64* only up to 4 32-bit floating-point parameters,
330 4 64-bit floating-point parameters, and 10 bit type parameters
332 - *RISCV64* only supports up to 11 bit type parameters, 4
333 32-bit floating-point parameters, and 4 64-bit floating-point
336 This calling convention supports `tail call
337 optimization <CodeGenerator.html#tail-call-optimization>`_ but requires
338 both the caller and callee are using it.
339 "``cc 11``" - The HiPE calling convention
340 This calling convention has been implemented specifically for use by
341 the `High-Performance Erlang
342 (HiPE) <http://www.it.uu.se/research/group/hipe/>`_ compiler, *the*
343 native code compiler of the `Ericsson's Open Source Erlang/OTP
344 system <http://www.erlang.org/download.shtml>`_. It uses more
345 registers for argument passing than the ordinary C calling
346 convention and defines no callee-saved registers. The calling
347 convention properly supports `tail call
348 optimization <CodeGenerator.html#tail-call-optimization>`_ but requires
349 that both the caller and the callee use it. It uses a *register pinning*
350 mechanism, similar to GHC's convention, for keeping frequently
351 accessed runtime components pinned to specific hardware registers.
352 At the moment only X86 supports this convention (both 32 and 64
354 "``anyregcc``" - Dynamic calling convention for code patching
355 This is a special convention that supports patching an arbitrary code
356 sequence in place of a call site. This convention forces the call
357 arguments into registers but allows them to be dynamically
358 allocated. This can currently only be used with calls to
359 llvm.experimental.patchpoint because only this intrinsic records
360 the location of its arguments in a side table. See :doc:`StackMaps`.
361 "``preserve_mostcc``" - The `PreserveMost` calling convention
362 This calling convention attempts to make the code in the caller as
363 unintrusive as possible. This convention behaves identically to the `C`
364 calling convention on how arguments and return values are passed, but it
365 uses a different set of caller/callee-saved registers. This alleviates the
366 burden of saving and recovering a large register set before and after the
367 call in the caller. If the arguments are passed in callee-saved registers,
368 then they will be preserved by the callee across the call. This doesn't
369 apply for values returned in callee-saved registers.
371 - On X86-64 the callee preserves all general purpose registers, except for
372 R11 and return registers, if any. R11 can be used as a scratch register.
373 The treatment of floating-point registers (XMMs/YMMs) matches the OS's C
374 calling convention: on most platforms, they are not preserved and need to
375 be saved by the caller, but on Windows, xmm6-xmm15 are preserved.
377 - On AArch64 the callee preserve all general purpose registers, except X0-X8
380 The idea behind this convention is to support calls to runtime functions
381 that have a hot path and a cold path. The hot path is usually a small piece
382 of code that doesn't use many registers. The cold path might need to call out to
383 another function and therefore only needs to preserve the caller-saved
384 registers, which haven't already been saved by the caller. The
385 `PreserveMost` calling convention is very similar to the `cold` calling
386 convention in terms of caller/callee-saved registers, but they are used for
387 different types of function calls. `coldcc` is for function calls that are
388 rarely executed, whereas `preserve_mostcc` function calls are intended to be
389 on the hot path and definitely executed a lot. Furthermore `preserve_mostcc`
390 doesn't prevent the inliner from inlining the function call.
392 This calling convention will be used by a future version of the ObjectiveC
393 runtime and should therefore still be considered experimental at this time.
394 Although this convention was created to optimize certain runtime calls to
395 the ObjectiveC runtime, it is not limited to this runtime and might be used
396 by other runtimes in the future too. The current implementation only
397 supports X86-64, but the intention is to support more architectures in the
399 "``preserve_allcc``" - The `PreserveAll` calling convention
400 This calling convention attempts to make the code in the caller even less
401 intrusive than the `PreserveMost` calling convention. This calling
402 convention also behaves identical to the `C` calling convention on how
403 arguments and return values are passed, but it uses a different set of
404 caller/callee-saved registers. This removes the burden of saving and
405 recovering a large register set before and after the call in the caller. If
406 the arguments are passed in callee-saved registers, then they will be
407 preserved by the callee across the call. This doesn't apply for values
408 returned in callee-saved registers.
410 - On X86-64 the callee preserves all general purpose registers, except for
411 R11. R11 can be used as a scratch register. Furthermore it also preserves
412 all floating-point registers (XMMs/YMMs).
414 - On AArch64 the callee preserve all general purpose registers, except X0-X8
415 and X16-X18. Furthermore it also preserves lower 128 bits of V8-V31 SIMD -
416 floating point registers.
418 The idea behind this convention is to support calls to runtime functions
419 that don't need to call out to any other functions.
421 This calling convention, like the `PreserveMost` calling convention, will be
422 used by a future version of the ObjectiveC runtime and should be considered
423 experimental at this time.
424 "``cxx_fast_tlscc``" - The `CXX_FAST_TLS` calling convention for access functions
425 Clang generates an access function to access C++-style TLS. The access
426 function generally has an entry block, an exit block and an initialization
427 block that is run at the first time. The entry and exit blocks can access
428 a few TLS IR variables, each access will be lowered to a platform-specific
431 This calling convention aims to minimize overhead in the caller by
432 preserving as many registers as possible (all the registers that are
433 preserved on the fast path, composed of the entry and exit blocks).
435 This calling convention behaves identical to the `C` calling convention on
436 how arguments and return values are passed, but it uses a different set of
437 caller/callee-saved registers.
439 Given that each platform has its own lowering sequence, hence its own set
440 of preserved registers, we can't use the existing `PreserveMost`.
442 - On X86-64 the callee preserves all general purpose registers, except for
444 "``tailcc``" - Tail callable calling convention
445 This calling convention ensures that calls in tail position will always be
446 tail call optimized. This calling convention is equivalent to fastcc,
447 except for an additional guarantee that tail calls will be produced
448 whenever possible. `Tail calls can only be optimized when this, the fastcc,
449 the GHC or the HiPE convention is used. <CodeGenerator.html#tail-call-optimization>`_
450 This calling convention does not support varargs and requires the prototype of
451 all callees to exactly match the prototype of the function definition.
452 "``swiftcc``" - This calling convention is used for Swift language.
453 - On X86-64 RCX and R8 are available for additional integer returns, and
454 XMM2 and XMM3 are available for additional FP/vector returns.
455 - On iOS platforms, we use AAPCS-VFP calling convention.
457 This calling convention is like ``swiftcc`` in most respects, but also the
458 callee pops the argument area of the stack so that mandatory tail calls are
459 possible as in ``tailcc``.
460 "``cfguard_checkcc``" - Windows Control Flow Guard (Check mechanism)
461 This calling convention is used for the Control Flow Guard check function,
462 calls to which can be inserted before indirect calls to check that the call
463 target is a valid function address. The check function has no return value,
464 but it will trigger an OS-level error if the address is not a valid target.
465 The set of registers preserved by the check function, and the register
466 containing the target address are architecture-specific.
468 - On X86 the target address is passed in ECX.
469 - On ARM the target address is passed in R0.
470 - On AArch64 the target address is passed in X15.
471 "``cc <n>``" - Numbered convention
472 Any calling convention may be specified by number, allowing
473 target-specific calling conventions to be used. Target specific
474 calling conventions start at 64.
476 More calling conventions can be added/defined on an as-needed basis, to
477 support Pascal conventions or any other well-known target-independent
480 .. _visibilitystyles:
485 All Global Variables and Functions have one of the following visibility
488 "``default``" - Default style
489 On targets that use the ELF object file format, default visibility
490 means that the declaration is visible to other modules and, in
491 shared libraries, means that the declared entity may be overridden.
492 On Darwin, default visibility means that the declaration is visible
493 to other modules. On XCOFF, default visibility means no explicit
494 visibility bit will be set and whether the symbol is visible
495 (i.e "exported") to other modules depends primarily on export lists
496 provided to the linker. Default visibility corresponds to "external
497 linkage" in the language.
498 "``hidden``" - Hidden style
499 Two declarations of an object with hidden visibility refer to the
500 same object if they are in the same shared object. Usually, hidden
501 visibility indicates that the symbol will not be placed into the
502 dynamic symbol table, so no other module (executable or shared
503 library) can reference it directly.
504 "``protected``" - Protected style
505 On ELF, protected visibility indicates that the symbol will be
506 placed in the dynamic symbol table, but that references within the
507 defining module will bind to the local symbol. That is, the symbol
508 cannot be overridden by another module.
510 A symbol with ``internal`` or ``private`` linkage must have ``default``
518 All Global Variables, Functions and Aliases can have one of the following
522 "``dllimport``" causes the compiler to reference a function or variable via
523 a global pointer to a pointer that is set up by the DLL exporting the
524 symbol. On Microsoft Windows targets, the pointer name is formed by
525 combining ``__imp_`` and the function or variable name.
527 On Microsoft Windows targets, "``dllexport``" causes the compiler to provide
528 a global pointer to a pointer in a DLL, so that it can be referenced with the
529 ``dllimport`` attribute. the pointer name is formed by combining ``__imp_``
530 and the function or variable name. On XCOFF targets, ``dllexport`` indicates
531 that the symbol will be made visible to other modules using "exported"
532 visibility and thus placed by the linker in the loader section symbol table.
533 Since this storage class exists for defining a dll interface, the compiler,
534 assembler and linker know it is externally referenced and must refrain from
537 A symbol with ``internal`` or ``private`` linkage cannot have a DLL storage
542 Thread Local Storage Models
543 ---------------------------
545 A variable may be defined as ``thread_local``, which means that it will
546 not be shared by threads (each thread will have a separated copy of the
547 variable). Not all targets support thread-local variables. Optionally, a
548 TLS model may be specified:
551 For variables that are only used within the current shared library.
553 For variables in modules that will not be loaded dynamically.
555 For variables defined in the executable and only used within it.
557 If no explicit model is given, the "general dynamic" model is used.
559 The models correspond to the ELF TLS models; see `ELF Handling For
560 Thread-Local Storage <http://people.redhat.com/drepper/tls.pdf>`_ for
561 more information on under which circumstances the different models may
562 be used. The target may choose a different TLS model if the specified
563 model is not supported, or if a better choice of model can be made.
565 A model can also be specified in an alias, but then it only governs how
566 the alias is accessed. It will not have any effect in the aliasee.
568 For platforms without linker support of ELF TLS model, the -femulated-tls
569 flag can be used to generate GCC compatible emulated TLS code.
571 .. _runtime_preemption_model:
573 Runtime Preemption Specifiers
574 -----------------------------
576 Global variables, functions and aliases may have an optional runtime preemption
577 specifier. If a preemption specifier isn't given explicitly, then a
578 symbol is assumed to be ``dso_preemptable``.
581 Indicates that the function or variable may be replaced by a symbol from
582 outside the linkage unit at runtime.
585 The compiler may assume that a function or variable marked as ``dso_local``
586 will resolve to a symbol within the same linkage unit. Direct access will
587 be generated even if the definition is not within this compilation unit.
594 LLVM IR allows you to specify both "identified" and "literal" :ref:`structure
595 types <t_struct>`. Literal types are uniqued structurally, but identified types
596 are never uniqued. An :ref:`opaque structural type <t_opaque>` can also be used
597 to forward declare a type that is not yet available.
599 An example of an identified structure specification is:
603 %mytype = type { %mytype*, i32 }
605 Prior to the LLVM 3.0 release, identified types were structurally uniqued. Only
606 literal types are uniqued in recent versions of LLVM.
610 Non-Integral Pointer Type
611 -------------------------
613 Note: non-integral pointer types are a work in progress, and they should be
614 considered experimental at this time.
616 LLVM IR optionally allows the frontend to denote pointers in certain address
617 spaces as "non-integral" via the :ref:`datalayout string<langref_datalayout>`.
618 Non-integral pointer types represent pointers that have an *unspecified* bitwise
619 representation; that is, the integral representation may be target dependent or
620 unstable (not backed by a fixed integer).
622 ``inttoptr`` and ``ptrtoint`` instructions have the same semantics as for
623 integral (i.e. normal) pointers in that they convert integers to and from
624 corresponding pointer types, but there are additional implications to be
625 aware of. Because the bit-representation of a non-integral pointer may
626 not be stable, two identical casts of the same operand may or may not
627 return the same value. Said differently, the conversion to or from the
628 non-integral type depends on environmental state in an implementation
631 If the frontend wishes to observe a *particular* value following a cast, the
632 generated IR must fence with the underlying environment in an implementation
633 defined manner. (In practice, this tends to require ``noinline`` routines for
636 From the perspective of the optimizer, ``inttoptr`` and ``ptrtoint`` for
637 non-integral types are analogous to ones on integral types with one
638 key exception: the optimizer may not, in general, insert new dynamic
639 occurrences of such casts. If a new cast is inserted, the optimizer would
640 need to either ensure that a) all possible values are valid, or b)
641 appropriate fencing is inserted. Since the appropriate fencing is
642 implementation defined, the optimizer can't do the latter. The former is
643 challenging as many commonly expected properties, such as
644 ``ptrtoint(v)-ptrtoint(v) == 0``, don't hold for non-integral types.
645 Similar restrictions apply to intrinsics that might examine the pointer bits,
646 such as :ref:`llvm.ptrmask<int_ptrmask>`.
648 The alignment information provided by the frontend for a non-integral pointer
649 (typically using attributes or metadata) must be valid for every possible
650 representation of the pointer.
657 Global variables define regions of memory allocated at compilation time
660 Global variable definitions must be initialized.
662 Global variables in other translation units can also be declared, in which
663 case they don't have an initializer.
665 Global variables can optionally specify a :ref:`linkage type <linkage>`.
667 Either global variable definitions or declarations may have an explicit section
668 to be placed in and may have an optional explicit alignment specified. If there
669 is a mismatch between the explicit or inferred section information for the
670 variable declaration and its definition the resulting behavior is undefined.
672 A variable may be defined as a global ``constant``, which indicates that
673 the contents of the variable will **never** be modified (enabling better
674 optimization, allowing the global data to be placed in the read-only
675 section of an executable, etc). Note that variables that need runtime
676 initialization cannot be marked ``constant`` as there is a store to the
679 LLVM explicitly allows *declarations* of global variables to be marked
680 constant, even if the final definition of the global is not. This
681 capability can be used to enable slightly better optimization of the
682 program, but requires the language definition to guarantee that
683 optimizations based on the 'constantness' are valid for the translation
684 units that do not include the definition.
686 As SSA values, global variables define pointer values that are in scope
687 (i.e. they dominate) all basic blocks in the program. Global variables
688 always define a pointer to their "content" type because they describe a
689 region of memory, and all memory objects in LLVM are accessed through
692 Global variables can be marked with ``unnamed_addr`` which indicates
693 that the address is not significant, only the content. Constants marked
694 like this can be merged with other constants if they have the same
695 initializer. Note that a constant with significant address *can* be
696 merged with a ``unnamed_addr`` constant, the result being a constant
697 whose address is significant.
699 If the ``local_unnamed_addr`` attribute is given, the address is known to
700 not be significant within the module.
702 A global variable may be declared to reside in a target-specific
703 numbered address space. For targets that support them, address spaces
704 may affect how optimizations are performed and/or what target
705 instructions are used to access the variable. The default address space
706 is zero. The address space qualifier must precede any other attributes.
708 LLVM allows an explicit section to be specified for globals. If the
709 target supports it, it will emit globals to the section specified.
710 Additionally, the global can placed in a comdat if the target has the necessary
713 External declarations may have an explicit section specified. Section
714 information is retained in LLVM IR for targets that make use of this
715 information. Attaching section information to an external declaration is an
716 assertion that its definition is located in the specified section. If the
717 definition is located in a different section, the behavior is undefined.
719 LLVM allows an explicit code model to be specified for globals. If the
720 target supports it, it will emit globals in the code model specified,
721 overriding the code model used to compile the translation unit.
722 The allowed values are "tiny", "small", "kernel", "medium", "large".
723 This may be extended in the future to specify global data layout that
724 doesn't cleanly fit into a specific code model.
726 By default, global initializers are optimized by assuming that global
727 variables defined within the module are not modified from their
728 initial values before the start of the global initializer. This is
729 true even for variables potentially accessible from outside the
730 module, including those with external linkage or appearing in
731 ``@llvm.used`` or dllexported variables. This assumption may be suppressed
732 by marking the variable with ``externally_initialized``.
734 An explicit alignment may be specified for a global, which must be a
735 power of 2. If not present, or if the alignment is set to zero, the
736 alignment of the global is set by the target to whatever it feels
737 convenient. If an explicit alignment is specified, the global is forced
738 to have exactly that alignment. Targets and optimizers are not allowed
739 to over-align the global if the global has an assigned section. In this
740 case, the extra alignment could be observable: for example, code could
741 assume that the globals are densely packed in their section and try to
742 iterate over them as an array, alignment padding would break this
743 iteration. For TLS variables, the module flag ``MaxTLSAlign``, if present,
744 limits the alignment to the given value. Optimizers are not allowed to
745 impose a stronger alignment on these variables. The maximum alignment
748 For global variable declarations, as well as definitions that may be
749 replaced at link time (``linkonce``, ``weak``, ``extern_weak`` and ``common``
750 linkage types), the allocation size and alignment of the definition it resolves
751 to must be greater than or equal to that of the declaration or replaceable
752 definition, otherwise the behavior is undefined.
754 Globals can also have a :ref:`DLL storage class <dllstorageclass>`,
755 an optional :ref:`runtime preemption specifier <runtime_preemption_model>`,
756 an optional :ref:`global attributes <glattrs>` and
757 an optional list of attached :ref:`metadata <metadata>`.
759 Variables and aliases can have a
760 :ref:`Thread Local Storage Model <tls_model>`.
762 Globals cannot be or contain :ref:`Scalable vectors <t_vector>` because their
763 size is unknown at compile time. They are allowed in structs to facilitate
764 intrinsics returning multiple values. Generally, structs containing scalable
765 vectors are not considered "sized" and cannot be used in loads, stores, allocas,
766 or GEPs. The only exception to this rule is for structs that contain scalable
767 vectors of the same type (e.g. ``{<vscale x 2 x i32>, <vscale x 2 x i32>}``
768 contains the same type while ``{<vscale x 2 x i32>, <vscale x 2 x i64>}``
769 doesn't). These kinds of structs (we may call them homogeneous scalable vector
770 structs) are considered sized and can be used in loads, stores, allocas, but
775 @<GlobalVarName> = [Linkage] [PreemptionSpecifier] [Visibility]
776 [DLLStorageClass] [ThreadLocal]
777 [(unnamed_addr|local_unnamed_addr)] [AddrSpace]
778 [ExternallyInitialized]
779 <global | constant> <Type> [<InitializerConstant>]
780 [, section "name"] [, partition "name"]
781 [, comdat [($name)]] [, align <Alignment>]
782 [, code_model "model"]
783 [, no_sanitize_address] [, no_sanitize_hwaddress]
784 [, sanitize_address_dyninit] [, sanitize_memtag]
787 For example, the following defines a global in a numbered address space
788 with an initializer, section, and alignment:
792 @G = addrspace(5) constant float 1.0, section "foo", align 4
794 The following example just declares a global variable
798 @G = external global i32
800 The following example defines a global variable with the
801 ``large`` code model:
805 @G = internal global i32 0, code_model "large"
807 The following example defines a thread-local global with the
808 ``initialexec`` TLS model:
812 @G = thread_local(initialexec) global i32 0, align 4
814 .. _functionstructure:
819 LLVM function definitions consist of the "``define``" keyword, an
820 optional :ref:`linkage type <linkage>`, an optional :ref:`runtime preemption
821 specifier <runtime_preemption_model>`, an optional :ref:`visibility
822 style <visibility>`, an optional :ref:`DLL storage class <dllstorageclass>`,
823 an optional :ref:`calling convention <callingconv>`,
824 an optional ``unnamed_addr`` attribute, a return type, an optional
825 :ref:`parameter attribute <paramattrs>` for the return type, a function
826 name, a (possibly empty) argument list (each with optional :ref:`parameter
827 attributes <paramattrs>`), optional :ref:`function attributes <fnattrs>`,
828 an optional address space, an optional section, an optional partition,
829 an optional alignment, an optional :ref:`comdat <langref_comdats>`,
830 an optional :ref:`garbage collector name <gc>`, an optional :ref:`prefix <prefixdata>`,
831 an optional :ref:`prologue <prologuedata>`,
832 an optional :ref:`personality <personalityfn>`,
833 an optional list of attached :ref:`metadata <metadata>`,
834 an opening curly brace, a list of basic blocks, and a closing curly brace.
838 define [linkage] [PreemptionSpecifier] [visibility] [DLLStorageClass]
840 <ResultType> @<FunctionName> ([argument list])
841 [(unnamed_addr|local_unnamed_addr)] [AddrSpace] [fn Attrs]
842 [section "name"] [partition "name"] [comdat [($name)]] [align N]
843 [gc] [prefix Constant] [prologue Constant] [personality Constant]
846 The argument list is a comma separated sequence of arguments where each
847 argument is of the following form:
851 <type> [parameter Attrs] [name]
853 LLVM function declarations consist of the "``declare``" keyword, an
854 optional :ref:`linkage type <linkage>`, an optional :ref:`visibility style
855 <visibility>`, an optional :ref:`DLL storage class <dllstorageclass>`, an
856 optional :ref:`calling convention <callingconv>`, an optional ``unnamed_addr``
857 or ``local_unnamed_addr`` attribute, an optional address space, a return type,
858 an optional :ref:`parameter attribute <paramattrs>` for the return type, a function name, a possibly
859 empty list of arguments, an optional alignment, an optional :ref:`garbage
860 collector name <gc>`, an optional :ref:`prefix <prefixdata>`, and an optional
861 :ref:`prologue <prologuedata>`.
865 declare [linkage] [visibility] [DLLStorageClass]
867 <ResultType> @<FunctionName> ([argument list])
868 [(unnamed_addr|local_unnamed_addr)] [align N] [gc]
869 [prefix Constant] [prologue Constant]
871 A function definition contains a list of basic blocks, forming the CFG (Control
872 Flow Graph) for the function. Each basic block may optionally start with a label
873 (giving the basic block a symbol table entry), contains a list of instructions,
874 and ends with a :ref:`terminator <terminators>` instruction (such as a branch or
875 function return). If an explicit label name is not provided, a block is assigned
876 an implicit numbered label, using the next value from the same counter as used
877 for unnamed temporaries (:ref:`see above<identifiers>`). For example, if a
878 function entry block does not have an explicit label, it will be assigned label
879 "%0", then the first unnamed temporary in that block will be "%1", etc. If a
880 numeric label is explicitly specified, it must match the numeric label that
881 would be used implicitly.
883 The first basic block in a function is special in two ways: it is
884 immediately executed on entrance to the function, and it is not allowed
885 to have predecessor basic blocks (i.e. there can not be any branches to
886 the entry block of a function). Because the block can have no
887 predecessors, it also cannot have any :ref:`PHI nodes <i_phi>`.
889 LLVM allows an explicit section to be specified for functions. If the
890 target supports it, it will emit functions to the section specified.
891 Additionally, the function can be placed in a COMDAT.
893 An explicit alignment may be specified for a function. If not present,
894 or if the alignment is set to zero, the alignment of the function is set
895 by the target to whatever it feels convenient. If an explicit alignment
896 is specified, the function is forced to have at least that much
897 alignment. All alignments must be a power of 2.
899 If the ``unnamed_addr`` attribute is given, the address is known to not
900 be significant and two identical functions can be merged.
902 If the ``local_unnamed_addr`` attribute is given, the address is known to
903 not be significant within the module.
905 If an explicit address space is not given, it will default to the program
906 address space from the :ref:`datalayout string<langref_datalayout>`.
913 Aliases, unlike function or variables, don't create any new data. They
914 are just a new symbol and metadata for an existing position.
916 Aliases have a name and an aliasee that is either a global value or a
919 Aliases may have an optional :ref:`linkage type <linkage>`, an optional
920 :ref:`runtime preemption specifier <runtime_preemption_model>`, an optional
921 :ref:`visibility style <visibility>`, an optional :ref:`DLL storage class
922 <dllstorageclass>` and an optional :ref:`tls model <tls_model>`.
926 @<Name> = [Linkage] [PreemptionSpecifier] [Visibility] [DLLStorageClass] [ThreadLocal] [(unnamed_addr|local_unnamed_addr)] alias <AliaseeTy>, <AliaseeTy>* @<Aliasee>
929 The linkage must be one of ``private``, ``internal``, ``linkonce``, ``weak``,
930 ``linkonce_odr``, ``weak_odr``, ``external``, ``available_externally``. Note
931 that some system linkers might not correctly handle dropping a weak symbol that
934 Aliases that are not ``unnamed_addr`` are guaranteed to have the same address as
935 the aliasee expression. ``unnamed_addr`` ones are only guaranteed to point
938 If the ``local_unnamed_addr`` attribute is given, the address is known to
939 not be significant within the module.
941 Since aliases are only a second name, some restrictions apply, of which
942 some can only be checked when producing an object file:
944 * The expression defining the aliasee must be computable at assembly
945 time. Since it is just a name, no relocations can be used.
947 * No alias in the expression can be weak as the possibility of the
948 intermediate alias being overridden cannot be represented in an
951 * If the alias has the ``available_externally`` linkage, the aliasee must be an
952 ``available_externally`` global value; otherwise the aliasee can be an
953 expression but no global value in the expression can be a declaration, since
954 that would require a relocation, which is not possible.
956 * If either the alias or the aliasee may be replaced by a symbol outside the
957 module at link time or runtime, any optimization cannot replace the alias with
958 the aliasee, since the behavior may be different. The alias may be used as a
959 name guaranteed to point to the content in the current module.
966 IFuncs, like as aliases, don't create any new data or func. They are just a new
967 symbol that is resolved at runtime by calling a resolver function.
969 On ELF platforms, IFuncs are resolved by the dynamic linker at load time. On
970 Mach-O platforms, they are lowered in terms of ``.symbol_resolver`` functions,
971 which lazily resolve the callee the first time they are called.
973 IFunc may have an optional :ref:`linkage type <linkage>` and an optional
974 :ref:`visibility style <visibility>`.
978 @<Name> = [Linkage] [PreemptionSpecifier] [Visibility] ifunc <IFuncTy>, <ResolverTy>* @<Resolver>
987 Comdat IR provides access to object file COMDAT/section group functionality
988 which represents interrelated sections.
990 Comdats have a name which represents the COMDAT key and a selection kind to
991 provide input on how the linker deduplicates comdats with the same key in two
992 different object files. A comdat must be included or omitted as a unit.
993 Discarding the whole comdat is allowed but discarding a subset is not.
995 A global object may be a member of at most one comdat. Aliases are placed in the
996 same COMDAT that their aliasee computes to, if any.
1000 $<Name> = comdat SelectionKind
1002 For selection kinds other than ``nodeduplicate``, only one of the duplicate
1003 comdats may be retained by the linker and the members of the remaining comdats
1004 must be discarded. The following selection kinds are supported:
1007 The linker may choose any COMDAT key, the choice is arbitrary.
1009 The linker may choose any COMDAT key but the sections must contain the
1012 The linker will choose the section containing the largest COMDAT key.
1014 No deduplication is performed.
1016 The linker may choose any COMDAT key but the sections must contain the
1017 same amount of data.
1019 - XCOFF and Mach-O don't support COMDATs.
1020 - COFF supports all selection kinds. Non-``nodeduplicate`` selection kinds need
1021 a non-local linkage COMDAT symbol.
1022 - ELF supports ``any`` and ``nodeduplicate``.
1023 - WebAssembly only supports ``any``.
1025 Here is an example of a COFF COMDAT where a function will only be selected if
1026 the COMDAT key's section is the largest:
1028 .. code-block:: text
1030 $foo = comdat largest
1031 @foo = global i32 2, comdat($foo)
1033 define void @bar() comdat($foo) {
1037 In a COFF object file, this will create a COMDAT section with selection kind
1038 ``IMAGE_COMDAT_SELECT_LARGEST`` containing the contents of the ``@foo`` symbol
1039 and another COMDAT section with selection kind
1040 ``IMAGE_COMDAT_SELECT_ASSOCIATIVE`` which is associated with the first COMDAT
1041 section and contains the contents of the ``@bar`` symbol.
1043 As a syntactic sugar the ``$name`` can be omitted if the name is the same as
1046 .. code-block:: llvm
1049 @foo = global i32 2, comdat
1050 @bar = global i32 3, comdat($foo)
1052 There are some restrictions on the properties of the global object.
1053 It, or an alias to it, must have the same name as the COMDAT group when
1055 The contents and size of this object may be used during link-time to determine
1056 which COMDAT groups get selected depending on the selection kind.
1057 Because the name of the object must match the name of the COMDAT group, the
1058 linkage of the global object must not be local; local symbols can get renamed
1059 if a collision occurs in the symbol table.
1061 The combined use of COMDATS and section attributes may yield surprising results.
1064 .. code-block:: llvm
1068 @g1 = global i32 42, section "sec", comdat($foo)
1069 @g2 = global i32 42, section "sec", comdat($bar)
1071 From the object file perspective, this requires the creation of two sections
1072 with the same name. This is necessary because both globals belong to different
1073 COMDAT groups and COMDATs, at the object file level, are represented by
1076 Note that certain IR constructs like global variables and functions may
1077 create COMDATs in the object file in addition to any which are specified using
1078 COMDAT IR. This arises when the code generator is configured to emit globals
1079 in individual sections (e.g. when `-data-sections` or `-function-sections`
1080 is supplied to `llc`).
1082 .. _namedmetadatastructure:
1087 Named metadata is a collection of metadata. :ref:`Metadata
1088 nodes <metadata>` (but not metadata strings) are the only valid
1089 operands for a named metadata.
1091 #. Named metadata are represented as a string of characters with the
1092 metadata prefix. The rules for metadata names are the same as for
1093 identifiers, but quoted names are not allowed. ``"\xx"`` type escapes
1094 are still valid, which allows any character to be part of a name.
1098 ; Some unnamed metadata nodes, which are referenced by the named metadata.
1103 !name = !{!0, !1, !2}
1107 Parameter Attributes
1108 --------------------
1110 The return type and each parameter of a function type may have a set of
1111 *parameter attributes* associated with them. Parameter attributes are
1112 used to communicate additional information about the result or
1113 parameters of a function. Parameter attributes are considered to be part
1114 of the function, not of the function type, so functions with different
1115 parameter attributes can have the same function type.
1117 Parameter attributes are simple keywords that follow the type specified.
1118 If multiple parameter attributes are needed, they are space separated.
1121 .. code-block:: llvm
1123 declare i32 @printf(ptr noalias nocapture, ...)
1124 declare i32 @atoi(i8 zeroext)
1125 declare signext i8 @returns_signed_char()
1127 Note that any attributes for the function result (``nonnull``,
1128 ``signext``) come before the result type.
1130 Currently, only the following parameter attributes are defined:
1133 This indicates to the code generator that the parameter or return
1134 value should be zero-extended to the extent required by the target's
1135 ABI by the caller (for a parameter) or the callee (for a return value).
1137 This indicates to the code generator that the parameter or return
1138 value should be sign-extended to the extent required by the target's
1139 ABI (which is usually 32-bits) by the caller (for a parameter) or
1140 the callee (for a return value).
1142 This indicates that this parameter or return value should be treated
1143 in a special target-dependent fashion while emitting code for
1144 a function call or return (usually, by putting it in a register as
1145 opposed to memory, though some targets use it to distinguish between
1146 two different kinds of registers). Use of this attribute is
1149 This indicates that the pointer parameter should really be passed by
1150 value to the function. The attribute implies that a hidden copy of
1151 the pointee is made between the caller and the callee, so the callee
1152 is unable to modify the value in the caller. This attribute is only
1153 valid on LLVM pointer arguments. It is generally used to pass
1154 structs and arrays by value, but is also valid on pointers to
1155 scalars. The copy is considered to belong to the caller not the
1156 callee (for example, ``readonly`` functions should not write to
1157 ``byval`` parameters). This is not a valid attribute for return
1160 The byval type argument indicates the in-memory value type, and
1161 must be the same as the pointee type of the argument.
1163 The byval attribute also supports specifying an alignment with the
1164 align attribute. It indicates the alignment of the stack slot to
1165 form and the known alignment of the pointer specified to the call
1166 site. If the alignment is not specified, then the code generator
1167 makes a target-specific assumption.
1173 The ``byref`` argument attribute allows specifying the pointee
1174 memory type of an argument. This is similar to ``byval``, but does
1175 not imply a copy is made anywhere, or that the argument is passed
1176 on the stack. This implies the pointer is dereferenceable up to
1177 the storage size of the type.
1179 It is not generally permissible to introduce a write to an
1180 ``byref`` pointer. The pointer may have any address space and may
1183 This is not a valid attribute for return values.
1185 The alignment for an ``byref`` parameter can be explicitly
1186 specified by combining it with the ``align`` attribute, similar to
1187 ``byval``. If the alignment is not specified, then the code generator
1188 makes a target-specific assumption.
1190 This is intended for representing ABI constraints, and is not
1191 intended to be inferred for optimization use.
1193 .. _attr_preallocated:
1195 ``preallocated(<ty>)``
1196 This indicates that the pointer parameter should really be passed by
1197 value to the function, and that the pointer parameter's pointee has
1198 already been initialized before the call instruction. This attribute
1199 is only valid on LLVM pointer arguments. The argument must be the value
1200 returned by the appropriate
1201 :ref:`llvm.call.preallocated.arg<int_call_preallocated_arg>` on non
1202 ``musttail`` calls, or the corresponding caller parameter in ``musttail``
1203 calls, although it is ignored during codegen.
1205 A non ``musttail`` function call with a ``preallocated`` attribute in
1206 any parameter must have a ``"preallocated"`` operand bundle. A ``musttail``
1207 function call cannot have a ``"preallocated"`` operand bundle.
1209 The preallocated attribute requires a type argument, which must be
1210 the same as the pointee type of the argument.
1212 The preallocated attribute also supports specifying an alignment with the
1213 align attribute. It indicates the alignment of the stack slot to
1214 form and the known alignment of the pointer specified to the call
1215 site. If the alignment is not specified, then the code generator
1216 makes a target-specific assumption.
1222 The ``inalloca`` argument attribute allows the caller to take the
1223 address of outgoing stack arguments. An ``inalloca`` argument must
1224 be a pointer to stack memory produced by an ``alloca`` instruction.
1225 The alloca, or argument allocation, must also be tagged with the
1226 inalloca keyword. Only the last argument may have the ``inalloca``
1227 attribute, and that argument is guaranteed to be passed in memory.
1229 An argument allocation may be used by a call at most once because
1230 the call may deallocate it. The ``inalloca`` attribute cannot be
1231 used in conjunction with other attributes that affect argument
1232 storage, like ``inreg``, ``nest``, ``sret``, or ``byval``. The
1233 ``inalloca`` attribute also disables LLVM's implicit lowering of
1234 large aggregate return values, which means that frontend authors
1235 must lower them with ``sret`` pointers.
1237 When the call site is reached, the argument allocation must have
1238 been the most recent stack allocation that is still live, or the
1239 behavior is undefined. It is possible to allocate additional stack
1240 space after an argument allocation and before its call site, but it
1241 must be cleared off with :ref:`llvm.stackrestore
1242 <int_stackrestore>`.
1244 The inalloca attribute requires a type argument, which must be the
1245 same as the pointee type of the argument.
1247 See :doc:`InAlloca` for more information on how to use this
1251 This indicates that the pointer parameter specifies the address of a
1252 structure that is the return value of the function in the source
1253 program. This pointer must be guaranteed by the caller to be valid:
1254 loads and stores to the structure may be assumed by the callee not
1255 to trap and to be properly aligned. This is not a valid attribute
1258 The sret type argument specifies the in memory type, which must be
1259 the same as the pointee type of the argument.
1261 .. _attr_elementtype:
1263 ``elementtype(<ty>)``
1265 The ``elementtype`` argument attribute can be used to specify a pointer
1266 element type in a way that is compatible with `opaque pointers
1267 <OpaquePointers.html>`__.
1269 The ``elementtype`` attribute by itself does not carry any specific
1270 semantics. However, certain intrinsics may require this attribute to be
1271 present and assign it particular semantics. This will be documented on
1272 individual intrinsics.
1274 The attribute may only be applied to pointer typed arguments of intrinsic
1275 calls. It cannot be applied to non-intrinsic calls, and cannot be applied
1276 to parameters on function declarations. For non-opaque pointers, the type
1277 passed to ``elementtype`` must match the pointer element type.
1281 ``align <n>`` or ``align(<n>)``
1282 This indicates that the pointer value or vector of pointers has the
1283 specified alignment. If applied to a vector of pointers, *all* pointers
1284 (elements) have the specified alignment. If the pointer value does not have
1285 the specified alignment, :ref:`poison value <poisonvalues>` is returned or
1286 passed instead. The ``align`` attribute should be combined with the
1287 ``noundef`` attribute to ensure a pointer is aligned, or otherwise the
1288 behavior is undefined. Note that ``align 1`` has no effect on non-byval,
1289 non-preallocated arguments.
1291 Note that this attribute has additional semantics when combined with the
1292 ``byval`` or ``preallocated`` attribute, which are documented there.
1297 This indicates that memory locations accessed via pointer values
1298 :ref:`based <pointeraliasing>` on the argument or return value are not also
1299 accessed, during the execution of the function, via pointer values not
1300 *based* on the argument or return value. This guarantee only holds for
1301 memory locations that are *modified*, by any means, during the execution of
1302 the function. The attribute on a return value also has additional semantics
1303 described below. The caller shares the responsibility with the callee for
1304 ensuring that these requirements are met. For further details, please see
1305 the discussion of the NoAlias response in :ref:`alias analysis <Must, May,
1308 Note that this definition of ``noalias`` is intentionally similar
1309 to the definition of ``restrict`` in C99 for function arguments.
1311 For function return values, C99's ``restrict`` is not meaningful,
1312 while LLVM's ``noalias`` is. Furthermore, the semantics of the ``noalias``
1313 attribute on return values are stronger than the semantics of the attribute
1314 when used on function arguments. On function return values, the ``noalias``
1315 attribute indicates that the function acts like a system memory allocation
1316 function, returning a pointer to allocated storage disjoint from the
1317 storage for any other object accessible to the caller.
1322 This indicates that the callee does not :ref:`capture <pointercapture>` the
1323 pointer. This is not a valid attribute for return values.
1324 This attribute applies only to the particular copy of the pointer passed in
1325 this argument. A caller could pass two copies of the same pointer with one
1326 being annotated nocapture and the other not, and the callee could validly
1327 capture through the non annotated parameter.
1329 .. code-block:: llvm
1331 define void @f(ptr nocapture %a, ptr %b) {
1335 call void @f(ptr @glb, ptr @glb) ; well-defined
1338 This indicates that callee does not free the pointer argument. This is not
1339 a valid attribute for return values.
1344 This indicates that the pointer parameter can be excised using the
1345 :ref:`trampoline intrinsics <int_trampoline>`. This is not a valid
1346 attribute for return values and can only be applied to one parameter.
1349 This indicates that the function always returns the argument as its return
1350 value. This is a hint to the optimizer and code generator used when
1351 generating the caller, allowing value propagation, tail call optimization,
1352 and omission of register saves and restores in some cases; it is not
1353 checked or enforced when generating the callee. The parameter and the
1354 function return type must be valid operands for the
1355 :ref:`bitcast instruction <i_bitcast>`. This is not a valid attribute for
1356 return values and can only be applied to one parameter.
1359 This indicates that the parameter or return pointer is not null. This
1360 attribute may only be applied to pointer typed parameters. This is not
1361 checked or enforced by LLVM; if the parameter or return pointer is null,
1362 :ref:`poison value <poisonvalues>` is returned or passed instead.
1363 The ``nonnull`` attribute should be combined with the ``noundef`` attribute
1364 to ensure a pointer is not null or otherwise the behavior is undefined.
1366 ``dereferenceable(<n>)``
1367 This indicates that the parameter or return pointer is dereferenceable. This
1368 attribute may only be applied to pointer typed parameters. A pointer that
1369 is dereferenceable can be loaded from speculatively without a risk of
1370 trapping. The number of bytes known to be dereferenceable must be provided
1371 in parentheses. It is legal for the number of bytes to be less than the
1372 size of the pointee type. The ``nonnull`` attribute does not imply
1373 dereferenceability (consider a pointer to one element past the end of an
1374 array), however ``dereferenceable(<n>)`` does imply ``nonnull`` in
1375 ``addrspace(0)`` (which is the default address space), except if the
1376 ``null_pointer_is_valid`` function attribute is present.
1377 ``n`` should be a positive number. The pointer should be well defined,
1378 otherwise it is undefined behavior. This means ``dereferenceable(<n>)``
1379 implies ``noundef``.
1381 ``dereferenceable_or_null(<n>)``
1382 This indicates that the parameter or return value isn't both
1383 non-null and non-dereferenceable (up to ``<n>`` bytes) at the same
1384 time. All non-null pointers tagged with
1385 ``dereferenceable_or_null(<n>)`` are ``dereferenceable(<n>)``.
1386 For address space 0 ``dereferenceable_or_null(<n>)`` implies that
1387 a pointer is exactly one of ``dereferenceable(<n>)`` or ``null``,
1388 and in other address spaces ``dereferenceable_or_null(<n>)``
1389 implies that a pointer is at least one of ``dereferenceable(<n>)``
1390 or ``null`` (i.e. it may be both ``null`` and
1391 ``dereferenceable(<n>)``). This attribute may only be applied to
1392 pointer typed parameters.
1395 This indicates that the parameter is the self/context parameter. This is not
1396 a valid attribute for return values and can only be applied to one
1402 This indicates that the parameter is the asynchronous context parameter and
1403 triggers the creation of a target-specific extended frame record to store
1404 this pointer. This is not a valid attribute for return values and can only
1405 be applied to one parameter.
1408 This attribute is motivated to model and optimize Swift error handling. It
1409 can be applied to a parameter with pointer to pointer type or a
1410 pointer-sized alloca. At the call site, the actual argument that corresponds
1411 to a ``swifterror`` parameter has to come from a ``swifterror`` alloca or
1412 the ``swifterror`` parameter of the caller. A ``swifterror`` value (either
1413 the parameter or the alloca) can only be loaded and stored from, or used as
1414 a ``swifterror`` argument. This is not a valid attribute for return values
1415 and can only be applied to one parameter.
1417 These constraints allow the calling convention to optimize access to
1418 ``swifterror`` variables by associating them with a specific register at
1419 call boundaries rather than placing them in memory. Since this does change
1420 the calling convention, a function which uses the ``swifterror`` attribute
1421 on a parameter is not ABI-compatible with one which does not.
1423 These constraints also allow LLVM to assume that a ``swifterror`` argument
1424 does not alias any other memory visible within a function and that a
1425 ``swifterror`` alloca passed as an argument does not escape.
1428 This indicates the parameter is required to be an immediate
1429 value. This must be a trivial immediate integer or floating-point
1430 constant. Undef or constant expressions are not valid. This is
1431 only valid on intrinsic declarations and cannot be applied to a
1432 call site or arbitrary function.
1435 This attribute applies to parameters and return values. If the value
1436 representation contains any undefined or poison bits, the behavior is
1437 undefined. Note that this does not refer to padding introduced by the
1438 type's storage representation.
1442 ``nofpclass(<test mask>)``
1443 This attribute applies to parameters and return values with
1444 floating-point and vector of floating-point types, as well as
1445 arrays of such types. The test mask has the same format as the
1446 second argument to the :ref:`llvm.is.fpclass <llvm.is.fpclass>`,
1447 and indicates which classes of floating-point values are not
1448 permitted for the value. For example a bitmask of 3 indicates
1449 the parameter may not be a NaN.
1451 If the value is a floating-point class indicated by the
1452 ``nofpclass`` test mask, a :ref:`poison value <poisonvalues>` is
1453 passed or returned instead.
1455 .. code-block:: text
1456 :caption: The following invariants hold
1458 @llvm.is.fpclass(nofpclass(test_mask) %x, test_mask) => false
1459 @llvm.is.fpclass(nofpclass(test_mask) %x, ~test_mask) => true
1460 nofpclass(all) => poison
1463 In textual IR, various string names are supported for readability
1464 and can be combined. For example ``nofpclass(nan pinf nzero)``
1465 evaluates to a mask of 547.
1467 This does not depend on the floating-point environment. For
1468 example, a function parameter marked ``nofpclass(zero)`` indicates
1469 no zero inputs. If this is applied to an argument in a function
1470 marked with :ref:`\"denormal-fp-math\" <denormal_fp_math>`
1471 indicating zero treatment of input denormals, it does not imply the
1472 value cannot be a denormal value which would compare equal to 0.
1474 .. table:: Recognized test mask names
1476 +-------+----------------------+---------------+
1477 | Name | floating-point class | Bitmask value |
1478 +=======+======================+===============+
1479 | nan | Any NaN | 3 |
1480 +-------+----------------------+---------------+
1481 | inf | +/- infinity | 516 |
1482 +-------+----------------------+---------------+
1483 | norm | +/- normal | 26 |
1484 +-------+----------------------+---------------+
1485 | sub | +/- subnormal | 144 |
1486 +-------+----------------------+---------------+
1487 | zero | +/- 0 | 96 |
1488 +-------+----------------------+---------------+
1489 | all | All values | 1023 |
1490 +-------+----------------------+---------------+
1491 | snan | Signaling NaN | 1 |
1492 +-------+----------------------+---------------+
1493 | qnan | Quiet NaN | 2 |
1494 +-------+----------------------+---------------+
1495 | ninf | Negative infinity | 4 |
1496 +-------+----------------------+---------------+
1497 | nnorm | Negative normal | 8 |
1498 +-------+----------------------+---------------+
1499 | nsub | Negative subnormal | 16 |
1500 +-------+----------------------+---------------+
1501 | nzero | Negative zero | 32 |
1502 +-------+----------------------+---------------+
1503 | pzero | Positive zero | 64 |
1504 +-------+----------------------+---------------+
1505 | psub | Positive subnormal | 128 |
1506 +-------+----------------------+---------------+
1507 | pnorm | Positive normal | 256 |
1508 +-------+----------------------+---------------+
1509 | pinf | Positive infinity | 512 |
1510 +-------+----------------------+---------------+
1514 This indicates the alignment that should be considered by the backend when
1515 assigning this parameter to a stack slot during calling convention
1516 lowering. The enforcement of the specified alignment is target-dependent,
1517 as target-specific calling convention rules may override this value. This
1518 attribute serves the purpose of carrying language specific alignment
1519 information that is not mapped to base types in the backend (for example,
1520 over-alignment specification through language attributes).
1523 The function parameter marked with this attribute is the alignment in bytes of the
1524 newly allocated block returned by this function. The returned value must either have
1525 the specified alignment or be the null pointer. The return value MAY be more aligned
1526 than the requested alignment, but not less aligned. Invalid (e.g. non-power-of-2)
1527 alignments are permitted for the allocalign parameter, so long as the returned pointer
1528 is null. This attribute may only be applied to integer parameters.
1531 The function parameter marked with this attribute is the pointer
1532 that will be manipulated by the allocator. For a realloc-like
1533 function the pointer will be invalidated upon success (but the
1534 same address may be returned), for a free-like function the
1535 pointer will always be invalidated.
1538 This attribute indicates that the function does not dereference that
1539 pointer argument, even though it may read or write the memory that the
1540 pointer points to if accessed through other pointers.
1542 If a function reads from or writes to a readnone pointer argument, the
1543 behavior is undefined.
1546 This attribute indicates that the function does not write through this
1547 pointer argument, even though it may write to the memory that the pointer
1550 If a function writes to a readonly pointer argument, the behavior is
1554 This attribute indicates that the function may write to, but does not read
1555 through this pointer argument (even though it may read from the memory that
1556 the pointer points to).
1558 If a function reads from a writeonly pointer argument, the behavior is
1562 This attribute is only meaningful in conjunction with ``dereferenceable(N)``
1563 or another attribute that implies the first ``N`` bytes of the pointer
1564 argument are dereferenceable.
1566 In that case, the attribute indicates that the first ``N`` bytes will be
1567 (non-atomically) loaded and stored back on entry to the function.
1569 This implies that it's possible to introduce spurious stores on entry to
1570 the function without introducing traps or data races. This does not
1571 necessarily hold throughout the whole function, as the pointer may escape
1572 to a different thread during the execution of the function. See also the
1573 :ref:`atomic optimization guide <Optimization outside atomic>`
1575 The "other attributes" that imply dereferenceability are
1576 ``dereferenceable_or_null`` (if the pointer is non-null) and the
1577 ``sret``, ``byval``, ``byref``, ``inalloca``, ``preallocated`` family of
1578 attributes. Note that not all of these combinations are useful, e.g.
1579 ``byval`` arguments are known to be writable even without this attribute.
1581 The ``writable`` attribute cannot be combined with ``readnone``,
1582 ``readonly`` or a ``memory`` attribute that does not contain
1586 At a high level, this attribute indicates that the pointer argument is dead
1587 if the call unwinds, in the sense that the caller will not depend on the
1588 contents of the memory. Stores that would only be visible on the unwind
1591 More precisely, the behavior is as-if any memory written through the
1592 pointer during the execution of the function is overwritten with a poison
1593 value on unwind. This includes memory written by the implicit write implied
1594 by the ``writable`` attribute. The caller is allowed to access the affected
1595 memory, but all loads that are not preceded by a store will return poison.
1597 This attribute cannot be applied to return values.
1601 Garbage Collector Strategy Names
1602 --------------------------------
1604 Each function may specify a garbage collector strategy name, which is simply a
1607 .. code-block:: llvm
1609 define void @f() gc "name" { ... }
1611 The supported values of *name* includes those :ref:`built in to LLVM
1612 <builtin-gc-strategies>` and any provided by loaded plugins. Specifying a GC
1613 strategy will cause the compiler to alter its output in order to support the
1614 named garbage collection algorithm. Note that LLVM itself does not contain a
1615 garbage collector, this functionality is restricted to generating machine code
1616 which can interoperate with a collector provided externally.
1623 Prefix data is data associated with a function which the code
1624 generator will emit immediately before the function's entrypoint.
1625 The purpose of this feature is to allow frontends to associate
1626 language-specific runtime metadata with specific functions and make it
1627 available through the function pointer while still allowing the
1628 function pointer to be called.
1630 To access the data for a given function, a program may bitcast the
1631 function pointer to a pointer to the constant's type and dereference
1632 index -1. This implies that the IR symbol points just past the end of
1633 the prefix data. For instance, take the example of a function annotated
1634 with a single ``i32``,
1636 .. code-block:: llvm
1638 define void @f() prefix i32 123 { ... }
1640 The prefix data can be referenced as,
1642 .. code-block:: llvm
1644 %a = getelementptr inbounds i32, ptr @f, i32 -1
1645 %b = load i32, ptr %a
1647 Prefix data is laid out as if it were an initializer for a global variable
1648 of the prefix data's type. The function will be placed such that the
1649 beginning of the prefix data is aligned. This means that if the size
1650 of the prefix data is not a multiple of the alignment size, the
1651 function's entrypoint will not be aligned. If alignment of the
1652 function's entrypoint is desired, padding must be added to the prefix
1655 A function may have prefix data but no body. This has similar semantics
1656 to the ``available_externally`` linkage in that the data may be used by the
1657 optimizers but will not be emitted in the object file.
1664 The ``prologue`` attribute allows arbitrary code (encoded as bytes) to
1665 be inserted prior to the function body. This can be used for enabling
1666 function hot-patching and instrumentation.
1668 To maintain the semantics of ordinary function calls, the prologue data must
1669 have a particular format. Specifically, it must begin with a sequence of
1670 bytes which decode to a sequence of machine instructions, valid for the
1671 module's target, which transfer control to the point immediately succeeding
1672 the prologue data, without performing any other visible action. This allows
1673 the inliner and other passes to reason about the semantics of the function
1674 definition without needing to reason about the prologue data. Obviously this
1675 makes the format of the prologue data highly target dependent.
1677 A trivial example of valid prologue data for the x86 architecture is ``i8 144``,
1678 which encodes the ``nop`` instruction:
1680 .. code-block:: text
1682 define void @f() prologue i8 144 { ... }
1684 Generally prologue data can be formed by encoding a relative branch instruction
1685 which skips the metadata, as in this example of valid prologue data for the
1686 x86_64 architecture, where the first two bytes encode ``jmp .+10``:
1688 .. code-block:: text
1690 %0 = type <{ i8, i8, ptr }>
1692 define void @f() prologue %0 <{ i8 235, i8 8, ptr @md}> { ... }
1694 A function may have prologue data but no body. This has similar semantics
1695 to the ``available_externally`` linkage in that the data may be used by the
1696 optimizers but will not be emitted in the object file.
1700 Personality Function
1701 --------------------
1703 The ``personality`` attribute permits functions to specify what function
1704 to use for exception handling.
1711 Attribute groups are groups of attributes that are referenced by objects within
1712 the IR. They are important for keeping ``.ll`` files readable, because a lot of
1713 functions will use the same set of attributes. In the degenerative case of a
1714 ``.ll`` file that corresponds to a single ``.c`` file, the single attribute
1715 group will capture the important command line flags used to build that file.
1717 An attribute group is a module-level object. To use an attribute group, an
1718 object references the attribute group's ID (e.g. ``#37``). An object may refer
1719 to more than one attribute group. In that situation, the attributes from the
1720 different groups are merged.
1722 Here is an example of attribute groups for a function that should always be
1723 inlined, has a stack alignment of 4, and which shouldn't use SSE instructions:
1725 .. code-block:: llvm
1727 ; Target-independent attributes:
1728 attributes #0 = { alwaysinline alignstack=4 }
1730 ; Target-dependent attributes:
1731 attributes #1 = { "no-sse" }
1733 ; Function @f has attributes: alwaysinline, alignstack=4, and "no-sse".
1734 define void @f() #0 #1 { ... }
1741 Function attributes are set to communicate additional information about
1742 a function. Function attributes are considered to be part of the
1743 function, not of the function type, so functions with different function
1744 attributes can have the same function type.
1746 Function attributes are simple keywords that follow the type specified.
1747 If multiple attributes are needed, they are space separated. For
1750 .. code-block:: llvm
1752 define void @f() noinline { ... }
1753 define void @f() alwaysinline { ... }
1754 define void @f() alwaysinline optsize { ... }
1755 define void @f() optsize { ... }
1758 This attribute indicates that, when emitting the prologue and
1759 epilogue, the backend should forcibly align the stack pointer.
1760 Specify the desired alignment, which must be a power of two, in
1762 ``"alloc-family"="FAMILY"``
1763 This indicates which "family" an allocator function is part of. To avoid
1764 collisions, the family name should match the mangled name of the primary
1765 allocator function, that is "malloc" for malloc/calloc/realloc/free,
1766 "_Znwm" for ``::operator::new`` and ``::operator::delete``, and
1767 "_ZnwmSt11align_val_t" for aligned ``::operator::new`` and
1768 ``::operator::delete``. Matching malloc/realloc/free calls within a family
1769 can be optimized, but mismatched ones will be left alone.
1770 ``allockind("KIND")``
1771 Describes the behavior of an allocation function. The KIND string contains comma
1772 separated entries from the following options:
1774 * "alloc": the function returns a new block of memory or null.
1775 * "realloc": the function returns a new block of memory or null. If the
1776 result is non-null the memory contents from the start of the block up to
1777 the smaller of the original allocation size and the new allocation size
1778 will match that of the ``allocptr`` argument and the ``allocptr``
1779 argument is invalidated, even if the function returns the same address.
1780 * "free": the function frees the block of memory specified by ``allocptr``.
1781 Functions marked as "free" ``allockind`` must return void.
1782 * "uninitialized": Any newly-allocated memory (either a new block from
1783 a "alloc" function or the enlarged capacity from a "realloc" function)
1784 will be uninitialized.
1785 * "zeroed": Any newly-allocated memory (either a new block from a "alloc"
1786 function or the enlarged capacity from a "realloc" function) will be
1788 * "aligned": the function returns memory aligned according to the
1789 ``allocalign`` parameter.
1791 The first three options are mutually exclusive, and the remaining options
1792 describe more details of how the function behaves. The remaining options
1793 are invalid for "free"-type functions.
1794 ``allocsize(<EltSizeParam>[, <NumEltsParam>])``
1795 This attribute indicates that the annotated function will always return at
1796 least a given number of bytes (or null). Its arguments are zero-indexed
1797 parameter numbers; if one argument is provided, then it's assumed that at
1798 least ``CallSite.Args[EltSizeParam]`` bytes will be available at the
1799 returned pointer. If two are provided, then it's assumed that
1800 ``CallSite.Args[EltSizeParam] * CallSite.Args[NumEltsParam]`` bytes are
1801 available. The referenced parameters must be integer types. No assumptions
1802 are made about the contents of the returned block of memory.
1804 This attribute indicates that the inliner should attempt to inline
1805 this function into callers whenever possible, ignoring any active
1806 inlining size threshold for this caller.
1808 This indicates that the callee function at a call site should be
1809 recognized as a built-in function, even though the function's declaration
1810 uses the ``nobuiltin`` attribute. This is only valid at call sites for
1811 direct calls to functions that are declared with the ``nobuiltin``
1814 This attribute indicates that this function is rarely called. When
1815 computing edge weights, basic blocks post-dominated by a cold
1816 function call are also considered to be cold; and, thus, given low
1819 .. _attr_convergent:
1822 This attribute indicates that this function is convergent.
1823 When it appears on a call/invoke, the convergent attribute
1824 indicates that we should treat the call as though we’re calling a
1825 convergent function. This is particularly useful on indirect
1826 calls; without this we may treat such calls as though the target
1829 See :doc:`ConvergentOperations` for further details.
1831 It is an error to call :ref:`llvm.experimental.convergence.entry
1832 <llvm.experimental.convergence.entry>` from a function that
1833 does not have this attribute.
1834 ``disable_sanitizer_instrumentation``
1835 When instrumenting code with sanitizers, it can be important to skip certain
1836 functions to ensure no instrumentation is applied to them.
1838 This attribute is not always similar to absent ``sanitize_<name>``
1839 attributes: depending on the specific sanitizer, code can be inserted into
1840 functions regardless of the ``sanitize_<name>`` attribute to prevent false
1843 ``disable_sanitizer_instrumentation`` disables all kinds of instrumentation,
1844 taking precedence over the ``sanitize_<name>`` attributes and other compiler
1846 ``"dontcall-error"``
1847 This attribute denotes that an error diagnostic should be emitted when a
1848 call of a function with this attribute is not eliminated via optimization.
1849 Front ends can provide optional ``srcloc`` metadata nodes on call sites of
1850 such callees to attach information about where in the source language such a
1851 call came from. A string value can be provided as a note.
1853 This attribute denotes that a warning diagnostic should be emitted when a
1854 call of a function with this attribute is not eliminated via optimization.
1855 Front ends can provide optional ``srcloc`` metadata nodes on call sites of
1856 such callees to attach information about where in the source language such a
1857 call came from. A string value can be provided as a note.
1858 ``fn_ret_thunk_extern``
1859 This attribute tells the code generator that returns from functions should
1860 be replaced with jumps to externally-defined architecture-specific symbols.
1861 For X86, this symbol's identifier is ``__x86_return_thunk``.
1863 This attribute tells the code generator whether the function
1864 should keep the frame pointer. The code generator may emit the frame pointer
1865 even if this attribute says the frame pointer can be eliminated.
1866 The allowed string values are:
1868 * ``"none"`` (default) - the frame pointer can be eliminated.
1869 * ``"non-leaf"`` - the frame pointer should be kept if the function calls
1871 * ``"all"`` - the frame pointer should be kept.
1873 This attribute indicates that this function is a hot spot of the program
1874 execution. The function will be optimized more aggressively and will be
1875 placed into special subsection of the text section to improving locality.
1877 When profile feedback is enabled, this attribute has the precedence over
1878 the profile information. By marking a function ``hot``, users can work
1879 around the cases where the training input does not have good coverage
1880 on all the hot functions.
1882 This attribute indicates that the source code contained a hint that
1883 inlining this function is desirable (such as the "inline" keyword in
1884 C/C++). It is just a hint; it imposes no requirements on the
1887 This attribute indicates that the function should be added to a
1888 jump-instruction table at code-generation time, and that all address-taken
1889 references to this function should be replaced with a reference to the
1890 appropriate jump-instruction-table function pointer. Note that this creates
1891 a new pointer for the original function, which means that code that depends
1892 on function-pointer identity can break. So, any function annotated with
1893 ``jumptable`` must also be ``unnamed_addr``.
1895 This attribute specifies the possible memory effects of the call-site or
1896 function. It allows specifying the possible access kinds (``none``,
1897 ``read``, ``write``, or ``readwrite``) for the possible memory location
1898 kinds (``argmem``, ``inaccessiblemem``, as well as a default). It is best
1899 understood by example:
1901 - ``memory(none)``: Does not access any memory.
1902 - ``memory(read)``: May read (but not write) any memory.
1903 - ``memory(write)``: May write (but not read) any memory.
1904 - ``memory(readwrite)``: May read or write any memory.
1905 - ``memory(argmem: read)``: May only read argument memory.
1906 - ``memory(argmem: read, inaccessiblemem: write)``: May only read argument
1907 memory and only write inaccessible memory.
1908 - ``memory(read, argmem: readwrite)``: May read any memory (default mode)
1909 and additionally write argument memory.
1910 - ``memory(readwrite, argmem: none)``: May access any memory apart from
1913 The supported memory location kinds are:
1915 - ``argmem``: This refers to accesses that are based on pointer arguments
1917 - ``inaccessiblemem``: This refers to accesses to memory which is not
1918 accessible by the current module (before return from the function -- an
1919 allocator function may return newly accessible memory while only
1920 accessing inaccessible memory itself). Inaccessible memory is often used
1921 to model control dependencies of intrinsics.
1922 - The default access kind (specified without a location prefix) applies to
1923 all locations that haven't been specified explicitly, including those that
1924 don't currently have a dedicated location kind (e.g. accesses to globals
1925 or captured pointers).
1927 If the ``memory`` attribute is not specified, then ``memory(readwrite)``
1928 is implied (all memory effects are possible).
1930 The memory effects of a call can be computed as
1931 ``CallSiteEffects & (FunctionEffects | OperandBundleEffects)``. Thus, the
1932 call-site annotation takes precedence over the potential effects described
1933 by either the function annotation or the operand bundles.
1935 This attribute suggests that optimization passes and code generator
1936 passes make choices that keep the code size of this function as small
1937 as possible and perform optimizations that may sacrifice runtime
1938 performance in order to minimize the size of the generated code.
1939 This attribute is incompatible with the ``optdebug`` and ``optnone``
1942 This attribute disables prologue / epilogue emission for the
1943 function. This can have very system-specific consequences.
1944 ``"no-inline-line-tables"``
1945 When this attribute is set to true, the inliner discards source locations
1946 when inlining code and instead uses the source location of the call site.
1947 Breakpoints set on code that was inlined into the current function will
1948 not fire during the execution of the inlined call sites. If the debugger
1949 stops inside an inlined call site, it will appear to be stopped at the
1950 outermost inlined call site.
1952 When this attribute is set to true, the jump tables and lookup tables that
1953 can be generated from a switch case lowering are disabled.
1955 This indicates that the callee function at a call site is not recognized as
1956 a built-in function. LLVM will retain the original call and not replace it
1957 with equivalent code based on the semantics of the built-in function, unless
1958 the call site uses the ``builtin`` attribute. This is valid at call sites
1959 and on function declarations and definitions.
1961 This attribute indicates that the function is only allowed to jump back into
1962 caller's module by a return or an exception, and is not allowed to jump back
1963 by invoking a callback function, a direct, possibly transitive, external
1964 function call, use of ``longjmp``, or other means. It is a compiler hint that
1965 is used at module level to improve dataflow analysis, dropped during linking,
1966 and has no effect on functions defined in the current module.
1968 This attribute indicates that calls to the function cannot be
1969 duplicated. A call to a ``noduplicate`` function may be moved
1970 within its parent function, but may not be duplicated within
1971 its parent function.
1973 A function containing a ``noduplicate`` call may still
1974 be an inlining candidate, provided that the call is not
1975 duplicated by inlining. That implies that the function has
1976 internal linkage and only has one call site, so the original
1977 call is dead after inlining.
1979 This function attribute indicates that the function does not, directly or
1980 transitively, call a memory-deallocation function (``free``, for example)
1981 on a memory allocation which existed before the call.
1983 As a result, uncaptured pointers that are known to be dereferenceable
1984 prior to a call to a function with the ``nofree`` attribute are still
1985 known to be dereferenceable after the call. The capturing condition is
1986 necessary in environments where the function might communicate the
1987 pointer to another thread which then deallocates the memory. Alternatively,
1988 ``nosync`` would ensure such communication cannot happen and even captured
1989 pointers cannot be freed by the function.
1991 A ``nofree`` function is explicitly allowed to free memory which it
1992 allocated or (if not ``nosync``) arrange for another thread to free
1993 memory on it's behalf. As a result, perhaps surprisingly, a ``nofree``
1994 function can return a pointer to a previously deallocated memory object.
1996 Disallows implicit floating-point code. This inhibits optimizations that
1997 use floating-point code and floating-point registers for operations that are
1998 not nominally floating-point. LLVM instructions that perform floating-point
1999 operations or require access to floating-point registers may still cause
2000 floating-point code to be generated.
2002 Also inhibits optimizations that create SIMD/vector code and registers from
2003 scalar code such as vectorization or memcpy/memset optimization. This
2004 includes integer vectors. Vector instructions present in IR may still cause
2005 vector code to be generated.
2007 This attribute indicates that the inliner should never inline this
2008 function in any situation. This attribute may not be used together
2009 with the ``alwaysinline`` attribute.
2011 This attribute indicates that calls to this function should never be merged
2012 during optimization. For example, it will prevent tail merging otherwise
2013 identical code sequences that raise an exception or terminate the program.
2014 Tail merging normally reduces the precision of source location information,
2015 making stack traces less useful for debugging. This attribute gives the
2016 user control over the tradeoff between code size and debug information
2019 This attribute suppresses lazy symbol binding for the function. This
2020 may make calls to the function faster, at the cost of extra program
2021 startup time if the function is not called during program startup.
2023 This function attribute prevents instrumentation based profiling, used for
2024 coverage or profile based optimization, from being added to a function. It
2025 also blocks inlining if the caller and callee have different values of this
2028 This function attribute prevents instrumentation based profiling, used for
2029 coverage or profile based optimization, from being added to a function. This
2030 attribute does not restrict inlining, so instrumented instruction could end
2031 up in this function.
2033 This attribute indicates that the code generator should not use a
2034 red zone, even if the target-specific ABI normally permits it.
2035 ``indirect-tls-seg-refs``
2036 This attribute indicates that the code generator should not use
2037 direct TLS access through segment registers, even if the
2038 target-specific ABI normally permits it.
2040 This function attribute indicates that the function never returns
2041 normally, hence through a return instruction. This produces undefined
2042 behavior at runtime if the function ever does dynamically return. Annotated
2043 functions may still raise an exception, i.a., ``nounwind`` is not implied.
2045 This function attribute indicates that the function does not call itself
2046 either directly or indirectly down any possible call path. This produces
2047 undefined behavior at runtime if the function ever does recurse.
2049 .. _langref_willreturn:
2052 This function attribute indicates that a call of this function will
2053 either exhibit undefined behavior or comes back and continues execution
2054 at a point in the existing call stack that includes the current invocation.
2055 Annotated functions may still raise an exception, i.a., ``nounwind`` is not implied.
2056 If an invocation of an annotated function does not return control back
2057 to a point in the call stack, the behavior is undefined.
2059 This function attribute indicates that the function does not communicate
2060 (synchronize) with another thread through memory or other well-defined means.
2061 Synchronization is considered possible in the presence of `atomic` accesses
2062 that enforce an order, thus not "unordered" and "monotonic", `volatile` accesses,
2063 as well as `convergent` function calls.
2065 Note that `convergent` operations can involve communication that is
2066 considered to be not through memory and does not necessarily imply an
2067 ordering between threads for the purposes of the memory model. Therefore,
2068 an operation can be both `convergent` and `nosync`.
2070 If a `nosync` function does ever synchronize with another thread,
2071 the behavior is undefined.
2073 This function attribute indicates that the function never raises an
2074 exception. If the function does raise an exception, its runtime
2075 behavior is undefined. However, functions marked nounwind may still
2076 trap or generate asynchronous exceptions. Exception handling schemes
2077 that are recognized by LLVM to handle asynchronous exceptions, such
2078 as SEH, will still provide their implementation defined semantics.
2079 ``nosanitize_bounds``
2080 This attribute indicates that bounds checking sanitizer instrumentation
2081 is disabled for this function.
2082 ``nosanitize_coverage``
2083 This attribute indicates that SanitizerCoverage instrumentation is disabled
2085 ``null_pointer_is_valid``
2086 If ``null_pointer_is_valid`` is set, then the ``null`` address
2087 in address-space 0 is considered to be a valid address for memory loads and
2088 stores. Any analysis or optimization should not treat dereferencing a
2089 pointer to ``null`` as undefined behavior in this function.
2090 Note: Comparing address of a global variable to ``null`` may still
2091 evaluate to false because of a limitation in querying this attribute inside
2092 constant expressions.
2094 This attribute suggests that optimization passes and code generator passes
2095 should make choices that try to preserve debug info without significantly
2096 degrading runtime performance.
2097 This attribute is incompatible with the ``minsize``, ``optsize``, and
2098 ``optnone`` attributes.
2100 This attribute indicates that this function should be optimized
2101 for maximum fuzzing signal.
2103 This function attribute indicates that most optimization passes will skip
2104 this function, with the exception of interprocedural optimization passes.
2105 Code generation defaults to the "fast" instruction selector.
2106 This attribute cannot be used together with the ``alwaysinline``
2107 attribute; this attribute is also incompatible
2108 with the ``minsize``, ``optsize``, and ``optdebug`` attributes.
2110 This attribute requires the ``noinline`` attribute to be specified on
2111 the function as well, so the function is never inlined into any caller.
2112 Only functions with the ``alwaysinline`` attribute are valid
2113 candidates for inlining into the body of this function.
2115 This attribute suggests that optimization passes and code generator
2116 passes make choices that keep the code size of this function low,
2117 and otherwise do optimizations specifically to reduce code size as
2118 long as they do not significantly impact runtime performance.
2119 This attribute is incompatible with the ``optdebug`` and ``optnone``
2121 ``"patchable-function"``
2122 This attribute tells the code generator that the code
2123 generated for this function needs to follow certain conventions that
2124 make it possible for a runtime function to patch over it later.
2125 The exact effect of this attribute depends on its string value,
2126 for which there currently is one legal possibility:
2128 * ``"prologue-short-redirect"`` - This style of patchable
2129 function is intended to support patching a function prologue to
2130 redirect control away from the function in a thread safe
2131 manner. It guarantees that the first instruction of the
2132 function will be large enough to accommodate a short jump
2133 instruction, and will be sufficiently aligned to allow being
2134 fully changed via an atomic compare-and-swap instruction.
2135 While the first requirement can be satisfied by inserting large
2136 enough NOP, LLVM can and will try to re-purpose an existing
2137 instruction (i.e. one that would have to be emitted anyway) as
2138 the patchable instruction larger than a short jump.
2140 ``"prologue-short-redirect"`` is currently only supported on
2143 This attribute by itself does not imply restrictions on
2144 inter-procedural optimizations. All of the semantic effects the
2145 patching may have to be separately conveyed via the linkage type.
2147 This attribute indicates that the function will trigger a guard region
2148 in the end of the stack. It ensures that accesses to the stack must be
2149 no further apart than the size of the guard region to a previous
2150 access of the stack. It takes one required string value, the name of
2151 the stack probing function that will be called.
2153 If a function that has a ``"probe-stack"`` attribute is inlined into
2154 a function with another ``"probe-stack"`` attribute, the resulting
2155 function has the ``"probe-stack"`` attribute of the caller. If a
2156 function that has a ``"probe-stack"`` attribute is inlined into a
2157 function that has no ``"probe-stack"`` attribute at all, the resulting
2158 function has the ``"probe-stack"`` attribute of the callee.
2159 ``"stack-probe-size"``
2160 This attribute controls the behavior of stack probes: either
2161 the ``"probe-stack"`` attribute, or ABI-required stack probes, if any.
2162 It defines the size of the guard region. It ensures that if the function
2163 may use more stack space than the size of the guard region, stack probing
2164 sequence will be emitted. It takes one required integer value, which
2167 If a function that has a ``"stack-probe-size"`` attribute is inlined into
2168 a function with another ``"stack-probe-size"`` attribute, the resulting
2169 function has the ``"stack-probe-size"`` attribute that has the lower
2170 numeric value. If a function that has a ``"stack-probe-size"`` attribute is
2171 inlined into a function that has no ``"stack-probe-size"`` attribute
2172 at all, the resulting function has the ``"stack-probe-size"`` attribute
2174 ``"no-stack-arg-probe"``
2175 This attribute disables ABI-required stack probes, if any.
2177 This attribute indicates that this function can return twice. The C
2178 ``setjmp`` is an example of such a function. The compiler disables
2179 some optimizations (like tail calls) in the caller of these
2182 This attribute indicates that
2183 `SafeStack <https://clang.llvm.org/docs/SafeStack.html>`_
2184 protection is enabled for this function.
2186 If a function that has a ``safestack`` attribute is inlined into a
2187 function that doesn't have a ``safestack`` attribute or which has an
2188 ``ssp``, ``sspstrong`` or ``sspreq`` attribute, then the resulting
2189 function will have a ``safestack`` attribute.
2190 ``sanitize_address``
2191 This attribute indicates that AddressSanitizer checks
2192 (dynamic address safety analysis) are enabled for this function.
2194 This attribute indicates that MemorySanitizer checks (dynamic detection
2195 of accesses to uninitialized memory) are enabled for this function.
2197 This attribute indicates that ThreadSanitizer checks
2198 (dynamic thread safety analysis) are enabled for this function.
2199 ``sanitize_hwaddress``
2200 This attribute indicates that HWAddressSanitizer checks
2201 (dynamic address safety analysis based on tagged pointers) are enabled for
2204 This attribute indicates that MemTagSanitizer checks
2205 (dynamic address safety analysis based on Armv8 MTE) are enabled for
2207 ``speculative_load_hardening``
2208 This attribute indicates that
2209 `Speculative Load Hardening <https://llvm.org/docs/SpeculativeLoadHardening.html>`_
2210 should be enabled for the function body.
2212 Speculative Load Hardening is a best-effort mitigation against
2213 information leak attacks that make use of control flow
2214 miss-speculation - specifically miss-speculation of whether a branch
2215 is taken or not. Typically vulnerabilities enabling such attacks are
2216 classified as "Spectre variant #1". Notably, this does not attempt to
2217 mitigate against miss-speculation of branch target, classified as
2218 "Spectre variant #2" vulnerabilities.
2220 When inlining, the attribute is sticky. Inlining a function that carries
2221 this attribute will cause the caller to gain the attribute. This is intended
2222 to provide a maximally conservative model where the code in a function
2223 annotated with this attribute will always (even after inlining) end up
2226 This function attribute indicates that the function does not have any
2227 effects besides calculating its result and does not have undefined behavior.
2228 Note that ``speculatable`` is not enough to conclude that along any
2229 particular execution path the number of calls to this function will not be
2230 externally observable. This attribute is only valid on functions
2231 and declarations, not on individual call sites. If a function is
2232 incorrectly marked as speculatable and really does exhibit
2233 undefined behavior, the undefined behavior may be observed even
2234 if the call site is dead code.
2237 This attribute indicates that the function should emit a stack
2238 smashing protector. It is in the form of a "canary" --- a random value
2239 placed on the stack before the local variables that's checked upon
2240 return from the function to see if it has been overwritten. A
2241 heuristic is used to determine if a function needs stack protectors
2242 or not. The heuristic used will enable protectors for functions with:
2244 - Character arrays larger than ``ssp-buffer-size`` (default 8).
2245 - Aggregates containing character arrays larger than ``ssp-buffer-size``.
2246 - Calls to alloca() with variable sizes or constant sizes greater than
2247 ``ssp-buffer-size``.
2249 Variables that are identified as requiring a protector will be arranged
2250 on the stack such that they are adjacent to the stack protector guard.
2252 If a function with an ``ssp`` attribute is inlined into a calling function,
2253 the attribute is not carried over to the calling function.
2256 This attribute indicates that the function should emit a stack smashing
2257 protector. This attribute causes a strong heuristic to be used when
2258 determining if a function needs stack protectors. The strong heuristic
2259 will enable protectors for functions with:
2261 - Arrays of any size and type
2262 - Aggregates containing an array of any size and type.
2263 - Calls to alloca().
2264 - Local variables that have had their address taken.
2266 Variables that are identified as requiring a protector will be arranged
2267 on the stack such that they are adjacent to the stack protector guard.
2268 The specific layout rules are:
2270 #. Large arrays and structures containing large arrays
2271 (``>= ssp-buffer-size``) are closest to the stack protector.
2272 #. Small arrays and structures containing small arrays
2273 (``< ssp-buffer-size``) are 2nd closest to the protector.
2274 #. Variables that have had their address taken are 3rd closest to the
2277 This overrides the ``ssp`` function attribute.
2279 If a function with an ``sspstrong`` attribute is inlined into a calling
2280 function which has an ``ssp`` attribute, the calling function's attribute
2281 will be upgraded to ``sspstrong``.
2284 This attribute indicates that the function should *always* emit a stack
2285 smashing protector. This overrides the ``ssp`` and ``sspstrong`` function
2288 Variables that are identified as requiring a protector will be arranged
2289 on the stack such that they are adjacent to the stack protector guard.
2290 The specific layout rules are:
2292 #. Large arrays and structures containing large arrays
2293 (``>= ssp-buffer-size``) are closest to the stack protector.
2294 #. Small arrays and structures containing small arrays
2295 (``< ssp-buffer-size``) are 2nd closest to the protector.
2296 #. Variables that have had their address taken are 3rd closest to the
2299 If a function with an ``sspreq`` attribute is inlined into a calling
2300 function which has an ``ssp`` or ``sspstrong`` attribute, the calling
2301 function's attribute will be upgraded to ``sspreq``.
2304 This attribute indicates that the function was called from a scope that
2305 requires strict floating-point semantics. LLVM will not attempt any
2306 optimizations that require assumptions about the floating-point rounding
2307 mode or that might alter the state of floating-point status flags that
2308 might otherwise be set or cleared by calling this function. LLVM will
2309 not introduce any new floating-point instructions that may trap.
2311 .. _denormal_fp_math:
2313 ``"denormal-fp-math"``
2314 This indicates the denormal (subnormal) handling that may be
2315 assumed for the default floating-point environment. This is a
2316 comma separated pair. The elements may be one of ``"ieee"``,
2317 ``"preserve-sign"``, ``"positive-zero"``, or ``"dynamic"``. The
2318 first entry indicates the flushing mode for the result of floating
2319 point operations. The second indicates the handling of denormal inputs
2320 to floating point instructions. For compatibility with older
2321 bitcode, if the second value is omitted, both input and output
2322 modes will assume the same mode.
2324 If this is attribute is not specified, the default is ``"ieee,ieee"``.
2326 If the output mode is ``"preserve-sign"``, or ``"positive-zero"``,
2327 denormal outputs may be flushed to zero by standard floating-point
2328 operations. It is not mandated that flushing to zero occurs, but if
2329 a denormal output is flushed to zero, it must respect the sign
2330 mode. Not all targets support all modes.
2332 If the mode is ``"dynamic"``, the behavior is derived from the
2333 dynamic state of the floating-point environment. Transformations
2334 which depend on the behavior of denormal values should not be
2337 While this indicates the expected floating point mode the function
2338 will be executed with, this does not make any attempt to ensure
2339 the mode is consistent. User or platform code is expected to set
2340 the floating point mode appropriately before function entry.
2342 If the input mode is ``"preserve-sign"``, or ``"positive-zero"``,
2343 a floating-point operation must treat any input denormal value as
2344 zero. In some situations, if an instruction does not respect this
2345 mode, the input may need to be converted to 0 as if by
2346 ``@llvm.canonicalize`` during lowering for correctness.
2348 ``"denormal-fp-math-f32"``
2349 Same as ``"denormal-fp-math"``, but only controls the behavior of
2350 the 32-bit float type (or vectors of 32-bit floats). If both are
2351 are present, this overrides ``"denormal-fp-math"``. Not all targets
2352 support separately setting the denormal mode per type, and no
2353 attempt is made to diagnose unsupported uses. Currently this
2354 attribute is respected by the AMDGPU and NVPTX backends.
2357 This attribute indicates that the function will delegate to some other
2358 function with a tail call. The prototype of a thunk should not be used for
2359 optimization purposes. The caller is expected to cast the thunk prototype to
2360 match the thunk target prototype.
2362 ``"tls-load-hoist"``
2363 This attribute indicates that the function will try to reduce redundant
2364 tls address calculation by hoisting tls variable.
2366 ``uwtable[(sync|async)]``
2367 This attribute indicates that the ABI being targeted requires that
2368 an unwind table entry be produced for this function even if we can
2369 show that no exceptions passes by it. This is normally the case for
2370 the ELF x86-64 abi, but it can be disabled for some compilation
2371 units. The optional parameter describes what kind of unwind tables
2372 to generate: ``sync`` for normal unwind tables, ``async`` for asynchronous
2373 (instruction precise) unwind tables. Without the parameter, the attribute
2374 ``uwtable`` is equivalent to ``uwtable(async)``.
2376 This attribute indicates that no control-flow check will be performed on
2377 the attributed entity. It disables -fcf-protection=<> for a specific
2378 entity to fine grain the HW control flow protection mechanism. The flag
2379 is target independent and currently appertains to a function or function
2382 This attribute indicates that the ShadowCallStack checks are enabled for
2383 the function. The instrumentation checks that the return address for the
2384 function has not changed between the function prolog and epilog. It is
2385 currently x86_64-specific.
2387 .. _langref_mustprogress:
2390 This attribute indicates that the function is required to return, unwind,
2391 or interact with the environment in an observable way e.g. via a volatile
2392 memory access, I/O, or other synchronization. The ``mustprogress``
2393 attribute is intended to model the requirements of the first section of
2394 [intro.progress] of the C++ Standard. As a consequence, a loop in a
2395 function with the `mustprogress` attribute can be assumed to terminate if
2396 it does not interact with the environment in an observable way, and
2397 terminating loops without side-effects can be removed. If a `mustprogress`
2398 function does not satisfy this contract, the behavior is undefined. This
2399 attribute does not apply transitively to callees, but does apply to call
2400 sites within the function. Note that `willreturn` implies `mustprogress`.
2401 ``"warn-stack-size"="<threshold>"``
2402 This attribute sets a threshold to emit diagnostics once the frame size is
2403 known should the frame size exceed the specified value. It takes one
2404 required integer value, which should be a non-negative integer, and less
2405 than `UINT_MAX`. It's unspecified which threshold will be used when
2406 duplicate definitions are linked together with differing values.
2407 ``vscale_range(<min>[, <max>])``
2408 This function attribute indicates `vscale` is a power-of-two within a
2409 specified range. `min` must be a power-of-two that is greater than 0. When
2410 specified, `max` must be a power-of-two greater-than-or-equal to `min` or 0
2411 to signify an unbounded maximum. The syntax `vscale_range(<val>)` can be
2412 used to set both `min` and `max` to the same value. Functions that don't
2413 include this attribute make no assumptions about the value of `vscale`.
2415 This attribute indicates that outlining passes should not modify the
2418 Call Site Attributes
2419 ----------------------
2421 In addition to function attributes the following call site only
2422 attributes are supported:
2424 ``vector-function-abi-variant``
2425 This attribute can be attached to a :ref:`call <i_call>` to list
2426 the vector functions associated to the function. Notice that the
2427 attribute cannot be attached to a :ref:`invoke <i_invoke>` or a
2428 :ref:`callbr <i_callbr>` instruction. The attribute consists of a
2429 comma separated list of mangled names. The order of the list does
2430 not imply preference (it is logically a set). The compiler is free
2431 to pick any listed vector function of its choosing.
2433 The syntax for the mangled names is as follows:::
2435 _ZGV<isa><mask><vlen><parameters>_<scalar_name>[(<vector_redirection>)]
2437 When present, the attribute informs the compiler that the function
2438 ``<scalar_name>`` has a corresponding vector variant that can be
2439 used to perform the concurrent invocation of ``<scalar_name>`` on
2440 vectors. The shape of the vector function is described by the
2441 tokens between the prefix ``_ZGV`` and the ``<scalar_name>``
2442 token. The standard name of the vector function is
2443 ``_ZGV<isa><mask><vlen><parameters>_<scalar_name>``. When present,
2444 the optional token ``(<vector_redirection>)`` informs the compiler
2445 that a custom name is provided in addition to the standard one
2446 (custom names can be provided for example via the use of ``declare
2447 variant`` in OpenMP 5.0). The declaration of the variant must be
2448 present in the IR Module. The signature of the vector variant is
2449 determined by the rules of the Vector Function ABI (VFABI)
2450 specifications of the target. For Arm and X86, the VFABI can be
2451 found at https://github.com/ARM-software/abi-aa and
2452 https://software.intel.com/content/www/us/en/develop/download/vector-simd-function-abi.html,
2455 For X86 and Arm targets, the values of the tokens in the standard
2456 name are those that are defined in the VFABI. LLVM has an internal
2457 ``<isa>`` token that can be used to create scalar-to-vector
2458 mappings for functions that are not directly associated to any of
2459 the target ISAs (for example, some of the mappings stored in the
2460 TargetLibraryInfo). Valid values for the ``<isa>`` token are:::
2462 <isa>:= b | c | d | e -> X86 SSE, AVX, AVX2, AVX512
2463 | n | s -> Armv8 Advanced SIMD, SVE
2464 | __LLVM__ -> Internal LLVM Vector ISA
2466 For all targets currently supported (x86, Arm and Internal LLVM),
2467 the remaining tokens can have the following values:::
2469 <mask>:= M | N -> mask | no mask
2471 <vlen>:= number -> number of lanes
2472 | x -> VLA (Vector Length Agnostic)
2474 <parameters>:= v -> vector
2475 | l | l <number> -> linear
2476 | R | R <number> -> linear with ref modifier
2477 | L | L <number> -> linear with val modifier
2478 | U | U <number> -> linear with uval modifier
2479 | ls <pos> -> runtime linear
2480 | Rs <pos> -> runtime linear with ref modifier
2481 | Ls <pos> -> runtime linear with val modifier
2482 | Us <pos> -> runtime linear with uval modifier
2485 <scalar_name>:= name of the scalar function
2487 <vector_redirection>:= optional, custom name of the vector function
2489 ``preallocated(<ty>)``
2490 This attribute is required on calls to ``llvm.call.preallocated.arg``
2491 and cannot be used on any other call. See
2492 :ref:`llvm.call.preallocated.arg<int_call_preallocated_arg>` for more
2500 Attributes may be set to communicate additional information about a global variable.
2501 Unlike :ref:`function attributes <fnattrs>`, attributes on a global variable
2502 are grouped into a single :ref:`attribute group <attrgrp>`.
2504 ``no_sanitize_address``
2505 This attribute indicates that the global variable should not have
2506 AddressSanitizer instrumentation applied to it, because it was annotated
2507 with `__attribute__((no_sanitize("address")))`,
2508 `__attribute__((disable_sanitizer_instrumentation))`, or included in the
2509 `-fsanitize-ignorelist` file.
2510 ``no_sanitize_hwaddress``
2511 This attribute indicates that the global variable should not have
2512 HWAddressSanitizer instrumentation applied to it, because it was annotated
2513 with `__attribute__((no_sanitize("hwaddress")))`,
2514 `__attribute__((disable_sanitizer_instrumentation))`, or included in the
2515 `-fsanitize-ignorelist` file.
2517 This attribute indicates that the global variable should have AArch64 memory
2518 tags (MTE) instrumentation applied to it. This attribute causes the
2519 suppression of certain optimisations, like GlobalMerge, as well as ensuring
2520 extra directives are emitted in the assembly and extra bits of metadata are
2521 placed in the object file so that the linker can ensure the accesses are
2522 protected by MTE. This attribute is added by clang when
2523 `-fsanitize=memtag-globals` is provided, as long as the global is not marked
2524 with `__attribute__((no_sanitize("memtag")))`,
2525 `__attribute__((disable_sanitizer_instrumentation))`, or included in the
2526 `-fsanitize-ignorelist` file. The AArch64 Globals Tagging pass may remove
2527 this attribute when it's not possible to tag the global (e.g. it's a TLS
2529 ``sanitize_address_dyninit``
2530 This attribute indicates that the global variable, when instrumented with
2531 AddressSanitizer, should be checked for ODR violations. This attribute is
2532 applied to global variables that are dynamically initialized according to
2540 Operand bundles are tagged sets of SSA values that can be associated
2541 with certain LLVM instructions (currently only ``call`` s and
2542 ``invoke`` s). In a way they are like metadata, but dropping them is
2543 incorrect and will change program semantics.
2547 operand bundle set ::= '[' operand bundle (, operand bundle )* ']'
2548 operand bundle ::= tag '(' [ bundle operand ] (, bundle operand )* ')'
2549 bundle operand ::= SSA value
2550 tag ::= string constant
2552 Operand bundles are **not** part of a function's signature, and a
2553 given function may be called from multiple places with different kinds
2554 of operand bundles. This reflects the fact that the operand bundles
2555 are conceptually a part of the ``call`` (or ``invoke``), not the
2556 callee being dispatched to.
2558 Operand bundles are a generic mechanism intended to support
2559 runtime-introspection-like functionality for managed languages. While
2560 the exact semantics of an operand bundle depend on the bundle tag,
2561 there are certain limitations to how much the presence of an operand
2562 bundle can influence the semantics of a program. These restrictions
2563 are described as the semantics of an "unknown" operand bundle. As
2564 long as the behavior of an operand bundle is describable within these
2565 restrictions, LLVM does not need to have special knowledge of the
2566 operand bundle to not miscompile programs containing it.
2568 - The bundle operands for an unknown operand bundle escape in unknown
2569 ways before control is transferred to the callee or invokee.
2570 - Calls and invokes with operand bundles have unknown read / write
2571 effect on the heap on entry and exit (even if the call target specifies
2572 a ``memory`` attribute), unless they're overridden with
2573 callsite specific attributes.
2574 - An operand bundle at a call site cannot change the implementation
2575 of the called function. Inter-procedural optimizations work as
2576 usual as long as they take into account the first two properties.
2578 More specific types of operand bundles are described below.
2580 .. _deopt_opbundles:
2582 Deoptimization Operand Bundles
2583 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2585 Deoptimization operand bundles are characterized by the ``"deopt"``
2586 operand bundle tag. These operand bundles represent an alternate
2587 "safe" continuation for the call site they're attached to, and can be
2588 used by a suitable runtime to deoptimize the compiled frame at the
2589 specified call site. There can be at most one ``"deopt"`` operand
2590 bundle attached to a call site. Exact details of deoptimization is
2591 out of scope for the language reference, but it usually involves
2592 rewriting a compiled frame into a set of interpreted frames.
2594 From the compiler's perspective, deoptimization operand bundles make
2595 the call sites they're attached to at least ``readonly``. They read
2596 through all of their pointer typed operands (even if they're not
2597 otherwise escaped) and the entire visible heap. Deoptimization
2598 operand bundles do not capture their operands except during
2599 deoptimization, in which case control will not be returned to the
2602 The inliner knows how to inline through calls that have deoptimization
2603 operand bundles. Just like inlining through a normal call site
2604 involves composing the normal and exceptional continuations, inlining
2605 through a call site with a deoptimization operand bundle needs to
2606 appropriately compose the "safe" deoptimization continuation. The
2607 inliner does this by prepending the parent's deoptimization
2608 continuation to every deoptimization continuation in the inlined body.
2609 E.g. inlining ``@f`` into ``@g`` in the following example
2611 .. code-block:: llvm
2614 call void @x() ;; no deopt state
2615 call void @y() [ "deopt"(i32 10) ]
2616 call void @y() [ "deopt"(i32 10), "unknown"(ptr null) ]
2621 call void @f() [ "deopt"(i32 20) ]
2627 .. code-block:: llvm
2630 call void @x() ;; still no deopt state
2631 call void @y() [ "deopt"(i32 20, i32 10) ]
2632 call void @y() [ "deopt"(i32 20, i32 10), "unknown"(ptr null) ]
2636 It is the frontend's responsibility to structure or encode the
2637 deoptimization state in a way that syntactically prepending the
2638 caller's deoptimization state to the callee's deoptimization state is
2639 semantically equivalent to composing the caller's deoptimization
2640 continuation after the callee's deoptimization continuation.
2644 Funclet Operand Bundles
2645 ^^^^^^^^^^^^^^^^^^^^^^^
2647 Funclet operand bundles are characterized by the ``"funclet"``
2648 operand bundle tag. These operand bundles indicate that a call site
2649 is within a particular funclet. There can be at most one
2650 ``"funclet"`` operand bundle attached to a call site and it must have
2651 exactly one bundle operand.
2653 If any funclet EH pads have been "entered" but not "exited" (per the
2654 `description in the EH doc\ <ExceptionHandling.html#wineh-constraints>`_),
2655 it is undefined behavior to execute a ``call`` or ``invoke`` which:
2657 * does not have a ``"funclet"`` bundle and is not a ``call`` to a nounwind
2659 * has a ``"funclet"`` bundle whose operand is not the most-recently-entered
2660 not-yet-exited funclet EH pad.
2662 Similarly, if no funclet EH pads have been entered-but-not-yet-exited,
2663 executing a ``call`` or ``invoke`` with a ``"funclet"`` bundle is undefined behavior.
2665 GC Transition Operand Bundles
2666 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2668 GC transition operand bundles are characterized by the
2669 ``"gc-transition"`` operand bundle tag. These operand bundles mark a
2670 call as a transition between a function with one GC strategy to a
2671 function with a different GC strategy. If coordinating the transition
2672 between GC strategies requires additional code generation at the call
2673 site, these bundles may contain any values that are needed by the
2674 generated code. For more details, see :ref:`GC Transitions
2675 <gc_transition_args>`.
2677 The bundle contain an arbitrary list of Values which need to be passed
2678 to GC transition code. They will be lowered and passed as operands to
2679 the appropriate GC_TRANSITION nodes in the selection DAG. It is assumed
2680 that these arguments must be available before and after (but not
2681 necessarily during) the execution of the callee.
2683 .. _assume_opbundles:
2685 Assume Operand Bundles
2686 ^^^^^^^^^^^^^^^^^^^^^^
2688 Operand bundles on an :ref:`llvm.assume <int_assume>` allows representing
2689 assumptions, such as that a :ref:`parameter attribute <paramattrs>` or a
2690 :ref:`function attribute <fnattrs>` holds for a certain value at a certain
2691 location. Operand bundles enable assumptions that are either hard or impossible
2692 to represent as a boolean argument of an :ref:`llvm.assume <int_assume>`.
2694 An assume operand bundle has the form:
2698 "<tag>"([ <arguments>] ])
2700 In the case of function or parameter attributes, the operand bundle has the
2705 "<tag>"([ <holds for value> [, <attribute argument>] ])
2707 * The tag of the operand bundle is usually the name of attribute that can be
2708 assumed to hold. It can also be `ignore`, this tag doesn't contain any
2709 information and should be ignored.
2710 * The first argument if present is the value for which the attribute hold.
2711 * The second argument if present is an argument of the attribute.
2713 If there are no arguments the attribute is a property of the call location.
2717 .. code-block:: llvm
2719 call void @llvm.assume(i1 true) ["align"(ptr %val, i32 8)]
2721 allows the optimizer to assume that at location of call to
2722 :ref:`llvm.assume <int_assume>` ``%val`` has an alignment of at least 8.
2724 .. code-block:: llvm
2726 call void @llvm.assume(i1 %cond) ["cold"(), "nonnull"(ptr %val)]
2728 allows the optimizer to assume that the :ref:`llvm.assume <int_assume>`
2729 call location is cold and that ``%val`` may not be null.
2731 Just like for the argument of :ref:`llvm.assume <int_assume>`, if any of the
2732 provided guarantees are violated at runtime the behavior is undefined.
2734 While attributes expect constant arguments, assume operand bundles may be
2735 provided a dynamic value, for example:
2737 .. code-block:: llvm
2739 call void @llvm.assume(i1 true) ["align"(ptr %val, i32 %align)]
2741 If the operand bundle value violates any requirements on the attribute value,
2742 the behavior is undefined, unless one of the following exceptions applies:
2744 * ``"align"`` operand bundles may specify a non-power-of-two alignment
2745 (including a zero alignment). If this is the case, then the pointer value
2746 must be a null pointer, otherwise the behavior is undefined.
2748 In addition to allowing operand bundles encoding function and parameter
2749 attributes, an assume operand bundle my also encode a ``separate_storage``
2750 operand bundle. This has the form:
2752 .. code-block:: llvm
2754 separate_storage(<val1>, <val2>)``
2756 This indicates that no pointer :ref:`based <pointeraliasing>` on one of its
2757 arguments can alias any pointer based on the other.
2759 Even if the assumed property can be encoded as a boolean value, like
2760 ``nonnull``, using operand bundles to express the property can still have
2763 * Attributes that can be expressed via operand bundles are directly the
2764 property that the optimizer uses and cares about. Encoding attributes as
2765 operand bundles removes the need for an instruction sequence that represents
2766 the property (e.g., `icmp ne ptr %p, null` for `nonnull`) and for the
2767 optimizer to deduce the property from that instruction sequence.
2768 * Expressing the property using operand bundles makes it easy to identify the
2769 use of the value as a use in an :ref:`llvm.assume <int_assume>`. This then
2770 simplifies and improves heuristics, e.g., for use "use-sensitive"
2773 .. _ob_preallocated:
2775 Preallocated Operand Bundles
2776 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2778 Preallocated operand bundles are characterized by the ``"preallocated"``
2779 operand bundle tag. These operand bundles allow separation of the allocation
2780 of the call argument memory from the call site. This is necessary to pass
2781 non-trivially copyable objects by value in a way that is compatible with MSVC
2782 on some targets. There can be at most one ``"preallocated"`` operand bundle
2783 attached to a call site and it must have exactly one bundle operand, which is
2784 a token generated by ``@llvm.call.preallocated.setup``. A call with this
2785 operand bundle should not adjust the stack before entering the function, as
2786 that will have been done by one of the ``@llvm.call.preallocated.*`` intrinsics.
2788 .. code-block:: llvm
2790 %foo = type { i64, i32 }
2794 %t = call token @llvm.call.preallocated.setup(i32 1)
2795 %a = call ptr @llvm.call.preallocated.arg(token %t, i32 0) preallocated(%foo)
2797 call void @bar(i32 42, ptr preallocated(%foo) %a) ["preallocated"(token %t)]
2801 GC Live Operand Bundles
2802 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2804 A "gc-live" operand bundle is only valid on a :ref:`gc.statepoint <gc_statepoint>`
2805 intrinsic. The operand bundle must contain every pointer to a garbage collected
2806 object which potentially needs to be updated by the garbage collector.
2808 When lowered, any relocated value will be recorded in the corresponding
2809 :ref:`stackmap entry <statepoint-stackmap-format>`. See the intrinsic description
2810 for further details.
2812 ObjC ARC Attached Call Operand Bundles
2813 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2815 A ``"clang.arc.attachedcall"`` operand bundle on a call indicates the call is
2816 implicitly followed by a marker instruction and a call to an ObjC runtime
2817 function that uses the result of the call. The operand bundle takes a mandatory
2818 pointer to the runtime function (``@objc_retainAutoreleasedReturnValue`` or
2819 ``@objc_unsafeClaimAutoreleasedReturnValue``).
2820 The return value of a call with this bundle is used by a call to
2821 ``@llvm.objc.clang.arc.noop.use`` unless the called function's return type is
2822 void, in which case the operand bundle is ignored.
2824 .. code-block:: llvm
2826 ; The marker instruction and a runtime function call are inserted after the call
2828 call ptr @foo() [ "clang.arc.attachedcall"(ptr @objc_retainAutoreleasedReturnValue) ]
2829 call ptr @foo() [ "clang.arc.attachedcall"(ptr @objc_unsafeClaimAutoreleasedReturnValue) ]
2831 The operand bundle is needed to ensure the call is immediately followed by the
2832 marker instruction and the ObjC runtime call in the final output.
2836 Pointer Authentication Operand Bundles
2837 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2839 Pointer Authentication operand bundles are characterized by the
2840 ``"ptrauth"`` operand bundle tag. They are described in the
2841 `Pointer Authentication <PointerAuth.html#operand-bundle>`__ document.
2845 KCFI Operand Bundles
2846 ^^^^^^^^^^^^^^^^^^^^
2848 A ``"kcfi"`` operand bundle on an indirect call indicates that the call will
2849 be preceded by a runtime type check, which validates that the call target is
2850 prefixed with a :ref:`type identifier<md_kcfi_type>` that matches the operand
2851 bundle attribute. For example:
2853 .. code-block:: llvm
2855 call void %0() ["kcfi"(i32 1234)]
2857 Clang emits KCFI operand bundles and the necessary metadata with
2858 ``-fsanitize=kcfi``.
2860 .. _convergencectrl:
2862 Convergence Control Operand Bundles
2863 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2865 A "convergencectrl" operand bundle is only valid on a ``convergent`` operation.
2866 When present, the operand bundle must contain exactly one value of token type.
2867 See the :doc:`ConvergentOperations` document for details.
2871 Module-Level Inline Assembly
2872 ----------------------------
2874 Modules may contain "module-level inline asm" blocks, which corresponds
2875 to the GCC "file scope inline asm" blocks. These blocks are internally
2876 concatenated by LLVM and treated as a single unit, but may be separated
2877 in the ``.ll`` file if desired. The syntax is very simple:
2879 .. code-block:: llvm
2881 module asm "inline asm code goes here"
2882 module asm "more can go here"
2884 The strings can contain any character by escaping non-printable
2885 characters. The escape sequence used is simply "\\xx" where "xx" is the
2886 two digit hex code for the number.
2888 Note that the assembly string *must* be parseable by LLVM's integrated assembler
2889 (unless it is disabled), even when emitting a ``.s`` file.
2891 .. _langref_datalayout:
2896 A module may specify a target specific data layout string that specifies
2897 how data is to be laid out in memory. The syntax for the data layout is
2900 .. code-block:: llvm
2902 target datalayout = "layout specification"
2904 The *layout specification* consists of a list of specifications
2905 separated by the minus sign character ('-'). Each specification starts
2906 with a letter and may include other information after the letter to
2907 define some aspect of the data layout. The specifications accepted are
2911 Specifies that the target lays out data in big-endian form. That is,
2912 the bits with the most significance have the lowest address
2915 Specifies that the target lays out data in little-endian form. That
2916 is, the bits with the least significance have the lowest address
2919 Specifies the natural alignment of the stack in bits. Alignment
2920 promotion of stack variables is limited to the natural stack
2921 alignment to avoid dynamic stack realignment. The stack alignment
2922 must be a multiple of 8-bits. If omitted, the natural stack
2923 alignment defaults to "unspecified", which does not prevent any
2924 alignment promotions.
2925 ``P<address space>``
2926 Specifies the address space that corresponds to program memory.
2927 Harvard architectures can use this to specify what space LLVM
2928 should place things such as functions into. If omitted, the
2929 program memory space defaults to the default address space of 0,
2930 which corresponds to a Von Neumann architecture that has code
2931 and data in the same space.
2932 ``G<address space>``
2933 Specifies the address space to be used by default when creating global
2934 variables. If omitted, the globals address space defaults to the default
2936 Note: variable declarations without an address space are always created in
2937 address space 0, this property only affects the default value to be used
2938 when creating globals without additional contextual information (e.g. in
2941 .. _alloca_addrspace:
2943 ``A<address space>``
2944 Specifies the address space of objects created by '``alloca``'.
2945 Defaults to the default address space of 0.
2946 ``p[n]:<size>:<abi>[:<pref>][:<idx>]``
2947 This specifies the *size* of a pointer and its ``<abi>`` and
2948 ``<pref>``\erred alignments for address space ``n``. ``<pref>`` is optional
2949 and defaults to ``<abi>``. The fourth parameter ``<idx>`` is the size of the
2950 index that used for address calculation, which must be less than or equal
2951 to the pointer size. If not
2952 specified, the default index size is equal to the pointer size. All sizes
2953 are in bits. The address space, ``n``, is optional, and if not specified,
2954 denotes the default address space 0. The value of ``n`` must be
2955 in the range [1,2^24).
2956 ``i<size>:<abi>[:<pref>]``
2957 This specifies the alignment for an integer type of a given bit
2958 ``<size>``. The value of ``<size>`` must be in the range [1,2^24).
2959 ``<pref>`` is optional and defaults to ``<abi>``.
2960 For ``i8``, the ``<abi>`` value must equal 8,
2961 that is, ``i8`` must be naturally aligned.
2962 ``v<size>:<abi>[:<pref>]``
2963 This specifies the alignment for a vector type of a given bit
2964 ``<size>``. The value of ``<size>`` must be in the range [1,2^24).
2965 ``<pref>`` is optional and defaults to ``<abi>``.
2966 ``f<size>:<abi>[:<pref>]``
2967 This specifies the alignment for a floating-point type of a given bit
2968 ``<size>``. Only values of ``<size>`` that are supported by the target
2969 will work. 32 (float) and 64 (double) are supported on all targets; 80
2970 or 128 (different flavors of long double) are also supported on some
2971 targets. The value of ``<size>`` must be in the range [1,2^24).
2972 ``<pref>`` is optional and defaults to ``<abi>``.
2973 ``a:<abi>[:<pref>]``
2974 This specifies the alignment for an object of aggregate type.
2975 ``<pref>`` is optional and defaults to ``<abi>``.
2977 This specifies the alignment for function pointers.
2978 The options for ``<type>`` are:
2980 * ``i``: The alignment of function pointers is independent of the alignment
2981 of functions, and is a multiple of ``<abi>``.
2982 * ``n``: The alignment of function pointers is a multiple of the explicit
2983 alignment specified on the function, and is a multiple of ``<abi>``.
2985 If present, specifies that llvm names are mangled in the output. Symbols
2986 prefixed with the mangling escape character ``\01`` are passed through
2987 directly to the assembler without the escape character. The mangling style
2990 * ``e``: ELF mangling: Private symbols get a ``.L`` prefix.
2991 * ``l``: GOFF mangling: Private symbols get a ``@`` prefix.
2992 * ``m``: Mips mangling: Private symbols get a ``$`` prefix.
2993 * ``o``: Mach-O mangling: Private symbols get ``L`` prefix. Other
2994 symbols get a ``_`` prefix.
2995 * ``x``: Windows x86 COFF mangling: Private symbols get the usual prefix.
2996 Regular C symbols get a ``_`` prefix. Functions with ``__stdcall``,
2997 ``__fastcall``, and ``__vectorcall`` have custom mangling that appends
2998 ``@N`` where N is the number of bytes used to pass parameters. C++ symbols
2999 starting with ``?`` are not mangled in any way.
3000 * ``w``: Windows COFF mangling: Similar to ``x``, except that normal C
3001 symbols do not receive a ``_`` prefix.
3002 * ``a``: XCOFF mangling: Private symbols get a ``L..`` prefix.
3003 ``n<size1>:<size2>:<size3>...``
3004 This specifies a set of native integer widths for the target CPU in
3005 bits. For example, it might contain ``n32`` for 32-bit PowerPC,
3006 ``n32:64`` for PowerPC 64, or ``n8:16:32:64`` for X86-64. Elements of
3007 this set are considered to support most general arithmetic operations
3009 ``ni:<address space0>:<address space1>:<address space2>...``
3010 This specifies pointer types with the specified address spaces
3011 as :ref:`Non-Integral Pointer Type <nointptrtype>` s. The ``0``
3012 address space cannot be specified as non-integral.
3014 On every specification that takes a ``<abi>:<pref>``, specifying the
3015 ``<pref>`` alignment is optional. If omitted, the preceding ``:``
3016 should be omitted too and ``<pref>`` will be equal to ``<abi>``.
3018 When constructing the data layout for a given target, LLVM starts with a
3019 default set of specifications which are then (possibly) overridden by
3020 the specifications in the ``datalayout`` keyword. The default
3021 specifications are given in this list:
3023 - ``e`` - little endian
3024 - ``p:64:64:64`` - 64-bit pointers with 64-bit alignment.
3025 - ``p[n]:64:64:64`` - Other address spaces are assumed to be the
3026 same as the default address space.
3027 - ``S0`` - natural stack alignment is unspecified
3028 - ``i1:8:8`` - i1 is 8-bit (byte) aligned
3029 - ``i8:8:8`` - i8 is 8-bit (byte) aligned as mandated
3030 - ``i16:16:16`` - i16 is 16-bit aligned
3031 - ``i32:32:32`` - i32 is 32-bit aligned
3032 - ``i64:32:64`` - i64 has ABI alignment of 32-bits but preferred
3033 alignment of 64-bits
3034 - ``f16:16:16`` - half is 16-bit aligned
3035 - ``f32:32:32`` - float is 32-bit aligned
3036 - ``f64:64:64`` - double is 64-bit aligned
3037 - ``f128:128:128`` - quad is 128-bit aligned
3038 - ``v64:64:64`` - 64-bit vector is 64-bit aligned
3039 - ``v128:128:128`` - 128-bit vector is 128-bit aligned
3040 - ``a:0:64`` - aggregates are 64-bit aligned
3042 When LLVM is determining the alignment for a given type, it uses the
3045 #. If the type sought is an exact match for one of the specifications,
3046 that specification is used.
3047 #. If no match is found, and the type sought is an integer type, then
3048 the smallest integer type that is larger than the bitwidth of the
3049 sought type is used. If none of the specifications are larger than
3050 the bitwidth then the largest integer type is used. For example,
3051 given the default specifications above, the i7 type will use the
3052 alignment of i8 (next largest) while both i65 and i256 will use the
3053 alignment of i64 (largest specified).
3055 The function of the data layout string may not be what you expect.
3056 Notably, this is not a specification from the frontend of what alignment
3057 the code generator should use.
3059 Instead, if specified, the target data layout is required to match what
3060 the ultimate *code generator* expects. This string is used by the
3061 mid-level optimizers to improve code, and this only works if it matches
3062 what the ultimate code generator uses. There is no way to generate IR
3063 that does not embed this target-specific detail into the IR. If you
3064 don't specify the string, the default specifications will be used to
3065 generate a Data Layout and the optimization phases will operate
3066 accordingly and introduce target specificity into the IR with respect to
3067 these default specifications.
3074 A module may specify a target triple string that describes the target
3075 host. The syntax for the target triple is simply:
3077 .. code-block:: llvm
3079 target triple = "x86_64-apple-macosx10.7.0"
3081 The *target triple* string consists of a series of identifiers delimited
3082 by the minus sign character ('-'). The canonical forms are:
3086 ARCHITECTURE-VENDOR-OPERATING_SYSTEM
3087 ARCHITECTURE-VENDOR-OPERATING_SYSTEM-ENVIRONMENT
3089 This information is passed along to the backend so that it generates
3090 code for the proper architecture. It's possible to override this on the
3091 command line with the ``-mtriple`` command line option.
3096 ----------------------
3098 A memory object, or simply object, is a region of a memory space that is
3099 reserved by a memory allocation such as :ref:`alloca <i_alloca>`, heap
3100 allocation calls, and global variable definitions.
3101 Once it is allocated, the bytes stored in the region can only be read or written
3102 through a pointer that is :ref:`based on <pointeraliasing>` the allocation
3104 If a pointer that is not based on the object tries to read or write to the
3105 object, it is undefined behavior.
3107 A lifetime of a memory object is a property that decides its accessibility.
3108 Unless stated otherwise, a memory object is alive since its allocation, and
3109 dead after its deallocation.
3110 It is undefined behavior to access a memory object that isn't alive, but
3111 operations that don't dereference it such as
3112 :ref:`getelementptr <i_getelementptr>`, :ref:`ptrtoint <i_ptrtoint>` and
3113 :ref:`icmp <i_icmp>` return a valid result.
3114 This explains code motion of these instructions across operations that
3115 impact the object's lifetime.
3116 A stack object's lifetime can be explicitly specified using
3117 :ref:`llvm.lifetime.start <int_lifestart>` and
3118 :ref:`llvm.lifetime.end <int_lifeend>` intrinsic function calls.
3120 .. _pointeraliasing:
3122 Pointer Aliasing Rules
3123 ----------------------
3125 Any memory access must be done through a pointer value associated with
3126 an address range of the memory access, otherwise the behavior is
3127 undefined. Pointer values are associated with address ranges according
3128 to the following rules:
3130 - A pointer value is associated with the addresses associated with any
3131 value it is *based* on.
3132 - An address of a global variable is associated with the address range
3133 of the variable's storage.
3134 - The result value of an allocation instruction is associated with the
3135 address range of the allocated storage.
3136 - A null pointer in the default address-space is associated with no
3138 - An :ref:`undef value <undefvalues>` in *any* address-space is
3139 associated with no address.
3140 - An integer constant other than zero or a pointer value returned from
3141 a function not defined within LLVM may be associated with address
3142 ranges allocated through mechanisms other than those provided by
3143 LLVM. Such ranges shall not overlap with any ranges of addresses
3144 allocated by mechanisms provided by LLVM.
3146 A pointer value is *based* on another pointer value according to the
3149 - A pointer value formed from a scalar ``getelementptr`` operation is *based* on
3150 the pointer-typed operand of the ``getelementptr``.
3151 - The pointer in lane *l* of the result of a vector ``getelementptr`` operation
3152 is *based* on the pointer in lane *l* of the vector-of-pointers-typed operand
3153 of the ``getelementptr``.
3154 - The result value of a ``bitcast`` is *based* on the operand of the
3156 - A pointer value formed by an ``inttoptr`` is *based* on all pointer
3157 values that contribute (directly or indirectly) to the computation of
3158 the pointer's value.
3159 - The "*based* on" relationship is transitive.
3161 Note that this definition of *"based"* is intentionally similar to the
3162 definition of *"based"* in C99, though it is slightly weaker.
3164 LLVM IR does not associate types with memory. The result type of a
3165 ``load`` merely indicates the size and alignment of the memory from
3166 which to load, as well as the interpretation of the value. The first
3167 operand type of a ``store`` similarly only indicates the size and
3168 alignment of the store.
3170 Consequently, type-based alias analysis, aka TBAA, aka
3171 ``-fstrict-aliasing``, is not applicable to general unadorned LLVM IR.
3172 :ref:`Metadata <metadata>` may be used to encode additional information
3173 which specialized optimization passes may use to implement type-based
3181 Given a function call and a pointer that is passed as an argument or stored in
3182 the memory before the call, a pointer is *captured* by the call if it makes a
3183 copy of any part of the pointer that outlives the call.
3184 To be precise, a pointer is captured if one or more of the following conditions
3187 1. The call stores any bit of the pointer carrying information into a place,
3188 and the stored bits can be read from the place by the caller after this call
3191 .. code-block:: llvm
3193 @glb = global ptr null
3194 @glb2 = global ptr null
3195 @glb3 = global ptr null
3196 @glbi = global i32 0
3198 define ptr @f(ptr %a, ptr %b, ptr %c, ptr %d, ptr %e) {
3199 store ptr %a, ptr @glb ; %a is captured by this call
3201 store ptr %b, ptr @glb2 ; %b isn't captured because the stored value is overwritten by the store below
3202 store ptr null, ptr @glb2
3204 store ptr %c, ptr @glb3
3205 call void @g() ; If @g makes a copy of %c that outlives this call (@f), %c is captured
3206 store ptr null, ptr @glb3
3208 %i = ptrtoint ptr %d to i64
3209 %j = trunc i64 %i to i32
3210 store i32 %j, ptr @glbi ; %d is captured
3212 ret ptr %e ; %e is captured
3215 2. The call stores any bit of the pointer carrying information into a place,
3216 and the stored bits can be safely read from the place by another thread via
3219 .. code-block:: llvm
3221 @lock = global i1 true
3223 define void @f(ptr %a) {
3224 store ptr %a, ptr* @glb
3225 store atomic i1 false, ptr @lock release ; %a is captured because another thread can safely read @glb
3226 store ptr null, ptr @glb
3230 3. The call's behavior depends on any bit of the pointer carrying information.
3232 .. code-block:: llvm
3236 define void @f(ptr %a) {
3237 %c = icmp eq ptr %a, @glb
3238 br i1 %c, label %BB_EXIT, label %BB_CONTINUE ; escapes %a
3246 4. The pointer is used in a volatile access as its address.
3251 Volatile Memory Accesses
3252 ------------------------
3254 Certain memory accesses, such as :ref:`load <i_load>`'s,
3255 :ref:`store <i_store>`'s, and :ref:`llvm.memcpy <int_memcpy>`'s may be
3256 marked ``volatile``. The optimizers must not change the number of
3257 volatile operations or change their order of execution relative to other
3258 volatile operations. The optimizers *may* change the order of volatile
3259 operations relative to non-volatile operations. This is not Java's
3260 "volatile" and has no cross-thread synchronization behavior.
3262 A volatile load or store may have additional target-specific semantics.
3263 Any volatile operation can have side effects, and any volatile operation
3264 can read and/or modify state which is not accessible via a regular load
3265 or store in this module. Volatile operations may use addresses which do
3266 not point to memory (like MMIO registers). This means the compiler may
3267 not use a volatile operation to prove a non-volatile access to that
3268 address has defined behavior.
3270 The allowed side-effects for volatile accesses are limited. If a
3271 non-volatile store to a given address would be legal, a volatile
3272 operation may modify the memory at that address. A volatile operation
3273 may not modify any other memory accessible by the module being compiled.
3274 A volatile operation may not call any code in the current module.
3276 In general (without target specific context), the address space of a
3277 volatile operation may not be changed. Different address spaces may
3278 have different trapping behavior when dereferencing an invalid
3281 The compiler may assume execution will continue after a volatile operation,
3282 so operations which modify memory or may have undefined behavior can be
3283 hoisted past a volatile operation.
3285 As an exception to the preceding rule, the compiler may not assume execution
3286 will continue after a volatile store operation. This restriction is necessary
3287 to support the somewhat common pattern in C of intentionally storing to an
3288 invalid pointer to crash the program. In the future, it might make sense to
3289 allow frontends to control this behavior.
3291 IR-level volatile loads and stores cannot safely be optimized into llvm.memcpy
3292 or llvm.memmove intrinsics even when those intrinsics are flagged volatile.
3293 Likewise, the backend should never split or merge target-legal volatile
3294 load/store instructions. Similarly, IR-level volatile loads and stores cannot
3295 change from integer to floating-point or vice versa.
3297 .. admonition:: Rationale
3299 Platforms may rely on volatile loads and stores of natively supported
3300 data width to be executed as single instruction. For example, in C
3301 this holds for an l-value of volatile primitive type with native
3302 hardware support, but not necessarily for aggregate types. The
3303 frontend upholds these expectations, which are intentionally
3304 unspecified in the IR. The rules above ensure that IR transformations
3305 do not violate the frontend's contract with the language.
3309 Memory Model for Concurrent Operations
3310 --------------------------------------
3312 The LLVM IR does not define any way to start parallel threads of
3313 execution or to register signal handlers. Nonetheless, there are
3314 platform-specific ways to create them, and we define LLVM IR's behavior
3315 in their presence. This model is inspired by the C++0x memory model.
3317 For a more informal introduction to this model, see the :doc:`Atomics`.
3319 We define a *happens-before* partial order as the least partial order
3322 - Is a superset of single-thread program order, and
3323 - When a *synchronizes-with* ``b``, includes an edge from ``a`` to
3324 ``b``. *Synchronizes-with* pairs are introduced by platform-specific
3325 techniques, like pthread locks, thread creation, thread joining,
3326 etc., and by atomic instructions. (See also :ref:`Atomic Memory Ordering
3327 Constraints <ordering>`).
3329 Note that program order does not introduce *happens-before* edges
3330 between a thread and signals executing inside that thread.
3332 Every (defined) read operation (load instructions, memcpy, atomic
3333 loads/read-modify-writes, etc.) R reads a series of bytes written by
3334 (defined) write operations (store instructions, atomic
3335 stores/read-modify-writes, memcpy, etc.). For the purposes of this
3336 section, initialized globals are considered to have a write of the
3337 initializer which is atomic and happens before any other read or write
3338 of the memory in question. For each byte of a read R, R\ :sub:`byte`
3339 may see any write to the same byte, except:
3341 - If write\ :sub:`1` happens before write\ :sub:`2`, and
3342 write\ :sub:`2` happens before R\ :sub:`byte`, then
3343 R\ :sub:`byte` does not see write\ :sub:`1`.
3344 - If R\ :sub:`byte` happens before write\ :sub:`3`, then
3345 R\ :sub:`byte` does not see write\ :sub:`3`.
3347 Given that definition, R\ :sub:`byte` is defined as follows:
3349 - If R is volatile, the result is target-dependent. (Volatile is
3350 supposed to give guarantees which can support ``sig_atomic_t`` in
3351 C/C++, and may be used for accesses to addresses that do not behave
3352 like normal memory. It does not generally provide cross-thread
3354 - Otherwise, if there is no write to the same byte that happens before
3355 R\ :sub:`byte`, R\ :sub:`byte` returns ``undef`` for that byte.
3356 - Otherwise, if R\ :sub:`byte` may see exactly one write,
3357 R\ :sub:`byte` returns the value written by that write.
3358 - Otherwise, if R is atomic, and all the writes R\ :sub:`byte` may
3359 see are atomic, it chooses one of the values written. See the :ref:`Atomic
3360 Memory Ordering Constraints <ordering>` section for additional
3361 constraints on how the choice is made.
3362 - Otherwise R\ :sub:`byte` returns ``undef``.
3364 R returns the value composed of the series of bytes it read. This
3365 implies that some bytes within the value may be ``undef`` **without**
3366 the entire value being ``undef``. Note that this only defines the
3367 semantics of the operation; it doesn't mean that targets will emit more
3368 than one instruction to read the series of bytes.
3370 Note that in cases where none of the atomic intrinsics are used, this
3371 model places only one restriction on IR transformations on top of what
3372 is required for single-threaded execution: introducing a store to a byte
3373 which might not otherwise be stored is not allowed in general.
3374 (Specifically, in the case where another thread might write to and read
3375 from an address, introducing a store can change a load that may see
3376 exactly one write into a load that may see multiple writes.)
3380 Atomic Memory Ordering Constraints
3381 ----------------------------------
3383 Atomic instructions (:ref:`cmpxchg <i_cmpxchg>`,
3384 :ref:`atomicrmw <i_atomicrmw>`, :ref:`fence <i_fence>`,
3385 :ref:`atomic load <i_load>`, and :ref:`atomic store <i_store>`) take
3386 ordering parameters that determine which other atomic instructions on
3387 the same address they *synchronize with*. These semantics are borrowed
3388 from Java and C++0x, but are somewhat more colloquial. If these
3389 descriptions aren't precise enough, check those specs (see spec
3390 references in the :doc:`atomics guide <Atomics>`).
3391 :ref:`fence <i_fence>` instructions treat these orderings somewhat
3392 differently since they don't take an address. See that instruction's
3393 documentation for details.
3395 For a simpler introduction to the ordering constraints, see the
3399 The set of values that can be read is governed by the happens-before
3400 partial order. A value cannot be read unless some operation wrote
3401 it. This is intended to provide a guarantee strong enough to model
3402 Java's non-volatile shared variables. This ordering cannot be
3403 specified for read-modify-write operations; it is not strong enough
3404 to make them atomic in any interesting way.
3406 In addition to the guarantees of ``unordered``, there is a single
3407 total order for modifications by ``monotonic`` operations on each
3408 address. All modification orders must be compatible with the
3409 happens-before order. There is no guarantee that the modification
3410 orders can be combined to a global total order for the whole program
3411 (and this often will not be possible). The read in an atomic
3412 read-modify-write operation (:ref:`cmpxchg <i_cmpxchg>` and
3413 :ref:`atomicrmw <i_atomicrmw>`) reads the value in the modification
3414 order immediately before the value it writes. If one atomic read
3415 happens before another atomic read of the same address, the later
3416 read must see the same value or a later value in the address's
3417 modification order. This disallows reordering of ``monotonic`` (or
3418 stronger) operations on the same address. If an address is written
3419 ``monotonic``-ally by one thread, and other threads ``monotonic``-ally
3420 read that address repeatedly, the other threads must eventually see
3421 the write. This corresponds to the C++0x/C1x
3422 ``memory_order_relaxed``.
3424 In addition to the guarantees of ``monotonic``, a
3425 *synchronizes-with* edge may be formed with a ``release`` operation.
3426 This is intended to model C++'s ``memory_order_acquire``.
3428 In addition to the guarantees of ``monotonic``, if this operation
3429 writes a value which is subsequently read by an ``acquire``
3430 operation, it *synchronizes-with* that operation. (This isn't a
3431 complete description; see the C++0x definition of a release
3432 sequence.) This corresponds to the C++0x/C1x
3433 ``memory_order_release``.
3434 ``acq_rel`` (acquire+release)
3435 Acts as both an ``acquire`` and ``release`` operation on its
3436 address. This corresponds to the C++0x/C1x ``memory_order_acq_rel``.
3437 ``seq_cst`` (sequentially consistent)
3438 In addition to the guarantees of ``acq_rel`` (``acquire`` for an
3439 operation that only reads, ``release`` for an operation that only
3440 writes), there is a global total order on all
3441 sequentially-consistent operations on all addresses, which is
3442 consistent with the *happens-before* partial order and with the
3443 modification orders of all the affected addresses. Each
3444 sequentially-consistent read sees the last preceding write to the
3445 same address in this global order. This corresponds to the C++0x/C1x
3446 ``memory_order_seq_cst`` and Java volatile.
3450 If an atomic operation is marked ``syncscope("singlethread")``, it only
3451 *synchronizes with* and only participates in the seq\_cst total orderings of
3452 other operations running in the same thread (for example, in signal handlers).
3454 If an atomic operation is marked ``syncscope("<target-scope>")``, where
3455 ``<target-scope>`` is a target specific synchronization scope, then it is target
3456 dependent if it *synchronizes with* and participates in the seq\_cst total
3457 orderings of other operations.
3459 Otherwise, an atomic operation that is not marked ``syncscope("singlethread")``
3460 or ``syncscope("<target-scope>")`` *synchronizes with* and participates in the
3461 seq\_cst total orderings of other operations that are not marked
3462 ``syncscope("singlethread")`` or ``syncscope("<target-scope>")``.
3466 Floating-Point Environment
3467 --------------------------
3469 The default LLVM floating-point environment assumes that traps are disabled and
3470 status flags are not observable. Therefore, floating-point math operations do
3471 not have side effects and may be speculated freely. Results assume the
3472 round-to-nearest rounding mode, and subnormals are assumed to be preserved.
3474 Running LLVM code in an environment where these assumptions are not met can lead
3475 to undefined behavior. The ``strictfp`` and ``denormal-fp-math`` attributes as
3476 well as :ref:`Constrained Floating-Point Intrinsics <constrainedfp>` can be used
3477 to weaken LLVM's assumptions and ensure defined behavior in non-default
3478 floating-point environments; see their respective documentation for details.
3482 Behavior of Floating-Point NaN values
3483 -------------------------------------
3485 A floating-point NaN value consists of a sign bit, a quiet/signaling bit, and a
3486 payload (which makes up the rest of the mantissa except for the quiet/signaling
3487 bit). LLVM assumes that the quiet/signaling bit being set to ``1`` indicates a
3488 quiet NaN (QNaN), and a value of ``0`` indicates a signaling NaN (SNaN). In the
3489 following we will hence just call it the "quiet bit".
3491 The representation bits of a floating-point value do not mutate arbitrarily; in
3492 particular, if there is no floating-point operation being performed, NaN signs,
3493 quiet bits, and payloads are preserved.
3495 For the purpose of this section, ``bitcast`` as well as the following operations
3496 are not "floating-point math operations": ``fneg``, ``llvm.fabs``, and
3497 ``llvm.copysign``. These operations act directly on the underlying bit
3498 representation and never change anything except possibly for the sign bit.
3500 For floating-point math operations, unless specified otherwise, the following
3501 rules apply when a NaN value is returned: the result has a non-deterministic
3502 sign; the quiet bit and payload are non-deterministically chosen from the
3503 following set of options:
3505 - The quiet bit is set and the payload is all-zero. ("Preferred NaN" case)
3506 - The quiet bit is set and the payload is copied from any input operand that is
3507 a NaN. ("Quieting NaN propagation" case)
3508 - The quiet bit and payload are copied from any input operand that is a NaN.
3509 ("Unchanged NaN propagation" case)
3510 - The quiet bit is set and the payload is picked from a target-specific set of
3511 "extra" possible NaN payloads. The set can depend on the input operand values.
3512 This set is empty on x86 and ARM, but can be non-empty on other architectures.
3513 (For instance, on wasm, if any input NaN does not have the preferred all-zero
3514 payload or any input NaN is an SNaN, then this set contains all possible
3515 payloads; otherwise, it is empty. On SPARC, this set consists of the all-one
3518 In particular, if all input NaNs are quiet (or if there are no input NaNs), then
3519 the output NaN is definitely quiet. Signaling NaN outputs can only occur if they
3520 are provided as an input value. For example, "fmul SNaN, 1.0" may be simplified
3521 to SNaN rather than QNaN. Similarly, if all input NaNs are preferred (or if
3522 there are no input NaNs) and the target does not have any "extra" NaN payloads,
3523 then the output NaN is guaranteed to be preferred.
3525 Floating-point math operations are allowed to treat all NaNs as if they were
3526 quiet NaNs. For example, "pow(1.0, SNaN)" may be simplified to 1.0.
3528 Code that requires different behavior than this should use the
3529 :ref:`Constrained Floating-Point Intrinsics <constrainedfp>`.
3530 In particular, constrained intrinsics rule out the "Unchanged NaN propagation"
3531 case; they are guaranteed to return a QNaN.
3533 Unfortunately, due to hard-or-impossible-to-fix issues, LLVM violates its own
3534 specification on some architectures:
3536 - x86-32 without SSE2 enabled may convert floating-point values to x86_fp80 and
3537 back when performing floating-point math operations; this can lead to results
3538 with different precision than expected and it can alter NaN values. Since
3539 optimizations can make contradicting assumptions, this can lead to arbitrary
3540 miscompilations. See `issue #44218
3541 <https://github.com/llvm/llvm-project/issues/44218>`_.
3542 - x86-32 (even with SSE2 enabled) may implicitly perform such a conversion on
3543 values returned from a function for some calling conventions. See `issue
3544 #66803 <https://github.com/llvm/llvm-project/issues/66803>`_.
3545 - Older MIPS versions use the opposite polarity for the quiet/signaling bit, and
3546 LLVM does not correctly represent this. See `issue #60796
3547 <https://github.com/llvm/llvm-project/issues/60796>`_.
3554 LLVM IR floating-point operations (:ref:`fneg <i_fneg>`, :ref:`fadd <i_fadd>`,
3555 :ref:`fsub <i_fsub>`, :ref:`fmul <i_fmul>`, :ref:`fdiv <i_fdiv>`,
3556 :ref:`frem <i_frem>`, :ref:`fcmp <i_fcmp>`), :ref:`phi <i_phi>`,
3557 :ref:`select <i_select>` and :ref:`call <i_call>`
3558 may use the following flags to enable otherwise unsafe
3559 floating-point transformations.
3562 No NaNs - Allow optimizations to assume the arguments and result are not
3563 NaN. If an argument is a nan, or the result would be a nan, it produces
3564 a :ref:`poison value <poisonvalues>` instead.
3567 No Infs - Allow optimizations to assume the arguments and result are not
3568 +/-Inf. If an argument is +/-Inf, or the result would be +/-Inf, it
3569 produces a :ref:`poison value <poisonvalues>` instead.
3572 No Signed Zeros - Allow optimizations to treat the sign of a zero
3573 argument or zero result as insignificant. This does not imply that -0.0
3574 is poison and/or guaranteed to not exist in the operation.
3577 Allow Reciprocal - Allow optimizations to use the reciprocal of an
3578 argument rather than perform division.
3581 Allow floating-point contraction (e.g. fusing a multiply followed by an
3582 addition into a fused multiply-and-add). This does not enable reassociating
3583 to form arbitrary contractions. For example, ``(a*b) + (c*d) + e`` can not
3584 be transformed into ``(a*b) + ((c*d) + e)`` to create two fma operations.
3589 Approximate functions - Allow substitution of approximate calculations for
3590 functions (sin, log, sqrt, etc). See floating-point intrinsic definitions
3591 for places where this can apply to LLVM's intrinsic math functions.
3594 Allow reassociation transformations for floating-point instructions.
3595 This may dramatically change results in floating-point.
3598 This flag implies all of the others.
3602 Use-list Order Directives
3603 -------------------------
3605 Use-list directives encode the in-memory order of each use-list, allowing the
3606 order to be recreated. ``<order-indexes>`` is a comma-separated list of
3607 indexes that are assigned to the referenced value's uses. The referenced
3608 value's use-list is immediately sorted by these indexes.
3610 Use-list directives may appear at function scope or global scope. They are not
3611 instructions, and have no effect on the semantics of the IR. When they're at
3612 function scope, they must appear after the terminator of the final basic block.
3614 If basic blocks have their address taken via ``blockaddress()`` expressions,
3615 ``uselistorder_bb`` can be used to reorder their use-lists from outside their
3622 uselistorder <ty> <value>, { <order-indexes> }
3623 uselistorder_bb @function, %block { <order-indexes> }
3629 define void @foo(i32 %arg1, i32 %arg2) {
3631 ; ... instructions ...
3633 ; ... instructions ...
3635 ; At function scope.
3636 uselistorder i32 %arg1, { 1, 0, 2 }
3637 uselistorder label %bb, { 1, 0 }
3641 uselistorder ptr @global, { 1, 2, 0 }
3642 uselistorder i32 7, { 1, 0 }
3643 uselistorder i32 (i32) @bar, { 1, 0 }
3644 uselistorder_bb @foo, %bb, { 5, 1, 3, 2, 0, 4 }
3646 .. _source_filename:
3651 The *source filename* string is set to the original module identifier,
3652 which will be the name of the compiled source file when compiling from
3653 source through the clang front end, for example. It is then preserved through
3656 This is currently necessary to generate a consistent unique global
3657 identifier for local functions used in profile data, which prepends the
3658 source file name to the local function name.
3660 The syntax for the source file name is simply:
3662 .. code-block:: text
3664 source_filename = "/path/to/source.c"
3671 The LLVM type system is one of the most important features of the
3672 intermediate representation. Being typed enables a number of
3673 optimizations to be performed on the intermediate representation
3674 directly, without having to do extra analyses on the side before the
3675 transformation. A strong type system makes it easier to read the
3676 generated code and enables novel analyses and transformations that are
3677 not feasible to perform on normal three address code representations.
3687 The void type does not represent any value and has no size.
3705 The function type can be thought of as a function signature. It consists of a
3706 return type and a list of formal parameter types. The return type of a function
3707 type is a void type or first class type --- except for :ref:`label <t_label>`
3708 and :ref:`metadata <t_metadata>` types.
3714 <returntype> (<parameter list>)
3716 ...where '``<parameter list>``' is a comma-separated list of type
3717 specifiers. Optionally, the parameter list may include a type ``...``, which
3718 indicates that the function takes a variable number of arguments. Variable
3719 argument functions can access their arguments with the :ref:`variable argument
3720 handling intrinsic <int_varargs>` functions. '``<returntype>``' is any type
3721 except :ref:`label <t_label>` and :ref:`metadata <t_metadata>`.
3725 +---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+
3726 | ``i32 (i32)`` | function taking an ``i32``, returning an ``i32`` |
3727 +---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+
3728 | ``i32 (ptr, ...)`` | A vararg function that takes at least one :ref:`pointer <t_pointer>` argument and returns an integer. This is the signature for ``printf`` in LLVM. |
3729 +---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+
3730 | ``{i32, i32} (i32)`` | A function taking an ``i32``, returning a :ref:`structure <t_struct>` containing two ``i32`` values |
3731 +---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+
3738 The :ref:`first class <t_firstclass>` types are perhaps the most important.
3739 Values of these types are the only ones which can be produced by
3747 These are the types that are valid in registers from CodeGen's perspective.
3756 The integer type is a very simple type that simply specifies an
3757 arbitrary bit width for the integer type desired. Any bit width from 1
3758 bit to 2\ :sup:`23`\ (about 8 million) can be specified.
3766 The number of bits the integer will occupy is specified by the ``N``
3772 +----------------+------------------------------------------------+
3773 | ``i1`` | a single-bit integer. |
3774 +----------------+------------------------------------------------+
3775 | ``i32`` | a 32-bit integer. |
3776 +----------------+------------------------------------------------+
3777 | ``i1942652`` | a really big integer of over 1 million bits. |
3778 +----------------+------------------------------------------------+
3782 Floating-Point Types
3783 """"""""""""""""""""
3792 - 16-bit floating-point value
3795 - 16-bit "brain" floating-point value (7-bit significand). Provides the
3796 same number of exponent bits as ``float``, so that it matches its dynamic
3797 range, but with greatly reduced precision. Used in Intel's AVX-512 BF16
3798 extensions and Arm's ARMv8.6-A extensions, among others.
3801 - 32-bit floating-point value
3804 - 64-bit floating-point value
3807 - 128-bit floating-point value (113-bit significand)
3810 - 80-bit floating-point value (X87)
3813 - 128-bit floating-point value (two 64-bits)
3815 The binary format of half, float, double, and fp128 correspond to the
3816 IEEE-754-2008 specifications for binary16, binary32, binary64, and binary128
3824 The x86_amx type represents a value held in an AMX tile register on an x86
3825 machine. The operations allowed on it are quite limited. Only few intrinsics
3826 are allowed: stride load and store, zero and dot product. No instruction is
3827 allowed for this type. There are no arguments, arrays, pointers, vectors
3828 or constants of this type.
3842 The x86_mmx type represents a value held in an MMX register on an x86
3843 machine. The operations allowed on it are quite limited: parameters and
3844 return values, load and store, and bitcast. User-specified MMX
3845 instructions are represented as intrinsic or asm calls with arguments
3846 and/or results of this type. There are no arrays, vectors or constants
3863 The pointer type ``ptr`` is used to specify memory locations. Pointers are
3864 commonly used to reference objects in memory.
3866 Pointer types may have an optional address space attribute defining
3867 the numbered address space where the pointed-to object resides. For
3868 example, ``ptr addrspace(5)`` is a pointer to address space 5.
3869 In addition to integer constants, ``addrspace`` can also reference one of the
3870 address spaces defined in the :ref:`datalayout string<langref_datalayout>`.
3871 ``addrspace("A")`` will use the alloca address space, ``addrspace("G")``
3872 the default globals address space and ``addrspace("P")`` the program address
3875 The default address space is number zero.
3877 The semantics of non-zero address spaces are target-specific. Memory
3878 access through a non-dereferenceable pointer is undefined behavior in
3879 any address space. Pointers with the bit-value 0 are only assumed to
3880 be non-dereferenceable in address space 0, unless the function is
3881 marked with the ``null_pointer_is_valid`` attribute.
3883 If an object can be proven accessible through a pointer with a
3884 different address space, the access may be modified to use that
3885 address space. Exceptions apply if the operation is ``volatile``.
3887 Prior to LLVM 15, pointer types also specified a pointee type, such as
3888 ``i8*``, ``[4 x i32]*`` or ``i32 (i32*)*``. In LLVM 15, such "typed
3889 pointers" are still supported under non-default options. See the
3890 `opaque pointers document <OpaquePointers.html>`__ for more information.
3894 Target Extension Type
3895 """""""""""""""""""""
3899 Target extension types represent types that must be preserved through
3900 optimization, but are otherwise generally opaque to the compiler. They may be
3901 used as function parameters or arguments, and in :ref:`phi <i_phi>` or
3902 :ref:`select <i_select>` instructions. Some types may be also used in
3903 :ref:`alloca <i_alloca>` instructions or as global values, and correspondingly
3904 it is legal to use :ref:`load <i_load>` and :ref:`store <i_store>` instructions
3905 on them. Full semantics for these types are defined by the target.
3907 The only constants that target extension types may have are ``zeroinitializer``,
3908 ``undef``, and ``poison``. Other possible values for target extension types may
3909 arise from target-specific intrinsics and functions.
3911 These types cannot be converted to other types. As such, it is not legal to use
3912 them in :ref:`bitcast <i_bitcast>` instructions (as a source or target type),
3913 nor is it legal to use them in :ref:`ptrtoint <i_ptrtoint>` or
3914 :ref:`inttoptr <i_inttoptr>` instructions. Similarly, they are not legal to use
3915 in an :ref:`icmp <i_icmp>` instruction.
3917 Target extension types have a name and optional type or integer parameters. The
3918 meanings of name and parameters are defined by the target. When being defined in
3919 LLVM IR, all of the type parameters must precede all of the integer parameters.
3921 Specific target extension types are registered with LLVM as having specific
3922 properties. These properties can be used to restrict the type from appearing in
3923 certain contexts, such as being the type of a global variable or having a
3924 ``zeroinitializer`` constant be valid. A complete list of type properties may be
3925 found in the documentation for ``llvm::TargetExtType::Property`` (`doxygen
3926 <https://llvm.org/doxygen/classllvm_1_1TargetExtType.html>`_).
3930 .. code-block:: llvm
3933 target("label", void)
3934 target("label", void, i32)
3935 target("label", 0, 1, 2)
3936 target("label", void, i32, 0, 1, 2)
3946 A vector type is a simple derived type that represents a vector of
3947 elements. Vector types are used when multiple primitive data are
3948 operated in parallel using a single instruction (SIMD). A vector type
3949 requires a size (number of elements), an underlying primitive data type,
3950 and a scalable property to represent vectors where the exact hardware
3951 vector length is unknown at compile time. Vector types are considered
3952 :ref:`first class <t_firstclass>`.
3956 In general vector elements are laid out in memory in the same way as
3957 :ref:`array types <t_array>`. Such an analogy works fine as long as the vector
3958 elements are byte sized. However, when the elements of the vector aren't byte
3959 sized it gets a bit more complicated. One way to describe the layout is by
3960 describing what happens when a vector such as <N x iM> is bitcasted to an
3961 integer type with N*M bits, and then following the rules for storing such an
3964 A bitcast from a vector type to a scalar integer type will see the elements
3965 being packed together (without padding). The order in which elements are
3966 inserted in the integer depends on endianness. For little endian element zero
3967 is put in the least significant bits of the integer, and for big endian
3968 element zero is put in the most significant bits.
3970 Using a vector such as ``<i4 1, i4 2, i4 3, i4 5>`` as an example, together
3971 with the analogy that we can replace a vector store by a bitcast followed by
3972 an integer store, we get this for big endian:
3974 .. code-block:: llvm
3976 %val = bitcast <4 x i4> <i4 1, i4 2, i4 3, i4 5> to i16
3978 ; Bitcasting from a vector to an integral type can be seen as
3979 ; concatenating the values:
3980 ; %val now has the hexadecimal value 0x1235.
3982 store i16 %val, ptr %ptr
3984 ; In memory the content will be (8-bit addressing):
3986 ; [%ptr + 0]: 00010010 (0x12)
3987 ; [%ptr + 1]: 00110101 (0x35)
3989 The same example for little endian:
3991 .. code-block:: llvm
3993 %val = bitcast <4 x i4> <i4 1, i4 2, i4 3, i4 5> to i16
3995 ; Bitcasting from a vector to an integral type can be seen as
3996 ; concatenating the values:
3997 ; %val now has the hexadecimal value 0x5321.
3999 store i16 %val, ptr %ptr
4001 ; In memory the content will be (8-bit addressing):
4003 ; [%ptr + 0]: 00100001 (0x21)
4004 ; [%ptr + 1]: 01010011 (0x53)
4006 When ``<N*M>`` isn't evenly divisible by the byte size the exact memory layout
4007 is unspecified (just like it is for an integral type of the same size). This
4008 is because different targets could put the padding at different positions when
4009 the type size is smaller than the type's store size.
4015 < <# elements> x <elementtype> > ; Fixed-length vector
4016 < vscale x <# elements> x <elementtype> > ; Scalable vector
4018 The number of elements is a constant integer value larger than 0;
4019 elementtype may be any integer, floating-point or pointer type. Vectors
4020 of size zero are not allowed. For scalable vectors, the total number of
4021 elements is a constant multiple (called vscale) of the specified number
4022 of elements; vscale is a positive integer that is unknown at compile time
4023 and the same hardware-dependent constant for all scalable vectors at run
4024 time. The size of a specific scalable vector type is thus constant within
4025 IR, even if the exact size in bytes cannot be determined until run time.
4029 +------------------------+----------------------------------------------------+
4030 | ``<4 x i32>`` | Vector of 4 32-bit integer values. |
4031 +------------------------+----------------------------------------------------+
4032 | ``<8 x float>`` | Vector of 8 32-bit floating-point values. |
4033 +------------------------+----------------------------------------------------+
4034 | ``<2 x i64>`` | Vector of 2 64-bit integer values. |
4035 +------------------------+----------------------------------------------------+
4036 | ``<4 x ptr>`` | Vector of 4 pointers |
4037 +------------------------+----------------------------------------------------+
4038 | ``<vscale x 4 x i32>`` | Vector with a multiple of 4 32-bit integer values. |
4039 +------------------------+----------------------------------------------------+
4048 The label type represents code labels.
4063 The token type is used when a value is associated with an instruction
4064 but all uses of the value must not attempt to introspect or obscure it.
4065 As such, it is not appropriate to have a :ref:`phi <i_phi>` or
4066 :ref:`select <i_select>` of type token.
4083 The metadata type represents embedded metadata. No derived types may be
4084 created from metadata except for :ref:`function <t_function>` arguments.
4097 Aggregate Types are a subset of derived types that can contain multiple
4098 member types. :ref:`Arrays <t_array>` and :ref:`structs <t_struct>` are
4099 aggregate types. :ref:`Vectors <t_vector>` are not considered to be
4109 The array type is a very simple derived type that arranges elements
4110 sequentially in memory. The array type requires a size (number of
4111 elements) and an underlying data type.
4117 [<# elements> x <elementtype>]
4119 The number of elements is a constant integer value; ``elementtype`` may
4120 be any type with a size.
4124 +------------------+--------------------------------------+
4125 | ``[40 x i32]`` | Array of 40 32-bit integer values. |
4126 +------------------+--------------------------------------+
4127 | ``[41 x i32]`` | Array of 41 32-bit integer values. |
4128 +------------------+--------------------------------------+
4129 | ``[4 x i8]`` | Array of 4 8-bit integer values. |
4130 +------------------+--------------------------------------+
4132 Here are some examples of multidimensional arrays:
4134 +-----------------------------+----------------------------------------------------------+
4135 | ``[3 x [4 x i32]]`` | 3x4 array of 32-bit integer values. |
4136 +-----------------------------+----------------------------------------------------------+
4137 | ``[12 x [10 x float]]`` | 12x10 array of single precision floating-point values. |
4138 +-----------------------------+----------------------------------------------------------+
4139 | ``[2 x [3 x [4 x i16]]]`` | 2x3x4 array of 16-bit integer values. |
4140 +-----------------------------+----------------------------------------------------------+
4142 There is no restriction on indexing beyond the end of the array implied
4143 by a static type (though there are restrictions on indexing beyond the
4144 bounds of an allocated object in some cases). This means that
4145 single-dimension 'variable sized array' addressing can be implemented in
4146 LLVM with a zero length array type. An implementation of 'pascal style
4147 arrays' in LLVM could use the type "``{ i32, [0 x float]}``", for
4157 The structure type is used to represent a collection of data members
4158 together in memory. The elements of a structure may be any type that has
4161 Structures in memory are accessed using '``load``' and '``store``' by
4162 getting a pointer to a field with the '``getelementptr``' instruction.
4163 Structures in registers are accessed using the '``extractvalue``' and
4164 '``insertvalue``' instructions.
4166 Structures may optionally be "packed" structures, which indicate that
4167 the alignment of the struct is one byte, and that there is no padding
4168 between the elements. In non-packed structs, padding between field types
4169 is inserted as defined by the DataLayout string in the module, which is
4170 required to match what the underlying code generator expects.
4172 Structures can either be "literal" or "identified". A literal structure
4173 is defined inline with other types (e.g. ``[2 x {i32, i32}]``) whereas
4174 identified types are always defined at the top level with a name.
4175 Literal types are uniqued by their contents and can never be recursive
4176 or opaque since there is no way to write one. Identified types can be
4177 recursive, can be opaqued, and are never uniqued.
4183 %T1 = type { <type list> } ; Identified normal struct type
4184 %T2 = type <{ <type list> }> ; Identified packed struct type
4188 +------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
4189 | ``{ i32, i32, i32 }`` | A triple of three ``i32`` values |
4190 +------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
4191 | ``{ float, ptr }`` | A pair, where the first element is a ``float`` and the second element is a :ref:`pointer <t_pointer>`. |
4192 +------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
4193 | ``<{ i8, i32 }>`` | A packed struct known to be 5 bytes in size. |
4194 +------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
4198 Opaque Structure Types
4199 """"""""""""""""""""""
4203 Opaque structure types are used to represent structure types that
4204 do not have a body specified. This corresponds (for example) to the C
4205 notion of a forward declared structure. They can be named (``%X``) or
4217 +--------------+-------------------+
4218 | ``opaque`` | An opaque type. |
4219 +--------------+-------------------+
4226 LLVM has several different basic types of constants. This section
4227 describes them all and their syntax.
4232 **Boolean constants**
4233 The two strings '``true``' and '``false``' are both valid constants
4235 **Integer constants**
4236 Standard integers (such as '4') are constants of the :ref:`integer
4237 <t_integer>` type. They can be either decimal or
4238 hexadecimal. Decimal integers can be prefixed with - to represent
4239 negative integers, e.g. '``-1234``'. Hexadecimal integers must be
4240 prefixed with either u or s to indicate whether they are unsigned
4241 or signed respectively. e.g '``u0x8000``' gives 32768, whilst
4242 '``s0x8000``' gives -32768.
4244 Note that hexadecimal integers are sign extended from the number
4245 of active bits, i.e. the bit width minus the number of leading
4246 zeros. So '``s0x0001``' of type '``i16``' will be -1, not 1.
4247 **Floating-point constants**
4248 Floating-point constants use standard decimal notation (e.g.
4249 123.421), exponential notation (e.g. 1.23421e+2), or a more precise
4250 hexadecimal notation (see below). The assembler requires the exact
4251 decimal value of a floating-point constant. For example, the
4252 assembler accepts 1.25 but rejects 1.3 because 1.3 is a repeating
4253 decimal in binary. Floating-point constants must have a
4254 :ref:`floating-point <t_floating>` type.
4255 **Null pointer constants**
4256 The identifier '``null``' is recognized as a null pointer constant
4257 and must be of :ref:`pointer type <t_pointer>`.
4259 The identifier '``none``' is recognized as an empty token constant
4260 and must be of :ref:`token type <t_token>`.
4262 The one non-intuitive notation for constants is the hexadecimal form of
4263 floating-point constants. For example, the form
4264 '``double 0x432ff973cafa8000``' is equivalent to (but harder to read
4265 than) '``double 4.5e+15``'. The only time hexadecimal floating-point
4266 constants are required (and the only time that they are generated by the
4267 disassembler) is when a floating-point constant must be emitted but it
4268 cannot be represented as a decimal floating-point number in a reasonable
4269 number of digits. For example, NaN's, infinities, and other special
4270 values are represented in their IEEE hexadecimal format so that assembly
4271 and disassembly do not cause any bits to change in the constants.
4273 When using the hexadecimal form, constants of types bfloat, half, float, and
4274 double are represented using the 16-digit form shown above (which matches the
4275 IEEE754 representation for double); bfloat, half and float values must, however,
4276 be exactly representable as bfloat, IEEE 754 half, and IEEE 754 single
4277 precision respectively. Hexadecimal format is always used for long double, and
4278 there are three forms of long double. The 80-bit format used by x86 is
4279 represented as ``0xK`` followed by 20 hexadecimal digits. The 128-bit format
4280 used by PowerPC (two adjacent doubles) is represented by ``0xM`` followed by 32
4281 hexadecimal digits. The IEEE 128-bit format is represented by ``0xL`` followed
4282 by 32 hexadecimal digits. Long doubles will only work if they match the long
4283 double format on your target. The IEEE 16-bit format (half precision) is
4284 represented by ``0xH`` followed by 4 hexadecimal digits. The bfloat 16-bit
4285 format is represented by ``0xR`` followed by 4 hexadecimal digits. All
4286 hexadecimal formats are big-endian (sign bit at the left).
4288 There are no constants of type x86_mmx and x86_amx.
4290 .. _complexconstants:
4295 Complex constants are a (potentially recursive) combination of simple
4296 constants and smaller complex constants.
4298 **Structure constants**
4299 Structure constants are represented with notation similar to
4300 structure type definitions (a comma separated list of elements,
4301 surrounded by braces (``{}``)). For example:
4302 "``{ i32 4, float 17.0, ptr @G }``", where "``@G``" is declared as
4303 "``@G = external global i32``". Structure constants must have
4304 :ref:`structure type <t_struct>`, and the number and types of elements
4305 must match those specified by the type.
4307 Array constants are represented with notation similar to array type
4308 definitions (a comma separated list of elements, surrounded by
4309 square brackets (``[]``)). For example:
4310 "``[ i32 42, i32 11, i32 74 ]``". Array constants must have
4311 :ref:`array type <t_array>`, and the number and types of elements must
4312 match those specified by the type. As a special case, character array
4313 constants may also be represented as a double-quoted string using the ``c``
4314 prefix. For example: "``c"Hello World\0A\00"``".
4315 **Vector constants**
4316 Vector constants are represented with notation similar to vector
4317 type definitions (a comma separated list of elements, surrounded by
4318 less-than/greater-than's (``<>``)). For example:
4319 "``< i32 42, i32 11, i32 74, i32 100 >``". Vector constants
4320 must have :ref:`vector type <t_vector>`, and the number and types of
4321 elements must match those specified by the type.
4323 When creating a vector whose elements have the same constant value, the
4324 preferred syntax is ``splat (<Ty> Val)``. For example: "``splat (i32 11)``".
4325 These vector constants must have ::ref:`vector type <t_vector>` with an
4326 element type that matches the ``splat`` operand.
4327 **Zero initialization**
4328 The string '``zeroinitializer``' can be used to zero initialize a
4329 value to zero of *any* type, including scalar and
4330 :ref:`aggregate <t_aggregate>` types. This is often used to avoid
4331 having to print large zero initializers (e.g. for large arrays) and
4332 is always exactly equivalent to using explicit zero initializers.
4334 A metadata node is a constant tuple without types. For example:
4335 "``!{!0, !{!2, !0}, !"test"}``". Metadata can reference constant values,
4336 for example: "``!{!0, i32 0, ptr @global, ptr @function, !"str"}``".
4337 Unlike other typed constants that are meant to be interpreted as part of
4338 the instruction stream, metadata is a place to attach additional
4339 information such as debug info.
4341 Global Variable and Function Addresses
4342 --------------------------------------
4344 The addresses of :ref:`global variables <globalvars>` and
4345 :ref:`functions <functionstructure>` are always implicitly valid
4346 (link-time) constants. These constants are explicitly referenced when
4347 the :ref:`identifier for the global <identifiers>` is used and always have
4348 :ref:`pointer <t_pointer>` type. For example, the following is a legal LLVM
4351 .. code-block:: llvm
4355 @Z = global [2 x ptr] [ ptr @X, ptr @Y ]
4362 The string '``undef``' can be used anywhere a constant is expected, and
4363 indicates that the user of the value may receive an unspecified
4364 bit-pattern. Undefined values may be of any type (other than '``label``'
4365 or '``void``') and be used anywhere a constant is permitted.
4369 A '``poison``' value (described in the next section) should be used instead of
4370 '``undef``' whenever possible. Poison values are stronger than undef, and
4371 enable more optimizations. Just the existence of '``undef``' blocks certain
4372 optimizations (see the examples below).
4374 Undefined values are useful because they indicate to the compiler that
4375 the program is well defined no matter what value is used. This gives the
4376 compiler more freedom to optimize. Here are some examples of
4377 (potentially surprising) transformations that are valid (in pseudo IR):
4379 .. code-block:: llvm
4389 This is safe because all of the output bits are affected by the undef
4390 bits. Any output bit can have a zero or one depending on the input bits.
4392 .. code-block:: llvm
4400 %A = %X ;; By choosing undef as 0
4401 %B = %X ;; By choosing undef as -1
4406 These logical operations have bits that are not always affected by the
4407 input. For example, if ``%X`` has a zero bit, then the output of the
4408 '``and``' operation will always be a zero for that bit, no matter what
4409 the corresponding bit from the '``undef``' is. As such, it is unsafe to
4410 optimize or assume that the result of the '``and``' is '``undef``'.
4411 However, it is safe to assume that all bits of the '``undef``' could be
4412 0, and optimize the '``and``' to 0. Likewise, it is safe to assume that
4413 all the bits of the '``undef``' operand to the '``or``' could be set,
4414 allowing the '``or``' to be folded to -1.
4416 .. code-block:: llvm
4418 %A = select undef, %X, %Y
4419 %B = select undef, 42, %Y
4420 %C = select %X, %Y, undef
4424 %C = %Y (if %Y is provably not poison; unsafe otherwise)
4430 This set of examples shows that undefined '``select``' (and conditional
4431 branch) conditions can go *either way*, but they have to come from one
4432 of the two operands. In the ``%A`` example, if ``%X`` and ``%Y`` were
4433 both known to have a clear low bit, then ``%A`` would have to have a
4434 cleared low bit. However, in the ``%C`` example, the optimizer is
4435 allowed to assume that the '``undef``' operand could be the same as
4436 ``%Y`` if ``%Y`` is provably not '``poison``', allowing the whole '``select``'
4437 to be eliminated. This is because '``poison``' is stronger than '``undef``'.
4439 .. code-block:: llvm
4441 %A = xor undef, undef
4458 This example points out that two '``undef``' operands are not
4459 necessarily the same. This can be surprising to people (and also matches
4460 C semantics) where they assume that "``X^X``" is always zero, even if
4461 ``X`` is undefined. This isn't true for a number of reasons, but the
4462 short answer is that an '``undef``' "variable" can arbitrarily change
4463 its value over its "live range". This is true because the variable
4464 doesn't actually *have a live range*. Instead, the value is logically
4465 read from arbitrary registers that happen to be around when needed, so
4466 the value is not necessarily consistent over time. In fact, ``%A`` and
4467 ``%C`` need to have the same semantics or the core LLVM "replace all
4468 uses with" concept would not hold.
4470 To ensure all uses of a given register observe the same value (even if
4471 '``undef``'), the :ref:`freeze instruction <i_freeze>` can be used.
4473 .. code-block:: llvm
4481 These examples show the crucial difference between an *undefined value*
4482 and *undefined behavior*. An undefined value (like '``undef``') is
4483 allowed to have an arbitrary bit-pattern. This means that the ``%A``
4484 operation can be constant folded to '``0``', because the '``undef``'
4485 could be zero, and zero divided by any value is zero.
4486 However, in the second example, we can make a more aggressive
4487 assumption: because the ``undef`` is allowed to be an arbitrary value,
4488 we are allowed to assume that it could be zero. Since a divide by zero
4489 has *undefined behavior*, we are allowed to assume that the operation
4490 does not execute at all. This allows us to delete the divide and all
4491 code after it. Because the undefined operation "can't happen", the
4492 optimizer can assume that it occurs in dead code.
4494 .. code-block:: text
4496 a: store undef -> %X
4497 b: store %X -> undef
4499 a: <deleted> (if the stored value in %X is provably not poison)
4502 A store *of* an undefined value can be assumed to not have any effect;
4503 we can assume that the value is overwritten with bits that happen to
4504 match what was already there. This argument is only valid if the stored value
4505 is provably not ``poison``. However, a store *to* an undefined
4506 location could clobber arbitrary memory, therefore, it has undefined
4509 Branching on an undefined value is undefined behavior.
4510 This explains optimizations that depend on branch conditions to construct
4511 predicates, such as Correlated Value Propagation and Global Value Numbering.
4512 In case of switch instruction, the branch condition should be frozen, otherwise
4513 it is undefined behavior.
4515 .. code-block:: llvm
4518 br undef, BB1, BB2 ; UB
4520 %X = and i32 undef, 255
4521 switch %X, label %ret [ .. ] ; UB
4523 store undef, ptr %ptr
4524 %X = load ptr %ptr ; %X is undef
4525 switch i8 %X, label %ret [ .. ] ; UB
4528 %X = or i8 undef, 255 ; always 255
4529 switch i8 %X, label %ret [ .. ] ; Well-defined
4531 %X = freeze i1 undef
4532 br %X, BB1, BB2 ; Well-defined (non-deterministic jump)
4541 A poison value is a result of an erroneous operation.
4542 In order to facilitate speculative execution, many instructions do not
4543 invoke immediate undefined behavior when provided with illegal operands,
4544 and return a poison value instead.
4545 The string '``poison``' can be used anywhere a constant is expected, and
4546 operations such as :ref:`add <i_add>` with the ``nsw`` flag can produce
4549 Most instructions return '``poison``' when one of their arguments is
4550 '``poison``'. A notable exception is the :ref:`select instruction <i_select>`.
4551 Propagation of poison can be stopped with the
4552 :ref:`freeze instruction <i_freeze>`.
4554 It is correct to replace a poison value with an
4555 :ref:`undef value <undefvalues>` or any value of the type.
4557 This means that immediate undefined behavior occurs if a poison value is
4558 used as an instruction operand that has any values that trigger undefined
4559 behavior. Notably this includes (but is not limited to):
4561 - The pointer operand of a :ref:`load <i_load>`, :ref:`store <i_store>` or
4562 any other pointer dereferencing instruction (independent of address
4564 - The divisor operand of a ``udiv``, ``sdiv``, ``urem`` or ``srem``
4566 - The condition operand of a :ref:`br <i_br>` instruction.
4567 - The callee operand of a :ref:`call <i_call>` or :ref:`invoke <i_invoke>`
4569 - The parameter operand of a :ref:`call <i_call>` or :ref:`invoke <i_invoke>`
4570 instruction, when the function or invoking call site has a ``noundef``
4571 attribute in the corresponding position.
4572 - The operand of a :ref:`ret <i_ret>` instruction if the function or invoking
4573 call site has a `noundef` attribute in the return value position.
4575 Here are some examples:
4577 .. code-block:: llvm
4580 %poison = sub nuw i32 0, 1 ; Results in a poison value.
4581 %poison2 = sub i32 poison, 1 ; Also results in a poison value.
4582 %still_poison = and i32 %poison, 0 ; 0, but also poison.
4583 %poison_yet_again = getelementptr i32, ptr @h, i32 %still_poison
4584 store i32 0, ptr %poison_yet_again ; Undefined behavior due to
4587 store i32 %poison, ptr @g ; Poison value stored to memory.
4588 %poison3 = load i32, ptr @g ; Poison value loaded back from memory.
4590 %poison4 = load i16, ptr @g ; Returns a poison value.
4591 %poison5 = load i64, ptr @g ; Returns a poison value.
4593 %cmp = icmp slt i32 %poison, 0 ; Returns a poison value.
4594 br i1 %cmp, label %end, label %end ; undefined behavior
4598 .. _welldefinedvalues:
4603 Given a program execution, a value is *well defined* if the value does not
4604 have an undef bit and is not poison in the execution.
4605 An aggregate value or vector is well defined if its elements are well defined.
4606 The padding of an aggregate isn't considered, since it isn't visible
4607 without storing it into memory and loading it with a different type.
4609 A constant of a :ref:`single value <t_single_value>`, non-vector type is well
4610 defined if it is neither '``undef``' constant nor '``poison``' constant.
4611 The result of :ref:`freeze instruction <i_freeze>` is well defined regardless
4616 Addresses of Basic Blocks
4617 -------------------------
4619 ``blockaddress(@function, %block)``
4621 The '``blockaddress``' constant computes the address of the specified
4622 basic block in the specified function.
4624 It always has an ``ptr addrspace(P)`` type, where ``P`` is the address space
4625 of the function containing ``%block`` (usually ``addrspace(0)``).
4627 Taking the address of the entry block is illegal.
4629 This value only has defined behavior when used as an operand to the
4630 ':ref:`indirectbr <i_indirectbr>`' or for comparisons against null. Pointer
4631 equality tests between labels addresses results in undefined behavior ---
4632 though, again, comparison against null is ok, and no label is equal to the null
4633 pointer. This may be passed around as an opaque pointer sized value as long as
4634 the bits are not inspected. This allows ``ptrtoint`` and arithmetic to be
4635 performed on these values so long as the original value is reconstituted before
4636 the ``indirectbr`` instruction.
4638 Finally, some targets may provide defined semantics when using the value
4639 as the operand to an inline assembly, but that is target specific.
4641 .. _dso_local_equivalent:
4643 DSO Local Equivalent
4644 --------------------
4646 ``dso_local_equivalent @func``
4648 A '``dso_local_equivalent``' constant represents a function which is
4649 functionally equivalent to a given function, but is always defined in the
4650 current linkage unit. The resulting pointer has the same type as the underlying
4651 function. The resulting pointer is permitted, but not required, to be different
4652 from a pointer to the function, and it may have different values in different
4655 The target function may not have ``extern_weak`` linkage.
4657 ``dso_local_equivalent`` can be implemented as such:
4659 - If the function has local linkage, hidden visibility, or is
4660 ``dso_local``, ``dso_local_equivalent`` can be implemented as simply a pointer
4662 - ``dso_local_equivalent`` can be implemented with a stub that tail-calls the
4663 function. Many targets support relocations that resolve at link time to either
4664 a function or a stub for it, depending on if the function is defined within the
4665 linkage unit; LLVM will use this when available. (This is commonly called a
4666 "PLT stub".) On other targets, the stub may need to be emitted explicitly.
4668 This can be used wherever a ``dso_local`` instance of a function is needed without
4669 needing to explicitly make the original function ``dso_local``. An instance where
4670 this can be used is for static offset calculations between a function and some other
4671 ``dso_local`` symbol. This is especially useful for the Relative VTables C++ ABI,
4672 where dynamic relocations for function pointers in VTables can be replaced with
4673 static relocations for offsets between the VTable and virtual functions which
4674 may not be ``dso_local``.
4676 This is currently only supported for ELF binary formats.
4685 With `Control-Flow Integrity (CFI)
4686 <https://clang.llvm.org/docs/ControlFlowIntegrity.html>`_, a '``no_cfi``'
4687 constant represents a function reference that does not get replaced with a
4688 reference to the CFI jump table in the ``LowerTypeTests`` pass. These constants
4689 may be useful in low-level programs, such as operating system kernels, which
4690 need to refer to the actual function body.
4694 Constant Expressions
4695 --------------------
4697 Constant expressions are used to allow expressions involving other
4698 constants to be used as constants. Constant expressions may be of any
4699 :ref:`first class <t_firstclass>` type and may involve any LLVM operation
4700 that does not have side effects (e.g. load and call are not supported).
4701 The following is the syntax for constant expressions:
4703 ``trunc (CST to TYPE)``
4704 Perform the :ref:`trunc operation <i_trunc>` on constants.
4705 ``ptrtoint (CST to TYPE)``
4706 Perform the :ref:`ptrtoint operation <i_ptrtoint>` on constants.
4707 ``inttoptr (CST to TYPE)``
4708 Perform the :ref:`inttoptr operation <i_inttoptr>` on constants.
4709 This one is *really* dangerous!
4710 ``bitcast (CST to TYPE)``
4711 Convert a constant, CST, to another TYPE.
4712 The constraints of the operands are the same as those for the
4713 :ref:`bitcast instruction <i_bitcast>`.
4714 ``addrspacecast (CST to TYPE)``
4715 Convert a constant pointer or constant vector of pointer, CST, to another
4716 TYPE in a different address space. The constraints of the operands are the
4717 same as those for the :ref:`addrspacecast instruction <i_addrspacecast>`.
4718 ``getelementptr (TY, CSTPTR, IDX0, IDX1, ...)``, ``getelementptr inbounds (TY, CSTPTR, IDX0, IDX1, ...)``
4719 Perform the :ref:`getelementptr operation <i_getelementptr>` on
4720 constants. As with the :ref:`getelementptr <i_getelementptr>`
4721 instruction, the index list may have one or more indexes, which are
4722 required to make sense for the type of "pointer to TY". These indexes
4723 may be implicitly sign-extended or truncated to match the index size
4724 of CSTPTR's address space.
4725 ``icmp COND (VAL1, VAL2)``
4726 Perform the :ref:`icmp operation <i_icmp>` on constants.
4727 ``fcmp COND (VAL1, VAL2)``
4728 Perform the :ref:`fcmp operation <i_fcmp>` on constants.
4729 ``extractelement (VAL, IDX)``
4730 Perform the :ref:`extractelement operation <i_extractelement>` on
4732 ``insertelement (VAL, ELT, IDX)``
4733 Perform the :ref:`insertelement operation <i_insertelement>` on
4735 ``shufflevector (VEC1, VEC2, IDXMASK)``
4736 Perform the :ref:`shufflevector operation <i_shufflevector>` on
4739 Perform an addition on constants.
4741 Perform a subtraction on constants.
4743 Perform a multiplication on constants.
4745 Perform a left shift on constants.
4747 Perform a bitwise xor on constants.
4754 Inline Assembler Expressions
4755 ----------------------------
4757 LLVM supports inline assembler expressions (as opposed to :ref:`Module-Level
4758 Inline Assembly <moduleasm>`) through the use of a special value. This value
4759 represents the inline assembler as a template string (containing the
4760 instructions to emit), a list of operand constraints (stored as a string), a
4761 flag that indicates whether or not the inline asm expression has side effects,
4762 and a flag indicating whether the function containing the asm needs to align its
4763 stack conservatively.
4765 The template string supports argument substitution of the operands using "``$``"
4766 followed by a number, to indicate substitution of the given register/memory
4767 location, as specified by the constraint string. "``${NUM:MODIFIER}``" may also
4768 be used, where ``MODIFIER`` is a target-specific annotation for how to print the
4769 operand (See :ref:`inline-asm-modifiers`).
4771 A literal "``$``" may be included by using "``$$``" in the template. To include
4772 other special characters into the output, the usual "``\XX``" escapes may be
4773 used, just as in other strings. Note that after template substitution, the
4774 resulting assembly string is parsed by LLVM's integrated assembler unless it is
4775 disabled -- even when emitting a ``.s`` file -- and thus must contain assembly
4776 syntax known to LLVM.
4778 LLVM also supports a few more substitutions useful for writing inline assembly:
4780 - ``${:uid}``: Expands to a decimal integer unique to this inline assembly blob.
4781 This substitution is useful when declaring a local label. Many standard
4782 compiler optimizations, such as inlining, may duplicate an inline asm blob.
4783 Adding a blob-unique identifier ensures that the two labels will not conflict
4784 during assembly. This is used to implement `GCC's %= special format
4785 string <https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html>`_.
4786 - ``${:comment}``: Expands to the comment character of the current target's
4787 assembly dialect. This is usually ``#``, but many targets use other strings,
4788 such as ``;``, ``//``, or ``!``.
4789 - ``${:private}``: Expands to the assembler private label prefix. Labels with
4790 this prefix will not appear in the symbol table of the assembled object.
4791 Typically the prefix is ``L``, but targets may use other strings. ``.L`` is
4794 LLVM's support for inline asm is modeled closely on the requirements of Clang's
4795 GCC-compatible inline-asm support. Thus, the feature-set and the constraint and
4796 modifier codes listed here are similar or identical to those in GCC's inline asm
4797 support. However, to be clear, the syntax of the template and constraint strings
4798 described here is *not* the same as the syntax accepted by GCC and Clang, and,
4799 while most constraint letters are passed through as-is by Clang, some get
4800 translated to other codes when converting from the C source to the LLVM
4803 An example inline assembler expression is:
4805 .. code-block:: llvm
4807 i32 (i32) asm "bswap $0", "=r,r"
4809 Inline assembler expressions may **only** be used as the callee operand
4810 of a :ref:`call <i_call>` or an :ref:`invoke <i_invoke>` instruction.
4811 Thus, typically we have:
4813 .. code-block:: llvm
4815 %X = call i32 asm "bswap $0", "=r,r"(i32 %Y)
4817 Inline asms with side effects not visible in the constraint list must be
4818 marked as having side effects. This is done through the use of the
4819 '``sideeffect``' keyword, like so:
4821 .. code-block:: llvm
4823 call void asm sideeffect "eieio", ""()
4825 In some cases inline asms will contain code that will not work unless
4826 the stack is aligned in some way, such as calls or SSE instructions on
4827 x86, yet will not contain code that does that alignment within the asm.
4828 The compiler should make conservative assumptions about what the asm
4829 might contain and should generate its usual stack alignment code in the
4830 prologue if the '``alignstack``' keyword is present:
4832 .. code-block:: llvm
4834 call void asm alignstack "eieio", ""()
4836 Inline asms also support using non-standard assembly dialects. The
4837 assumed dialect is ATT. When the '``inteldialect``' keyword is present,
4838 the inline asm is using the Intel dialect. Currently, ATT and Intel are
4839 the only supported dialects. An example is:
4841 .. code-block:: llvm
4843 call void asm inteldialect "eieio", ""()
4845 In the case that the inline asm might unwind the stack,
4846 the '``unwind``' keyword must be used, so that the compiler emits
4847 unwinding information:
4849 .. code-block:: llvm
4851 call void asm unwind "call func", ""()
4853 If the inline asm unwinds the stack and isn't marked with
4854 the '``unwind``' keyword, the behavior is undefined.
4856 If multiple keywords appear, the '``sideeffect``' keyword must come
4857 first, the '``alignstack``' keyword second, the '``inteldialect``' keyword
4858 third and the '``unwind``' keyword last.
4860 Inline Asm Constraint String
4861 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
4863 The constraint list is a comma-separated string, each element containing one or
4864 more constraint codes.
4866 For each element in the constraint list an appropriate register or memory
4867 operand will be chosen, and it will be made available to assembly template
4868 string expansion as ``$0`` for the first constraint in the list, ``$1`` for the
4871 There are three different types of constraints, which are distinguished by a
4872 prefix symbol in front of the constraint code: Output, Input, and Clobber. The
4873 constraints must always be given in that order: outputs first, then inputs, then
4874 clobbers. They cannot be intermingled.
4876 There are also three different categories of constraint codes:
4878 - Register constraint. This is either a register class, or a fixed physical
4879 register. This kind of constraint will allocate a register, and if necessary,
4880 bitcast the argument or result to the appropriate type.
4881 - Memory constraint. This kind of constraint is for use with an instruction
4882 taking a memory operand. Different constraints allow for different addressing
4883 modes used by the target.
4884 - Immediate value constraint. This kind of constraint is for an integer or other
4885 immediate value which can be rendered directly into an instruction. The
4886 various target-specific constraints allow the selection of a value in the
4887 proper range for the instruction you wish to use it with.
4892 Output constraints are specified by an "``=``" prefix (e.g. "``=r``"). This
4893 indicates that the assembly will write to this operand, and the operand will
4894 then be made available as a return value of the ``asm`` expression. Output
4895 constraints do not consume an argument from the call instruction. (Except, see
4896 below about indirect outputs).
4898 Normally, it is expected that no output locations are written to by the assembly
4899 expression until *all* of the inputs have been read. As such, LLVM may assign
4900 the same register to an output and an input. If this is not safe (e.g. if the
4901 assembly contains two instructions, where the first writes to one output, and
4902 the second reads an input and writes to a second output), then the "``&``"
4903 modifier must be used (e.g. "``=&r``") to specify that the output is an
4904 "early-clobber" output. Marking an output as "early-clobber" ensures that LLVM
4905 will not use the same register for any inputs (other than an input tied to this
4911 Input constraints do not have a prefix -- just the constraint codes. Each input
4912 constraint will consume one argument from the call instruction. It is not
4913 permitted for the asm to write to any input register or memory location (unless
4914 that input is tied to an output). Note also that multiple inputs may all be
4915 assigned to the same register, if LLVM can determine that they necessarily all
4916 contain the same value.
4918 Instead of providing a Constraint Code, input constraints may also "tie"
4919 themselves to an output constraint, by providing an integer as the constraint
4920 string. Tied inputs still consume an argument from the call instruction, and
4921 take up a position in the asm template numbering as is usual -- they will simply
4922 be constrained to always use the same register as the output they've been tied
4923 to. For example, a constraint string of "``=r,0``" says to assign a register for
4924 output, and use that register as an input as well (it being the 0'th
4927 It is permitted to tie an input to an "early-clobber" output. In that case, no
4928 *other* input may share the same register as the input tied to the early-clobber
4929 (even when the other input has the same value).
4931 You may only tie an input to an output which has a register constraint, not a
4932 memory constraint. Only a single input may be tied to an output.
4934 There is also an "interesting" feature which deserves a bit of explanation: if a
4935 register class constraint allocates a register which is too small for the value
4936 type operand provided as input, the input value will be split into multiple
4937 registers, and all of them passed to the inline asm.
4939 However, this feature is often not as useful as you might think.
4941 Firstly, the registers are *not* guaranteed to be consecutive. So, on those
4942 architectures that have instructions which operate on multiple consecutive
4943 instructions, this is not an appropriate way to support them. (e.g. the 32-bit
4944 SparcV8 has a 64-bit load, which instruction takes a single 32-bit register. The
4945 hardware then loads into both the named register, and the next register. This
4946 feature of inline asm would not be useful to support that.)
4948 A few of the targets provide a template string modifier allowing explicit access
4949 to the second register of a two-register operand (e.g. MIPS ``L``, ``M``, and
4950 ``D``). On such an architecture, you can actually access the second allocated
4951 register (yet, still, not any subsequent ones). But, in that case, you're still
4952 probably better off simply splitting the value into two separate operands, for
4953 clarity. (e.g. see the description of the ``A`` constraint on X86, which,
4954 despite existing only for use with this feature, is not really a good idea to
4957 Indirect inputs and outputs
4958 """""""""""""""""""""""""""
4960 Indirect output or input constraints can be specified by the "``*``" modifier
4961 (which goes after the "``=``" in case of an output). This indicates that the asm
4962 will write to or read from the contents of an *address* provided as an input
4963 argument. (Note that in this way, indirect outputs act more like an *input* than
4964 an output: just like an input, they consume an argument of the call expression,
4965 rather than producing a return value. An indirect output constraint is an
4966 "output" only in that the asm is expected to write to the contents of the input
4967 memory location, instead of just read from it).
4969 This is most typically used for memory constraint, e.g. "``=*m``", to pass the
4970 address of a variable as a value.
4972 It is also possible to use an indirect *register* constraint, but only on output
4973 (e.g. "``=*r``"). This will cause LLVM to allocate a register for an output
4974 value normally, and then, separately emit a store to the address provided as
4975 input, after the provided inline asm. (It's not clear what value this
4976 functionality provides, compared to writing the store explicitly after the asm
4977 statement, and it can only produce worse code, since it bypasses many
4978 optimization passes. I would recommend not using it.)
4980 Call arguments for indirect constraints must have pointer type and must specify
4981 the :ref:`elementtype <attr_elementtype>` attribute to indicate the pointer
4987 A clobber constraint is indicated by a "``~``" prefix. A clobber does not
4988 consume an input operand, nor generate an output. Clobbers cannot use any of the
4989 general constraint code letters -- they may use only explicit register
4990 constraints, e.g. "``~{eax}``". The one exception is that a clobber string of
4991 "``~{memory}``" indicates that the assembly writes to arbitrary undeclared
4992 memory locations -- not only the memory pointed to by a declared indirect
4995 Note that clobbering named registers that are also present in output
4996 constraints is not legal.
5001 A label constraint is indicated by a "``!``" prefix and typically used in the
5002 form ``"!i"``. Instead of consuming call arguments, label constraints consume
5003 indirect destination labels of ``callbr`` instructions.
5005 Label constraints can only be used in conjunction with ``callbr`` and the
5006 number of label constraints must match the number of indirect destination
5007 labels in the ``callbr`` instruction.
5012 After a potential prefix comes constraint code, or codes.
5014 A Constraint Code is either a single letter (e.g. "``r``"), a "``^``" character
5015 followed by two letters (e.g. "``^wc``"), or "``{``" register-name "``}``"
5018 The one and two letter constraint codes are typically chosen to be the same as
5019 GCC's constraint codes.
5021 A single constraint may include one or more than constraint code in it, leaving
5022 it up to LLVM to choose which one to use. This is included mainly for
5023 compatibility with the translation of GCC inline asm coming from clang.
5025 There are two ways to specify alternatives, and either or both may be used in an
5026 inline asm constraint list:
5028 1) Append the codes to each other, making a constraint code set. E.g. "``im``"
5029 or "``{eax}m``". This means "choose any of the options in the set". The
5030 choice of constraint is made independently for each constraint in the
5033 2) Use "``|``" between constraint code sets, creating alternatives. Every
5034 constraint in the constraint list must have the same number of alternative
5035 sets. With this syntax, the same alternative in *all* of the items in the
5036 constraint list will be chosen together.
5038 Putting those together, you might have a two operand constraint string like
5039 ``"rm|r,ri|rm"``. This indicates that if operand 0 is ``r`` or ``m``, then
5040 operand 1 may be one of ``r`` or ``i``. If operand 0 is ``r``, then operand 1
5041 may be one of ``r`` or ``m``. But, operand 0 and 1 cannot both be of type m.
5043 However, the use of either of the alternatives features is *NOT* recommended, as
5044 LLVM is not able to make an intelligent choice about which one to use. (At the
5045 point it currently needs to choose, not enough information is available to do so
5046 in a smart way.) Thus, it simply tries to make a choice that's most likely to
5047 compile, not one that will be optimal performance. (e.g., given "``rm``", it'll
5048 always choose to use memory, not registers). And, if given multiple registers,
5049 or multiple register classes, it will simply choose the first one. (In fact, it
5050 doesn't currently even ensure explicitly specified physical registers are
5051 unique, so specifying multiple physical registers as alternatives, like
5052 ``{r11}{r12},{r11}{r12}``, will assign r11 to both operands, not at all what was
5055 Supported Constraint Code List
5056 """"""""""""""""""""""""""""""
5058 The constraint codes are, in general, expected to behave the same way they do in
5059 GCC. LLVM's support is often implemented on an 'as-needed' basis, to support C
5060 inline asm code which was supported by GCC. A mismatch in behavior between LLVM
5061 and GCC likely indicates a bug in LLVM.
5063 Some constraint codes are typically supported by all targets:
5065 - ``r``: A register in the target's general purpose register class.
5066 - ``m``: A memory address operand. It is target-specific what addressing modes
5067 are supported, typical examples are register, or register + register offset,
5068 or register + immediate offset (of some target-specific size).
5069 - ``p``: An address operand. Similar to ``m``, but used by "load address"
5070 type instructions without touching memory.
5071 - ``i``: An integer constant (of target-specific width). Allows either a simple
5072 immediate, or a relocatable value.
5073 - ``n``: An integer constant -- *not* including relocatable values.
5074 - ``s``: An integer constant, but allowing *only* relocatable values.
5075 - ``X``: Allows an operand of any kind, no constraint whatsoever. Typically
5076 useful to pass a label for an asm branch or call.
5078 .. FIXME: but that surely isn't actually okay to jump out of an asm
5079 block without telling llvm about the control transfer???)
5081 - ``{register-name}``: Requires exactly the named physical register.
5083 Other constraints are target-specific:
5087 - ``z``: An immediate integer 0. Outputs ``WZR`` or ``XZR``, as appropriate.
5088 - ``I``: An immediate integer valid for an ``ADD`` or ``SUB`` instruction,
5089 i.e. 0 to 4095 with optional shift by 12.
5090 - ``J``: An immediate integer that, when negated, is valid for an ``ADD`` or
5091 ``SUB`` instruction, i.e. -1 to -4095 with optional left shift by 12.
5092 - ``K``: An immediate integer that is valid for the 'bitmask immediate 32' of a
5093 logical instruction like ``AND``, ``EOR``, or ``ORR`` with a 32-bit register.
5094 - ``L``: An immediate integer that is valid for the 'bitmask immediate 64' of a
5095 logical instruction like ``AND``, ``EOR``, or ``ORR`` with a 64-bit register.
5096 - ``M``: An immediate integer for use with the ``MOV`` assembly alias on a
5097 32-bit register. This is a superset of ``K``: in addition to the bitmask
5098 immediate, also allows immediate integers which can be loaded with a single
5099 ``MOVZ`` or ``MOVL`` instruction.
5100 - ``N``: An immediate integer for use with the ``MOV`` assembly alias on a
5101 64-bit register. This is a superset of ``L``.
5102 - ``Q``: Memory address operand must be in a single register (no
5103 offsets). (However, LLVM currently does this for the ``m`` constraint as
5105 - ``r``: A 32 or 64-bit integer register (W* or X*).
5106 - ``Uci``: Like r, but restricted to registers 8 to 11 inclusive.
5107 - ``Ucj``: Like r, but restricted to registers 12 to 15 inclusive.
5108 - ``w``: A 32, 64, or 128-bit floating-point, SIMD or SVE vector register.
5109 - ``x``: Like w, but restricted to registers 0 to 15 inclusive.
5110 - ``y``: Like w, but restricted to SVE vector registers Z0 to Z7 inclusive.
5111 - ``Uph``: One of the upper eight SVE predicate registers (P8 to P15)
5112 - ``Upl``: One of the lower eight SVE predicate registers (P0 to P7)
5113 - ``Upa``: Any of the SVE predicate registers (P0 to P15)
5117 - ``r``: A 32 or 64-bit integer register.
5118 - ``[0-9]v``: The 32-bit VGPR register, number 0-9.
5119 - ``[0-9]s``: The 32-bit SGPR register, number 0-9.
5120 - ``[0-9]a``: The 32-bit AGPR register, number 0-9.
5121 - ``I``: An integer inline constant in the range from -16 to 64.
5122 - ``J``: A 16-bit signed integer constant.
5123 - ``A``: An integer or a floating-point inline constant.
5124 - ``B``: A 32-bit signed integer constant.
5125 - ``C``: A 32-bit unsigned integer constant or an integer inline constant in the range from -16 to 64.
5126 - ``DA``: A 64-bit constant that can be split into two "A" constants.
5127 - ``DB``: A 64-bit constant that can be split into two "B" constants.
5131 - ``Q``, ``Um``, ``Un``, ``Uq``, ``Us``, ``Ut``, ``Uv``, ``Uy``: Memory address
5132 operand. Treated the same as operand ``m``, at the moment.
5133 - ``Te``: An even general-purpose 32-bit integer register: ``r0,r2,...,r12,r14``
5134 - ``To``: An odd general-purpose 32-bit integer register: ``r1,r3,...,r11``
5136 ARM and ARM's Thumb2 mode:
5138 - ``j``: An immediate integer between 0 and 65535 (valid for ``MOVW``)
5139 - ``I``: An immediate integer valid for a data-processing instruction.
5140 - ``J``: An immediate integer between -4095 and 4095.
5141 - ``K``: An immediate integer whose bitwise inverse is valid for a
5142 data-processing instruction. (Can be used with template modifier "``B``" to
5143 print the inverted value).
5144 - ``L``: An immediate integer whose negation is valid for a data-processing
5145 instruction. (Can be used with template modifier "``n``" to print the negated
5147 - ``M``: A power of two or an integer between 0 and 32.
5148 - ``N``: Invalid immediate constraint.
5149 - ``O``: Invalid immediate constraint.
5150 - ``r``: A general-purpose 32-bit integer register (``r0-r15``).
5151 - ``l``: In Thumb2 mode, low 32-bit GPR registers (``r0-r7``). In ARM mode, same
5153 - ``h``: In Thumb2 mode, a high 32-bit GPR register (``r8-r15``). In ARM mode,
5155 - ``w``: A 32, 64, or 128-bit floating-point/SIMD register in the ranges
5156 ``s0-s31``, ``d0-d31``, or ``q0-q15``, respectively.
5157 - ``t``: A 32, 64, or 128-bit floating-point/SIMD register in the ranges
5158 ``s0-s31``, ``d0-d15``, or ``q0-q7``, respectively.
5159 - ``x``: A 32, 64, or 128-bit floating-point/SIMD register in the ranges
5160 ``s0-s15``, ``d0-d7``, or ``q0-q3``, respectively.
5164 - ``I``: An immediate integer between 0 and 255.
5165 - ``J``: An immediate integer between -255 and -1.
5166 - ``K``: An immediate integer between 0 and 255, with optional left-shift by
5168 - ``L``: An immediate integer between -7 and 7.
5169 - ``M``: An immediate integer which is a multiple of 4 between 0 and 1020.
5170 - ``N``: An immediate integer between 0 and 31.
5171 - ``O``: An immediate integer which is a multiple of 4 between -508 and 508.
5172 - ``r``: A low 32-bit GPR register (``r0-r7``).
5173 - ``l``: A low 32-bit GPR register (``r0-r7``).
5174 - ``h``: A high GPR register (``r0-r7``).
5175 - ``w``: A 32, 64, or 128-bit floating-point/SIMD register in the ranges
5176 ``s0-s31``, ``d0-d31``, or ``q0-q15``, respectively.
5177 - ``t``: A 32, 64, or 128-bit floating-point/SIMD register in the ranges
5178 ``s0-s31``, ``d0-d15``, or ``q0-q7``, respectively.
5179 - ``x``: A 32, 64, or 128-bit floating-point/SIMD register in the ranges
5180 ``s0-s15``, ``d0-d7``, or ``q0-q3``, respectively.
5184 - ``o``, ``v``: A memory address operand, treated the same as constraint ``m``,
5186 - ``r``: A 32 or 64-bit register.
5190 - ``f``: A floating-point register (if available).
5191 - ``k``: A memory operand whose address is formed by a base register and
5192 (optionally scaled) index register.
5193 - ``l``: A signed 16-bit constant.
5194 - ``m``: A memory operand whose address is formed by a base register and
5195 offset that is suitable for use in instructions with the same addressing
5196 mode as st.w and ld.w.
5197 - ``I``: A signed 12-bit constant (for arithmetic instructions).
5198 - ``J``: An immediate integer zero.
5199 - ``K``: An unsigned 12-bit constant (for logic instructions).
5200 - ``ZB``: An address that is held in a general-purpose register. The offset
5202 - ``ZC``: A memory operand whose address is formed by a base register and
5203 offset that is suitable for use in instructions with the same addressing
5204 mode as ll.w and sc.w.
5208 - ``r``: An 8 or 16-bit register.
5212 - ``I``: An immediate signed 16-bit integer.
5213 - ``J``: An immediate integer zero.
5214 - ``K``: An immediate unsigned 16-bit integer.
5215 - ``L``: An immediate 32-bit integer, where the lower 16 bits are 0.
5216 - ``N``: An immediate integer between -65535 and -1.
5217 - ``O``: An immediate signed 15-bit integer.
5218 - ``P``: An immediate integer between 1 and 65535.
5219 - ``m``: A memory address operand. In MIPS-SE mode, allows a base address
5220 register plus 16-bit immediate offset. In MIPS mode, just a base register.
5221 - ``R``: A memory address operand. In MIPS-SE mode, allows a base address
5222 register plus a 9-bit signed offset. In MIPS mode, the same as constraint
5224 - ``ZC``: A memory address operand, suitable for use in a ``pref``, ``ll``, or
5225 ``sc`` instruction on the given subtarget (details vary).
5226 - ``r``, ``d``, ``y``: A 32 or 64-bit GPR register.
5227 - ``f``: A 32 or 64-bit FPU register (``F0-F31``), or a 128-bit MSA register
5228 (``W0-W31``). In the case of MSA registers, it is recommended to use the ``w``
5229 argument modifier for compatibility with GCC.
5230 - ``c``: A 32-bit or 64-bit GPR register suitable for indirect jump (always
5232 - ``l``: The ``lo`` register, 32 or 64-bit.
5237 - ``b``: A 1-bit integer register.
5238 - ``c`` or ``h``: A 16-bit integer register.
5239 - ``r``: A 32-bit integer register.
5240 - ``l`` or ``N``: A 64-bit integer register.
5241 - ``f``: A 32-bit float register.
5242 - ``d``: A 64-bit float register.
5247 - ``I``: An immediate signed 16-bit integer.
5248 - ``J``: An immediate unsigned 16-bit integer, shifted left 16 bits.
5249 - ``K``: An immediate unsigned 16-bit integer.
5250 - ``L``: An immediate signed 16-bit integer, shifted left 16 bits.
5251 - ``M``: An immediate integer greater than 31.
5252 - ``N``: An immediate integer that is an exact power of 2.
5253 - ``O``: The immediate integer constant 0.
5254 - ``P``: An immediate integer constant whose negation is a signed 16-bit
5256 - ``es``, ``o``, ``Q``, ``Z``, ``Zy``: A memory address operand, currently
5257 treated the same as ``m``.
5258 - ``r``: A 32 or 64-bit integer register.
5259 - ``b``: A 32 or 64-bit integer register, excluding ``R0`` (that is:
5261 - ``f``: A 32 or 64-bit float register (``F0-F31``),
5262 - ``v``: For ``4 x f32`` or ``4 x f64`` types, a 128-bit altivec vector
5263 register (``V0-V31``).
5265 - ``y``: Condition register (``CR0-CR7``).
5266 - ``wc``: An individual CR bit in a CR register.
5267 - ``wa``, ``wd``, ``wf``: Any 128-bit VSX vector register, from the full VSX
5268 register set (overlapping both the floating-point and vector register files).
5269 - ``ws``: A 32 or 64-bit floating-point register, from the full VSX register
5274 - ``A``: An address operand (using a general-purpose register, without an
5276 - ``I``: A 12-bit signed integer immediate operand.
5277 - ``J``: A zero integer immediate operand.
5278 - ``K``: A 5-bit unsigned integer immediate operand.
5279 - ``f``: A 32- or 64-bit floating-point register (requires F or D extension).
5280 - ``r``: A 32- or 64-bit general-purpose register (depending on the platform
5282 - ``vr``: A vector register. (requires V extension).
5283 - ``vm``: A vector register for masking operand. (requires V extension).
5287 - ``I``: An immediate 13-bit signed integer.
5288 - ``r``: A 32-bit integer register.
5289 - ``f``: Any floating-point register on SparcV8, or a floating-point
5290 register in the "low" half of the registers on SparcV9.
5291 - ``e``: Any floating-point register. (Same as ``f`` on SparcV8.)
5295 - ``I``: An immediate unsigned 8-bit integer.
5296 - ``J``: An immediate unsigned 12-bit integer.
5297 - ``K``: An immediate signed 16-bit integer.
5298 - ``L``: An immediate signed 20-bit integer.
5299 - ``M``: An immediate integer 0x7fffffff.
5300 - ``Q``: A memory address operand with a base address and a 12-bit immediate
5301 unsigned displacement.
5302 - ``R``: A memory address operand with a base address, a 12-bit immediate
5303 unsigned displacement, and an index register.
5304 - ``S``: A memory address operand with a base address and a 20-bit immediate
5305 signed displacement.
5306 - ``T``: A memory address operand with a base address, a 20-bit immediate
5307 signed displacement, and an index register.
5308 - ``r`` or ``d``: A 32, 64, or 128-bit integer register.
5309 - ``a``: A 32, 64, or 128-bit integer address register (excludes R0, which in an
5310 address context evaluates as zero).
5311 - ``h``: A 32-bit value in the high part of a 64bit data register
5313 - ``f``: A 32, 64, or 128-bit floating-point register.
5317 - ``I``: An immediate integer between 0 and 31.
5318 - ``J``: An immediate integer between 0 and 64.
5319 - ``K``: An immediate signed 8-bit integer.
5320 - ``L``: An immediate integer, 0xff or 0xffff or (in 64-bit mode only)
5322 - ``M``: An immediate integer between 0 and 3.
5323 - ``N``: An immediate unsigned 8-bit integer.
5324 - ``O``: An immediate integer between 0 and 127.
5325 - ``e``: An immediate 32-bit signed integer.
5326 - ``Z``: An immediate 32-bit unsigned integer.
5327 - ``q``: An 8, 16, 32, or 64-bit register which can be accessed as an 8-bit
5328 ``l`` integer register. On X86-32, this is the ``a``, ``b``, ``c``, and ``d``
5329 registers, and on X86-64, it is all of the integer registers.
5330 - ``Q``: An 8, 16, 32, or 64-bit register which can be accessed as an 8-bit
5331 ``h`` integer register. This is the ``a``, ``b``, ``c``, and ``d`` registers.
5332 - ``r`` or ``l``: An 8, 16, 32, or 64-bit integer register.
5333 - ``R``: An 8, 16, 32, or 64-bit "legacy" integer register -- one which has
5334 existed since i386, and can be accessed without the REX prefix.
5335 - ``f``: A 32, 64, or 80-bit '387 FPU stack pseudo-register.
5336 - ``y``: A 64-bit MMX register, if MMX is enabled.
5337 - ``v``: If SSE is enabled: a 32 or 64-bit scalar operand, or 128-bit vector
5338 operand in a SSE register. If AVX is also enabled, can also be a 256-bit
5339 vector operand in an AVX register. If AVX-512 is also enabled, can also be a
5340 512-bit vector operand in an AVX512 register. Otherwise, an error.
5341 - ``Ws``: A symbolic reference with an optional constant addend or a label
5343 - ``x``: The same as ``v``, except that when AVX-512 is enabled, the ``x`` code
5344 only allocates into the first 16 AVX-512 registers, while the ``v`` code
5345 allocates into any of the 32 AVX-512 registers.
5346 - ``Y``: The same as ``x``, if *SSE2* is enabled, otherwise an error.
5347 - ``A``: Special case: allocates EAX first, then EDX, for a single operand (in
5348 32-bit mode, a 64-bit integer operand will get split into two registers). It
5349 is not recommended to use this constraint, as in 64-bit mode, the 64-bit
5350 operand will get allocated only to RAX -- if two 32-bit operands are needed,
5351 you're better off splitting it yourself, before passing it to the asm
5356 - ``r``: A 32-bit integer register.
5359 .. _inline-asm-modifiers:
5361 Asm template argument modifiers
5362 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
5364 In the asm template string, modifiers can be used on the operand reference, like
5367 The modifiers are, in general, expected to behave the same way they do in
5368 GCC. LLVM's support is often implemented on an 'as-needed' basis, to support C
5369 inline asm code which was supported by GCC. A mismatch in behavior between LLVM
5370 and GCC likely indicates a bug in LLVM.
5374 - ``c``: Print an immediate integer constant unadorned, without
5375 the target-specific immediate punctuation (e.g. no ``$`` prefix).
5376 - ``n``: Negate and print immediate integer constant unadorned, without the
5377 target-specific immediate punctuation (e.g. no ``$`` prefix).
5378 - ``l``: Print as an unadorned label, without the target-specific label
5379 punctuation (e.g. no ``$`` prefix).
5383 - ``w``: Print a GPR register with a ``w*`` name instead of ``x*`` name. E.g.,
5384 instead of ``x30``, print ``w30``.
5385 - ``x``: Print a GPR register with a ``x*`` name. (this is the default, anyhow).
5386 - ``b``, ``h``, ``s``, ``d``, ``q``: Print a floating-point/SIMD register with a
5387 ``b*``, ``h*``, ``s*``, ``d*``, or ``q*`` name, rather than the default of
5396 - ``a``: Print an operand as an address (with ``[`` and ``]`` surrounding a
5400 - ``y``: Print a VFP single-precision register as an indexed double (e.g. print
5401 as ``d4[1]`` instead of ``s9``)
5402 - ``B``: Bitwise invert and print an immediate integer constant without ``#``
5404 - ``L``: Print the low 16-bits of an immediate integer constant.
5405 - ``M``: Print as a register set suitable for ldm/stm. Also prints *all*
5406 register operands subsequent to the specified one (!), so use carefully.
5407 - ``Q``: Print the low-order register of a register-pair, or the low-order
5408 register of a two-register operand.
5409 - ``R``: Print the high-order register of a register-pair, or the high-order
5410 register of a two-register operand.
5411 - ``H``: Print the second register of a register-pair. (On a big-endian system,
5412 ``H`` is equivalent to ``Q``, and on little-endian system, ``H`` is equivalent
5415 .. FIXME: H doesn't currently support printing the second register
5416 of a two-register operand.
5418 - ``e``: Print the low doubleword register of a NEON quad register.
5419 - ``f``: Print the high doubleword register of a NEON quad register.
5420 - ``m``: Print the base register of a memory operand without the ``[`` and ``]``
5425 - ``L``: Print the second register of a two-register operand. Requires that it
5426 has been allocated consecutively to the first.
5428 .. FIXME: why is it restricted to consecutive ones? And there's
5429 nothing that ensures that happens, is there?
5431 - ``I``: Print the letter 'i' if the operand is an integer constant, otherwise
5432 nothing. Used to print 'addi' vs 'add' instructions.
5436 - ``z``: Print $zero register if operand is zero, otherwise print it normally.
5440 No additional modifiers.
5444 - ``X``: Print an immediate integer as hexadecimal
5445 - ``x``: Print the low 16 bits of an immediate integer as hexadecimal.
5446 - ``d``: Print an immediate integer as decimal.
5447 - ``m``: Subtract one and print an immediate integer as decimal.
5448 - ``z``: Print $0 if an immediate zero, otherwise print normally.
5449 - ``L``: Print the low-order register of a two-register operand, or prints the
5450 address of the low-order word of a double-word memory operand.
5452 .. FIXME: L seems to be missing memory operand support.
5454 - ``M``: Print the high-order register of a two-register operand, or prints the
5455 address of the high-order word of a double-word memory operand.
5457 .. FIXME: M seems to be missing memory operand support.
5459 - ``D``: Print the second register of a two-register operand, or prints the
5460 second word of a double-word memory operand. (On a big-endian system, ``D`` is
5461 equivalent to ``L``, and on little-endian system, ``D`` is equivalent to
5463 - ``w``: No effect. Provided for compatibility with GCC which requires this
5464 modifier in order to print MSA registers (``W0-W31``) with the ``f``
5473 - ``L``: Print the second register of a two-register operand. Requires that it
5474 has been allocated consecutively to the first.
5476 .. FIXME: why is it restricted to consecutive ones? And there's
5477 nothing that ensures that happens, is there?
5479 - ``I``: Print the letter 'i' if the operand is an integer constant, otherwise
5480 nothing. Used to print 'addi' vs 'add' instructions.
5481 - ``y``: For a memory operand, prints formatter for a two-register X-form
5482 instruction. (Currently always prints ``r0,OPERAND``).
5483 - ``U``: Prints 'u' if the memory operand is an update form, and nothing
5484 otherwise. (NOTE: LLVM does not support update form, so this will currently
5485 always print nothing)
5486 - ``X``: Prints 'x' if the memory operand is an indexed form. (NOTE: LLVM does
5487 not support indexed form, so this will currently always print nothing)
5491 - ``i``: Print the letter 'i' if the operand is not a register, otherwise print
5492 nothing. Used to print 'addi' vs 'add' instructions, etc.
5493 - ``z``: Print the register ``zero`` if an immediate zero, otherwise print
5502 SystemZ implements only ``n``, and does *not* support any of the other
5503 target-independent modifiers.
5507 - ``c``: Print an unadorned integer or symbol name. (The latter is
5508 target-specific behavior for this typically target-independent modifier).
5509 - ``A``: Print a register name with a '``*``' before it.
5510 - ``b``: Print an 8-bit register name (e.g. ``al``); do nothing on a memory
5512 - ``h``: Print the upper 8-bit register name (e.g. ``ah``); do nothing on a
5514 - ``w``: Print the 16-bit register name (e.g. ``ax``); do nothing on a memory
5516 - ``k``: Print the 32-bit register name (e.g. ``eax``); do nothing on a memory
5518 - ``q``: Print the 64-bit register name (e.g. ``rax``), if 64-bit registers are
5519 available, otherwise the 32-bit register name; do nothing on a memory operand.
5520 - ``n``: Negate and print an unadorned integer, or, for operands other than an
5521 immediate integer (e.g. a relocatable symbol expression), print a '-' before
5522 the operand. (The behavior for relocatable symbol expressions is a
5523 target-specific behavior for this typically target-independent modifier)
5524 - ``H``: Print a memory reference with additional offset +8.
5525 - ``p``: Print a raw symbol name (without syntax-specific prefixes).
5526 - ``P``: Print a memory reference used as the argument of a call instruction or
5527 used with explicit base reg and index reg as its offset. So it can not use
5528 additional regs to present the memory reference. (E.g. omit ``(rip)``, even
5529 though it's PC-relative.)
5533 No additional modifiers.
5539 The call instructions that wrap inline asm nodes may have a
5540 "``!srcloc``" MDNode attached to it that contains a list of constant
5541 integers. If present, the code generator will use the integer as the
5542 location cookie value when report errors through the ``LLVMContext``
5543 error reporting mechanisms. This allows a front-end to correlate backend
5544 errors that occur with inline asm back to the source code that produced
5547 .. code-block:: llvm
5549 call void asm sideeffect "something bad", ""(), !srcloc !42
5551 !42 = !{ i32 1234567 }
5553 It is up to the front-end to make sense of the magic numbers it places
5554 in the IR. If the MDNode contains multiple constants, the code generator
5555 will use the one that corresponds to the line of the asm that the error
5563 LLVM IR allows metadata to be attached to instructions and global objects in the
5564 program that can convey extra information about the code to the optimizers and
5565 code generator. One example application of metadata is source-level
5566 debug information. There are two metadata primitives: strings and nodes.
5568 Metadata does not have a type, and is not a value. If referenced from a
5569 ``call`` instruction, it uses the ``metadata`` type.
5571 All metadata are identified in syntax by an exclamation point ('``!``').
5573 .. _metadata-string:
5575 Metadata Nodes and Metadata Strings
5576 -----------------------------------
5578 A metadata string is a string surrounded by double quotes. It can
5579 contain any character by escaping non-printable characters with
5580 "``\xx``" where "``xx``" is the two digit hex code. For example:
5583 Metadata nodes are represented with notation similar to structure
5584 constants (a comma separated list of elements, surrounded by braces and
5585 preceded by an exclamation point). Metadata nodes can have any values as
5586 their operand. For example:
5588 .. code-block:: llvm
5590 !{ !"test\00", i32 10}
5592 Metadata nodes that aren't uniqued use the ``distinct`` keyword. For example:
5594 .. code-block:: text
5596 !0 = distinct !{!"test\00", i32 10}
5598 ``distinct`` nodes are useful when nodes shouldn't be merged based on their
5599 content. They can also occur when transformations cause uniquing collisions
5600 when metadata operands change.
5602 A :ref:`named metadata <namedmetadatastructure>` is a collection of
5603 metadata nodes, which can be looked up in the module symbol table. For
5606 .. code-block:: llvm
5610 Metadata can be used as function arguments. Here the ``llvm.dbg.value``
5611 intrinsic is using three metadata arguments:
5613 .. code-block:: llvm
5615 call void @llvm.dbg.value(metadata !24, metadata !25, metadata !26)
5617 Metadata can be attached to an instruction. Here metadata ``!21`` is attached
5618 to the ``add`` instruction using the ``!dbg`` identifier:
5620 .. code-block:: llvm
5622 %indvar.next = add i64 %indvar, 1, !dbg !21
5624 Instructions may not have multiple metadata attachments with the same
5627 Metadata can also be attached to a function or a global variable. Here metadata
5628 ``!22`` is attached to the ``f1`` and ``f2`` functions, and the globals ``g1``
5629 and ``g2`` using the ``!dbg`` identifier:
5631 .. code-block:: llvm
5633 declare !dbg !22 void @f1()
5634 define void @f2() !dbg !22 {
5638 @g1 = global i32 0, !dbg !22
5639 @g2 = external global i32, !dbg !22
5641 Unlike instructions, global objects (functions and global variables) may have
5642 multiple metadata attachments with the same identifier.
5644 A transformation is required to drop any metadata attachment that it
5645 does not know or know it can't preserve. Currently there is an
5646 exception for metadata attachment to globals for ``!func_sanitize``,
5647 ``!type``, ``!absolute_symbol`` and ``!associated`` which can't be
5648 unconditionally dropped unless the global is itself deleted.
5650 Metadata attached to a module using named metadata may not be dropped, with
5651 the exception of debug metadata (named metadata with the name ``!llvm.dbg.*``).
5653 More information about specific metadata nodes recognized by the
5654 optimizers and code generator is found below.
5656 .. _specialized-metadata:
5658 Specialized Metadata Nodes
5659 ^^^^^^^^^^^^^^^^^^^^^^^^^^
5661 Specialized metadata nodes are custom data structures in metadata (as opposed
5662 to generic tuples). Their fields are labelled, and can be specified in any
5665 These aren't inherently debug info centric, but currently all the specialized
5666 metadata nodes are related to debug info.
5673 ``DICompileUnit`` nodes represent a compile unit. The ``enums:``,
5674 ``retainedTypes:``, ``globals:``, ``imports:`` and ``macros:`` fields are tuples
5675 containing the debug info to be emitted along with the compile unit, regardless
5676 of code optimizations (some nodes are only emitted if there are references to
5677 them from instructions). The ``debugInfoForProfiling:`` field is a boolean
5678 indicating whether or not line-table discriminators are updated to provide
5679 more-accurate debug info for profiling results.
5681 .. code-block:: text
5683 !0 = !DICompileUnit(language: DW_LANG_C99, file: !1, producer: "clang",
5684 isOptimized: true, flags: "-O2", runtimeVersion: 2,
5685 splitDebugFilename: "abc.debug", emissionKind: FullDebug,
5686 enums: !2, retainedTypes: !3, globals: !4, imports: !5,
5687 macros: !6, dwoId: 0x0abcd)
5689 Compile unit descriptors provide the root scope for objects declared in a
5690 specific compilation unit. File descriptors are defined using this scope. These
5691 descriptors are collected by a named metadata node ``!llvm.dbg.cu``. They keep
5692 track of global variables, type information, and imported entities (declarations
5700 ``DIFile`` nodes represent files. The ``filename:`` can include slashes.
5702 .. code-block:: none
5704 !0 = !DIFile(filename: "path/to/file", directory: "/path/to/dir",
5705 checksumkind: CSK_MD5,
5706 checksum: "000102030405060708090a0b0c0d0e0f")
5708 Files are sometimes used in ``scope:`` fields, and are the only valid target
5709 for ``file:`` fields.
5711 The ``checksum:`` and ``checksumkind:`` fields are optional. If one of these
5712 fields is present, then the other is required to be present as well. Valid
5713 values for ``checksumkind:`` field are: {CSK_MD5, CSK_SHA1, CSK_SHA256}
5720 ``DIBasicType`` nodes represent primitive types, such as ``int``, ``bool`` and
5721 ``float``. ``tag:`` defaults to ``DW_TAG_base_type``.
5723 .. code-block:: text
5725 !0 = !DIBasicType(name: "unsigned char", size: 8, align: 8,
5726 encoding: DW_ATE_unsigned_char)
5727 !1 = !DIBasicType(tag: DW_TAG_unspecified_type, name: "decltype(nullptr)")
5729 The ``encoding:`` describes the details of the type. Usually it's one of the
5732 .. code-block:: text
5738 DW_ATE_signed_char = 6
5740 DW_ATE_unsigned_char = 8
5742 .. _DISubroutineType:
5747 ``DISubroutineType`` nodes represent subroutine types. Their ``types:`` field
5748 refers to a tuple; the first operand is the return type, while the rest are the
5749 types of the formal arguments in order. If the first operand is ``null``, that
5750 represents a function with no return value (such as ``void foo() {}`` in C++).
5752 .. code-block:: text
5754 !0 = !BasicType(name: "int", size: 32, align: 32, DW_ATE_signed)
5755 !1 = !BasicType(name: "char", size: 8, align: 8, DW_ATE_signed_char)
5756 !2 = !DISubroutineType(types: !{null, !0, !1}) ; void (int, char)
5763 ``DIDerivedType`` nodes represent types derived from other types, such as
5766 .. code-block:: text
5768 !0 = !DIBasicType(name: "unsigned char", size: 8, align: 8,
5769 encoding: DW_ATE_unsigned_char)
5770 !1 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !0, size: 32,
5773 The following ``tag:`` values are valid:
5775 .. code-block:: text
5778 DW_TAG_pointer_type = 15
5779 DW_TAG_reference_type = 16
5781 DW_TAG_inheritance = 28
5782 DW_TAG_ptr_to_member_type = 31
5783 DW_TAG_const_type = 38
5785 DW_TAG_volatile_type = 53
5786 DW_TAG_restrict_type = 55
5787 DW_TAG_atomic_type = 71
5788 DW_TAG_immutable_type = 75
5790 .. _DIDerivedTypeMember:
5792 ``DW_TAG_member`` is used to define a member of a :ref:`composite type
5793 <DICompositeType>`. The type of the member is the ``baseType:``. The
5794 ``offset:`` is the member's bit offset. If the composite type has an ODR
5795 ``identifier:`` and does not set ``flags: DIFwdDecl``, then the member is
5796 uniqued based only on its ``name:`` and ``scope:``.
5798 ``DW_TAG_inheritance`` and ``DW_TAG_friend`` are used in the ``elements:``
5799 field of :ref:`composite types <DICompositeType>` to describe parents and
5802 ``DW_TAG_typedef`` is used to provide a name for the ``baseType:``.
5804 ``DW_TAG_pointer_type``, ``DW_TAG_reference_type``, ``DW_TAG_const_type``,
5805 ``DW_TAG_volatile_type``, ``DW_TAG_restrict_type``, ``DW_TAG_atomic_type`` and
5806 ``DW_TAG_immutable_type`` are used to qualify the ``baseType:``.
5808 Note that the ``void *`` type is expressed as a type derived from NULL.
5810 .. _DICompositeType:
5815 ``DICompositeType`` nodes represent types composed of other types, like
5816 structures and unions. ``elements:`` points to a tuple of the composed types.
5818 If the source language supports ODR, the ``identifier:`` field gives the unique
5819 identifier used for type merging between modules. When specified,
5820 :ref:`subprogram declarations <DISubprogramDeclaration>` and :ref:`member
5821 derived types <DIDerivedTypeMember>` that reference the ODR-type in their
5822 ``scope:`` change uniquing rules.
5824 For a given ``identifier:``, there should only be a single composite type that
5825 does not have ``flags: DIFlagFwdDecl`` set. LLVM tools that link modules
5826 together will unique such definitions at parse time via the ``identifier:``
5827 field, even if the nodes are ``distinct``.
5829 .. code-block:: text
5831 !0 = !DIEnumerator(name: "SixKind", value: 7)
5832 !1 = !DIEnumerator(name: "SevenKind", value: 7)
5833 !2 = !DIEnumerator(name: "NegEightKind", value: -8)
5834 !3 = !DICompositeType(tag: DW_TAG_enumeration_type, name: "Enum", file: !12,
5835 line: 2, size: 32, align: 32, identifier: "_M4Enum",
5836 elements: !{!0, !1, !2})
5838 The following ``tag:`` values are valid:
5840 .. code-block:: text
5842 DW_TAG_array_type = 1
5843 DW_TAG_class_type = 2
5844 DW_TAG_enumeration_type = 4
5845 DW_TAG_structure_type = 19
5846 DW_TAG_union_type = 23
5848 For ``DW_TAG_array_type``, the ``elements:`` should be :ref:`subrange
5849 descriptors <DISubrange>`, each representing the range of subscripts at that
5850 level of indexing. The ``DIFlagVector`` flag to ``flags:`` indicates that an
5851 array type is a native packed vector. The optional ``dataLocation`` is a
5852 DIExpression that describes how to get from an object's address to the actual
5853 raw data, if they aren't equivalent. This is only supported for array types,
5854 particularly to describe Fortran arrays, which have an array descriptor in
5855 addition to the array data. Alternatively it can also be DIVariable which
5856 has the address of the actual raw data. The Fortran language supports pointer
5857 arrays which can be attached to actual arrays, this attachment between pointer
5858 and pointee is called association. The optional ``associated`` is a
5859 DIExpression that describes whether the pointer array is currently associated.
5860 The optional ``allocated`` is a DIExpression that describes whether the
5861 allocatable array is currently allocated. The optional ``rank`` is a
5862 DIExpression that describes the rank (number of dimensions) of fortran assumed
5863 rank array (rank is known at runtime).
5865 For ``DW_TAG_enumeration_type``, the ``elements:`` should be :ref:`enumerator
5866 descriptors <DIEnumerator>`, each representing the definition of an enumeration
5867 value for the set. All enumeration type descriptors are collected in the
5868 ``enums:`` field of the :ref:`compile unit <DICompileUnit>`.
5870 For ``DW_TAG_structure_type``, ``DW_TAG_class_type``, and
5871 ``DW_TAG_union_type``, the ``elements:`` should be :ref:`derived types
5872 <DIDerivedType>` with ``tag: DW_TAG_member``, ``tag: DW_TAG_inheritance``, or
5873 ``tag: DW_TAG_friend``; or :ref:`subprograms <DISubprogram>` with
5874 ``isDefinition: false``.
5881 ``DISubrange`` nodes are the elements for ``DW_TAG_array_type`` variants of
5882 :ref:`DICompositeType`.
5884 - ``count: -1`` indicates an empty array.
5885 - ``count: !10`` describes the count with a :ref:`DILocalVariable`.
5886 - ``count: !12`` describes the count with a :ref:`DIGlobalVariable`.
5888 .. code-block:: text
5890 !0 = !DISubrange(count: 5, lowerBound: 0) ; array counting from 0
5891 !1 = !DISubrange(count: 5, lowerBound: 1) ; array counting from 1
5892 !2 = !DISubrange(count: -1) ; empty array.
5894 ; Scopes used in rest of example
5895 !6 = !DIFile(filename: "vla.c", directory: "/path/to/file")
5896 !7 = distinct !DICompileUnit(language: DW_LANG_C99, file: !6)
5897 !8 = distinct !DISubprogram(name: "foo", scope: !7, file: !6, line: 5)
5899 ; Use of local variable as count value
5900 !9 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
5901 !10 = !DILocalVariable(name: "count", scope: !8, file: !6, line: 42, type: !9)
5902 !11 = !DISubrange(count: !10, lowerBound: 0)
5904 ; Use of global variable as count value
5905 !12 = !DIGlobalVariable(name: "count", scope: !8, file: !6, line: 22, type: !9)
5906 !13 = !DISubrange(count: !12, lowerBound: 0)
5913 ``DIEnumerator`` nodes are the elements for ``DW_TAG_enumeration_type``
5914 variants of :ref:`DICompositeType`.
5916 .. code-block:: text
5918 !0 = !DIEnumerator(name: "SixKind", value: 7)
5919 !1 = !DIEnumerator(name: "SevenKind", value: 7)
5920 !2 = !DIEnumerator(name: "NegEightKind", value: -8)
5922 DITemplateTypeParameter
5923 """""""""""""""""""""""
5925 ``DITemplateTypeParameter`` nodes represent type parameters to generic source
5926 language constructs. They are used (optionally) in :ref:`DICompositeType` and
5927 :ref:`DISubprogram` ``templateParams:`` fields.
5929 .. code-block:: text
5931 !0 = !DITemplateTypeParameter(name: "Ty", type: !1)
5933 DITemplateValueParameter
5934 """"""""""""""""""""""""
5936 ``DITemplateValueParameter`` nodes represent value parameters to generic source
5937 language constructs. ``tag:`` defaults to ``DW_TAG_template_value_parameter``,
5938 but if specified can also be set to ``DW_TAG_GNU_template_template_param`` or
5939 ``DW_TAG_GNU_template_param_pack``. They are used (optionally) in
5940 :ref:`DICompositeType` and :ref:`DISubprogram` ``templateParams:`` fields.
5942 .. code-block:: text
5944 !0 = !DITemplateValueParameter(name: "Ty", type: !1, value: i32 7)
5949 ``DINamespace`` nodes represent namespaces in the source language.
5951 .. code-block:: text
5953 !0 = !DINamespace(name: "myawesomeproject", scope: !1, file: !2, line: 7)
5955 .. _DIGlobalVariable:
5960 ``DIGlobalVariable`` nodes represent global variables in the source language.
5962 .. code-block:: text
5964 @foo = global i32, !dbg !0
5965 !0 = !DIGlobalVariableExpression(var: !1, expr: !DIExpression())
5966 !1 = !DIGlobalVariable(name: "foo", linkageName: "foo", scope: !2,
5967 file: !3, line: 7, type: !4, isLocal: true,
5968 isDefinition: false, declaration: !5)
5971 DIGlobalVariableExpression
5972 """"""""""""""""""""""""""
5974 ``DIGlobalVariableExpression`` nodes tie a :ref:`DIGlobalVariable` together
5975 with a :ref:`DIExpression`.
5977 .. code-block:: text
5979 @lower = global i32, !dbg !0
5980 @upper = global i32, !dbg !1
5981 !0 = !DIGlobalVariableExpression(
5983 expr: !DIExpression(DW_OP_LLVM_fragment, 0, 32)
5985 !1 = !DIGlobalVariableExpression(
5987 expr: !DIExpression(DW_OP_LLVM_fragment, 32, 32)
5989 !2 = !DIGlobalVariable(name: "split64", linkageName: "split64", scope: !3,
5990 file: !4, line: 8, type: !5, declaration: !6)
5992 All global variable expressions should be referenced by the `globals:` field of
5993 a :ref:`compile unit <DICompileUnit>`.
6000 ``DISubprogram`` nodes represent functions from the source language. A distinct
6001 ``DISubprogram`` may be attached to a function definition using ``!dbg``
6002 metadata. A unique ``DISubprogram`` may be attached to a function declaration
6003 used for call site debug info. The ``retainedNodes:`` field is a list of
6004 :ref:`variables <DILocalVariable>` and :ref:`labels <DILabel>` that must be
6005 retained, even if their IR counterparts are optimized out of the IR. The
6006 ``type:`` field must point at an :ref:`DISubroutineType`.
6008 .. _DISubprogramDeclaration:
6010 When ``spFlags: DISPFlagDefinition`` is not present, subprograms describe a
6011 declaration in the type tree as opposed to a definition of a function. In this
6012 case, the ``declaration`` field must be empty. If the scope is a composite type
6013 with an ODR ``identifier:`` and that does not set ``flags: DIFwdDecl``, then
6014 the subprogram declaration is uniqued based only on its ``linkageName:`` and
6017 .. code-block:: text
6019 define void @_Z3foov() !dbg !0 {
6023 !0 = distinct !DISubprogram(name: "foo", linkageName: "_Zfoov", scope: !1,
6024 file: !2, line: 7, type: !3,
6025 spFlags: DISPFlagDefinition | DISPFlagLocalToUnit,
6026 scopeLine: 8, containingType: !4,
6027 virtuality: DW_VIRTUALITY_pure_virtual,
6028 virtualIndex: 10, flags: DIFlagPrototyped,
6029 isOptimized: true, unit: !5, templateParams: !6,
6030 declaration: !7, retainedNodes: !8,
6038 ``DILexicalBlock`` nodes describe nested blocks within a :ref:`subprogram
6039 <DISubprogram>`. The line number and column numbers are used to distinguish
6040 two lexical blocks at same depth. They are valid targets for ``scope:``
6043 .. code-block:: text
6045 !0 = distinct !DILexicalBlock(scope: !1, file: !2, line: 7, column: 35)
6047 Usually lexical blocks are ``distinct`` to prevent node merging based on
6050 .. _DILexicalBlockFile:
6055 ``DILexicalBlockFile`` nodes are used to discriminate between sections of a
6056 :ref:`lexical block <DILexicalBlock>`. The ``file:`` field can be changed to
6057 indicate textual inclusion, or the ``discriminator:`` field can be used to
6058 discriminate between control flow within a single block in the source language.
6060 .. code-block:: text
6062 !0 = !DILexicalBlock(scope: !3, file: !4, line: 7, column: 35)
6063 !1 = !DILexicalBlockFile(scope: !0, file: !4, discriminator: 0)
6064 !2 = !DILexicalBlockFile(scope: !0, file: !4, discriminator: 1)
6071 ``DILocation`` nodes represent source debug locations. The ``scope:`` field is
6072 mandatory, and points at an :ref:`DILexicalBlockFile`, an
6073 :ref:`DILexicalBlock`, or an :ref:`DISubprogram`.
6075 .. code-block:: text
6077 !0 = !DILocation(line: 2900, column: 42, scope: !1, inlinedAt: !2)
6079 .. _DILocalVariable:
6084 ``DILocalVariable`` nodes represent local variables in the source language. If
6085 the ``arg:`` field is set to non-zero, then this variable is a subprogram
6086 parameter, and it will be included in the ``retainedNodes:`` field of its
6087 :ref:`DISubprogram`.
6089 .. code-block:: text
6091 !0 = !DILocalVariable(name: "this", arg: 1, scope: !3, file: !2, line: 7,
6092 type: !3, flags: DIFlagArtificial)
6093 !1 = !DILocalVariable(name: "x", arg: 2, scope: !4, file: !2, line: 7,
6095 !2 = !DILocalVariable(name: "y", scope: !5, file: !2, line: 7, type: !3)
6102 ``DIExpression`` nodes represent expressions that are inspired by the DWARF
6103 expression language. They are used in :ref:`debug intrinsics<dbg_intrinsics>`
6104 (such as ``llvm.dbg.declare`` and ``llvm.dbg.value``) to describe how the
6105 referenced LLVM variable relates to the source language variable. Debug
6106 intrinsics are interpreted left-to-right: start by pushing the value/address
6107 operand of the intrinsic onto a stack, then repeatedly push and evaluate
6108 opcodes from the DIExpression until the final variable description is produced.
6110 The current supported opcode vocabulary is limited:
6112 - ``DW_OP_deref`` dereferences the top of the expression stack.
6113 - ``DW_OP_plus`` pops the last two entries from the expression stack, adds
6114 them together and appends the result to the expression stack.
6115 - ``DW_OP_minus`` pops the last two entries from the expression stack, subtracts
6116 the last entry from the second last entry and appends the result to the
6118 - ``DW_OP_plus_uconst, 93`` adds ``93`` to the working expression.
6119 - ``DW_OP_LLVM_fragment, 16, 8`` specifies the offset and size (``16`` and ``8``
6120 here, respectively) of the variable fragment from the working expression. Note
6121 that contrary to DW_OP_bit_piece, the offset is describing the location
6122 within the described source variable.
6123 - ``DW_OP_LLVM_convert, 16, DW_ATE_signed`` specifies a bit size and encoding
6124 (``16`` and ``DW_ATE_signed`` here, respectively) to which the top of the
6125 expression stack is to be converted. Maps into a ``DW_OP_convert`` operation
6126 that references a base type constructed from the supplied values.
6127 - ``DW_OP_LLVM_tag_offset, tag_offset`` specifies that a memory tag should be
6128 optionally applied to the pointer. The memory tag is derived from the
6129 given tag offset in an implementation-defined manner.
6130 - ``DW_OP_swap`` swaps top two stack entries.
6131 - ``DW_OP_xderef`` provides extended dereference mechanism. The entry at the top
6132 of the stack is treated as an address. The second stack entry is treated as an
6133 address space identifier.
6134 - ``DW_OP_stack_value`` marks a constant value.
6135 - ``DW_OP_LLVM_entry_value, N`` refers to the value a register had upon
6136 function entry. When targeting DWARF, a ``DBG_VALUE(reg, ...,
6137 DIExpression(DW_OP_LLVM_entry_value, 1, ...)`` is lowered to
6138 ``DW_OP_entry_value [reg], ...``, which pushes the value ``reg`` had upon
6139 function entry onto the DWARF expression stack.
6141 The next ``(N - 1)`` operations will be part of the ``DW_OP_entry_value``
6142 block argument. For example, ``!DIExpression(DW_OP_LLVM_entry_value, 1,
6143 DW_OP_plus_uconst, 123, DW_OP_stack_value)`` specifies an expression where
6144 the entry value of ``reg`` is pushed onto the stack, and is added with 123.
6145 Due to framework limitations ``N`` must be 1, in other words,
6146 ``DW_OP_entry_value`` always refers to the value/address operand of the
6149 Because ``DW_OP_LLVM_entry_value`` is defined in terms of registers, it is
6150 usually used in MIR, but it is also allowed in LLVM IR when targeting a
6151 :ref:`swiftasync <swiftasync>` argument. The operation is introduced by:
6153 - ``LiveDebugValues`` pass, which applies it to function parameters that
6154 are unmodified throughout the function. Support is limited to simple
6155 register location descriptions, or as indirect locations (e.g.,
6156 parameters passed-by-value to a callee via a pointer to a temporary copy
6157 made in the caller).
6158 - ``AsmPrinter`` pass when a call site parameter value
6159 (``DW_AT_call_site_parameter_value``) is represented as entry value of
6161 - ``CoroSplit`` pass, which may move variables from allocas into a
6162 coroutine frame. If the coroutine frame is a
6163 :ref:`swiftasync <swiftasync>` argument, the variable is described with
6164 an ``DW_OP_LLVM_entry_value`` operation.
6166 - ``DW_OP_LLVM_arg, N`` is used in debug intrinsics that refer to more than one
6167 value, such as one that calculates the sum of two registers. This is always
6168 used in combination with an ordered list of values, such that
6169 ``DW_OP_LLVM_arg, N`` refers to the ``N``\ :sup:`th` element in that list. For
6170 example, ``!DIExpression(DW_OP_LLVM_arg, 0, DW_OP_LLVM_arg, 1, DW_OP_minus,
6171 DW_OP_stack_value)`` used with the list ``(%reg1, %reg2)`` would evaluate to
6172 ``%reg1 - reg2``. This list of values should be provided by the containing
6173 intrinsic/instruction.
6174 - ``DW_OP_breg`` (or ``DW_OP_bregx``) represents a content on the provided
6175 signed offset of the specified register. The opcode is only generated by the
6176 ``AsmPrinter`` pass to describe call site parameter value which requires an
6177 expression over two registers.
6178 - ``DW_OP_push_object_address`` pushes the address of the object which can then
6179 serve as a descriptor in subsequent calculation. This opcode can be used to
6180 calculate bounds of fortran allocatable array which has array descriptors.
6181 - ``DW_OP_over`` duplicates the entry currently second in the stack at the top
6182 of the stack. This opcode can be used to calculate bounds of fortran assumed
6183 rank array which has rank known at run time and current dimension number is
6184 implicitly first element of the stack.
6185 - ``DW_OP_LLVM_implicit_pointer`` It specifies the dereferenced value. It can
6186 be used to represent pointer variables which are optimized out but the value
6187 it points to is known. This operator is required as it is different than DWARF
6188 operator DW_OP_implicit_pointer in representation and specification (number
6189 and types of operands) and later can not be used as multiple level.
6191 .. code-block:: text
6195 call void @llvm.dbg.value(metadata i32 4, metadata !17, metadata !20)
6196 !17 = !DILocalVariable(name: "ptr1", scope: !12, file: !3, line: 5,
6198 !18 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !19, size: 64)
6199 !19 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
6200 !20 = !DIExpression(DW_OP_LLVM_implicit_pointer))
6204 call void @llvm.dbg.value(metadata i32 4, metadata !17, metadata !21)
6205 !17 = !DILocalVariable(name: "ptr1", scope: !12, file: !3, line: 5,
6207 !18 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !19, size: 64)
6208 !19 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !20, size: 64)
6209 !20 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
6210 !21 = !DIExpression(DW_OP_LLVM_implicit_pointer,
6211 DW_OP_LLVM_implicit_pointer))
6213 DWARF specifies three kinds of simple location descriptions: Register, memory,
6214 and implicit location descriptions. Note that a location description is
6215 defined over certain ranges of a program, i.e the location of a variable may
6216 change over the course of the program. Register and memory location
6217 descriptions describe the *concrete location* of a source variable (in the
6218 sense that a debugger might modify its value), whereas *implicit locations*
6219 describe merely the actual *value* of a source variable which might not exist
6220 in registers or in memory (see ``DW_OP_stack_value``).
6222 A ``llvm.dbg.declare`` intrinsic describes an indirect value (the address) of a
6223 source variable. The first operand of the intrinsic must be an address of some
6224 kind. A DIExpression attached to the intrinsic refines this address to produce a
6225 concrete location for the source variable.
6227 A ``llvm.dbg.value`` intrinsic describes the direct value of a source variable.
6228 The first operand of the intrinsic may be a direct or indirect value. A
6229 DIExpression attached to the intrinsic refines the first operand to produce a
6230 direct value. For example, if the first operand is an indirect value, it may be
6231 necessary to insert ``DW_OP_deref`` into the DIExpression in order to produce a
6232 valid debug intrinsic.
6236 A DIExpression is interpreted in the same way regardless of which kind of
6237 debug intrinsic it's attached to.
6239 .. code-block:: text
6241 !0 = !DIExpression(DW_OP_deref)
6242 !1 = !DIExpression(DW_OP_plus_uconst, 3)
6243 !1 = !DIExpression(DW_OP_constu, 3, DW_OP_plus)
6244 !2 = !DIExpression(DW_OP_bit_piece, 3, 7)
6245 !3 = !DIExpression(DW_OP_deref, DW_OP_constu, 3, DW_OP_plus, DW_OP_LLVM_fragment, 3, 7)
6246 !4 = !DIExpression(DW_OP_constu, 2, DW_OP_swap, DW_OP_xderef)
6247 !5 = !DIExpression(DW_OP_constu, 42, DW_OP_stack_value)
6252 ``DIAssignID`` nodes have no operands and are always distinct. They are used to
6253 link together `@llvm.dbg.assign` intrinsics (:ref:`debug
6254 intrinsics<dbg_intrinsics>`) and instructions that store in IR. See `Debug Info
6255 Assignment Tracking <AssignmentTracking.html>`_ for more info.
6257 .. code-block:: llvm
6259 store i32 %a, ptr %a.addr, align 4, !DIAssignID !2
6260 llvm.dbg.assign(metadata %a, metadata !1, metadata !DIExpression(), !2, metadata %a.addr, metadata !DIExpression()), !dbg !3
6262 !2 = distinct !DIAssignID()
6267 ``DIArgList`` nodes hold a list of constant or SSA value references. These are
6268 used in :ref:`debug intrinsics<dbg_intrinsics>` (currently only in
6269 ``llvm.dbg.value``) in combination with a ``DIExpression`` that uses the
6270 ``DW_OP_LLVM_arg`` operator. Because a DIArgList may refer to local values
6271 within a function, it must only be used as a function argument, must always be
6272 inlined, and cannot appear in named metadata.
6274 .. code-block:: text
6276 llvm.dbg.value(metadata !DIArgList(i32 %a, i32 %b),
6278 metadata !DIExpression(DW_OP_LLVM_arg, 0, DW_OP_LLVM_arg, 1, DW_OP_plus))
6283 These flags encode various properties of DINodes.
6285 The `ExportSymbols` flag marks a class, struct or union whose members
6286 may be referenced as if they were defined in the containing class or
6287 union. This flag is used to decide whether the DW_AT_export_symbols can
6288 be used for the structure type.
6293 ``DIObjCProperty`` nodes represent Objective-C property nodes.
6295 .. code-block:: text
6297 !3 = !DIObjCProperty(name: "foo", file: !1, line: 7, setter: "setFoo",
6298 getter: "getFoo", attributes: 7, type: !2)
6303 ``DIImportedEntity`` nodes represent entities (such as modules) imported into a
6304 compile unit. The ``elements`` field is a list of renamed entities (such as
6305 variables and subprograms) in the imported entity (such as module).
6307 .. code-block:: text
6309 !2 = !DIImportedEntity(tag: DW_TAG_imported_module, name: "foo", scope: !0,
6310 entity: !1, line: 7, elements: !3)
6312 !4 = !DIImportedEntity(tag: DW_TAG_imported_declaration, name: "bar", scope: !0,
6313 entity: !5, line: 7)
6318 ``DIMacro`` nodes represent definition or undefinition of a macro identifiers.
6319 The ``name:`` field is the macro identifier, followed by macro parameters when
6320 defining a function-like macro, and the ``value`` field is the token-string
6321 used to expand the macro identifier.
6323 .. code-block:: text
6325 !2 = !DIMacro(macinfo: DW_MACINFO_define, line: 7, name: "foo(x)",
6327 !3 = !DIMacro(macinfo: DW_MACINFO_undef, line: 30, name: "foo")
6332 ``DIMacroFile`` nodes represent inclusion of source files.
6333 The ``nodes:`` field is a list of ``DIMacro`` and ``DIMacroFile`` nodes that
6334 appear in the included source file.
6336 .. code-block:: text
6338 !2 = !DIMacroFile(macinfo: DW_MACINFO_start_file, line: 7, file: !2,
6346 ``DILabel`` nodes represent labels within a :ref:`DISubprogram`. All fields of
6347 a ``DILabel`` are mandatory. The ``scope:`` field must be one of either a
6348 :ref:`DILexicalBlockFile`, a :ref:`DILexicalBlock`, or a :ref:`DISubprogram`.
6349 The ``name:`` field is the label identifier. The ``file:`` field is the
6350 :ref:`DIFile` the label is present in. The ``line:`` field is the source line
6351 within the file where the label is declared.
6353 .. code-block:: text
6355 !2 = !DILabel(scope: !0, name: "foo", file: !1, line: 7)
6360 In LLVM IR, memory does not have types, so LLVM's own type system is not
6361 suitable for doing type based alias analysis (TBAA). Instead, metadata is
6362 added to the IR to describe a type system of a higher level language. This
6363 can be used to implement C/C++ strict type aliasing rules, but it can also
6364 be used to implement custom alias analysis behavior for other languages.
6366 This description of LLVM's TBAA system is broken into two parts:
6367 :ref:`Semantics<tbaa_node_semantics>` talks about high level issues, and
6368 :ref:`Representation<tbaa_node_representation>` talks about the metadata
6369 encoding of various entities.
6371 It is always possible to trace any TBAA node to a "root" TBAA node (details
6372 in the :ref:`Representation<tbaa_node_representation>` section). TBAA
6373 nodes with different roots have an unknown aliasing relationship, and LLVM
6374 conservatively infers ``MayAlias`` between them. The rules mentioned in
6375 this section only pertain to TBAA nodes living under the same root.
6377 .. _tbaa_node_semantics:
6382 The TBAA metadata system, referred to as "struct path TBAA" (not to be
6383 confused with ``tbaa.struct``), consists of the following high level
6384 concepts: *Type Descriptors*, further subdivided into scalar type
6385 descriptors and struct type descriptors; and *Access Tags*.
6387 **Type descriptors** describe the type system of the higher level language
6388 being compiled. **Scalar type descriptors** describe types that do not
6389 contain other types. Each scalar type has a parent type, which must also
6390 be a scalar type or the TBAA root. Via this parent relation, scalar types
6391 within a TBAA root form a tree. **Struct type descriptors** denote types
6392 that contain a sequence of other type descriptors, at known offsets. These
6393 contained type descriptors can either be struct type descriptors themselves
6394 or scalar type descriptors.
6396 **Access tags** are metadata nodes attached to load and store instructions.
6397 Access tags use type descriptors to describe the *location* being accessed
6398 in terms of the type system of the higher level language. Access tags are
6399 tuples consisting of a base type, an access type and an offset. The base
6400 type is a scalar type descriptor or a struct type descriptor, the access
6401 type is a scalar type descriptor, and the offset is a constant integer.
6403 The access tag ``(BaseTy, AccessTy, Offset)`` can describe one of two
6406 * If ``BaseTy`` is a struct type, the tag describes a memory access (load
6407 or store) of a value of type ``AccessTy`` contained in the struct type
6408 ``BaseTy`` at offset ``Offset``.
6410 * If ``BaseTy`` is a scalar type, ``Offset`` must be 0 and ``BaseTy`` and
6411 ``AccessTy`` must be the same; and the access tag describes a scalar
6412 access with scalar type ``AccessTy``.
6414 We first define an ``ImmediateParent`` relation on ``(BaseTy, Offset)``
6417 * If ``BaseTy`` is a scalar type then ``ImmediateParent(BaseTy, 0)`` is
6418 ``(ParentTy, 0)`` where ``ParentTy`` is the parent of the scalar type as
6419 described in the TBAA metadata. ``ImmediateParent(BaseTy, Offset)`` is
6420 undefined if ``Offset`` is non-zero.
6422 * If ``BaseTy`` is a struct type then ``ImmediateParent(BaseTy, Offset)``
6423 is ``(NewTy, NewOffset)`` where ``NewTy`` is the type contained in
6424 ``BaseTy`` at offset ``Offset`` and ``NewOffset`` is ``Offset`` adjusted
6425 to be relative within that inner type.
6427 A memory access with an access tag ``(BaseTy1, AccessTy1, Offset1)``
6428 aliases a memory access with an access tag ``(BaseTy2, AccessTy2,
6429 Offset2)`` if either ``(BaseTy1, Offset1)`` is reachable from ``(Base2,
6430 Offset2)`` via the ``Parent`` relation or vice versa.
6432 As a concrete example, the type descriptor graph for the following program
6438 float f; // offset 4
6442 float f; // offset 0
6443 double d; // offset 4
6444 struct Inner inner_a; // offset 12
6447 void f(struct Outer* outer, struct Inner* inner, float* f, int* i, char* c) {
6448 outer->f = 0; // tag0: (OuterStructTy, FloatScalarTy, 0)
6449 outer->inner_a.i = 0; // tag1: (OuterStructTy, IntScalarTy, 12)
6450 outer->inner_a.f = 0.0; // tag2: (OuterStructTy, FloatScalarTy, 16)
6451 *f = 0.0; // tag3: (FloatScalarTy, FloatScalarTy, 0)
6454 is (note that in C and C++, ``char`` can be used to access any arbitrary
6457 .. code-block:: text
6460 CharScalarTy = ("char", Root, 0)
6461 FloatScalarTy = ("float", CharScalarTy, 0)
6462 DoubleScalarTy = ("double", CharScalarTy, 0)
6463 IntScalarTy = ("int", CharScalarTy, 0)
6464 InnerStructTy = {"Inner" (IntScalarTy, 0), (FloatScalarTy, 4)}
6465 OuterStructTy = {"Outer", (FloatScalarTy, 0), (DoubleScalarTy, 4),
6466 (InnerStructTy, 12)}
6469 with (e.g.) ``ImmediateParent(OuterStructTy, 12)`` = ``(InnerStructTy,
6470 0)``, ``ImmediateParent(InnerStructTy, 0)`` = ``(IntScalarTy, 0)``, and
6471 ``ImmediateParent(IntScalarTy, 0)`` = ``(CharScalarTy, 0)``.
6473 .. _tbaa_node_representation:
6478 The root node of a TBAA type hierarchy is an ``MDNode`` with 0 operands or
6479 with exactly one ``MDString`` operand.
6481 Scalar type descriptors are represented as an ``MDNode`` s with two
6482 operands. The first operand is an ``MDString`` denoting the name of the
6483 struct type. LLVM does not assign meaning to the value of this operand, it
6484 only cares about it being an ``MDString``. The second operand is an
6485 ``MDNode`` which points to the parent for said scalar type descriptor,
6486 which is either another scalar type descriptor or the TBAA root. Scalar
6487 type descriptors can have an optional third argument, but that must be the
6488 constant integer zero.
6490 Struct type descriptors are represented as ``MDNode`` s with an odd number
6491 of operands greater than 1. The first operand is an ``MDString`` denoting
6492 the name of the struct type. Like in scalar type descriptors the actual
6493 value of this name operand is irrelevant to LLVM. After the name operand,
6494 the struct type descriptors have a sequence of alternating ``MDNode`` and
6495 ``ConstantInt`` operands. With N starting from 1, the 2N - 1 th operand,
6496 an ``MDNode``, denotes a contained field, and the 2N th operand, a
6497 ``ConstantInt``, is the offset of the said contained field. The offsets
6498 must be in non-decreasing order.
6500 Access tags are represented as ``MDNode`` s with either 3 or 4 operands.
6501 The first operand is an ``MDNode`` pointing to the node representing the
6502 base type. The second operand is an ``MDNode`` pointing to the node
6503 representing the access type. The third operand is a ``ConstantInt`` that
6504 states the offset of the access. If a fourth field is present, it must be
6505 a ``ConstantInt`` valued at 0 or 1. If it is 1 then the access tag states
6506 that the location being accessed is "constant" (meaning
6507 ``pointsToConstantMemory`` should return true; see `other useful
6508 AliasAnalysis methods <AliasAnalysis.html#OtherItfs>`_). The TBAA root of
6509 the access type and the base type of an access tag must be the same, and
6510 that is the TBAA root of the access tag.
6512 '``tbaa.struct``' Metadata
6513 ^^^^^^^^^^^^^^^^^^^^^^^^^^
6515 The :ref:`llvm.memcpy <int_memcpy>` is often used to implement
6516 aggregate assignment operations in C and similar languages, however it
6517 is defined to copy a contiguous region of memory, which is more than
6518 strictly necessary for aggregate types which contain holes due to
6519 padding. Also, it doesn't contain any TBAA information about the fields
6522 ``!tbaa.struct`` metadata can describe which memory subregions in a
6523 memcpy are padding and what the TBAA tags of the struct are.
6525 The current metadata format is very simple. ``!tbaa.struct`` metadata
6526 nodes are a list of operands which are in conceptual groups of three.
6527 For each group of three, the first operand gives the byte offset of a
6528 field in bytes, the second gives its size in bytes, and the third gives
6531 .. code-block:: llvm
6533 !4 = !{ i64 0, i64 4, !1, i64 8, i64 4, !2 }
6535 This describes a struct with two fields. The first is at offset 0 bytes
6536 with size 4 bytes, and has tbaa tag !1. The second is at offset 8 bytes
6537 and has size 4 bytes and has tbaa tag !2.
6539 Note that the fields need not be contiguous. In this example, there is a
6540 4 byte gap between the two fields. This gap represents padding which
6541 does not carry useful data and need not be preserved.
6543 '``noalias``' and '``alias.scope``' Metadata
6544 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6546 ``noalias`` and ``alias.scope`` metadata provide the ability to specify generic
6547 noalias memory-access sets. This means that some collection of memory access
6548 instructions (loads, stores, memory-accessing calls, etc.) that carry
6549 ``noalias`` metadata can specifically be specified not to alias with some other
6550 collection of memory access instructions that carry ``alias.scope`` metadata.
6551 Each type of metadata specifies a list of scopes where each scope has an id and
6554 When evaluating an aliasing query, if for some domain, the set
6555 of scopes with that domain in one instruction's ``alias.scope`` list is a
6556 subset of (or equal to) the set of scopes for that domain in another
6557 instruction's ``noalias`` list, then the two memory accesses are assumed not to
6560 Because scopes in one domain don't affect scopes in other domains, separate
6561 domains can be used to compose multiple independent noalias sets. This is
6562 used for example during inlining. As the noalias function parameters are
6563 turned into noalias scope metadata, a new domain is used every time the
6564 function is inlined.
6566 The metadata identifying each domain is itself a list containing one or two
6567 entries. The first entry is the name of the domain. Note that if the name is a
6568 string then it can be combined across functions and translation units. A
6569 self-reference can be used to create globally unique domain names. A
6570 descriptive string may optionally be provided as a second list entry.
6572 The metadata identifying each scope is also itself a list containing two or
6573 three entries. The first entry is the name of the scope. Note that if the name
6574 is a string then it can be combined across functions and translation units. A
6575 self-reference can be used to create globally unique scope names. A metadata
6576 reference to the scope's domain is the second entry. A descriptive string may
6577 optionally be provided as a third list entry.
6581 .. code-block:: llvm
6583 ; Two scope domains:
6587 ; Some scopes in these domains:
6593 !5 = !{!4} ; A list containing only scope !4
6597 ; These two instructions don't alias:
6598 %0 = load float, ptr %c, align 4, !alias.scope !5
6599 store float %0, ptr %arrayidx.i, align 4, !noalias !5
6601 ; These two instructions also don't alias (for domain !1, the set of scopes
6602 ; in the !alias.scope equals that in the !noalias list):
6603 %2 = load float, ptr %c, align 4, !alias.scope !5
6604 store float %2, ptr %arrayidx.i2, align 4, !noalias !6
6606 ; These two instructions may alias (for domain !0, the set of scopes in
6607 ; the !noalias list is not a superset of, or equal to, the scopes in the
6608 ; !alias.scope list):
6609 %2 = load float, ptr %c, align 4, !alias.scope !6
6610 store float %0, ptr %arrayidx.i, align 4, !noalias !7
6612 '``fpmath``' Metadata
6613 ^^^^^^^^^^^^^^^^^^^^^
6615 ``fpmath`` metadata may be attached to any instruction of floating-point
6616 type. It can be used to express the maximum acceptable error in the
6617 result of that instruction, in ULPs, thus potentially allowing the
6618 compiler to use a more efficient but less accurate method of computing
6619 it. ULP is defined as follows:
6621 If ``x`` is a real number that lies between two finite consecutive
6622 floating-point numbers ``a`` and ``b``, without being equal to one
6623 of them, then ``ulp(x) = |b - a|``, otherwise ``ulp(x)`` is the
6624 distance between the two non-equal finite floating-point numbers
6625 nearest ``x``. Moreover, ``ulp(NaN)`` is ``NaN``.
6627 The metadata node shall consist of a single positive float type number
6628 representing the maximum relative error, for example:
6630 .. code-block:: llvm
6632 !0 = !{ float 2.5 } ; maximum acceptable inaccuracy is 2.5 ULPs
6636 '``range``' Metadata
6637 ^^^^^^^^^^^^^^^^^^^^
6639 ``range`` metadata may be attached only to ``load``, ``call`` and ``invoke`` of
6640 integer or vector of integer types. It expresses the possible ranges the loaded
6641 value or the value returned by the called function at this call site is in. If
6642 the loaded or returned value is not in the specified range, a poison value is
6643 returned instead. The ranges are represented with a flattened list of integers.
6644 The loaded value or the value returned is known to be in the union of the ranges
6645 defined by each consecutive pair. Each pair has the following properties:
6647 - The type must match the scalar type of the instruction.
6648 - The pair ``a,b`` represents the range ``[a,b)``.
6649 - Both ``a`` and ``b`` are constants.
6650 - The range is allowed to wrap.
6651 - The range should not represent the full or empty set. That is,
6654 In addition, the pairs must be in signed order of the lower bound and
6655 they must be non-contiguous.
6657 For vector-typed instructions, the range is applied element-wise.
6661 .. code-block:: llvm
6663 %a = load i8, ptr %x, align 1, !range !0 ; Can only be 0 or 1
6664 %b = load i8, ptr %y, align 1, !range !1 ; Can only be 255 (-1), 0 or 1
6665 %c = call i8 @foo(), !range !2 ; Can only be 0, 1, 3, 4 or 5
6666 %d = invoke i8 @bar() to label %cont
6667 unwind label %lpad, !range !3 ; Can only be -2, -1, 3, 4 or 5
6668 %e = load <2 x i8>, ptr %x, !range 0 ; Can only be <0 or 1, 0 or 1>
6670 !0 = !{ i8 0, i8 2 }
6671 !1 = !{ i8 255, i8 2 }
6672 !2 = !{ i8 0, i8 2, i8 3, i8 6 }
6673 !3 = !{ i8 -2, i8 0, i8 3, i8 6 }
6675 '``absolute_symbol``' Metadata
6676 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6678 ``absolute_symbol`` metadata may be attached to a global variable
6679 declaration. It marks the declaration as a reference to an absolute symbol,
6680 which causes the backend to use absolute relocations for the symbol even
6681 in position independent code, and expresses the possible ranges that the
6682 global variable's *address* (not its value) is in, in the same format as
6683 ``range`` metadata, with the extension that the pair ``all-ones,all-ones``
6684 may be used to represent the full set.
6686 Example (assuming 64-bit pointers):
6688 .. code-block:: llvm
6690 @a = external global i8, !absolute_symbol !0 ; Absolute symbol in range [0,256)
6691 @b = external global i8, !absolute_symbol !1 ; Absolute symbol in range [0,2^64)
6694 !0 = !{ i64 0, i64 256 }
6695 !1 = !{ i64 -1, i64 -1 }
6697 '``callees``' Metadata
6698 ^^^^^^^^^^^^^^^^^^^^^^
6700 ``callees`` metadata may be attached to indirect call sites. If ``callees``
6701 metadata is attached to a call site, and any callee is not among the set of
6702 functions provided by the metadata, the behavior is undefined. The intent of
6703 this metadata is to facilitate optimizations such as indirect-call promotion.
6704 For example, in the code below, the call instruction may only target the
6705 ``add`` or ``sub`` functions:
6707 .. code-block:: llvm
6709 %result = call i64 %binop(i64 %x, i64 %y), !callees !0
6712 !0 = !{ptr @add, ptr @sub}
6714 '``callback``' Metadata
6715 ^^^^^^^^^^^^^^^^^^^^^^^
6717 ``callback`` metadata may be attached to a function declaration, or definition.
6718 (Call sites are excluded only due to the lack of a use case.) For ease of
6719 exposition, we'll refer to the function annotated w/ metadata as a broker
6720 function. The metadata describes how the arguments of a call to the broker are
6721 in turn passed to the callback function specified by the metadata. Thus, the
6722 ``callback`` metadata provides a partial description of a call site inside the
6723 broker function with regards to the arguments of a call to the broker. The only
6724 semantic restriction on the broker function itself is that it is not allowed to
6725 inspect or modify arguments referenced in the ``callback`` metadata as
6726 pass-through to the callback function.
6728 The broker is not required to actually invoke the callback function at runtime.
6729 However, the assumptions about not inspecting or modifying arguments that would
6730 be passed to the specified callback function still hold, even if the callback
6731 function is not dynamically invoked. The broker is allowed to invoke the
6732 callback function more than once per invocation of the broker. The broker is
6733 also allowed to invoke (directly or indirectly) the function passed as a
6734 callback through another use. Finally, the broker is also allowed to relay the
6735 callback callee invocation to a different thread.
6737 The metadata is structured as follows: At the outer level, ``callback``
6738 metadata is a list of ``callback`` encodings. Each encoding starts with a
6739 constant ``i64`` which describes the argument position of the callback function
6740 in the call to the broker. The following elements, except the last, describe
6741 what arguments are passed to the callback function. Each element is again an
6742 ``i64`` constant identifying the argument of the broker that is passed through,
6743 or ``i64 -1`` to indicate an unknown or inspected argument. The order in which
6744 they are listed has to be the same in which they are passed to the callback
6745 callee. The last element of the encoding is a boolean which specifies how
6746 variadic arguments of the broker are handled. If it is true, all variadic
6747 arguments of the broker are passed through to the callback function *after* the
6748 arguments encoded explicitly before.
6750 In the code below, the ``pthread_create`` function is marked as a broker
6751 through the ``!callback !1`` metadata. In the example, there is only one
6752 callback encoding, namely ``!2``, associated with the broker. This encoding
6753 identifies the callback function as the second argument of the broker (``i64
6754 2``) and the sole argument of the callback function as the third one of the
6755 broker function (``i64 3``).
6757 .. FIXME why does the llvm-sphinx-docs builder give a highlighting
6758 error if the below is set to highlight as 'llvm', despite that we
6759 have misc.highlighting_failure set?
6761 .. code-block:: text
6763 declare !callback !1 dso_local i32 @pthread_create(ptr, ptr, ptr, ptr)
6766 !2 = !{i64 2, i64 3, i1 false}
6769 Another example is shown below. The callback callee is the second argument of
6770 the ``__kmpc_fork_call`` function (``i64 2``). The callee is given two unknown
6771 values (each identified by a ``i64 -1``) and afterwards all
6772 variadic arguments that are passed to the ``__kmpc_fork_call`` call (due to the
6775 .. FIXME why does the llvm-sphinx-docs builder give a highlighting
6776 error if the below is set to highlight as 'llvm', despite that we
6777 have misc.highlighting_failure set?
6779 .. code-block:: text
6781 declare !callback !0 dso_local void @__kmpc_fork_call(ptr, i32, ptr, ...)
6784 !1 = !{i64 2, i64 -1, i64 -1, i1 true}
6787 '``exclude``' Metadata
6788 ^^^^^^^^^^^^^^^^^^^^^^
6790 ``exclude`` metadata may be attached to a global variable to signify that its
6791 section should not be included in the final executable or shared library. This
6792 option is only valid for global variables with an explicit section targeting ELF
6793 or COFF. This is done using the ``SHF_EXCLUDE`` flag on ELF targets and the
6794 ``IMAGE_SCN_LNK_REMOVE`` and ``IMAGE_SCN_MEM_DISCARDABLE`` flags for COFF
6795 targets. Additionally, this metadata is only used as a flag, so the associated
6796 node must be empty. The explicit section should not conflict with any other
6797 sections that the user does not want removed after linking.
6799 .. code-block:: text
6801 @object = private constant [1 x i8] c"\00", section ".foo" !exclude !0
6806 '``unpredictable``' Metadata
6807 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6809 ``unpredictable`` metadata may be attached to any branch or switch
6810 instruction. It can be used to express the unpredictability of control
6811 flow. Similar to the llvm.expect intrinsic, it may be used to alter
6812 optimizations related to compare and branch instructions. The metadata
6813 is treated as a boolean value; if it exists, it signals that the branch
6814 or switch that it is attached to is completely unpredictable.
6816 .. _md_dereferenceable:
6818 '``dereferenceable``' Metadata
6819 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6821 The existence of the ``!dereferenceable`` metadata on the instruction
6822 tells the optimizer that the value loaded is known to be dereferenceable,
6823 otherwise the behavior is undefined.
6824 The number of bytes known to be dereferenceable is specified by the integer
6825 value in the metadata node. This is analogous to the ''dereferenceable''
6826 attribute on parameters and return values.
6828 .. _md_dereferenceable_or_null:
6830 '``dereferenceable_or_null``' Metadata
6831 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6833 The existence of the ``!dereferenceable_or_null`` metadata on the
6834 instruction tells the optimizer that the value loaded is known to be either
6835 dereferenceable or null, otherwise the behavior is undefined.
6836 The number of bytes known to be dereferenceable is specified by the integer
6837 value in the metadata node. This is analogous to the ''dereferenceable_or_null''
6838 attribute on parameters and return values.
6845 It is sometimes useful to attach information to loop constructs. Currently,
6846 loop metadata is implemented as metadata attached to the branch instruction
6847 in the loop latch block. The loop metadata node is a list of
6848 other metadata nodes, each representing a property of the loop. Usually,
6849 the first item of the property node is a string. For example, the
6850 ``llvm.loop.unroll.count`` suggests an unroll factor to the loop
6853 .. code-block:: llvm
6855 br i1 %exitcond, label %._crit_edge, label %.lr.ph, !llvm.loop !0
6858 !1 = !{!"llvm.loop.unroll.enable"}
6859 !2 = !{!"llvm.loop.unroll.count", i32 4}
6861 For legacy reasons, the first item of a loop metadata node must be a
6862 reference to itself. Before the advent of the 'distinct' keyword, this
6863 forced the preservation of otherwise identical metadata nodes. Since
6864 the loop-metadata node can be attached to multiple nodes, the 'distinct'
6865 keyword has become unnecessary.
6867 Prior to the property nodes, one or two ``DILocation`` (debug location)
6868 nodes can be present in the list. The first, if present, identifies the
6869 source-code location where the loop begins. The second, if present,
6870 identifies the source-code location where the loop ends.
6872 Loop metadata nodes cannot be used as unique identifiers. They are
6873 neither persistent for the same loop through transformations nor
6874 necessarily unique to just one loop.
6876 '``llvm.loop.disable_nonforced``'
6877 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6879 This metadata disables all optional loop transformations unless
6880 explicitly instructed using other transformation metadata such as
6881 ``llvm.loop.unroll.enable``. That is, no heuristic will try to determine
6882 whether a transformation is profitable. The purpose is to avoid that the
6883 loop is transformed to a different loop before an explicitly requested
6884 (forced) transformation is applied. For instance, loop fusion can make
6885 other transformations impossible. Mandatory loop canonicalizations such
6886 as loop rotation are still applied.
6888 It is recommended to use this metadata in addition to any llvm.loop.*
6889 transformation directive. Also, any loop should have at most one
6890 directive applied to it (and a sequence of transformations built using
6891 followup-attributes). Otherwise, which transformation will be applied
6892 depends on implementation details such as the pass pipeline order.
6894 See :ref:`transformation-metadata` for details.
6896 '``llvm.loop.vectorize``' and '``llvm.loop.interleave``'
6897 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6899 Metadata prefixed with ``llvm.loop.vectorize`` or ``llvm.loop.interleave`` are
6900 used to control per-loop vectorization and interleaving parameters such as
6901 vectorization width and interleave count. These metadata should be used in
6902 conjunction with ``llvm.loop`` loop identification metadata. The
6903 ``llvm.loop.vectorize`` and ``llvm.loop.interleave`` metadata are only
6904 optimization hints and the optimizer will only interleave and vectorize loops if
6905 it believes it is safe to do so. The ``llvm.loop.parallel_accesses`` metadata
6906 which contains information about loop-carried memory dependencies can be helpful
6907 in determining the safety of these transformations.
6909 '``llvm.loop.interleave.count``' Metadata
6910 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6912 This metadata suggests an interleave count to the loop interleaver.
6913 The first operand is the string ``llvm.loop.interleave.count`` and the
6914 second operand is an integer specifying the interleave count. For
6917 .. code-block:: llvm
6919 !0 = !{!"llvm.loop.interleave.count", i32 4}
6921 Note that setting ``llvm.loop.interleave.count`` to 1 disables interleaving
6922 multiple iterations of the loop. If ``llvm.loop.interleave.count`` is set to 0
6923 then the interleave count will be determined automatically.
6925 '``llvm.loop.vectorize.enable``' Metadata
6926 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6928 This metadata selectively enables or disables vectorization for the loop. The
6929 first operand is the string ``llvm.loop.vectorize.enable`` and the second operand
6930 is a bit. If the bit operand value is 1 vectorization is enabled. A value of
6931 0 disables vectorization:
6933 .. code-block:: llvm
6935 !0 = !{!"llvm.loop.vectorize.enable", i1 0}
6936 !1 = !{!"llvm.loop.vectorize.enable", i1 1}
6938 '``llvm.loop.vectorize.predicate.enable``' Metadata
6939 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6941 This metadata selectively enables or disables creating predicated instructions
6942 for the loop, which can enable folding of the scalar epilogue loop into the
6943 main loop. The first operand is the string
6944 ``llvm.loop.vectorize.predicate.enable`` and the second operand is a bit. If
6945 the bit operand value is 1 vectorization is enabled. A value of 0 disables
6948 .. code-block:: llvm
6950 !0 = !{!"llvm.loop.vectorize.predicate.enable", i1 0}
6951 !1 = !{!"llvm.loop.vectorize.predicate.enable", i1 1}
6953 '``llvm.loop.vectorize.scalable.enable``' Metadata
6954 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6956 This metadata selectively enables or disables scalable vectorization for the
6957 loop, and only has any effect if vectorization for the loop is already enabled.
6958 The first operand is the string ``llvm.loop.vectorize.scalable.enable``
6959 and the second operand is a bit. If the bit operand value is 1 scalable
6960 vectorization is enabled, whereas a value of 0 reverts to the default fixed
6961 width vectorization:
6963 .. code-block:: llvm
6965 !0 = !{!"llvm.loop.vectorize.scalable.enable", i1 0}
6966 !1 = !{!"llvm.loop.vectorize.scalable.enable", i1 1}
6968 '``llvm.loop.vectorize.width``' Metadata
6969 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6971 This metadata sets the target width of the vectorizer. The first
6972 operand is the string ``llvm.loop.vectorize.width`` and the second
6973 operand is an integer specifying the width. For example:
6975 .. code-block:: llvm
6977 !0 = !{!"llvm.loop.vectorize.width", i32 4}
6979 Note that setting ``llvm.loop.vectorize.width`` to 1 disables
6980 vectorization of the loop. If ``llvm.loop.vectorize.width`` is set to
6981 0 or if the loop does not have this metadata the width will be
6982 determined automatically.
6984 '``llvm.loop.vectorize.followup_vectorized``' Metadata
6985 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6987 This metadata defines which loop attributes the vectorized loop will
6988 have. See :ref:`transformation-metadata` for details.
6990 '``llvm.loop.vectorize.followup_epilogue``' Metadata
6991 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6993 This metadata defines which loop attributes the epilogue will have. The
6994 epilogue is not vectorized and is executed when either the vectorized
6995 loop is not known to preserve semantics (because e.g., it processes two
6996 arrays that are found to alias by a runtime check) or for the last
6997 iterations that do not fill a complete set of vector lanes. See
6998 :ref:`Transformation Metadata <transformation-metadata>` for details.
7000 '``llvm.loop.vectorize.followup_all``' Metadata
7001 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7003 Attributes in the metadata will be added to both the vectorized and
7005 See :ref:`Transformation Metadata <transformation-metadata>` for details.
7007 '``llvm.loop.unroll``'
7008 ^^^^^^^^^^^^^^^^^^^^^^
7010 Metadata prefixed with ``llvm.loop.unroll`` are loop unrolling
7011 optimization hints such as the unroll factor. ``llvm.loop.unroll``
7012 metadata should be used in conjunction with ``llvm.loop`` loop
7013 identification metadata. The ``llvm.loop.unroll`` metadata are only
7014 optimization hints and the unrolling will only be performed if the
7015 optimizer believes it is safe to do so.
7017 '``llvm.loop.unroll.count``' Metadata
7018 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7020 This metadata suggests an unroll factor to the loop unroller. The
7021 first operand is the string ``llvm.loop.unroll.count`` and the second
7022 operand is a positive integer specifying the unroll factor. For
7025 .. code-block:: llvm
7027 !0 = !{!"llvm.loop.unroll.count", i32 4}
7029 If the trip count of the loop is less than the unroll count the loop
7030 will be partially unrolled.
7032 '``llvm.loop.unroll.disable``' Metadata
7033 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7035 This metadata disables loop unrolling. The metadata has a single operand
7036 which is the string ``llvm.loop.unroll.disable``. For example:
7038 .. code-block:: llvm
7040 !0 = !{!"llvm.loop.unroll.disable"}
7042 '``llvm.loop.unroll.runtime.disable``' Metadata
7043 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7045 This metadata disables runtime loop unrolling. The metadata has a single
7046 operand which is the string ``llvm.loop.unroll.runtime.disable``. For example:
7048 .. code-block:: llvm
7050 !0 = !{!"llvm.loop.unroll.runtime.disable"}
7052 '``llvm.loop.unroll.enable``' Metadata
7053 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7055 This metadata suggests that the loop should be fully unrolled if the trip count
7056 is known at compile time and partially unrolled if the trip count is not known
7057 at compile time. The metadata has a single operand which is the string
7058 ``llvm.loop.unroll.enable``. For example:
7060 .. code-block:: llvm
7062 !0 = !{!"llvm.loop.unroll.enable"}
7064 '``llvm.loop.unroll.full``' Metadata
7065 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7067 This metadata suggests that the loop should be unrolled fully. The
7068 metadata has a single operand which is the string ``llvm.loop.unroll.full``.
7071 .. code-block:: llvm
7073 !0 = !{!"llvm.loop.unroll.full"}
7075 '``llvm.loop.unroll.followup``' Metadata
7076 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7078 This metadata defines which loop attributes the unrolled loop will have.
7079 See :ref:`Transformation Metadata <transformation-metadata>` for details.
7081 '``llvm.loop.unroll.followup_remainder``' Metadata
7082 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7084 This metadata defines which loop attributes the remainder loop after
7085 partial/runtime unrolling will have. See
7086 :ref:`Transformation Metadata <transformation-metadata>` for details.
7088 '``llvm.loop.unroll_and_jam``'
7089 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7091 This metadata is treated very similarly to the ``llvm.loop.unroll`` metadata
7092 above, but affect the unroll and jam pass. In addition any loop with
7093 ``llvm.loop.unroll`` metadata but no ``llvm.loop.unroll_and_jam`` metadata will
7094 disable unroll and jam (so ``llvm.loop.unroll`` metadata will be left to the
7095 unroller, plus ``llvm.loop.unroll.disable`` metadata will disable unroll and jam
7098 The metadata for unroll and jam otherwise is the same as for ``unroll``.
7099 ``llvm.loop.unroll_and_jam.enable``, ``llvm.loop.unroll_and_jam.disable`` and
7100 ``llvm.loop.unroll_and_jam.count`` do the same as for unroll.
7101 ``llvm.loop.unroll_and_jam.full`` is not supported. Again these are only hints
7102 and the normal safety checks will still be performed.
7104 '``llvm.loop.unroll_and_jam.count``' Metadata
7105 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7107 This metadata suggests an unroll and jam factor to use, similarly to
7108 ``llvm.loop.unroll.count``. The first operand is the string
7109 ``llvm.loop.unroll_and_jam.count`` and the second operand is a positive integer
7110 specifying the unroll factor. For example:
7112 .. code-block:: llvm
7114 !0 = !{!"llvm.loop.unroll_and_jam.count", i32 4}
7116 If the trip count of the loop is less than the unroll count the loop
7117 will be partially unroll and jammed.
7119 '``llvm.loop.unroll_and_jam.disable``' Metadata
7120 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7122 This metadata disables loop unroll and jamming. The metadata has a single
7123 operand which is the string ``llvm.loop.unroll_and_jam.disable``. For example:
7125 .. code-block:: llvm
7127 !0 = !{!"llvm.loop.unroll_and_jam.disable"}
7129 '``llvm.loop.unroll_and_jam.enable``' Metadata
7130 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7132 This metadata suggests that the loop should be fully unroll and jammed if the
7133 trip count is known at compile time and partially unrolled if the trip count is
7134 not known at compile time. The metadata has a single operand which is the
7135 string ``llvm.loop.unroll_and_jam.enable``. For example:
7137 .. code-block:: llvm
7139 !0 = !{!"llvm.loop.unroll_and_jam.enable"}
7141 '``llvm.loop.unroll_and_jam.followup_outer``' Metadata
7142 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7144 This metadata defines which loop attributes the outer unrolled loop will
7145 have. See :ref:`Transformation Metadata <transformation-metadata>` for
7148 '``llvm.loop.unroll_and_jam.followup_inner``' Metadata
7149 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7151 This metadata defines which loop attributes the inner jammed loop will
7152 have. See :ref:`Transformation Metadata <transformation-metadata>` for
7155 '``llvm.loop.unroll_and_jam.followup_remainder_outer``' Metadata
7156 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7158 This metadata defines which attributes the epilogue of the outer loop
7159 will have. This loop is usually unrolled, meaning there is no such
7160 loop. This attribute will be ignored in this case. See
7161 :ref:`Transformation Metadata <transformation-metadata>` for details.
7163 '``llvm.loop.unroll_and_jam.followup_remainder_inner``' Metadata
7164 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7166 This metadata defines which attributes the inner loop of the epilogue
7167 will have. The outer epilogue will usually be unrolled, meaning there
7168 can be multiple inner remainder loops. See
7169 :ref:`Transformation Metadata <transformation-metadata>` for details.
7171 '``llvm.loop.unroll_and_jam.followup_all``' Metadata
7172 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7174 Attributes specified in the metadata is added to all
7175 ``llvm.loop.unroll_and_jam.*`` loops. See
7176 :ref:`Transformation Metadata <transformation-metadata>` for details.
7178 '``llvm.loop.licm_versioning.disable``' Metadata
7179 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7181 This metadata indicates that the loop should not be versioned for the purpose
7182 of enabling loop-invariant code motion (LICM). The metadata has a single operand
7183 which is the string ``llvm.loop.licm_versioning.disable``. For example:
7185 .. code-block:: llvm
7187 !0 = !{!"llvm.loop.licm_versioning.disable"}
7189 '``llvm.loop.distribute.enable``' Metadata
7190 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7192 Loop distribution allows splitting a loop into multiple loops. Currently,
7193 this is only performed if the entire loop cannot be vectorized due to unsafe
7194 memory dependencies. The transformation will attempt to isolate the unsafe
7195 dependencies into their own loop.
7197 This metadata can be used to selectively enable or disable distribution of the
7198 loop. The first operand is the string ``llvm.loop.distribute.enable`` and the
7199 second operand is a bit. If the bit operand value is 1 distribution is
7200 enabled. A value of 0 disables distribution:
7202 .. code-block:: llvm
7204 !0 = !{!"llvm.loop.distribute.enable", i1 0}
7205 !1 = !{!"llvm.loop.distribute.enable", i1 1}
7207 This metadata should be used in conjunction with ``llvm.loop`` loop
7208 identification metadata.
7210 '``llvm.loop.distribute.followup_coincident``' Metadata
7211 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7213 This metadata defines which attributes extracted loops with no cyclic
7214 dependencies will have (i.e. can be vectorized). See
7215 :ref:`Transformation Metadata <transformation-metadata>` for details.
7217 '``llvm.loop.distribute.followup_sequential``' Metadata
7218 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7220 This metadata defines which attributes the isolated loops with unsafe
7221 memory dependencies will have. See
7222 :ref:`Transformation Metadata <transformation-metadata>` for details.
7224 '``llvm.loop.distribute.followup_fallback``' Metadata
7225 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7227 If loop versioning is necessary, this metadata defined the attributes
7228 the non-distributed fallback version will have. See
7229 :ref:`Transformation Metadata <transformation-metadata>` for details.
7231 '``llvm.loop.distribute.followup_all``' Metadata
7232 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7234 The attributes in this metadata is added to all followup loops of the
7235 loop distribution pass. See
7236 :ref:`Transformation Metadata <transformation-metadata>` for details.
7238 '``llvm.licm.disable``' Metadata
7239 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7241 This metadata indicates that loop-invariant code motion (LICM) should not be
7242 performed on this loop. The metadata has a single operand which is the string
7243 ``llvm.licm.disable``. For example:
7245 .. code-block:: llvm
7247 !0 = !{!"llvm.licm.disable"}
7249 Note that although it operates per loop it isn't given the llvm.loop prefix
7250 as it is not affected by the ``llvm.loop.disable_nonforced`` metadata.
7252 '``llvm.access.group``' Metadata
7253 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7255 ``llvm.access.group`` metadata can be attached to any instruction that
7256 potentially accesses memory. It can point to a single distinct metadata
7257 node, which we call access group. This node represents all memory access
7258 instructions referring to it via ``llvm.access.group``. When an
7259 instruction belongs to multiple access groups, it can also point to a
7260 list of accesses groups, illustrated by the following example.
7262 .. code-block:: llvm
7264 %val = load i32, ptr %arrayidx, !llvm.access.group !0
7270 It is illegal for the list node to be empty since it might be confused
7271 with an access group.
7273 The access group metadata node must be 'distinct' to avoid collapsing
7274 multiple access groups by content. An access group metadata node must
7275 always be empty which can be used to distinguish an access group
7276 metadata node from a list of access groups. Being empty avoids the
7277 situation that the content must be updated which, because metadata is
7278 immutable by design, would required finding and updating all references
7279 to the access group node.
7281 The access group can be used to refer to a memory access instruction
7282 without pointing to it directly (which is not possible in global
7283 metadata). Currently, the only metadata making use of it is
7284 ``llvm.loop.parallel_accesses``.
7286 '``llvm.loop.parallel_accesses``' Metadata
7287 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7289 The ``llvm.loop.parallel_accesses`` metadata refers to one or more
7290 access group metadata nodes (see ``llvm.access.group``). It denotes that
7291 no loop-carried memory dependence exist between it and other instructions
7292 in the loop with this metadata.
7294 Let ``m1`` and ``m2`` be two instructions that both have the
7295 ``llvm.access.group`` metadata to the access group ``g1``, respectively
7296 ``g2`` (which might be identical). If a loop contains both access groups
7297 in its ``llvm.loop.parallel_accesses`` metadata, then the compiler can
7298 assume that there is no dependency between ``m1`` and ``m2`` carried by
7299 this loop. Instructions that belong to multiple access groups are
7300 considered having this property if at least one of the access groups
7301 matches the ``llvm.loop.parallel_accesses`` list.
7303 If all memory-accessing instructions in a loop have
7304 ``llvm.access.group`` metadata that each refer to one of the access
7305 groups of a loop's ``llvm.loop.parallel_accesses`` metadata, then the
7306 loop has no loop carried memory dependences and is considered to be a
7309 Note that if not all memory access instructions belong to an access
7310 group referred to by ``llvm.loop.parallel_accesses``, then the loop must
7311 not be considered trivially parallel. Additional
7312 memory dependence analysis is required to make that determination. As a fail
7313 safe mechanism, this causes loops that were originally parallel to be considered
7314 sequential (if optimization passes that are unaware of the parallel semantics
7315 insert new memory instructions into the loop body).
7317 Example of a loop that is considered parallel due to its correct use of
7318 both ``llvm.access.group`` and ``llvm.loop.parallel_accesses``
7321 .. code-block:: llvm
7325 %val0 = load i32, ptr %arrayidx, !llvm.access.group !1
7327 store i32 %val0, ptr %arrayidx1, !llvm.access.group !1
7329 br i1 %exitcond, label %for.end, label %for.body, !llvm.loop !0
7333 !0 = distinct !{!0, !{!"llvm.loop.parallel_accesses", !1}}
7336 It is also possible to have nested parallel loops:
7338 .. code-block:: llvm
7342 %val1 = load i32, ptr %arrayidx3, !llvm.access.group !4
7344 br label %inner.for.body
7348 %val0 = load i32, ptr %arrayidx1, !llvm.access.group !3
7350 store i32 %val0, ptr %arrayidx2, !llvm.access.group !3
7352 br i1 %exitcond, label %inner.for.end, label %inner.for.body, !llvm.loop !1
7356 store i32 %val1, ptr %arrayidx4, !llvm.access.group !4
7358 br i1 %exitcond, label %outer.for.end, label %outer.for.body, !llvm.loop !2
7360 outer.for.end: ; preds = %for.body
7362 !1 = distinct !{!1, !{!"llvm.loop.parallel_accesses", !3}} ; metadata for the inner loop
7363 !2 = distinct !{!2, !{!"llvm.loop.parallel_accesses", !3, !4}} ; metadata for the outer loop
7364 !3 = distinct !{} ; access group for instructions in the inner loop (which are implicitly contained in outer loop as well)
7365 !4 = distinct !{} ; access group for instructions in the outer, but not the inner loop
7367 .. _langref_llvm_loop_mustprogress:
7369 '``llvm.loop.mustprogress``' Metadata
7370 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7372 The ``llvm.loop.mustprogress`` metadata indicates that this loop is required to
7373 terminate, unwind, or interact with the environment in an observable way e.g.
7374 via a volatile memory access, I/O, or other synchronization. If such a loop is
7375 not found to interact with the environment in an observable way, the loop may
7376 be removed. This corresponds to the ``mustprogress`` function attribute.
7378 '``irr_loop``' Metadata
7379 ^^^^^^^^^^^^^^^^^^^^^^^
7381 ``irr_loop`` metadata may be attached to the terminator instruction of a basic
7382 block that's an irreducible loop header (note that an irreducible loop has more
7383 than once header basic blocks.) If ``irr_loop`` metadata is attached to the
7384 terminator instruction of a basic block that is not really an irreducible loop
7385 header, the behavior is undefined. The intent of this metadata is to improve the
7386 accuracy of the block frequency propagation. For example, in the code below, the
7387 block ``header0`` may have a loop header weight (relative to the other headers of
7388 the irreducible loop) of 100:
7390 .. code-block:: llvm
7394 br i1 %cmp, label %t1, label %t2, !irr_loop !0
7397 !0 = !{"loop_header_weight", i64 100}
7399 Irreducible loop header weights are typically based on profile data.
7401 .. _md_invariant.group:
7403 '``invariant.group``' Metadata
7404 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7406 The experimental ``invariant.group`` metadata may be attached to
7407 ``load``/``store`` instructions referencing a single metadata with no entries.
7408 The existence of the ``invariant.group`` metadata on the instruction tells
7409 the optimizer that every ``load`` and ``store`` to the same pointer operand
7410 can be assumed to load or store the same
7411 value (but see the ``llvm.launder.invariant.group`` intrinsic which affects
7412 when two pointers are considered the same). Pointers returned by bitcast or
7413 getelementptr with only zero indices are considered the same.
7417 .. code-block:: llvm
7419 @unknownPtr = external global i8
7422 store i8 42, ptr %ptr, !invariant.group !0
7423 call void @foo(ptr %ptr)
7425 %a = load i8, ptr %ptr, !invariant.group !0 ; Can assume that value under %ptr didn't change
7426 call void @foo(ptr %ptr)
7428 %newPtr = call ptr @getPointer(ptr %ptr)
7429 %c = load i8, ptr %newPtr, !invariant.group !0 ; Can't assume anything, because we only have information about %ptr
7431 %unknownValue = load i8, ptr @unknownPtr
7432 store i8 %unknownValue, ptr %ptr, !invariant.group !0 ; Can assume that %unknownValue == 42
7434 call void @foo(ptr %ptr)
7435 %newPtr2 = call ptr @llvm.launder.invariant.group.p0(ptr %ptr)
7436 %d = load i8, ptr %newPtr2, !invariant.group !0 ; Can't step through launder.invariant.group to get value of %ptr
7439 declare void @foo(ptr)
7440 declare ptr @getPointer(ptr)
7441 declare ptr @llvm.launder.invariant.group.p0(ptr)
7445 The invariant.group metadata must be dropped when replacing one pointer by
7446 another based on aliasing information. This is because invariant.group is tied
7447 to the SSA value of the pointer operand.
7449 .. code-block:: llvm
7451 %v = load i8, ptr %x, !invariant.group !0
7452 ; if %x mustalias %y then we can replace the above instruction with
7453 %v = load i8, ptr %y
7455 Note that this is an experimental feature, which means that its semantics might
7456 change in the future.
7461 See :doc:`TypeMetadata`.
7463 '``associated``' Metadata
7464 ^^^^^^^^^^^^^^^^^^^^^^^^^
7466 The ``associated`` metadata may be attached to a global variable definition with
7467 a single argument that references a global object (optionally through an alias).
7469 This metadata lowers to the ELF section flag ``SHF_LINK_ORDER`` which prevents
7470 discarding of the global variable in linker GC unless the referenced object is
7471 also discarded. The linker support for this feature is spotty. For best
7472 compatibility, globals carrying this metadata should:
7474 - Be in ``@llvm.compiler.used``.
7475 - If the referenced global variable is in a comdat, be in the same comdat.
7477 ``!associated`` can not express many-to-one relationship. A global variable with
7478 the metadata should generally not be referenced by a function: the function may
7479 be inlined into other functions, leading to more references to the metadata.
7480 Ideally we would want to keep metadata alive as long as any inline location is
7481 alive, but this many-to-one relationship is not representable. Moreover, if the
7482 metadata is retained while the function is discarded, the linker will report an
7483 error of a relocation referencing a discarded section.
7485 The metadata is often used with an explicit section consisting of valid C
7486 identifiers so that the runtime can find the metadata section with
7487 linker-defined encapsulation symbols ``__start_<section_name>`` and
7488 ``__stop_<section_name>``.
7490 It does not have any effect on non-ELF targets.
7494 .. code-block:: text
7497 @a = global i32 1, comdat $a
7498 @b = internal global i32 2, comdat $a, section "abc", !associated !0
7505 The ``prof`` metadata is used to record profile data in the IR.
7506 The first operand of the metadata node indicates the profile metadata
7507 type. There are currently 3 types:
7508 :ref:`branch_weights<prof_node_branch_weights>`,
7509 :ref:`function_entry_count<prof_node_function_entry_count>`, and
7510 :ref:`VP<prof_node_VP>`.
7512 .. _prof_node_branch_weights:
7517 Branch weight metadata attached to a branch, select, switch or call instruction
7518 represents the likeliness of the associated branch being taken.
7519 For more information, see :doc:`BranchWeightMetadata`.
7521 .. _prof_node_function_entry_count:
7523 function_entry_count
7524 """"""""""""""""""""
7526 Function entry count metadata can be attached to function definitions
7527 to record the number of times the function is called. Used with BFI
7528 information, it is also used to derive the basic block profile count.
7529 For more information, see :doc:`BranchWeightMetadata`.
7536 VP (value profile) metadata can be attached to instructions that have
7537 value profile information. Currently this is indirect calls (where it
7538 records the hottest callees) and calls to memory intrinsics such as memcpy,
7539 memmove, and memset (where it records the hottest byte lengths).
7541 Each VP metadata node contains "VP" string, then a uint32_t value for the value
7542 profiling kind, a uint64_t value for the total number of times the instruction
7543 is executed, followed by uint64_t value and execution count pairs.
7544 The value profiling kind is 0 for indirect call targets and 1 for memory
7545 operations. For indirect call targets, each profile value is a hash
7546 of the callee function name, and for memory operations each value is the
7549 Note that the value counts do not need to add up to the total count
7550 listed in the third operand (in practice only the top hottest values
7551 are tracked and reported).
7553 Indirect call example:
7555 .. code-block:: llvm
7557 call void %f(), !prof !1
7558 !1 = !{!"VP", i32 0, i64 1600, i64 7651369219802541373, i64 1030, i64 -4377547752858689819, i64 410}
7560 Note that the VP type is 0 (the second operand), which indicates this is
7561 an indirect call value profile data. The third operand indicates that the
7562 indirect call executed 1600 times. The 4th and 6th operands give the
7563 hashes of the 2 hottest target functions' names (this is the same hash used
7564 to represent function names in the profile database), and the 5th and 7th
7565 operands give the execution count that each of the respective prior target
7566 functions was called.
7570 '``annotation``' Metadata
7571 ^^^^^^^^^^^^^^^^^^^^^^^^^
7573 The ``annotation`` metadata can be used to attach a tuple of annotation strings
7574 or a tuple of a tuple of annotation strings to any instruction. This metadata does
7575 not impact the semantics of the program and may only be used to provide additional
7576 insight about the program and transformations to users.
7580 .. code-block:: text
7582 %a.addr = alloca ptr, align 8, !annotation !0
7583 !0 = !{!"auto-init"}
7585 Embedding tuple of strings example:
7587 .. code-block:: text
7589 %a.ptr = getelementptr ptr, ptr %base, i64 0. !annotation !0
7591 !1 = !{!"gep offset", !"0"}
7593 '``func_sanitize``' Metadata
7594 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7596 The ``func_sanitize`` metadata is used to attach two values for the function
7597 sanitizer instrumentation. The first value is the ubsan function signature.
7598 The second value is the address of the proxy variable which stores the address
7599 of the RTTI descriptor. If :ref:`prologue <prologuedata>` and '``func_sanitize``'
7600 are used at the same time, :ref:`prologue <prologuedata>` is emitted before
7601 '``func_sanitize``' in the output.
7605 .. code-block:: text
7607 @__llvm_rtti_proxy = private unnamed_addr constant ptr @_ZTIFvvE
7608 define void @_Z3funv() !func_sanitize !0 {
7611 !0 = !{i32 846595819, ptr @__llvm_rtti_proxy}
7615 '``kcfi_type``' Metadata
7616 ^^^^^^^^^^^^^^^^^^^^^^^^
7618 The ``kcfi_type`` metadata can be used to attach a type identifier to
7619 functions that can be called indirectly. The type data is emitted before the
7620 function entry in the assembly. Indirect calls with the :ref:`kcfi operand
7621 bundle<ob_kcfi>` will emit a check that compares the type identifier to the
7626 .. code-block:: text
7628 define dso_local i32 @f() !kcfi_type !0 {
7631 !0 = !{i32 12345678}
7633 Clang emits ``kcfi_type`` metadata nodes for address-taken functions with
7634 ``-fsanitize=kcfi``.
7638 '``memprof``' Metadata
7639 ^^^^^^^^^^^^^^^^^^^^^^^^
7641 The ``memprof`` metadata is used to record memory profile data on heap
7642 allocation calls. Multiple context-sensitive profiles can be represented
7643 with a single ``memprof`` metadata attachment.
7647 .. code-block:: text
7649 %call = call ptr @_Znam(i64 10), !memprof !0, !callsite !5
7652 !2 = !{i64 4854880825882961848, i64 1905834578520680781}
7653 !3 = !{!4, !"notcold"}
7654 !4 = !{i64 4854880825882961848, i64 -6528110295079665978}
7655 !5 = !{i64 4854880825882961848}
7657 Each operand in the ``memprof`` metadata attachment describes the profiled
7658 behavior of memory allocated by the associated allocation for a given context.
7659 In the above example, there were 2 profiled contexts, one allocating memory
7660 that was typically cold and one allocating memory that was typically not cold.
7662 The format of the metadata describing a context specific profile (e.g.
7663 ``!1`` and ``!3`` above) requires a first operand that is a metadata node
7664 describing the context, followed by a list of string metadata tags describing
7665 the profile behavior (e.g. ``cold`` and ``notcold``) above. The metadata nodes
7666 describing the context (e.g. ``!2`` and ``!4`` above) are unique ids
7667 corresponding to callsites, which can be matched to associated IR calls via
7668 :ref:`callsite metadata<md_callsite>`. In practice these ids are formed via
7669 a hash of the callsite's debug info, and the associated call may be in a
7670 different module. The contexts are listed in order from leaf-most call (the
7671 allocation itself) to the outermost callsite context required for uniquely
7672 identifying the described profile behavior (note this may not be the top of
7673 the profiled call stack).
7677 '``callsite``' Metadata
7678 ^^^^^^^^^^^^^^^^^^^^^^^^
7680 The ``callsite`` metadata is used to identify callsites involved in memory
7681 profile contexts described in :ref:`memprof metadata<md_memprof>`.
7683 It is attached both to the profile allocation calls (see the example in
7684 :ref:`memprof metadata<md_memprof>`), as well as to other callsites
7685 in profiled contexts described in heap allocation ``memprof`` metadata.
7689 .. code-block:: text
7691 %call = call ptr @_Z1Bb(void), !callsite !0
7692 !0 = !{i64 -6528110295079665978, i64 5462047985461644151}
7694 Each operand in the ``callsite`` metadata attachment is a unique id
7695 corresponding to a callsite (possibly inlined). In practice these ids are
7696 formed via a hash of the callsite's debug info. If the call was not inlined
7697 into any callers it will contain a single operand (id). If it was inlined
7698 it will contain a list of ids, including the ids of the callsites in the
7699 full inline sequence, in order from the leaf-most call's id to the outermost
7702 Module Flags Metadata
7703 =====================
7705 Information about the module as a whole is difficult to convey to LLVM's
7706 subsystems. The LLVM IR isn't sufficient to transmit this information.
7707 The ``llvm.module.flags`` named metadata exists in order to facilitate
7708 this. These flags are in the form of key / value pairs --- much like a
7709 dictionary --- making it easy for any subsystem who cares about a flag to
7712 The ``llvm.module.flags`` metadata contains a list of metadata triplets.
7713 Each triplet has the following form:
7715 - The first element is a *behavior* flag, which specifies the behavior
7716 when two (or more) modules are merged together, and it encounters two
7717 (or more) metadata with the same ID. The supported behaviors are
7719 - The second element is a metadata string that is a unique ID for the
7720 metadata. Each module may only have one flag entry for each unique ID (not
7721 including entries with the **Require** behavior).
7722 - The third element is the value of the flag.
7724 When two (or more) modules are merged together, the resulting
7725 ``llvm.module.flags`` metadata is the union of the modules' flags. That is, for
7726 each unique metadata ID string, there will be exactly one entry in the merged
7727 modules ``llvm.module.flags`` metadata table, and the value for that entry will
7728 be determined by the merge behavior flag, as described below. The only exception
7729 is that entries with the *Require* behavior are always preserved.
7731 The following behaviors are supported:
7742 Emits an error if two values disagree, otherwise the resulting value
7743 is that of the operands.
7747 Emits a warning if two values disagree. The result value will be the
7748 operand for the flag from the first module being linked, unless the
7749 other module uses **Min** or **Max**, in which case the result will
7750 be **Min** (with the min value) or **Max** (with the max value),
7755 Adds a requirement that another module flag be present and have a
7756 specified value after linking is performed. The value must be a
7757 metadata pair, where the first element of the pair is the ID of the
7758 module flag to be restricted, and the second element of the pair is
7759 the value the module flag should be restricted to. This behavior can
7760 be used to restrict the allowable results (via triggering of an
7761 error) of linking IDs with the **Override** behavior.
7765 Uses the specified value, regardless of the behavior or value of the
7766 other module. If both modules specify **Override**, but the values
7767 differ, an error will be emitted.
7771 Appends the two values, which are required to be metadata nodes.
7775 Appends the two values, which are required to be metadata
7776 nodes. However, duplicate entries in the second list are dropped
7777 during the append operation.
7781 Takes the max of the two values, which are required to be integers.
7785 Takes the min of the two values, which are required to be non-negative integers.
7786 An absent module flag is treated as having the value 0.
7788 It is an error for a particular unique flag ID to have multiple behaviors,
7789 except in the case of **Require** (which adds restrictions on another metadata
7790 value) or **Override**.
7792 An example of module flags:
7794 .. code-block:: llvm
7796 !0 = !{ i32 1, !"foo", i32 1 }
7797 !1 = !{ i32 4, !"bar", i32 37 }
7798 !2 = !{ i32 2, !"qux", i32 42 }
7799 !3 = !{ i32 3, !"qux",
7804 !llvm.module.flags = !{ !0, !1, !2, !3 }
7806 - Metadata ``!0`` has the ID ``!"foo"`` and the value '1'. The behavior
7807 if two or more ``!"foo"`` flags are seen is to emit an error if their
7808 values are not equal.
7810 - Metadata ``!1`` has the ID ``!"bar"`` and the value '37'. The
7811 behavior if two or more ``!"bar"`` flags are seen is to use the value
7814 - Metadata ``!2`` has the ID ``!"qux"`` and the value '42'. The
7815 behavior if two or more ``!"qux"`` flags are seen is to emit a
7816 warning if their values are not equal.
7818 - Metadata ``!3`` has the ID ``!"qux"`` and the value:
7824 The behavior is to emit an error if the ``llvm.module.flags`` does not
7825 contain a flag with the ID ``!"foo"`` that has the value '1' after linking is
7828 Synthesized Functions Module Flags Metadata
7829 -------------------------------------------
7831 These metadata specify the default attributes synthesized functions should have.
7832 These metadata are currently respected by a few instrumentation passes, such as
7835 These metadata correspond to a few function attributes with significant code
7836 generation behaviors. Function attributes with just optimization purposes
7837 should not be listed because the performance impact of these synthesized
7840 - "frame-pointer": **Max**. The value can be 0, 1, or 2. A synthesized function
7841 will get the "frame-pointer" function attribute, with value being "none",
7842 "non-leaf", or "all", respectively.
7843 - "function_return_thunk_extern": The synthesized function will get the
7844 ``fn_return_thunk_extern`` function attribute.
7845 - "uwtable": **Max**. The value can be 0, 1, or 2. If the value is 1, a synthesized
7846 function will get the ``uwtable(sync)`` function attribute, if the value is 2,
7847 a synthesized function will get the ``uwtable(async)`` function attribute.
7849 Objective-C Garbage Collection Module Flags Metadata
7850 ----------------------------------------------------
7852 On the Mach-O platform, Objective-C stores metadata about garbage
7853 collection in a special section called "image info". The metadata
7854 consists of a version number and a bitmask specifying what types of
7855 garbage collection are supported (if any) by the file. If two or more
7856 modules are linked together their garbage collection metadata needs to
7857 be merged rather than appended together.
7859 The Objective-C garbage collection module flags metadata consists of the
7860 following key-value pairs:
7869 * - ``Objective-C Version``
7870 - **[Required]** --- The Objective-C ABI version. Valid values are 1 and 2.
7872 * - ``Objective-C Image Info Version``
7873 - **[Required]** --- The version of the image info section. Currently
7876 * - ``Objective-C Image Info Section``
7877 - **[Required]** --- The section to place the metadata. Valid values are
7878 ``"__OBJC, __image_info, regular"`` for Objective-C ABI version 1, and
7879 ``"__DATA,__objc_imageinfo, regular, no_dead_strip"`` for
7880 Objective-C ABI version 2.
7882 * - ``Objective-C Garbage Collection``
7883 - **[Required]** --- Specifies whether garbage collection is supported or
7884 not. Valid values are 0, for no garbage collection, and 2, for garbage
7885 collection supported.
7887 * - ``Objective-C GC Only``
7888 - **[Optional]** --- Specifies that only garbage collection is supported.
7889 If present, its value must be 6. This flag requires that the
7890 ``Objective-C Garbage Collection`` flag have the value 2.
7892 Some important flag interactions:
7894 - If a module with ``Objective-C Garbage Collection`` set to 0 is
7895 merged with a module with ``Objective-C Garbage Collection`` set to
7896 2, then the resulting module has the
7897 ``Objective-C Garbage Collection`` flag set to 0.
7898 - A module with ``Objective-C Garbage Collection`` set to 0 cannot be
7899 merged with a module with ``Objective-C GC Only`` set to 6.
7901 C type width Module Flags Metadata
7902 ----------------------------------
7904 The ARM backend emits a section into each generated object file describing the
7905 options that it was compiled with (in a compiler-independent way) to prevent
7906 linking incompatible objects, and to allow automatic library selection. Some
7907 of these options are not visible at the IR level, namely wchar_t width and enum
7910 To pass this information to the backend, these options are encoded in module
7911 flags metadata, using the following key-value pairs:
7921 - * 0 --- sizeof(wchar_t) == 4
7922 * 1 --- sizeof(wchar_t) == 2
7925 - * 0 --- Enums are at least as large as an ``int``.
7926 * 1 --- Enums are stored in the smallest integer type which can
7927 represent all of its values.
7929 For example, the following metadata section specifies that the module was
7930 compiled with a ``wchar_t`` width of 4 bytes, and the underlying type of an
7931 enum is the smallest type which can represent all of its values::
7933 !llvm.module.flags = !{!0, !1}
7934 !0 = !{i32 1, !"short_wchar", i32 1}
7935 !1 = !{i32 1, !"short_enum", i32 0}
7937 Stack Alignment Metadata
7938 ------------------------
7940 Changes the default stack alignment from the target ABI's implicit default
7941 stack alignment. Takes an i32 value in bytes. It is considered an error to link
7942 two modules together with different values for this metadata.
7946 !llvm.module.flags = !{!0}
7947 !0 = !{i32 1, !"override-stack-alignment", i32 8}
7949 This will change the stack alignment to 8B.
7951 Embedded Objects Names Metadata
7952 ===============================
7954 Offloading compilations need to embed device code into the host section table to
7955 create a fat binary. This metadata node references each global that will be
7956 embedded in the module. The primary use for this is to make referencing these
7957 globals more efficient in the IR. The metadata references nodes containing
7958 pointers to the global to be embedded followed by the section name it will be
7961 !llvm.embedded.objects = !{!0}
7962 !0 = !{ptr @object, !".section"}
7964 Automatic Linker Flags Named Metadata
7965 =====================================
7967 Some targets support embedding of flags to the linker inside individual object
7968 files. Typically this is used in conjunction with language extensions which
7969 allow source files to contain linker command line options, and have these
7970 automatically be transmitted to the linker via object files.
7972 These flags are encoded in the IR using named metadata with the name
7973 ``!llvm.linker.options``. Each operand is expected to be a metadata node
7974 which should be a list of other metadata nodes, each of which should be a
7975 list of metadata strings defining linker options.
7977 For example, the following metadata section specifies two separate sets of
7978 linker options, presumably to link against ``libz`` and the ``Cocoa``
7982 !1 = !{ !"-framework", !"Cocoa" }
7983 !llvm.linker.options = !{ !0, !1 }
7985 The metadata encoding as lists of lists of options, as opposed to a collapsed
7986 list of options, is chosen so that the IR encoding can use multiple option
7987 strings to specify e.g., a single library, while still having that specifier be
7988 preserved as an atomic element that can be recognized by a target specific
7989 assembly writer or object file emitter.
7991 Each individual option is required to be either a valid option for the target's
7992 linker, or an option that is reserved by the target specific assembly writer or
7993 object file emitter. No other aspect of these options is defined by the IR.
7995 Dependent Libs Named Metadata
7996 =============================
7998 Some targets support embedding of strings into object files to indicate
7999 a set of libraries to add to the link. Typically this is used in conjunction
8000 with language extensions which allow source files to explicitly declare the
8001 libraries they depend on, and have these automatically be transmitted to the
8002 linker via object files.
8004 The list is encoded in the IR using named metadata with the name
8005 ``!llvm.dependent-libraries``. Each operand is expected to be a metadata node
8006 which should contain a single string operand.
8008 For example, the following metadata section contains two library specifiers::
8010 !0 = !{!"a library specifier"}
8011 !1 = !{!"another library specifier"}
8012 !llvm.dependent-libraries = !{ !0, !1 }
8014 Each library specifier will be handled independently by the consuming linker.
8015 The effect of the library specifiers are defined by the consuming linker.
8022 Compiling with `ThinLTO <https://clang.llvm.org/docs/ThinLTO.html>`_
8023 causes the building of a compact summary of the module that is emitted into
8024 the bitcode. The summary is emitted into the LLVM assembly and identified
8025 in syntax by a caret ('``^``').
8027 The summary is parsed into a bitcode output, along with the Module
8028 IR, via the "``llvm-as``" tool. Tools that parse the Module IR for the purposes
8029 of optimization (e.g. "``clang -x ir``" and "``opt``"), will ignore the
8030 summary entries (just as they currently ignore summary entries in a bitcode
8033 Eventually, the summary will be parsed into a ModuleSummaryIndex object under
8034 the same conditions where summary index is currently built from bitcode.
8035 Specifically, tools that test the Thin Link portion of a ThinLTO compile
8036 (i.e. llvm-lto and llvm-lto2), or when parsing a combined index
8037 for a distributed ThinLTO backend via clang's "``-fthinlto-index=<>``" flag
8038 (this part is not yet implemented, use llvm-as to create a bitcode object
8039 before feeding into thin link tools for now).
8041 There are currently 3 types of summary entries in the LLVM assembly:
8042 :ref:`module paths<module_path_summary>`,
8043 :ref:`global values<gv_summary>`, and
8044 :ref:`type identifiers<typeid_summary>`.
8046 .. _module_path_summary:
8048 Module Path Summary Entry
8049 -------------------------
8051 Each module path summary entry lists a module containing global values included
8052 in the summary. For a single IR module there will be one such entry, but
8053 in a combined summary index produced during the thin link, there will be
8054 one module path entry per linked module with summary.
8058 .. code-block:: text
8060 ^0 = module: (path: "/path/to/file.o", hash: (2468601609, 1329373163, 1565878005, 638838075, 3148790418))
8062 The ``path`` field is a string path to the bitcode file, and the ``hash``
8063 field is the 160-bit SHA-1 hash of the IR bitcode contents, used for
8064 incremental builds and caching.
8068 Global Value Summary Entry
8069 --------------------------
8071 Each global value summary entry corresponds to a global value defined or
8072 referenced by a summarized module.
8076 .. code-block:: text
8078 ^4 = gv: (name: "f"[, summaries: (Summary)[, (Summary)]*]?) ; guid = 14740650423002898831
8080 For declarations, there will not be a summary list. For definitions, a
8081 global value will contain a list of summaries, one per module containing
8082 a definition. There can be multiple entries in a combined summary index
8083 for symbols with weak linkage.
8085 Each ``Summary`` format will depend on whether the global value is a
8086 :ref:`function<function_summary>`, :ref:`variable<variable_summary>`, or
8087 :ref:`alias<alias_summary>`.
8089 .. _function_summary:
8094 If the global value is a function, the ``Summary`` entry will look like:
8096 .. code-block:: text
8098 function: (module: ^0, flags: (linkage: external, notEligibleToImport: 0, live: 0, dsoLocal: 0), insts: 2[, FuncFlags]?[, Calls]?[, TypeIdInfo]?[, Params]?[, Refs]?
8100 The ``module`` field includes the summary entry id for the module containing
8101 this definition, and the ``flags`` field contains information such as
8102 the linkage type, a flag indicating whether it is legal to import the
8103 definition, whether it is globally live and whether the linker resolved it
8104 to a local definition (the latter two are populated during the thin link).
8105 The ``insts`` field contains the number of IR instructions in the function.
8106 Finally, there are several optional fields: :ref:`FuncFlags<funcflags_summary>`,
8107 :ref:`Calls<calls_summary>`, :ref:`TypeIdInfo<typeidinfo_summary>`,
8108 :ref:`Params<params_summary>`, :ref:`Refs<refs_summary>`.
8110 .. _variable_summary:
8112 Global Variable Summary
8113 ^^^^^^^^^^^^^^^^^^^^^^^
8115 If the global value is a variable, the ``Summary`` entry will look like:
8117 .. code-block:: text
8119 variable: (module: ^0, flags: (linkage: external, notEligibleToImport: 0, live: 0, dsoLocal: 0)[, Refs]?
8121 The variable entry contains a subset of the fields in a
8122 :ref:`function summary <function_summary>`, see the descriptions there.
8129 If the global value is an alias, the ``Summary`` entry will look like:
8131 .. code-block:: text
8133 alias: (module: ^0, flags: (linkage: external, notEligibleToImport: 0, live: 0, dsoLocal: 0), aliasee: ^2)
8135 The ``module`` and ``flags`` fields are as described for a
8136 :ref:`function summary <function_summary>`. The ``aliasee`` field
8137 contains a reference to the global value summary entry of the aliasee.
8139 .. _funcflags_summary:
8144 The optional ``FuncFlags`` field looks like:
8146 .. code-block:: text
8148 funcFlags: (readNone: 0, readOnly: 0, noRecurse: 0, returnDoesNotAlias: 0, noInline: 0, alwaysInline: 0, noUnwind: 1, mayThrow: 0, hasUnknownCall: 0)
8150 If unspecified, flags are assumed to hold the conservative ``false`` value of
8158 The optional ``Calls`` field looks like:
8160 .. code-block:: text
8162 calls: ((Callee)[, (Callee)]*)
8164 where each ``Callee`` looks like:
8166 .. code-block:: text
8168 callee: ^1[, hotness: None]?[, relbf: 0]?
8170 The ``callee`` refers to the summary entry id of the callee. At most one
8171 of ``hotness`` (which can take the values ``Unknown``, ``Cold``, ``None``,
8172 ``Hot``, and ``Critical``), and ``relbf`` (which holds the integer
8173 branch frequency relative to the entry frequency, scaled down by 2^8)
8174 may be specified. The defaults are ``Unknown`` and ``0``, respectively.
8181 The optional ``Params`` is used by ``StackSafety`` and looks like:
8183 .. code-block:: text
8185 Params: ((Param)[, (Param)]*)
8187 where each ``Param`` describes pointer parameter access inside of the
8188 function and looks like:
8190 .. code-block:: text
8192 param: 4, offset: [0, 5][, calls: ((Callee)[, (Callee)]*)]?
8194 where the first ``param`` is the number of the parameter it describes,
8195 ``offset`` is the inclusive range of offsets from the pointer parameter to bytes
8196 which can be accessed by the function. This range does not include accesses by
8197 function calls from ``calls`` list.
8199 where each ``Callee`` describes how parameter is forwarded into other
8200 functions and looks like:
8202 .. code-block:: text
8204 callee: ^3, param: 5, offset: [-3, 3]
8206 The ``callee`` refers to the summary entry id of the callee, ``param`` is
8207 the number of the callee parameter which points into the callers parameter
8208 with offset known to be inside of the ``offset`` range. ``calls`` will be
8209 consumed and removed by thin link stage to update ``Param::offset`` so it
8210 covers all accesses possible by ``calls``.
8212 Pointer parameter without corresponding ``Param`` is considered unsafe and we
8213 assume that access with any offset is possible.
8217 If we have the following function:
8219 .. code-block:: text
8221 define i64 @foo(ptr %0, ptr %1, ptr %2, i8 %3) {
8222 store ptr %1, ptr @x
8223 %5 = getelementptr inbounds i8, ptr %2, i64 5
8224 %6 = load i8, ptr %5
8225 %7 = getelementptr inbounds i8, ptr %2, i8 %3
8226 tail call void @bar(i8 %3, ptr %7)
8227 %8 = load i64, ptr %0
8231 We can expect the record like this:
8233 .. code-block:: text
8235 params: ((param: 0, offset: [0, 7]),(param: 2, offset: [5, 5], calls: ((callee: ^3, param: 1, offset: [-128, 127]))))
8237 The function may access just 8 bytes of the parameter %0 . ``calls`` is empty,
8238 so the parameter is either not used for function calls or ``offset`` already
8239 covers all accesses from nested function calls.
8240 Parameter %1 escapes, so access is unknown.
8241 The function itself can access just a single byte of the parameter %2. Additional
8242 access is possible inside of the ``@bar`` or ``^3``. The function adds signed
8243 offset to the pointer and passes the result as the argument %1 into ``^3``.
8244 This record itself does not tell us how ``^3`` will access the parameter.
8245 Parameter %3 is not a pointer.
8252 The optional ``Refs`` field looks like:
8254 .. code-block:: text
8256 refs: ((Ref)[, (Ref)]*)
8258 where each ``Ref`` contains a reference to the summary id of the referenced
8259 value (e.g. ``^1``).
8261 .. _typeidinfo_summary:
8266 The optional ``TypeIdInfo`` field, used for
8267 `Control Flow Integrity <https://clang.llvm.org/docs/ControlFlowIntegrity.html>`_,
8270 .. code-block:: text
8272 typeIdInfo: [(TypeTests)]?[, (TypeTestAssumeVCalls)]?[, (TypeCheckedLoadVCalls)]?[, (TypeTestAssumeConstVCalls)]?[, (TypeCheckedLoadConstVCalls)]?
8274 These optional fields have the following forms:
8279 .. code-block:: text
8281 typeTests: (TypeIdRef[, TypeIdRef]*)
8283 Where each ``TypeIdRef`` refers to a :ref:`type id<typeid_summary>`
8284 by summary id or ``GUID``.
8286 TypeTestAssumeVCalls
8287 """"""""""""""""""""
8289 .. code-block:: text
8291 typeTestAssumeVCalls: (VFuncId[, VFuncId]*)
8293 Where each VFuncId has the format:
8295 .. code-block:: text
8297 vFuncId: (TypeIdRef, offset: 16)
8299 Where each ``TypeIdRef`` refers to a :ref:`type id<typeid_summary>`
8300 by summary id or ``GUID`` preceded by a ``guid:`` tag.
8302 TypeCheckedLoadVCalls
8303 """""""""""""""""""""
8305 .. code-block:: text
8307 typeCheckedLoadVCalls: (VFuncId[, VFuncId]*)
8309 Where each VFuncId has the format described for ``TypeTestAssumeVCalls``.
8311 TypeTestAssumeConstVCalls
8312 """""""""""""""""""""""""
8314 .. code-block:: text
8316 typeTestAssumeConstVCalls: (ConstVCall[, ConstVCall]*)
8318 Where each ConstVCall has the format:
8320 .. code-block:: text
8322 (VFuncId, args: (Arg[, Arg]*))
8324 and where each VFuncId has the format described for ``TypeTestAssumeVCalls``,
8325 and each Arg is an integer argument number.
8327 TypeCheckedLoadConstVCalls
8328 """"""""""""""""""""""""""
8330 .. code-block:: text
8332 typeCheckedLoadConstVCalls: (ConstVCall[, ConstVCall]*)
8334 Where each ConstVCall has the format described for
8335 ``TypeTestAssumeConstVCalls``.
8339 Type ID Summary Entry
8340 ---------------------
8342 Each type id summary entry corresponds to a type identifier resolution
8343 which is generated during the LTO link portion of the compile when building
8344 with `Control Flow Integrity <https://clang.llvm.org/docs/ControlFlowIntegrity.html>`_,
8345 so these are only present in a combined summary index.
8349 .. code-block:: text
8351 ^4 = typeid: (name: "_ZTS1A", summary: (typeTestRes: (kind: allOnes, sizeM1BitWidth: 7[, alignLog2: 0]?[, sizeM1: 0]?[, bitMask: 0]?[, inlineBits: 0]?)[, WpdResolutions]?)) ; guid = 7004155349499253778
8353 The ``typeTestRes`` gives the type test resolution ``kind`` (which may
8354 be ``unsat``, ``byteArray``, ``inline``, ``single``, or ``allOnes``), and
8355 the ``size-1`` bit width. It is followed by optional flags, which default to 0,
8356 and an optional WpdResolutions (whole program devirtualization resolution)
8357 field that looks like:
8359 .. code-block:: text
8361 wpdResolutions: ((offset: 0, WpdRes)[, (offset: 1, WpdRes)]*
8363 where each entry is a mapping from the given byte offset to the whole-program
8364 devirtualization resolution WpdRes, that has one of the following formats:
8366 .. code-block:: text
8368 wpdRes: (kind: branchFunnel)
8369 wpdRes: (kind: singleImpl, singleImplName: "_ZN1A1nEi")
8370 wpdRes: (kind: indir)
8372 Additionally, each wpdRes has an optional ``resByArg`` field, which
8373 describes the resolutions for calls with all constant integer arguments:
8375 .. code-block:: text
8377 resByArg: (ResByArg[, ResByArg]*)
8381 .. code-block:: text
8383 args: (Arg[, Arg]*), byArg: (kind: UniformRetVal[, info: 0][, byte: 0][, bit: 0])
8385 Where the ``kind`` can be ``Indir``, ``UniformRetVal``, ``UniqueRetVal``
8386 or ``VirtualConstProp``. The ``info`` field is only used if the kind
8387 is ``UniformRetVal`` (indicates the uniform return value), or
8388 ``UniqueRetVal`` (holds the return value associated with the unique vtable
8389 (0 or 1)). The ``byte`` and ``bit`` fields are only used if the target does
8390 not support the use of absolute symbols to store constants.
8392 .. _intrinsicglobalvariables:
8394 Intrinsic Global Variables
8395 ==========================
8397 LLVM has a number of "magic" global variables that contain data that
8398 affect code generation or other IR semantics. These are documented here.
8399 All globals of this sort should have a section specified as
8400 "``llvm.metadata``". This section and all globals that start with
8401 "``llvm.``" are reserved for use by LLVM.
8405 The '``llvm.used``' Global Variable
8406 -----------------------------------
8408 The ``@llvm.used`` global is an array which has
8409 :ref:`appending linkage <linkage_appending>`. This array contains a list of
8410 pointers to named global variables, functions and aliases which may optionally
8411 have a pointer cast formed of bitcast or getelementptr. For example, a legal
8414 .. code-block:: llvm
8419 @llvm.used = appending global [2 x ptr] [
8422 ], section "llvm.metadata"
8424 If a symbol appears in the ``@llvm.used`` list, then the compiler, assembler,
8425 and linker are required to treat the symbol as if there is a reference to the
8426 symbol that it cannot see (which is why they have to be named). For example, if
8427 a variable has internal linkage and no references other than that from the
8428 ``@llvm.used`` list, it cannot be deleted. This is commonly used to represent
8429 references from inline asms and other things the compiler cannot "see", and
8430 corresponds to "``attribute((used))``" in GNU C.
8432 On some targets, the code generator must emit a directive to the
8433 assembler or object file to prevent the assembler and linker from
8434 removing the symbol.
8436 .. _gv_llvmcompilerused:
8438 The '``llvm.compiler.used``' Global Variable
8439 --------------------------------------------
8441 The ``@llvm.compiler.used`` directive is the same as the ``@llvm.used``
8442 directive, except that it only prevents the compiler from touching the
8443 symbol. On targets that support it, this allows an intelligent linker to
8444 optimize references to the symbol without being impeded as it would be
8447 This is a rare construct that should only be used in rare circumstances,
8448 and should not be exposed to source languages.
8450 .. _gv_llvmglobalctors:
8452 The '``llvm.global_ctors``' Global Variable
8453 -------------------------------------------
8455 .. code-block:: llvm
8457 %0 = type { i32, ptr, ptr }
8458 @llvm.global_ctors = appending global [1 x %0] [%0 { i32 65535, ptr @ctor, ptr @data }]
8460 The ``@llvm.global_ctors`` array contains a list of constructor
8461 functions, priorities, and an associated global or function.
8462 The functions referenced by this array will be called in ascending order
8463 of priority (i.e. lowest first) when the module is loaded. The order of
8464 functions with the same priority is not defined.
8466 If the third field is non-null, and points to a global variable
8467 or function, the initializer function will only run if the associated
8468 data from the current module is not discarded.
8469 On ELF the referenced global variable or function must be in a comdat.
8471 .. _llvmglobaldtors:
8473 The '``llvm.global_dtors``' Global Variable
8474 -------------------------------------------
8476 .. code-block:: llvm
8478 %0 = type { i32, ptr, ptr }
8479 @llvm.global_dtors = appending global [1 x %0] [%0 { i32 65535, ptr @dtor, ptr @data }]
8481 The ``@llvm.global_dtors`` array contains a list of destructor
8482 functions, priorities, and an associated global or function.
8483 The functions referenced by this array will be called in descending
8484 order of priority (i.e. highest first) when the module is unloaded. The
8485 order of functions with the same priority is not defined.
8487 If the third field is non-null, and points to a global variable
8488 or function, the destructor function will only run if the associated
8489 data from the current module is not discarded.
8490 On ELF the referenced global variable or function must be in a comdat.
8492 Instruction Reference
8493 =====================
8495 The LLVM instruction set consists of several different classifications
8496 of instructions: :ref:`terminator instructions <terminators>`, :ref:`binary
8497 instructions <binaryops>`, :ref:`bitwise binary
8498 instructions <bitwiseops>`, :ref:`memory instructions <memoryops>`, and
8499 :ref:`other instructions <otherops>`.
8503 Terminator Instructions
8504 -----------------------
8506 As mentioned :ref:`previously <functionstructure>`, every basic block in a
8507 program ends with a "Terminator" instruction, which indicates which
8508 block should be executed after the current block is finished. These
8509 terminator instructions typically yield a '``void``' value: they produce
8510 control flow, not values (the one exception being the
8511 ':ref:`invoke <i_invoke>`' instruction).
8513 The terminator instructions are: ':ref:`ret <i_ret>`',
8514 ':ref:`br <i_br>`', ':ref:`switch <i_switch>`',
8515 ':ref:`indirectbr <i_indirectbr>`', ':ref:`invoke <i_invoke>`',
8516 ':ref:`callbr <i_callbr>`'
8517 ':ref:`resume <i_resume>`', ':ref:`catchswitch <i_catchswitch>`',
8518 ':ref:`catchret <i_catchret>`',
8519 ':ref:`cleanupret <i_cleanupret>`',
8520 and ':ref:`unreachable <i_unreachable>`'.
8524 '``ret``' Instruction
8525 ^^^^^^^^^^^^^^^^^^^^^
8532 ret <type> <value> ; Return a value from a non-void function
8533 ret void ; Return from void function
8538 The '``ret``' instruction is used to return control flow (and optionally
8539 a value) from a function back to the caller.
8541 There are two forms of the '``ret``' instruction: one that returns a
8542 value and then causes control flow, and one that just causes control
8548 The '``ret``' instruction optionally accepts a single argument, the
8549 return value. The type of the return value must be a ':ref:`first
8550 class <t_firstclass>`' type.
8552 A function is not :ref:`well formed <wellformed>` if it has a non-void
8553 return type and contains a '``ret``' instruction with no return value or
8554 a return value with a type that does not match its type, or if it has a
8555 void return type and contains a '``ret``' instruction with a return
8561 When the '``ret``' instruction is executed, control flow returns back to
8562 the calling function's context. If the caller is a
8563 ":ref:`call <i_call>`" instruction, execution continues at the
8564 instruction after the call. If the caller was an
8565 ":ref:`invoke <i_invoke>`" instruction, execution continues at the
8566 beginning of the "normal" destination block. If the instruction returns
8567 a value, that value shall set the call or invoke instruction's return
8573 .. code-block:: llvm
8575 ret i32 5 ; Return an integer value of 5
8576 ret void ; Return from a void function
8577 ret { i32, i8 } { i32 4, i8 2 } ; Return a struct of values 4 and 2
8581 '``br``' Instruction
8582 ^^^^^^^^^^^^^^^^^^^^
8589 br i1 <cond>, label <iftrue>, label <iffalse>
8590 br label <dest> ; Unconditional branch
8595 The '``br``' instruction is used to cause control flow to transfer to a
8596 different basic block in the current function. There are two forms of
8597 this instruction, corresponding to a conditional branch and an
8598 unconditional branch.
8603 The conditional branch form of the '``br``' instruction takes a single
8604 '``i1``' value and two '``label``' values. The unconditional form of the
8605 '``br``' instruction takes a single '``label``' value as a target.
8610 Upon execution of a conditional '``br``' instruction, the '``i1``'
8611 argument is evaluated. If the value is ``true``, control flows to the
8612 '``iftrue``' ``label`` argument. If "cond" is ``false``, control flows
8613 to the '``iffalse``' ``label`` argument.
8614 If '``cond``' is ``poison`` or ``undef``, this instruction has undefined
8620 .. code-block:: llvm
8623 %cond = icmp eq i32 %a, %b
8624 br i1 %cond, label %IfEqual, label %IfUnequal
8632 '``switch``' Instruction
8633 ^^^^^^^^^^^^^^^^^^^^^^^^
8640 switch <intty> <value>, label <defaultdest> [ <intty> <val>, label <dest> ... ]
8645 The '``switch``' instruction is used to transfer control flow to one of
8646 several different places. It is a generalization of the '``br``'
8647 instruction, allowing a branch to occur to one of many possible
8653 The '``switch``' instruction uses three parameters: an integer
8654 comparison value '``value``', a default '``label``' destination, and an
8655 array of pairs of comparison value constants and '``label``'s. The table
8656 is not allowed to contain duplicate constant entries.
8661 The ``switch`` instruction specifies a table of values and destinations.
8662 When the '``switch``' instruction is executed, this table is searched
8663 for the given value. If the value is found, control flow is transferred
8664 to the corresponding destination; otherwise, control flow is transferred
8665 to the default destination.
8666 If '``value``' is ``poison`` or ``undef``, this instruction has undefined
8672 Depending on properties of the target machine and the particular
8673 ``switch`` instruction, this instruction may be code generated in
8674 different ways. For example, it could be generated as a series of
8675 chained conditional branches or with a lookup table.
8680 .. code-block:: llvm
8682 ; Emulate a conditional br instruction
8683 %Val = zext i1 %value to i32
8684 switch i32 %Val, label %truedest [ i32 0, label %falsedest ]
8686 ; Emulate an unconditional br instruction
8687 switch i32 0, label %dest [ ]
8689 ; Implement a jump table:
8690 switch i32 %val, label %otherwise [ i32 0, label %onzero
8692 i32 2, label %ontwo ]
8696 '``indirectbr``' Instruction
8697 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
8704 indirectbr ptr <address>, [ label <dest1>, label <dest2>, ... ]
8709 The '``indirectbr``' instruction implements an indirect branch to a
8710 label within the current function, whose address is specified by
8711 "``address``". Address must be derived from a
8712 :ref:`blockaddress <blockaddress>` constant.
8717 The '``address``' argument is the address of the label to jump to. The
8718 rest of the arguments indicate the full set of possible destinations
8719 that the address may point to. Blocks are allowed to occur multiple
8720 times in the destination list, though this isn't particularly useful.
8722 This destination list is required so that dataflow analysis has an
8723 accurate understanding of the CFG.
8728 Control transfers to the block specified in the address argument. All
8729 possible destination blocks must be listed in the label list, otherwise
8730 this instruction has undefined behavior. This implies that jumps to
8731 labels defined in other functions have undefined behavior as well.
8732 If '``address``' is ``poison`` or ``undef``, this instruction has undefined
8738 This is typically implemented with a jump through a register.
8743 .. code-block:: llvm
8745 indirectbr ptr %Addr, [ label %bb1, label %bb2, label %bb3 ]
8749 '``invoke``' Instruction
8750 ^^^^^^^^^^^^^^^^^^^^^^^^
8757 <result> = invoke [cconv] [ret attrs] [addrspace(<num>)] <ty>|<fnty> <fnptrval>(<function args>) [fn attrs]
8758 [operand bundles] to label <normal label> unwind label <exception label>
8763 The '``invoke``' instruction causes control to transfer to a specified
8764 function, with the possibility of control flow transfer to either the
8765 '``normal``' label or the '``exception``' label. If the callee function
8766 returns with the "``ret``" instruction, control flow will return to the
8767 "normal" label. If the callee (or any indirect callees) returns via the
8768 ":ref:`resume <i_resume>`" instruction or other exception handling
8769 mechanism, control is interrupted and continued at the dynamically
8770 nearest "exception" label.
8772 The '``exception``' label is a `landing
8773 pad <ExceptionHandling.html#overview>`_ for the exception. As such,
8774 '``exception``' label is required to have the
8775 ":ref:`landingpad <i_landingpad>`" instruction, which contains the
8776 information about the behavior of the program after unwinding happens,
8777 as its first non-PHI instruction. The restrictions on the
8778 "``landingpad``" instruction's tightly couples it to the "``invoke``"
8779 instruction, so that the important information contained within the
8780 "``landingpad``" instruction can't be lost through normal code motion.
8785 This instruction requires several arguments:
8787 #. The optional "cconv" marker indicates which :ref:`calling
8788 convention <callingconv>` the call should use. If none is
8789 specified, the call defaults to using C calling conventions.
8790 #. The optional :ref:`Parameter Attributes <paramattrs>` list for return
8791 values. Only '``zeroext``', '``signext``', and '``inreg``' attributes
8793 #. The optional addrspace attribute can be used to indicate the address space
8794 of the called function. If it is not specified, the program address space
8795 from the :ref:`datalayout string<langref_datalayout>` will be used.
8796 #. '``ty``': the type of the call instruction itself which is also the
8797 type of the return value. Functions that return no value are marked
8799 #. '``fnty``': shall be the signature of the function being invoked. The
8800 argument types must match the types implied by this signature. This
8801 type can be omitted if the function is not varargs.
8802 #. '``fnptrval``': An LLVM value containing a pointer to a function to
8803 be invoked. In most cases, this is a direct function invocation, but
8804 indirect ``invoke``'s are just as possible, calling an arbitrary pointer
8806 #. '``function args``': argument list whose types match the function
8807 signature argument types and parameter attributes. All arguments must
8808 be of :ref:`first class <t_firstclass>` type. If the function signature
8809 indicates the function accepts a variable number of arguments, the
8810 extra arguments can be specified.
8811 #. '``normal label``': the label reached when the called function
8812 executes a '``ret``' instruction.
8813 #. '``exception label``': the label reached when a callee returns via
8814 the :ref:`resume <i_resume>` instruction or other exception handling
8816 #. The optional :ref:`function attributes <fnattrs>` list.
8817 #. The optional :ref:`operand bundles <opbundles>` list.
8822 This instruction is designed to operate as a standard '``call``'
8823 instruction in most regards. The primary difference is that it
8824 establishes an association with a label, which is used by the runtime
8825 library to unwind the stack.
8827 This instruction is used in languages with destructors to ensure that
8828 proper cleanup is performed in the case of either a ``longjmp`` or a
8829 thrown exception. Additionally, this is important for implementation of
8830 '``catch``' clauses in high-level languages that support them.
8832 For the purposes of the SSA form, the definition of the value returned
8833 by the '``invoke``' instruction is deemed to occur on the edge from the
8834 current block to the "normal" label. If the callee unwinds then no
8835 return value is available.
8840 .. code-block:: llvm
8842 %retval = invoke i32 @Test(i32 15) to label %Continue
8843 unwind label %TestCleanup ; i32:retval set
8844 %retval = invoke coldcc i32 %Testfnptr(i32 15) to label %Continue
8845 unwind label %TestCleanup ; i32:retval set
8849 '``callbr``' Instruction
8850 ^^^^^^^^^^^^^^^^^^^^^^^^
8857 <result> = callbr [cconv] [ret attrs] [addrspace(<num>)] <ty>|<fnty> <fnptrval>(<function args>) [fn attrs]
8858 [operand bundles] to label <fallthrough label> [indirect labels]
8863 The '``callbr``' instruction causes control to transfer to a specified
8864 function, with the possibility of control flow transfer to either the
8865 '``fallthrough``' label or one of the '``indirect``' labels.
8867 This instruction should only be used to implement the "goto" feature of gcc
8868 style inline assembly. Any other usage is an error in the IR verifier.
8870 Note that in order to support outputs along indirect edges, LLVM may need to
8871 split critical edges, which may require synthesizing a replacement block for
8872 the ``indirect labels``. Therefore, the address of a label as seen by another
8873 ``callbr`` instruction, or for a :ref:`blockaddress <blockaddress>` constant,
8874 may not be equal to the address provided for the same block to this
8875 instruction's ``indirect labels`` operand. The assembly code may only transfer
8876 control to addresses provided via this instruction's ``indirect labels``.
8881 This instruction requires several arguments:
8883 #. The optional "cconv" marker indicates which :ref:`calling
8884 convention <callingconv>` the call should use. If none is
8885 specified, the call defaults to using C calling conventions.
8886 #. The optional :ref:`Parameter Attributes <paramattrs>` list for return
8887 values. Only '``zeroext``', '``signext``', and '``inreg``' attributes
8889 #. The optional addrspace attribute can be used to indicate the address space
8890 of the called function. If it is not specified, the program address space
8891 from the :ref:`datalayout string<langref_datalayout>` will be used.
8892 #. '``ty``': the type of the call instruction itself which is also the
8893 type of the return value. Functions that return no value are marked
8895 #. '``fnty``': shall be the signature of the function being called. The
8896 argument types must match the types implied by this signature. This
8897 type can be omitted if the function is not varargs.
8898 #. '``fnptrval``': An LLVM value containing a pointer to a function to
8899 be called. In most cases, this is a direct function call, but
8900 other ``callbr``'s are just as possible, calling an arbitrary pointer
8902 #. '``function args``': argument list whose types match the function
8903 signature argument types and parameter attributes. All arguments must
8904 be of :ref:`first class <t_firstclass>` type. If the function signature
8905 indicates the function accepts a variable number of arguments, the
8906 extra arguments can be specified.
8907 #. '``fallthrough label``': the label reached when the inline assembly's
8908 execution exits the bottom.
8909 #. '``indirect labels``': the labels reached when a callee transfers control
8910 to a location other than the '``fallthrough label``'. Label constraints
8911 refer to these destinations.
8912 #. The optional :ref:`function attributes <fnattrs>` list.
8913 #. The optional :ref:`operand bundles <opbundles>` list.
8918 This instruction is designed to operate as a standard '``call``'
8919 instruction in most regards. The primary difference is that it
8920 establishes an association with additional labels to define where control
8921 flow goes after the call.
8923 The output values of a '``callbr``' instruction are available only to
8924 the '``fallthrough``' block, not to any '``indirect``' blocks(s).
8926 The only use of this today is to implement the "goto" feature of gcc inline
8927 assembly where additional labels can be provided as locations for the inline
8928 assembly to jump to.
8933 .. code-block:: llvm
8935 ; "asm goto" without output constraints.
8936 callbr void asm "", "r,!i"(i32 %x)
8937 to label %fallthrough [label %indirect]
8939 ; "asm goto" with output constraints.
8940 <result> = callbr i32 asm "", "=r,r,!i"(i32 %x)
8941 to label %fallthrough [label %indirect]
8945 '``resume``' Instruction
8946 ^^^^^^^^^^^^^^^^^^^^^^^^
8953 resume <type> <value>
8958 The '``resume``' instruction is a terminator instruction that has no
8964 The '``resume``' instruction requires one argument, which must have the
8965 same type as the result of any '``landingpad``' instruction in the same
8971 The '``resume``' instruction resumes propagation of an existing
8972 (in-flight) exception whose unwinding was interrupted with a
8973 :ref:`landingpad <i_landingpad>` instruction.
8978 .. code-block:: llvm
8980 resume { ptr, i32 } %exn
8984 '``catchswitch``' Instruction
8985 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
8992 <resultval> = catchswitch within <parent> [ label <handler1>, label <handler2>, ... ] unwind to caller
8993 <resultval> = catchswitch within <parent> [ label <handler1>, label <handler2>, ... ] unwind label <default>
8998 The '``catchswitch``' instruction is used by `LLVM's exception handling system
8999 <ExceptionHandling.html#overview>`_ to describe the set of possible catch handlers
9000 that may be executed by the :ref:`EH personality routine <personalityfn>`.
9005 The ``parent`` argument is the token of the funclet that contains the
9006 ``catchswitch`` instruction. If the ``catchswitch`` is not inside a funclet,
9007 this operand may be the token ``none``.
9009 The ``default`` argument is the label of another basic block beginning with
9010 either a ``cleanuppad`` or ``catchswitch`` instruction. This unwind destination
9011 must be a legal target with respect to the ``parent`` links, as described in
9012 the `exception handling documentation\ <ExceptionHandling.html#wineh-constraints>`_.
9014 The ``handlers`` are a nonempty list of successor blocks that each begin with a
9015 :ref:`catchpad <i_catchpad>` instruction.
9020 Executing this instruction transfers control to one of the successors in
9021 ``handlers``, if appropriate, or continues to unwind via the unwind label if
9024 The ``catchswitch`` is both a terminator and a "pad" instruction, meaning that
9025 it must be both the first non-phi instruction and last instruction in the basic
9026 block. Therefore, it must be the only non-phi instruction in the block.
9031 .. code-block:: text
9034 %cs1 = catchswitch within none [label %handler0, label %handler1] unwind to caller
9036 %cs2 = catchswitch within %parenthandler [label %handler0] unwind label %cleanup
9040 '``catchret``' Instruction
9041 ^^^^^^^^^^^^^^^^^^^^^^^^^^
9048 catchret from <token> to label <normal>
9053 The '``catchret``' instruction is a terminator instruction that has a
9060 The first argument to a '``catchret``' indicates which ``catchpad`` it
9061 exits. It must be a :ref:`catchpad <i_catchpad>`.
9062 The second argument to a '``catchret``' specifies where control will
9068 The '``catchret``' instruction ends an existing (in-flight) exception whose
9069 unwinding was interrupted with a :ref:`catchpad <i_catchpad>` instruction. The
9070 :ref:`personality function <personalityfn>` gets a chance to execute arbitrary
9071 code to, for example, destroy the active exception. Control then transfers to
9074 The ``token`` argument must be a token produced by a ``catchpad`` instruction.
9075 If the specified ``catchpad`` is not the most-recently-entered not-yet-exited
9076 funclet pad (as described in the `EH documentation\ <ExceptionHandling.html#wineh-constraints>`_),
9077 the ``catchret``'s behavior is undefined.
9082 .. code-block:: text
9084 catchret from %catch to label %continue
9088 '``cleanupret``' Instruction
9089 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
9096 cleanupret from <value> unwind label <continue>
9097 cleanupret from <value> unwind to caller
9102 The '``cleanupret``' instruction is a terminator instruction that has
9103 an optional successor.
9109 The '``cleanupret``' instruction requires one argument, which indicates
9110 which ``cleanuppad`` it exits, and must be a :ref:`cleanuppad <i_cleanuppad>`.
9111 If the specified ``cleanuppad`` is not the most-recently-entered not-yet-exited
9112 funclet pad (as described in the `EH documentation\ <ExceptionHandling.html#wineh-constraints>`_),
9113 the ``cleanupret``'s behavior is undefined.
9115 The '``cleanupret``' instruction also has an optional successor, ``continue``,
9116 which must be the label of another basic block beginning with either a
9117 ``cleanuppad`` or ``catchswitch`` instruction. This unwind destination must
9118 be a legal target with respect to the ``parent`` links, as described in the
9119 `exception handling documentation\ <ExceptionHandling.html#wineh-constraints>`_.
9124 The '``cleanupret``' instruction indicates to the
9125 :ref:`personality function <personalityfn>` that one
9126 :ref:`cleanuppad <i_cleanuppad>` it transferred control to has ended.
9127 It transfers control to ``continue`` or unwinds out of the function.
9132 .. code-block:: text
9134 cleanupret from %cleanup unwind to caller
9135 cleanupret from %cleanup unwind label %continue
9139 '``unreachable``' Instruction
9140 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
9152 The '``unreachable``' instruction has no defined semantics. This
9153 instruction is used to inform the optimizer that a particular portion of
9154 the code is not reachable. This can be used to indicate that the code
9155 after a no-return function cannot be reached, and other facts.
9160 The '``unreachable``' instruction has no defined semantics.
9167 Unary operators require a single operand, execute an operation on
9168 it, and produce a single value. The operand might represent multiple
9169 data, as is the case with the :ref:`vector <t_vector>` data type. The
9170 result value has the same type as its operand.
9174 '``fneg``' Instruction
9175 ^^^^^^^^^^^^^^^^^^^^^^
9182 <result> = fneg [fast-math flags]* <ty> <op1> ; yields ty:result
9187 The '``fneg``' instruction returns the negation of its operand.
9192 The argument to the '``fneg``' instruction must be a
9193 :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of
9194 floating-point values.
9199 The value produced is a copy of the operand with its sign bit flipped.
9200 The value is otherwise completely identical; in particular, if the input is a
9201 NaN, then the quiet/signaling bit and payload are perfectly preserved.
9203 This instruction can also take any number of :ref:`fast-math
9204 flags <fastmath>`, which are optimization hints to enable otherwise
9205 unsafe floating-point optimizations:
9210 .. code-block:: text
9212 <result> = fneg float %val ; yields float:result = -%var
9219 Binary operators are used to do most of the computation in a program.
9220 They require two operands of the same type, execute an operation on
9221 them, and produce a single value. The operands might represent multiple
9222 data, as is the case with the :ref:`vector <t_vector>` data type. The
9223 result value has the same type as its operands.
9225 There are several different binary operators:
9229 '``add``' Instruction
9230 ^^^^^^^^^^^^^^^^^^^^^
9237 <result> = add <ty> <op1>, <op2> ; yields ty:result
9238 <result> = add nuw <ty> <op1>, <op2> ; yields ty:result
9239 <result> = add nsw <ty> <op1>, <op2> ; yields ty:result
9240 <result> = add nuw nsw <ty> <op1>, <op2> ; yields ty:result
9245 The '``add``' instruction returns the sum of its two operands.
9250 The two arguments to the '``add``' instruction must be
9251 :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
9252 arguments must have identical types.
9257 The value produced is the integer sum of the two operands.
9259 If the sum has unsigned overflow, the result returned is the
9260 mathematical result modulo 2\ :sup:`n`\ , where n is the bit width of
9263 Because LLVM integers use a two's complement representation, this
9264 instruction is appropriate for both signed and unsigned integers.
9266 ``nuw`` and ``nsw`` stand for "No Unsigned Wrap" and "No Signed Wrap",
9267 respectively. If the ``nuw`` and/or ``nsw`` keywords are present, the
9268 result value of the ``add`` is a :ref:`poison value <poisonvalues>` if
9269 unsigned and/or signed overflow, respectively, occurs.
9274 .. code-block:: text
9276 <result> = add i32 4, %var ; yields i32:result = 4 + %var
9280 '``fadd``' Instruction
9281 ^^^^^^^^^^^^^^^^^^^^^^
9288 <result> = fadd [fast-math flags]* <ty> <op1>, <op2> ; yields ty:result
9293 The '``fadd``' instruction returns the sum of its two operands.
9298 The two arguments to the '``fadd``' instruction must be
9299 :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of
9300 floating-point values. Both arguments must have identical types.
9305 The value produced is the floating-point sum of the two operands.
9306 This instruction is assumed to execute in the default :ref:`floating-point
9307 environment <floatenv>`.
9308 This instruction can also take any number of :ref:`fast-math
9309 flags <fastmath>`, which are optimization hints to enable otherwise
9310 unsafe floating-point optimizations:
9315 .. code-block:: text
9317 <result> = fadd float 4.0, %var ; yields float:result = 4.0 + %var
9321 '``sub``' Instruction
9322 ^^^^^^^^^^^^^^^^^^^^^
9329 <result> = sub <ty> <op1>, <op2> ; yields ty:result
9330 <result> = sub nuw <ty> <op1>, <op2> ; yields ty:result
9331 <result> = sub nsw <ty> <op1>, <op2> ; yields ty:result
9332 <result> = sub nuw nsw <ty> <op1>, <op2> ; yields ty:result
9337 The '``sub``' instruction returns the difference of its two operands.
9339 Note that the '``sub``' instruction is used to represent the '``neg``'
9340 instruction present in most other intermediate representations.
9345 The two arguments to the '``sub``' instruction must be
9346 :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
9347 arguments must have identical types.
9352 The value produced is the integer difference of the two operands.
9354 If the difference has unsigned overflow, the result returned is the
9355 mathematical result modulo 2\ :sup:`n`\ , where n is the bit width of
9358 Because LLVM integers use a two's complement representation, this
9359 instruction is appropriate for both signed and unsigned integers.
9361 ``nuw`` and ``nsw`` stand for "No Unsigned Wrap" and "No Signed Wrap",
9362 respectively. If the ``nuw`` and/or ``nsw`` keywords are present, the
9363 result value of the ``sub`` is a :ref:`poison value <poisonvalues>` if
9364 unsigned and/or signed overflow, respectively, occurs.
9369 .. code-block:: text
9371 <result> = sub i32 4, %var ; yields i32:result = 4 - %var
9372 <result> = sub i32 0, %val ; yields i32:result = -%var
9376 '``fsub``' Instruction
9377 ^^^^^^^^^^^^^^^^^^^^^^
9384 <result> = fsub [fast-math flags]* <ty> <op1>, <op2> ; yields ty:result
9389 The '``fsub``' instruction returns the difference of its two operands.
9394 The two arguments to the '``fsub``' instruction must be
9395 :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of
9396 floating-point values. Both arguments must have identical types.
9401 The value produced is the floating-point difference of the two operands.
9402 This instruction is assumed to execute in the default :ref:`floating-point
9403 environment <floatenv>`.
9404 This instruction can also take any number of :ref:`fast-math
9405 flags <fastmath>`, which are optimization hints to enable otherwise
9406 unsafe floating-point optimizations:
9411 .. code-block:: text
9413 <result> = fsub float 4.0, %var ; yields float:result = 4.0 - %var
9414 <result> = fsub float -0.0, %val ; yields float:result = -%var
9418 '``mul``' Instruction
9419 ^^^^^^^^^^^^^^^^^^^^^
9426 <result> = mul <ty> <op1>, <op2> ; yields ty:result
9427 <result> = mul nuw <ty> <op1>, <op2> ; yields ty:result
9428 <result> = mul nsw <ty> <op1>, <op2> ; yields ty:result
9429 <result> = mul nuw nsw <ty> <op1>, <op2> ; yields ty:result
9434 The '``mul``' instruction returns the product of its two operands.
9439 The two arguments to the '``mul``' instruction must be
9440 :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
9441 arguments must have identical types.
9446 The value produced is the integer product of the two operands.
9448 If the result of the multiplication has unsigned overflow, the result
9449 returned is the mathematical result modulo 2\ :sup:`n`\ , where n is the
9450 bit width of the result.
9452 Because LLVM integers use a two's complement representation, and the
9453 result is the same width as the operands, this instruction returns the
9454 correct result for both signed and unsigned integers. If a full product
9455 (e.g. ``i32`` * ``i32`` -> ``i64``) is needed, the operands should be
9456 sign-extended or zero-extended as appropriate to the width of the full
9459 ``nuw`` and ``nsw`` stand for "No Unsigned Wrap" and "No Signed Wrap",
9460 respectively. If the ``nuw`` and/or ``nsw`` keywords are present, the
9461 result value of the ``mul`` is a :ref:`poison value <poisonvalues>` if
9462 unsigned and/or signed overflow, respectively, occurs.
9467 .. code-block:: text
9469 <result> = mul i32 4, %var ; yields i32:result = 4 * %var
9473 '``fmul``' Instruction
9474 ^^^^^^^^^^^^^^^^^^^^^^
9481 <result> = fmul [fast-math flags]* <ty> <op1>, <op2> ; yields ty:result
9486 The '``fmul``' instruction returns the product of its two operands.
9491 The two arguments to the '``fmul``' instruction must be
9492 :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of
9493 floating-point values. Both arguments must have identical types.
9498 The value produced is the floating-point product of the two operands.
9499 This instruction is assumed to execute in the default :ref:`floating-point
9500 environment <floatenv>`.
9501 This instruction can also take any number of :ref:`fast-math
9502 flags <fastmath>`, which are optimization hints to enable otherwise
9503 unsafe floating-point optimizations:
9508 .. code-block:: text
9510 <result> = fmul float 4.0, %var ; yields float:result = 4.0 * %var
9514 '``udiv``' Instruction
9515 ^^^^^^^^^^^^^^^^^^^^^^
9522 <result> = udiv <ty> <op1>, <op2> ; yields ty:result
9523 <result> = udiv exact <ty> <op1>, <op2> ; yields ty:result
9528 The '``udiv``' instruction returns the quotient of its two operands.
9533 The two arguments to the '``udiv``' instruction must be
9534 :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
9535 arguments must have identical types.
9540 The value produced is the unsigned integer quotient of the two operands.
9542 Note that unsigned integer division and signed integer division are
9543 distinct operations; for signed integer division, use '``sdiv``'.
9545 Division by zero is undefined behavior. For vectors, if any element
9546 of the divisor is zero, the operation has undefined behavior.
9549 If the ``exact`` keyword is present, the result value of the ``udiv`` is
9550 a :ref:`poison value <poisonvalues>` if %op1 is not a multiple of %op2 (as
9551 such, "((a udiv exact b) mul b) == a").
9556 .. code-block:: text
9558 <result> = udiv i32 4, %var ; yields i32:result = 4 / %var
9562 '``sdiv``' Instruction
9563 ^^^^^^^^^^^^^^^^^^^^^^
9570 <result> = sdiv <ty> <op1>, <op2> ; yields ty:result
9571 <result> = sdiv exact <ty> <op1>, <op2> ; yields ty:result
9576 The '``sdiv``' instruction returns the quotient of its two operands.
9581 The two arguments to the '``sdiv``' instruction must be
9582 :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
9583 arguments must have identical types.
9588 The value produced is the signed integer quotient of the two operands
9589 rounded towards zero.
9591 Note that signed integer division and unsigned integer division are
9592 distinct operations; for unsigned integer division, use '``udiv``'.
9594 Division by zero is undefined behavior. For vectors, if any element
9595 of the divisor is zero, the operation has undefined behavior.
9596 Overflow also leads to undefined behavior; this is a rare case, but can
9597 occur, for example, by doing a 32-bit division of -2147483648 by -1.
9599 If the ``exact`` keyword is present, the result value of the ``sdiv`` is
9600 a :ref:`poison value <poisonvalues>` if the result would be rounded.
9605 .. code-block:: text
9607 <result> = sdiv i32 4, %var ; yields i32:result = 4 / %var
9611 '``fdiv``' Instruction
9612 ^^^^^^^^^^^^^^^^^^^^^^
9619 <result> = fdiv [fast-math flags]* <ty> <op1>, <op2> ; yields ty:result
9624 The '``fdiv``' instruction returns the quotient of its two operands.
9629 The two arguments to the '``fdiv``' instruction must be
9630 :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of
9631 floating-point values. Both arguments must have identical types.
9636 The value produced is the floating-point quotient of the two operands.
9637 This instruction is assumed to execute in the default :ref:`floating-point
9638 environment <floatenv>`.
9639 This instruction can also take any number of :ref:`fast-math
9640 flags <fastmath>`, which are optimization hints to enable otherwise
9641 unsafe floating-point optimizations:
9646 .. code-block:: text
9648 <result> = fdiv float 4.0, %var ; yields float:result = 4.0 / %var
9652 '``urem``' Instruction
9653 ^^^^^^^^^^^^^^^^^^^^^^
9660 <result> = urem <ty> <op1>, <op2> ; yields ty:result
9665 The '``urem``' instruction returns the remainder from the unsigned
9666 division of its two arguments.
9671 The two arguments to the '``urem``' instruction must be
9672 :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
9673 arguments must have identical types.
9678 This instruction returns the unsigned integer *remainder* of a division.
9679 This instruction always performs an unsigned division to get the
9682 Note that unsigned integer remainder and signed integer remainder are
9683 distinct operations; for signed integer remainder, use '``srem``'.
9685 Taking the remainder of a division by zero is undefined behavior.
9686 For vectors, if any element of the divisor is zero, the operation has
9692 .. code-block:: text
9694 <result> = urem i32 4, %var ; yields i32:result = 4 % %var
9698 '``srem``' Instruction
9699 ^^^^^^^^^^^^^^^^^^^^^^
9706 <result> = srem <ty> <op1>, <op2> ; yields ty:result
9711 The '``srem``' instruction returns the remainder from the signed
9712 division of its two operands. This instruction can also take
9713 :ref:`vector <t_vector>` versions of the values in which case the elements
9719 The two arguments to the '``srem``' instruction must be
9720 :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
9721 arguments must have identical types.
9726 This instruction returns the *remainder* of a division (where the result
9727 is either zero or has the same sign as the dividend, ``op1``), not the
9728 *modulo* operator (where the result is either zero or has the same sign
9729 as the divisor, ``op2``) of a value. For more information about the
9730 difference, see `The Math
9731 Forum <http://mathforum.org/dr.math/problems/anne.4.28.99.html>`_. For a
9732 table of how this is implemented in various languages, please see
9734 operation <http://en.wikipedia.org/wiki/Modulo_operation>`_.
9736 Note that signed integer remainder and unsigned integer remainder are
9737 distinct operations; for unsigned integer remainder, use '``urem``'.
9739 Taking the remainder of a division by zero is undefined behavior.
9740 For vectors, if any element of the divisor is zero, the operation has
9742 Overflow also leads to undefined behavior; this is a rare case, but can
9743 occur, for example, by taking the remainder of a 32-bit division of
9744 -2147483648 by -1. (The remainder doesn't actually overflow, but this
9745 rule lets srem be implemented using instructions that return both the
9746 result of the division and the remainder.)
9751 .. code-block:: text
9753 <result> = srem i32 4, %var ; yields i32:result = 4 % %var
9757 '``frem``' Instruction
9758 ^^^^^^^^^^^^^^^^^^^^^^
9765 <result> = frem [fast-math flags]* <ty> <op1>, <op2> ; yields ty:result
9770 The '``frem``' instruction returns the remainder from the division of
9775 The instruction is implemented as a call to libm's '``fmod``'
9776 for some targets, and using the instruction may thus require linking libm.
9782 The two arguments to the '``frem``' instruction must be
9783 :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of
9784 floating-point values. Both arguments must have identical types.
9789 The value produced is the floating-point remainder of the two operands.
9790 This is the same output as a libm '``fmod``' function, but without any
9791 possibility of setting ``errno``. The remainder has the same sign as the
9793 This instruction is assumed to execute in the default :ref:`floating-point
9794 environment <floatenv>`.
9795 This instruction can also take any number of :ref:`fast-math
9796 flags <fastmath>`, which are optimization hints to enable otherwise
9797 unsafe floating-point optimizations:
9802 .. code-block:: text
9804 <result> = frem float 4.0, %var ; yields float:result = 4.0 % %var
9808 Bitwise Binary Operations
9809 -------------------------
9811 Bitwise binary operators are used to do various forms of bit-twiddling
9812 in a program. They are generally very efficient instructions and can
9813 commonly be strength reduced from other instructions. They require two
9814 operands of the same type, execute an operation on them, and produce a
9815 single value. The resulting value is the same type as its operands.
9819 '``shl``' Instruction
9820 ^^^^^^^^^^^^^^^^^^^^^
9827 <result> = shl <ty> <op1>, <op2> ; yields ty:result
9828 <result> = shl nuw <ty> <op1>, <op2> ; yields ty:result
9829 <result> = shl nsw <ty> <op1>, <op2> ; yields ty:result
9830 <result> = shl nuw nsw <ty> <op1>, <op2> ; yields ty:result
9835 The '``shl``' instruction returns the first operand shifted to the left
9836 a specified number of bits.
9841 Both arguments to the '``shl``' instruction must be the same
9842 :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer type.
9843 '``op2``' is treated as an unsigned value.
9848 The value produced is ``op1`` \* 2\ :sup:`op2` mod 2\ :sup:`n`,
9849 where ``n`` is the width of the result. If ``op2`` is (statically or
9850 dynamically) equal to or larger than the number of bits in
9851 ``op1``, this instruction returns a :ref:`poison value <poisonvalues>`.
9852 If the arguments are vectors, each vector element of ``op1`` is shifted
9853 by the corresponding shift amount in ``op2``.
9855 If the ``nuw`` keyword is present, then the shift produces a poison
9856 value if it shifts out any non-zero bits.
9857 If the ``nsw`` keyword is present, then the shift produces a poison
9858 value if it shifts out any bits that disagree with the resultant sign bit.
9863 .. code-block:: text
9865 <result> = shl i32 4, %var ; yields i32: 4 << %var
9866 <result> = shl i32 4, 2 ; yields i32: 16
9867 <result> = shl i32 1, 10 ; yields i32: 1024
9868 <result> = shl i32 1, 32 ; undefined
9869 <result> = shl <2 x i32> < i32 1, i32 1>, < i32 1, i32 2> ; yields: result=<2 x i32> < i32 2, i32 4>
9874 '``lshr``' Instruction
9875 ^^^^^^^^^^^^^^^^^^^^^^
9882 <result> = lshr <ty> <op1>, <op2> ; yields ty:result
9883 <result> = lshr exact <ty> <op1>, <op2> ; yields ty:result
9888 The '``lshr``' instruction (logical shift right) returns the first
9889 operand shifted to the right a specified number of bits with zero fill.
9894 Both arguments to the '``lshr``' instruction must be the same
9895 :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer type.
9896 '``op2``' is treated as an unsigned value.
9901 This instruction always performs a logical shift right operation. The
9902 most significant bits of the result will be filled with zero bits after
9903 the shift. If ``op2`` is (statically or dynamically) equal to or larger
9904 than the number of bits in ``op1``, this instruction returns a :ref:`poison
9905 value <poisonvalues>`. If the arguments are vectors, each vector element
9906 of ``op1`` is shifted by the corresponding shift amount in ``op2``.
9908 If the ``exact`` keyword is present, the result value of the ``lshr`` is
9909 a poison value if any of the bits shifted out are non-zero.
9914 .. code-block:: text
9916 <result> = lshr i32 4, 1 ; yields i32:result = 2
9917 <result> = lshr i32 4, 2 ; yields i32:result = 1
9918 <result> = lshr i8 4, 3 ; yields i8:result = 0
9919 <result> = lshr i8 -2, 1 ; yields i8:result = 0x7F
9920 <result> = lshr i32 1, 32 ; undefined
9921 <result> = lshr <2 x i32> < i32 -2, i32 4>, < i32 1, i32 2> ; yields: result=<2 x i32> < i32 0x7FFFFFFF, i32 1>
9925 '``ashr``' Instruction
9926 ^^^^^^^^^^^^^^^^^^^^^^
9933 <result> = ashr <ty> <op1>, <op2> ; yields ty:result
9934 <result> = ashr exact <ty> <op1>, <op2> ; yields ty:result
9939 The '``ashr``' instruction (arithmetic shift right) returns the first
9940 operand shifted to the right a specified number of bits with sign
9946 Both arguments to the '``ashr``' instruction must be the same
9947 :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer type.
9948 '``op2``' is treated as an unsigned value.
9953 This instruction always performs an arithmetic shift right operation,
9954 The most significant bits of the result will be filled with the sign bit
9955 of ``op1``. If ``op2`` is (statically or dynamically) equal to or larger
9956 than the number of bits in ``op1``, this instruction returns a :ref:`poison
9957 value <poisonvalues>`. If the arguments are vectors, each vector element
9958 of ``op1`` is shifted by the corresponding shift amount in ``op2``.
9960 If the ``exact`` keyword is present, the result value of the ``ashr`` is
9961 a poison value if any of the bits shifted out are non-zero.
9966 .. code-block:: text
9968 <result> = ashr i32 4, 1 ; yields i32:result = 2
9969 <result> = ashr i32 4, 2 ; yields i32:result = 1
9970 <result> = ashr i8 4, 3 ; yields i8:result = 0
9971 <result> = ashr i8 -2, 1 ; yields i8:result = -1
9972 <result> = ashr i32 1, 32 ; undefined
9973 <result> = ashr <2 x i32> < i32 -2, i32 4>, < i32 1, i32 3> ; yields: result=<2 x i32> < i32 -1, i32 0>
9977 '``and``' Instruction
9978 ^^^^^^^^^^^^^^^^^^^^^
9985 <result> = and <ty> <op1>, <op2> ; yields ty:result
9990 The '``and``' instruction returns the bitwise logical and of its two
9996 The two arguments to the '``and``' instruction must be
9997 :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
9998 arguments must have identical types.
10003 The truth table used for the '``and``' instruction is:
10005 +-----+-----+-----+
10006 | In0 | In1 | Out |
10007 +-----+-----+-----+
10009 +-----+-----+-----+
10011 +-----+-----+-----+
10013 +-----+-----+-----+
10015 +-----+-----+-----+
10020 .. code-block:: text
10022 <result> = and i32 4, %var ; yields i32:result = 4 & %var
10023 <result> = and i32 15, 40 ; yields i32:result = 8
10024 <result> = and i32 4, 8 ; yields i32:result = 0
10028 '``or``' Instruction
10029 ^^^^^^^^^^^^^^^^^^^^
10036 <result> = or <ty> <op1>, <op2> ; yields ty:result
10037 <result> = or disjoint <ty> <op1>, <op2> ; yields ty:result
10042 The '``or``' instruction returns the bitwise logical inclusive or of its
10048 The two arguments to the '``or``' instruction must be
10049 :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
10050 arguments must have identical types.
10055 The truth table used for the '``or``' instruction is:
10057 +-----+-----+-----+
10058 | In0 | In1 | Out |
10059 +-----+-----+-----+
10061 +-----+-----+-----+
10063 +-----+-----+-----+
10065 +-----+-----+-----+
10067 +-----+-----+-----+
10069 ``disjoint`` means that for each bit, that bit is zero in at least one of the
10070 inputs. This allows the Or to be treated as an Add since no carry can occur from
10071 any bit. If the disjoint keyword is present, the result value of the ``or`` is a
10072 :ref:`poison value <poisonvalues>` if both inputs have a one in the same bit
10073 position. For vectors, only the element containing the bit is poison.
10080 <result> = or i32 4, %var ; yields i32:result = 4 | %var
10081 <result> = or i32 15, 40 ; yields i32:result = 47
10082 <result> = or i32 4, 8 ; yields i32:result = 12
10086 '``xor``' Instruction
10087 ^^^^^^^^^^^^^^^^^^^^^
10094 <result> = xor <ty> <op1>, <op2> ; yields ty:result
10099 The '``xor``' instruction returns the bitwise logical exclusive or of
10100 its two operands. The ``xor`` is used to implement the "one's
10101 complement" operation, which is the "~" operator in C.
10106 The two arguments to the '``xor``' instruction must be
10107 :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
10108 arguments must have identical types.
10113 The truth table used for the '``xor``' instruction is:
10115 +-----+-----+-----+
10116 | In0 | In1 | Out |
10117 +-----+-----+-----+
10119 +-----+-----+-----+
10121 +-----+-----+-----+
10123 +-----+-----+-----+
10125 +-----+-----+-----+
10130 .. code-block:: text
10132 <result> = xor i32 4, %var ; yields i32:result = 4 ^ %var
10133 <result> = xor i32 15, 40 ; yields i32:result = 39
10134 <result> = xor i32 4, 8 ; yields i32:result = 12
10135 <result> = xor i32 %V, -1 ; yields i32:result = ~%V
10140 LLVM supports several instructions to represent vector operations in a
10141 target-independent manner. These instructions cover the element-access
10142 and vector-specific operations needed to process vectors effectively.
10143 While LLVM does directly support these vector operations, many
10144 sophisticated algorithms will want to use target-specific intrinsics to
10145 take full advantage of a specific target.
10147 .. _i_extractelement:
10149 '``extractelement``' Instruction
10150 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10157 <result> = extractelement <n x <ty>> <val>, <ty2> <idx> ; yields <ty>
10158 <result> = extractelement <vscale x n x <ty>> <val>, <ty2> <idx> ; yields <ty>
10163 The '``extractelement``' instruction extracts a single scalar element
10164 from a vector at a specified index.
10169 The first operand of an '``extractelement``' instruction is a value of
10170 :ref:`vector <t_vector>` type. The second operand is an index indicating
10171 the position from which to extract the element. The index may be a
10172 variable of any integer type, and will be treated as an unsigned integer.
10177 The result is a scalar of the same type as the element type of ``val``.
10178 Its value is the value at position ``idx`` of ``val``. If ``idx``
10179 exceeds the length of ``val`` for a fixed-length vector, the result is a
10180 :ref:`poison value <poisonvalues>`. For a scalable vector, if the value
10181 of ``idx`` exceeds the runtime length of the vector, the result is a
10182 :ref:`poison value <poisonvalues>`.
10187 .. code-block:: text
10189 <result> = extractelement <4 x i32> %vec, i32 0 ; yields i32
10191 .. _i_insertelement:
10193 '``insertelement``' Instruction
10194 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10201 <result> = insertelement <n x <ty>> <val>, <ty> <elt>, <ty2> <idx> ; yields <n x <ty>>
10202 <result> = insertelement <vscale x n x <ty>> <val>, <ty> <elt>, <ty2> <idx> ; yields <vscale x n x <ty>>
10207 The '``insertelement``' instruction inserts a scalar element into a
10208 vector at a specified index.
10213 The first operand of an '``insertelement``' instruction is a value of
10214 :ref:`vector <t_vector>` type. The second operand is a scalar value whose
10215 type must equal the element type of the first operand. The third operand
10216 is an index indicating the position at which to insert the value. The
10217 index may be a variable of any integer type, and will be treated as an
10223 The result is a vector of the same type as ``val``. Its element values
10224 are those of ``val`` except at position ``idx``, where it gets the value
10225 ``elt``. If ``idx`` exceeds the length of ``val`` for a fixed-length vector,
10226 the result is a :ref:`poison value <poisonvalues>`. For a scalable vector,
10227 if the value of ``idx`` exceeds the runtime length of the vector, the result
10228 is a :ref:`poison value <poisonvalues>`.
10233 .. code-block:: text
10235 <result> = insertelement <4 x i32> %vec, i32 1, i32 0 ; yields <4 x i32>
10237 .. _i_shufflevector:
10239 '``shufflevector``' Instruction
10240 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10247 <result> = shufflevector <n x <ty>> <v1>, <n x <ty>> <v2>, <m x i32> <mask> ; yields <m x <ty>>
10248 <result> = shufflevector <vscale x n x <ty>> <v1>, <vscale x n x <ty>> v2, <vscale x m x i32> <mask> ; yields <vscale x m x <ty>>
10253 The '``shufflevector``' instruction constructs a permutation of elements
10254 from two input vectors, returning a vector with the same element type as
10255 the input and length that is the same as the shuffle mask.
10260 The first two operands of a '``shufflevector``' instruction are vectors
10261 with the same type. The third argument is a shuffle mask vector constant
10262 whose element type is ``i32``. The mask vector elements must be constant
10263 integers or ``poison`` values. The result of the instruction is a vector
10264 whose length is the same as the shuffle mask and whose element type is the
10265 same as the element type of the first two operands.
10270 The elements of the two input vectors are numbered from left to right
10271 across both of the vectors. For each element of the result vector, the
10272 shuffle mask selects an element from one of the input vectors to copy
10273 to the result. Non-negative elements in the mask represent an index
10274 into the concatenated pair of input vectors.
10276 A ``poison`` element in the mask vector specifies that the resulting element
10278 For backwards-compatibility reasons, LLVM temporarily also accepts ``undef``
10279 mask elements, which will be interpreted the same way as ``poison`` elements.
10280 If the shuffle mask selects an ``undef`` element from one of the input
10281 vectors, the resulting element is ``undef``.
10283 For scalable vectors, the only valid mask values at present are
10284 ``zeroinitializer``, ``undef`` and ``poison``, since we cannot write all indices as
10285 literals for a vector with a length unknown at compile time.
10290 .. code-block:: text
10292 <result> = shufflevector <4 x i32> %v1, <4 x i32> %v2,
10293 <4 x i32> <i32 0, i32 4, i32 1, i32 5> ; yields <4 x i32>
10294 <result> = shufflevector <4 x i32> %v1, <4 x i32> poison,
10295 <4 x i32> <i32 0, i32 1, i32 2, i32 3> ; yields <4 x i32> - Identity shuffle.
10296 <result> = shufflevector <8 x i32> %v1, <8 x i32> poison,
10297 <4 x i32> <i32 0, i32 1, i32 2, i32 3> ; yields <4 x i32>
10298 <result> = shufflevector <4 x i32> %v1, <4 x i32> %v2,
10299 <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7 > ; yields <8 x i32>
10301 Aggregate Operations
10302 --------------------
10304 LLVM supports several instructions for working with
10305 :ref:`aggregate <t_aggregate>` values.
10307 .. _i_extractvalue:
10309 '``extractvalue``' Instruction
10310 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10317 <result> = extractvalue <aggregate type> <val>, <idx>{, <idx>}*
10322 The '``extractvalue``' instruction extracts the value of a member field
10323 from an :ref:`aggregate <t_aggregate>` value.
10328 The first operand of an '``extractvalue``' instruction is a value of
10329 :ref:`struct <t_struct>` or :ref:`array <t_array>` type. The other operands are
10330 constant indices to specify which value to extract in a similar manner
10331 as indices in a '``getelementptr``' instruction.
10333 The major differences to ``getelementptr`` indexing are:
10335 - Since the value being indexed is not a pointer, the first index is
10336 omitted and assumed to be zero.
10337 - At least one index must be specified.
10338 - Not only struct indices but also array indices must be in bounds.
10343 The result is the value at the position in the aggregate specified by
10344 the index operands.
10349 .. code-block:: text
10351 <result> = extractvalue {i32, float} %agg, 0 ; yields i32
10355 '``insertvalue``' Instruction
10356 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10363 <result> = insertvalue <aggregate type> <val>, <ty> <elt>, <idx>{, <idx>}* ; yields <aggregate type>
10368 The '``insertvalue``' instruction inserts a value into a member field in
10369 an :ref:`aggregate <t_aggregate>` value.
10374 The first operand of an '``insertvalue``' instruction is a value of
10375 :ref:`struct <t_struct>` or :ref:`array <t_array>` type. The second operand is
10376 a first-class value to insert. The following operands are constant
10377 indices indicating the position at which to insert the value in a
10378 similar manner as indices in a '``extractvalue``' instruction. The value
10379 to insert must have the same type as the value identified by the
10385 The result is an aggregate of the same type as ``val``. Its value is
10386 that of ``val`` except that the value at the position specified by the
10387 indices is that of ``elt``.
10392 .. code-block:: llvm
10394 %agg1 = insertvalue {i32, float} undef, i32 1, 0 ; yields {i32 1, float undef}
10395 %agg2 = insertvalue {i32, float} %agg1, float %val, 1 ; yields {i32 1, float %val}
10396 %agg3 = insertvalue {i32, {float}} undef, float %val, 1, 0 ; yields {i32 undef, {float %val}}
10400 Memory Access and Addressing Operations
10401 ---------------------------------------
10403 A key design point of an SSA-based representation is how it represents
10404 memory. In LLVM, no memory locations are in SSA form, which makes things
10405 very simple. This section describes how to read, write, and allocate
10410 '``alloca``' Instruction
10411 ^^^^^^^^^^^^^^^^^^^^^^^^
10418 <result> = alloca [inalloca] <type> [, <ty> <NumElements>] [, align <alignment>] [, addrspace(<num>)] ; yields type addrspace(num)*:result
10423 The '``alloca``' instruction allocates memory on the stack frame of the
10424 currently executing function, to be automatically released when this
10425 function returns to its caller. If the address space is not explicitly
10426 specified, the object is allocated in the alloca address space from the
10427 :ref:`datalayout string<langref_datalayout>`.
10432 The '``alloca``' instruction allocates ``sizeof(<type>)*NumElements``
10433 bytes of memory on the runtime stack, returning a pointer of the
10434 appropriate type to the program. If "NumElements" is specified, it is
10435 the number of elements allocated, otherwise "NumElements" is defaulted
10438 If a constant alignment is specified, the value result of the
10439 allocation is guaranteed to be aligned to at least that boundary. The
10440 alignment may not be greater than ``1 << 32``.
10442 The alignment is only optional when parsing textual IR; for in-memory IR,
10443 it is always present. If not specified, the target can choose to align the
10444 allocation on any convenient boundary compatible with the type.
10446 '``type``' may be any sized type.
10448 Structs containing scalable vectors cannot be used in allocas unless all
10449 fields are the same scalable vector type (e.g. ``{<vscale x 2 x i32>,
10450 <vscale x 2 x i32>}`` contains the same type while ``{<vscale x 2 x i32>,
10451 <vscale x 2 x i64>}`` doesn't).
10456 Memory is allocated; a pointer is returned. The allocated memory is
10457 uninitialized, and loading from uninitialized memory produces an undefined
10458 value. The operation itself is undefined if there is insufficient stack
10459 space for the allocation.'``alloca``'d memory is automatically released
10460 when the function returns. The '``alloca``' instruction is commonly used
10461 to represent automatic variables that must have an address available. When
10462 the function returns (either with the ``ret`` or ``resume`` instructions),
10463 the memory is reclaimed. Allocating zero bytes is legal, but the returned
10464 pointer may not be unique. The order in which memory is allocated (ie.,
10465 which way the stack grows) is not specified.
10467 Note that '``alloca``' outside of the alloca address space from the
10468 :ref:`datalayout string<langref_datalayout>` is meaningful only if the
10469 target has assigned it a semantics.
10471 If the returned pointer is used by :ref:`llvm.lifetime.start <int_lifestart>`,
10472 the returned object is initially dead.
10473 See :ref:`llvm.lifetime.start <int_lifestart>` and
10474 :ref:`llvm.lifetime.end <int_lifeend>` for the precise semantics of
10475 lifetime-manipulating intrinsics.
10480 .. code-block:: llvm
10482 %ptr = alloca i32 ; yields ptr
10483 %ptr = alloca i32, i32 4 ; yields ptr
10484 %ptr = alloca i32, i32 4, align 1024 ; yields ptr
10485 %ptr = alloca i32, align 1024 ; yields ptr
10489 '``load``' Instruction
10490 ^^^^^^^^^^^^^^^^^^^^^^
10497 <result> = load [volatile] <ty>, ptr <pointer>[, align <alignment>][, !nontemporal !<nontemp_node>][, !invariant.load !<empty_node>][, !invariant.group !<empty_node>][, !nonnull !<empty_node>][, !dereferenceable !<deref_bytes_node>][, !dereferenceable_or_null !<deref_bytes_node>][, !align !<align_node>][, !noundef !<empty_node>]
10498 <result> = load atomic [volatile] <ty>, ptr <pointer> [syncscope("<target-scope>")] <ordering>, align <alignment> [, !invariant.group !<empty_node>]
10499 !<nontemp_node> = !{ i32 1 }
10500 !<empty_node> = !{}
10501 !<deref_bytes_node> = !{ i64 <dereferenceable_bytes> }
10502 !<align_node> = !{ i64 <value_alignment> }
10507 The '``load``' instruction is used to read from memory.
10512 The argument to the ``load`` instruction specifies the memory address from which
10513 to load. The type specified must be a :ref:`first class <t_firstclass>` type of
10514 known size (i.e. not containing an :ref:`opaque structural type <t_opaque>`). If
10515 the ``load`` is marked as ``volatile``, then the optimizer is not allowed to
10516 modify the number or order of execution of this ``load`` with other
10517 :ref:`volatile operations <volatile>`.
10519 If the ``load`` is marked as ``atomic``, it takes an extra :ref:`ordering
10520 <ordering>` and optional ``syncscope("<target-scope>")`` argument. The
10521 ``release`` and ``acq_rel`` orderings are not valid on ``load`` instructions.
10522 Atomic loads produce :ref:`defined <memmodel>` results when they may see
10523 multiple atomic stores. The type of the pointee must be an integer, pointer, or
10524 floating-point type whose bit width is a power of two greater than or equal to
10525 eight and less than or equal to a target-specific size limit. ``align`` must be
10526 explicitly specified on atomic loads. Note: if the alignment is not greater or
10527 equal to the size of the `<value>` type, the atomic operation is likely to
10528 require a lock and have poor performance. ``!nontemporal`` does not have any
10529 defined semantics for atomic loads.
10531 The optional constant ``align`` argument specifies the alignment of the
10532 operation (that is, the alignment of the memory address). It is the
10533 responsibility of the code emitter to ensure that the alignment information is
10534 correct. Overestimating the alignment results in undefined behavior.
10535 Underestimating the alignment may produce less efficient code. An alignment of
10536 1 is always safe. The maximum possible alignment is ``1 << 32``. An alignment
10537 value higher than the size of the loaded type implies memory up to the
10538 alignment value bytes can be safely loaded without trapping in the default
10539 address space. Access of the high bytes can interfere with debugging tools, so
10540 should not be accessed if the function has the ``sanitize_thread`` or
10541 ``sanitize_address`` attributes.
10543 The alignment is only optional when parsing textual IR; for in-memory IR, it is
10544 always present. An omitted ``align`` argument means that the operation has the
10545 ABI alignment for the target.
10547 The optional ``!nontemporal`` metadata must reference a single
10548 metadata name ``<nontemp_node>`` corresponding to a metadata node with one
10549 ``i32`` entry of value 1. The existence of the ``!nontemporal``
10550 metadata on the instruction tells the optimizer and code generator
10551 that this load is not expected to be reused in the cache. The code
10552 generator may select special instructions to save cache bandwidth, such
10553 as the ``MOVNT`` instruction on x86.
10555 The optional ``!invariant.load`` metadata must reference a single
10556 metadata name ``<empty_node>`` corresponding to a metadata node with no
10557 entries. If a load instruction tagged with the ``!invariant.load``
10558 metadata is executed, the memory location referenced by the load has
10559 to contain the same value at all points in the program where the
10560 memory location is dereferenceable; otherwise, the behavior is
10563 The optional ``!invariant.group`` metadata must reference a single metadata name
10564 ``<empty_node>`` corresponding to a metadata node with no entries.
10565 See ``invariant.group`` metadata :ref:`invariant.group <md_invariant.group>`.
10567 The optional ``!nonnull`` metadata must reference a single
10568 metadata name ``<empty_node>`` corresponding to a metadata node with no
10569 entries. The existence of the ``!nonnull`` metadata on the
10570 instruction tells the optimizer that the value loaded is known to
10571 never be null. If the value is null at runtime, a poison value is returned
10572 instead. This is analogous to the ``nonnull`` attribute on parameters and
10573 return values. This metadata can only be applied to loads of a pointer type.
10575 The optional ``!dereferenceable`` metadata must reference a single metadata
10576 name ``<deref_bytes_node>`` corresponding to a metadata node with one ``i64``
10578 See ``dereferenceable`` metadata :ref:`dereferenceable <md_dereferenceable>`.
10580 The optional ``!dereferenceable_or_null`` metadata must reference a single
10581 metadata name ``<deref_bytes_node>`` corresponding to a metadata node with one
10583 See ``dereferenceable_or_null`` metadata :ref:`dereferenceable_or_null
10584 <md_dereferenceable_or_null>`.
10586 The optional ``!align`` metadata must reference a single metadata name
10587 ``<align_node>`` corresponding to a metadata node with one ``i64`` entry.
10588 The existence of the ``!align`` metadata on the instruction tells the
10589 optimizer that the value loaded is known to be aligned to a boundary specified
10590 by the integer value in the metadata node. The alignment must be a power of 2.
10591 This is analogous to the ''align'' attribute on parameters and return values.
10592 This metadata can only be applied to loads of a pointer type. If the returned
10593 value is not appropriately aligned at runtime, a poison value is returned
10596 The optional ``!noundef`` metadata must reference a single metadata name
10597 ``<empty_node>`` corresponding to a node with no entries. The existence of
10598 ``!noundef`` metadata on the instruction tells the optimizer that the value
10599 loaded is known to be :ref:`well defined <welldefinedvalues>`.
10600 If the value isn't well defined, the behavior is undefined. If the ``!noundef``
10601 metadata is combined with poison-generating metadata like ``!nonnull``,
10602 violation of that metadata constraint will also result in undefined behavior.
10607 The location of memory pointed to is loaded. If the value being loaded
10608 is of scalar type then the number of bytes read does not exceed the
10609 minimum number of bytes needed to hold all bits of the type. For
10610 example, loading an ``i24`` reads at most three bytes. When loading a
10611 value of a type like ``i20`` with a size that is not an integral number
10612 of bytes, the result is undefined if the value was not originally
10613 written using a store of the same type.
10614 If the value being loaded is of aggregate type, the bytes that correspond to
10615 padding may be accessed but are ignored, because it is impossible to observe
10616 padding from the loaded aggregate value.
10617 If ``<pointer>`` is not a well-defined value, the behavior is undefined.
10622 .. code-block:: llvm
10624 %ptr = alloca i32 ; yields ptr
10625 store i32 3, ptr %ptr ; yields void
10626 %val = load i32, ptr %ptr ; yields i32:val = i32 3
10630 '``store``' Instruction
10631 ^^^^^^^^^^^^^^^^^^^^^^^
10638 store [volatile] <ty> <value>, ptr <pointer>[, align <alignment>][, !nontemporal !<nontemp_node>][, !invariant.group !<empty_node>] ; yields void
10639 store atomic [volatile] <ty> <value>, ptr <pointer> [syncscope("<target-scope>")] <ordering>, align <alignment> [, !invariant.group !<empty_node>] ; yields void
10640 !<nontemp_node> = !{ i32 1 }
10641 !<empty_node> = !{}
10646 The '``store``' instruction is used to write to memory.
10651 There are two arguments to the ``store`` instruction: a value to store and an
10652 address at which to store it. The type of the ``<pointer>`` operand must be a
10653 pointer to the :ref:`first class <t_firstclass>` type of the ``<value>``
10654 operand. If the ``store`` is marked as ``volatile``, then the optimizer is not
10655 allowed to modify the number or order of execution of this ``store`` with other
10656 :ref:`volatile operations <volatile>`. Only values of :ref:`first class
10657 <t_firstclass>` types of known size (i.e. not containing an :ref:`opaque
10658 structural type <t_opaque>`) can be stored.
10660 If the ``store`` is marked as ``atomic``, it takes an extra :ref:`ordering
10661 <ordering>` and optional ``syncscope("<target-scope>")`` argument. The
10662 ``acquire`` and ``acq_rel`` orderings aren't valid on ``store`` instructions.
10663 Atomic loads produce :ref:`defined <memmodel>` results when they may see
10664 multiple atomic stores. The type of the pointee must be an integer, pointer, or
10665 floating-point type whose bit width is a power of two greater than or equal to
10666 eight and less than or equal to a target-specific size limit. ``align`` must be
10667 explicitly specified on atomic stores. Note: if the alignment is not greater or
10668 equal to the size of the `<value>` type, the atomic operation is likely to
10669 require a lock and have poor performance. ``!nontemporal`` does not have any
10670 defined semantics for atomic stores.
10672 The optional constant ``align`` argument specifies the alignment of the
10673 operation (that is, the alignment of the memory address). It is the
10674 responsibility of the code emitter to ensure that the alignment information is
10675 correct. Overestimating the alignment results in undefined behavior.
10676 Underestimating the alignment may produce less efficient code. An alignment of
10677 1 is always safe. The maximum possible alignment is ``1 << 32``. An alignment
10678 value higher than the size of the loaded type implies memory up to the
10679 alignment value bytes can be safely loaded without trapping in the default
10680 address space. Access of the high bytes can interfere with debugging tools, so
10681 should not be accessed if the function has the ``sanitize_thread`` or
10682 ``sanitize_address`` attributes.
10684 The alignment is only optional when parsing textual IR; for in-memory IR, it is
10685 always present. An omitted ``align`` argument means that the operation has the
10686 ABI alignment for the target.
10688 The optional ``!nontemporal`` metadata must reference a single metadata
10689 name ``<nontemp_node>`` corresponding to a metadata node with one ``i32`` entry
10690 of value 1. The existence of the ``!nontemporal`` metadata on the instruction
10691 tells the optimizer and code generator that this load is not expected to
10692 be reused in the cache. The code generator may select special
10693 instructions to save cache bandwidth, such as the ``MOVNT`` instruction on
10696 The optional ``!invariant.group`` metadata must reference a
10697 single metadata name ``<empty_node>``. See ``invariant.group`` metadata.
10702 The contents of memory are updated to contain ``<value>`` at the
10703 location specified by the ``<pointer>`` operand. If ``<value>`` is
10704 of scalar type then the number of bytes written does not exceed the
10705 minimum number of bytes needed to hold all bits of the type. For
10706 example, storing an ``i24`` writes at most three bytes. When writing a
10707 value of a type like ``i20`` with a size that is not an integral number
10708 of bytes, it is unspecified what happens to the extra bits that do not
10709 belong to the type, but they will typically be overwritten.
10710 If ``<value>`` is of aggregate type, padding is filled with
10711 :ref:`undef <undefvalues>`.
10712 If ``<pointer>`` is not a well-defined value, the behavior is undefined.
10717 .. code-block:: llvm
10719 %ptr = alloca i32 ; yields ptr
10720 store i32 3, ptr %ptr ; yields void
10721 %val = load i32, ptr %ptr ; yields i32:val = i32 3
10725 '``fence``' Instruction
10726 ^^^^^^^^^^^^^^^^^^^^^^^
10733 fence [syncscope("<target-scope>")] <ordering> ; yields void
10738 The '``fence``' instruction is used to introduce happens-before edges
10739 between operations.
10744 '``fence``' instructions take an :ref:`ordering <ordering>` argument which
10745 defines what *synchronizes-with* edges they add. They can only be given
10746 ``acquire``, ``release``, ``acq_rel``, and ``seq_cst`` orderings.
10751 A fence A which has (at least) ``release`` ordering semantics
10752 *synchronizes with* a fence B with (at least) ``acquire`` ordering
10753 semantics if and only if there exist atomic operations X and Y, both
10754 operating on some atomic object M, such that A is sequenced before X, X
10755 modifies M (either directly or through some side effect of a sequence
10756 headed by X), Y is sequenced before B, and Y observes M. This provides a
10757 *happens-before* dependency between A and B. Rather than an explicit
10758 ``fence``, one (but not both) of the atomic operations X or Y might
10759 provide a ``release`` or ``acquire`` (resp.) ordering constraint and
10760 still *synchronize-with* the explicit ``fence`` and establish the
10761 *happens-before* edge.
10763 A ``fence`` which has ``seq_cst`` ordering, in addition to having both
10764 ``acquire`` and ``release`` semantics specified above, participates in
10765 the global program order of other ``seq_cst`` operations and/or fences.
10767 A ``fence`` instruction can also take an optional
10768 ":ref:`syncscope <syncscope>`" argument.
10773 .. code-block:: text
10775 fence acquire ; yields void
10776 fence syncscope("singlethread") seq_cst ; yields void
10777 fence syncscope("agent") seq_cst ; yields void
10781 '``cmpxchg``' Instruction
10782 ^^^^^^^^^^^^^^^^^^^^^^^^^
10789 cmpxchg [weak] [volatile] ptr <pointer>, <ty> <cmp>, <ty> <new> [syncscope("<target-scope>")] <success ordering> <failure ordering>[, align <alignment>] ; yields { ty, i1 }
10794 The '``cmpxchg``' instruction is used to atomically modify memory. It
10795 loads a value in memory and compares it to a given value. If they are
10796 equal, it tries to store a new value into the memory.
10801 There are three arguments to the '``cmpxchg``' instruction: an address
10802 to operate on, a value to compare to the value currently be at that
10803 address, and a new value to place at that address if the compared values
10804 are equal. The type of '<cmp>' must be an integer or pointer type whose
10805 bit width is a power of two greater than or equal to eight and less
10806 than or equal to a target-specific size limit. '<cmp>' and '<new>' must
10807 have the same type, and the type of '<pointer>' must be a pointer to
10808 that type. If the ``cmpxchg`` is marked as ``volatile``, then the
10809 optimizer is not allowed to modify the number or order of execution of
10810 this ``cmpxchg`` with other :ref:`volatile operations <volatile>`.
10812 The success and failure :ref:`ordering <ordering>` arguments specify how this
10813 ``cmpxchg`` synchronizes with other atomic operations. Both ordering parameters
10814 must be at least ``monotonic``, the failure ordering cannot be either
10815 ``release`` or ``acq_rel``.
10817 A ``cmpxchg`` instruction can also take an optional
10818 ":ref:`syncscope <syncscope>`" argument.
10820 Note: if the alignment is not greater or equal to the size of the `<value>`
10821 type, the atomic operation is likely to require a lock and have poor
10824 The alignment is only optional when parsing textual IR; for in-memory IR, it is
10825 always present. If unspecified, the alignment is assumed to be equal to the
10826 size of the '<value>' type. Note that this default alignment assumption is
10827 different from the alignment used for the load/store instructions when align
10830 The pointer passed into cmpxchg must have alignment greater than or
10831 equal to the size in memory of the operand.
10836 The contents of memory at the location specified by the '``<pointer>``' operand
10837 is read and compared to '``<cmp>``'; if the values are equal, '``<new>``' is
10838 written to the location. The original value at the location is returned,
10839 together with a flag indicating success (true) or failure (false).
10841 If the cmpxchg operation is marked as ``weak`` then a spurious failure is
10842 permitted: the operation may not write ``<new>`` even if the comparison
10845 If the cmpxchg operation is strong (the default), the i1 value is 1 if and only
10846 if the value loaded equals ``cmp``.
10848 A successful ``cmpxchg`` is a read-modify-write instruction for the purpose of
10849 identifying release sequences. A failed ``cmpxchg`` is equivalent to an atomic
10850 load with an ordering parameter determined the second ordering parameter.
10855 .. code-block:: llvm
10858 %orig = load atomic i32, ptr %ptr unordered, align 4 ; yields i32
10862 %cmp = phi i32 [ %orig, %entry ], [%value_loaded, %loop]
10863 %squared = mul i32 %cmp, %cmp
10864 %val_success = cmpxchg ptr %ptr, i32 %cmp, i32 %squared acq_rel monotonic ; yields { i32, i1 }
10865 %value_loaded = extractvalue { i32, i1 } %val_success, 0
10866 %success = extractvalue { i32, i1 } %val_success, 1
10867 br i1 %success, label %done, label %loop
10874 '``atomicrmw``' Instruction
10875 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
10882 atomicrmw [volatile] <operation> ptr <pointer>, <ty> <value> [syncscope("<target-scope>")] <ordering>[, align <alignment>] ; yields ty
10887 The '``atomicrmw``' instruction is used to atomically modify memory.
10892 There are three arguments to the '``atomicrmw``' instruction: an
10893 operation to apply, an address whose value to modify, an argument to the
10894 operation. The operation must be one of the following keywords:
10914 For most of these operations, the type of '<value>' must be an integer
10915 type whose bit width is a power of two greater than or equal to eight
10916 and less than or equal to a target-specific size limit. For xchg, this
10917 may also be a floating point or a pointer type with the same size constraints
10918 as integers. For fadd/fsub/fmax/fmin, this must be a floating point type. The
10919 type of the '``<pointer>``' operand must be a pointer to that type. If
10920 the ``atomicrmw`` is marked as ``volatile``, then the optimizer is not
10921 allowed to modify the number or order of execution of this
10922 ``atomicrmw`` with other :ref:`volatile operations <volatile>`.
10924 Note: if the alignment is not greater or equal to the size of the `<value>`
10925 type, the atomic operation is likely to require a lock and have poor
10928 The alignment is only optional when parsing textual IR; for in-memory IR, it is
10929 always present. If unspecified, the alignment is assumed to be equal to the
10930 size of the '<value>' type. Note that this default alignment assumption is
10931 different from the alignment used for the load/store instructions when align
10934 A ``atomicrmw`` instruction can also take an optional
10935 ":ref:`syncscope <syncscope>`" argument.
10940 The contents of memory at the location specified by the '``<pointer>``'
10941 operand are atomically read, modified, and written back. The original
10942 value at the location is returned. The modification is specified by the
10943 operation argument:
10945 - xchg: ``*ptr = val``
10946 - add: ``*ptr = *ptr + val``
10947 - sub: ``*ptr = *ptr - val``
10948 - and: ``*ptr = *ptr & val``
10949 - nand: ``*ptr = ~(*ptr & val)``
10950 - or: ``*ptr = *ptr | val``
10951 - xor: ``*ptr = *ptr ^ val``
10952 - max: ``*ptr = *ptr > val ? *ptr : val`` (using a signed comparison)
10953 - min: ``*ptr = *ptr < val ? *ptr : val`` (using a signed comparison)
10954 - umax: ``*ptr = *ptr > val ? *ptr : val`` (using an unsigned comparison)
10955 - umin: ``*ptr = *ptr < val ? *ptr : val`` (using an unsigned comparison)
10956 - fadd: ``*ptr = *ptr + val`` (using floating point arithmetic)
10957 - fsub: ``*ptr = *ptr - val`` (using floating point arithmetic)
10958 - fmax: ``*ptr = maxnum(*ptr, val)`` (match the `llvm.maxnum.*`` intrinsic)
10959 - fmin: ``*ptr = minnum(*ptr, val)`` (match the `llvm.minnum.*`` intrinsic)
10960 - uinc_wrap: ``*ptr = (*ptr u>= val) ? 0 : (*ptr + 1)`` (increment value with wraparound to zero when incremented above input value)
10961 - udec_wrap: ``*ptr = ((*ptr == 0) || (*ptr u> val)) ? val : (*ptr - 1)`` (decrement with wraparound to input value when decremented below zero).
10967 .. code-block:: llvm
10969 %old = atomicrmw add ptr %ptr, i32 1 acquire ; yields i32
10971 .. _i_getelementptr:
10973 '``getelementptr``' Instruction
10974 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10981 <result> = getelementptr <ty>, ptr <ptrval>{, [inrange] <ty> <idx>}*
10982 <result> = getelementptr inbounds <ty>, ptr <ptrval>{, [inrange] <ty> <idx>}*
10983 <result> = getelementptr <ty>, <N x ptr> <ptrval>, [inrange] <vector index type> <idx>
10988 The '``getelementptr``' instruction is used to get the address of a
10989 subelement of an :ref:`aggregate <t_aggregate>` data structure. It performs
10990 address calculation only and does not access memory. The instruction can also
10991 be used to calculate a vector of such addresses.
10996 The first argument is always a type used as the basis for the calculations.
10997 The second argument is always a pointer or a vector of pointers, and is the
10998 base address to start from. The remaining arguments are indices
10999 that indicate which of the elements of the aggregate object are indexed.
11000 The interpretation of each index is dependent on the type being indexed
11001 into. The first index always indexes the pointer value given as the
11002 second argument, the second index indexes a value of the type pointed to
11003 (not necessarily the value directly pointed to, since the first index
11004 can be non-zero), etc. The first type indexed into must be a pointer
11005 value, subsequent types can be arrays, vectors, and structs. Note that
11006 subsequent types being indexed into can never be pointers, since that
11007 would require loading the pointer before continuing calculation.
11009 The type of each index argument depends on the type it is indexing into.
11010 When indexing into a (optionally packed) structure, only ``i32`` integer
11011 **constants** are allowed (when using a vector of indices they must all
11012 be the **same** ``i32`` integer constant). When indexing into an array,
11013 pointer or vector, integers of any width are allowed, and they are not
11014 required to be constant. These integers are treated as signed values
11017 For example, let's consider a C code fragment and how it gets compiled
11033 int *foo(struct ST *s) {
11034 return &s[1].Z.B[5][13];
11037 The LLVM code generated by Clang is approximately:
11039 .. code-block:: llvm
11041 %struct.RT = type { i8, [10 x [20 x i32]], i8 }
11042 %struct.ST = type { i32, double, %struct.RT }
11044 define ptr @foo(ptr %s) {
11046 %arrayidx = getelementptr inbounds %struct.ST, ptr %s, i64 1, i32 2, i32 1, i64 5, i64 13
11053 In the example above, the first index is indexing into the
11054 '``%struct.ST*``' type, which is a pointer, yielding a '``%struct.ST``'
11055 = '``{ i32, double, %struct.RT }``' type, a structure. The second index
11056 indexes into the third element of the structure, yielding a
11057 '``%struct.RT``' = '``{ i8 , [10 x [20 x i32]], i8 }``' type, another
11058 structure. The third index indexes into the second element of the
11059 structure, yielding a '``[10 x [20 x i32]]``' type, an array. The two
11060 dimensions of the array are subscripted into, yielding an '``i32``'
11061 type. The '``getelementptr``' instruction returns a pointer to this
11064 Note that it is perfectly legal to index partially through a structure,
11065 returning a pointer to an inner element. Because of this, the LLVM code
11066 for the given testcase is equivalent to:
11068 .. code-block:: llvm
11070 define ptr @foo(ptr %s) {
11071 %t1 = getelementptr %struct.ST, ptr %s, i32 1
11072 %t2 = getelementptr %struct.ST, ptr %t1, i32 0, i32 2
11073 %t3 = getelementptr %struct.RT, ptr %t2, i32 0, i32 1
11074 %t4 = getelementptr [10 x [20 x i32]], ptr %t3, i32 0, i32 5
11075 %t5 = getelementptr [20 x i32], ptr %t4, i32 0, i32 13
11079 The indices are first converted to offsets in the pointer's index type. If the
11080 currently indexed type is a struct type, the struct offset corresponding to the
11081 index is sign-extended or truncated to the pointer index type. Otherwise, the
11082 index itself is sign-extended or truncated, and then multiplied by the type
11083 allocation size (that is, the size rounded up to the ABI alignment) of the
11084 currently indexed type.
11086 The offsets are then added to the low bits of the base address up to the index
11087 type width, with silently-wrapping two's complement arithmetic. If the pointer
11088 size is larger than the index size, this means that the bits outside the index
11089 type width will not be affected.
11091 The result value of the ``getelementptr`` may be outside the object pointed
11092 to by the base pointer. The result value may not necessarily be used to access
11093 memory though, even if it happens to point into allocated storage. See the
11094 :ref:`Pointer Aliasing Rules <pointeraliasing>` section for more
11097 If the ``inbounds`` keyword is present, the result value of a
11098 ``getelementptr`` with any non-zero indices is a
11099 :ref:`poison value <poisonvalues>` if one of the following rules is violated:
11101 * The base pointer has an *in bounds* address of an allocated object, which
11102 means that it points into an allocated object, or to its end. Note that the
11103 object does not have to be live anymore; being in-bounds of a deallocated
11104 object is sufficient.
11105 * If the type of an index is larger than the pointer index type, the
11106 truncation to the pointer index type preserves the signed value.
11107 * The multiplication of an index by the type size does not wrap the pointer
11108 index type in a signed sense (``nsw``).
11109 * The successive addition of each offset (without adding the base address) does
11110 not wrap the pointer index type in a signed sense (``nsw``).
11111 * The successive addition of the current address, interpreted as an unsigned
11112 number, and each offset, interpreted as a signed number, does not wrap the
11113 unsigned address space and remains *in bounds* of the allocated object.
11114 As a corollary, if the added offset is non-negative, the addition does not
11115 wrap in an unsigned sense (``nuw``).
11116 * In cases where the base is a vector of pointers, the ``inbounds`` keyword
11117 applies to each of the computations element-wise.
11119 Note that ``getelementptr`` with all-zero indices is always considered to be
11120 ``inbounds``, even if the base pointer does not point to an allocated object.
11121 As a corollary, the only pointer in bounds of the null pointer in the default
11122 address space is the null pointer itself.
11124 These rules are based on the assumption that no allocated object may cross
11125 the unsigned address space boundary, and no allocated object may be larger
11126 than half the pointer index type space.
11128 If the ``inrange`` keyword is present before any index, loading from or
11129 storing to any pointer derived from the ``getelementptr`` has undefined
11130 behavior if the load or store would access memory outside of the bounds of
11131 the element selected by the index marked as ``inrange``. The result of a
11132 pointer comparison or ``ptrtoint`` (including ``ptrtoint``-like operations
11133 involving memory) involving a pointer derived from a ``getelementptr`` with
11134 the ``inrange`` keyword is undefined, with the exception of comparisons
11135 in the case where both operands are in the range of the element selected
11136 by the ``inrange`` keyword, inclusive of the address one past the end of
11137 that element. Note that the ``inrange`` keyword is currently only allowed
11138 in constant ``getelementptr`` expressions.
11140 The getelementptr instruction is often confusing. For some more insight
11141 into how it works, see :doc:`the getelementptr FAQ <GetElementPtr>`.
11146 .. code-block:: llvm
11148 %aptr = getelementptr {i32, [12 x i8]}, ptr %saptr, i64 0, i32 1
11149 %vptr = getelementptr {i32, <2 x i8>}, ptr %svptr, i64 0, i32 1, i32 1
11150 %eptr = getelementptr [12 x i8], ptr %aptr, i64 0, i32 1
11151 %iptr = getelementptr [10 x i32], ptr @arr, i16 0, i16 0
11153 Vector of pointers:
11154 """""""""""""""""""
11156 The ``getelementptr`` returns a vector of pointers, instead of a single address,
11157 when one or more of its arguments is a vector. In such cases, all vector
11158 arguments should have the same number of elements, and every scalar argument
11159 will be effectively broadcast into a vector during address calculation.
11161 .. code-block:: llvm
11163 ; All arguments are vectors:
11164 ; A[i] = ptrs[i] + offsets[i]*sizeof(i8)
11165 %A = getelementptr i8, <4 x i8*> %ptrs, <4 x i64> %offsets
11167 ; Add the same scalar offset to each pointer of a vector:
11168 ; A[i] = ptrs[i] + offset*sizeof(i8)
11169 %A = getelementptr i8, <4 x ptr> %ptrs, i64 %offset
11171 ; Add distinct offsets to the same pointer:
11172 ; A[i] = ptr + offsets[i]*sizeof(i8)
11173 %A = getelementptr i8, ptr %ptr, <4 x i64> %offsets
11175 ; In all cases described above the type of the result is <4 x ptr>
11177 The two following instructions are equivalent:
11179 .. code-block:: llvm
11181 getelementptr %struct.ST, <4 x ptr> %s, <4 x i64> %ind1,
11182 <4 x i32> <i32 2, i32 2, i32 2, i32 2>,
11183 <4 x i32> <i32 1, i32 1, i32 1, i32 1>,
11185 <4 x i64> <i64 13, i64 13, i64 13, i64 13>
11187 getelementptr %struct.ST, <4 x ptr> %s, <4 x i64> %ind1,
11188 i32 2, i32 1, <4 x i32> %ind4, i64 13
11190 Let's look at the C code, where the vector version of ``getelementptr``
11195 // Let's assume that we vectorize the following loop:
11196 double *A, *B; int *C;
11197 for (int i = 0; i < size; ++i) {
11201 .. code-block:: llvm
11203 ; get pointers for 8 elements from array B
11204 %ptrs = getelementptr double, ptr %B, <8 x i32> %C
11205 ; load 8 elements from array B into A
11206 %A = call <8 x double> @llvm.masked.gather.v8f64.v8p0f64(<8 x ptr> %ptrs,
11207 i32 8, <8 x i1> %mask, <8 x double> %passthru)
11209 Conversion Operations
11210 ---------------------
11212 The instructions in this category are the conversion instructions
11213 (casting) which all take a single operand and a type. They perform
11214 various bit conversions on the operand.
11218 '``trunc .. to``' Instruction
11219 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11226 <result> = trunc <ty> <value> to <ty2> ; yields ty2
11231 The '``trunc``' instruction truncates its operand to the type ``ty2``.
11236 The '``trunc``' instruction takes a value to trunc, and a type to trunc
11237 it to. Both types must be of :ref:`integer <t_integer>` types, or vectors
11238 of the same number of integers. The bit size of the ``value`` must be
11239 larger than the bit size of the destination type, ``ty2``. Equal sized
11240 types are not allowed.
11245 The '``trunc``' instruction truncates the high order bits in ``value``
11246 and converts the remaining bits to ``ty2``. Since the source size must
11247 be larger than the destination size, ``trunc`` cannot be a *no-op cast*.
11248 It will always truncate bits.
11253 .. code-block:: llvm
11255 %X = trunc i32 257 to i8 ; yields i8:1
11256 %Y = trunc i32 123 to i1 ; yields i1:true
11257 %Z = trunc i32 122 to i1 ; yields i1:false
11258 %W = trunc <2 x i16> <i16 8, i16 7> to <2 x i8> ; yields <i8 8, i8 7>
11262 '``zext .. to``' Instruction
11263 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11270 <result> = zext <ty> <value> to <ty2> ; yields ty2
11275 The '``zext``' instruction zero extends its operand to type ``ty2``.
11277 The ``nneg`` (non-negative) flag, if present, specifies that the operand is
11278 non-negative. This property may be used by optimization passes to later
11279 convert the ``zext`` into a ``sext``.
11284 The '``zext``' instruction takes a value to cast, and a type to cast it
11285 to. Both types must be of :ref:`integer <t_integer>` types, or vectors of
11286 the same number of integers. The bit size of the ``value`` must be
11287 smaller than the bit size of the destination type, ``ty2``.
11292 The ``zext`` fills the high order bits of the ``value`` with zero bits
11293 until it reaches the size of the destination type, ``ty2``.
11295 When zero extending from i1, the result will always be either 0 or 1.
11297 If the ``nneg`` flag is set, and the ``zext`` argument is negative, the result
11303 .. code-block:: llvm
11305 %X = zext i32 257 to i64 ; yields i64:257
11306 %Y = zext i1 true to i32 ; yields i32:1
11307 %Z = zext <2 x i16> <i16 8, i16 7> to <2 x i32> ; yields <i32 8, i32 7>
11309 %a = zext nneg i8 127 to i16 ; yields i16 127
11310 %b = zext nneg i8 -1 to i16 ; yields i16 poison
11314 '``sext .. to``' Instruction
11315 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11322 <result> = sext <ty> <value> to <ty2> ; yields ty2
11327 The '``sext``' sign extends ``value`` to the type ``ty2``.
11332 The '``sext``' instruction takes a value to cast, and a type to cast it
11333 to. Both types must be of :ref:`integer <t_integer>` types, or vectors of
11334 the same number of integers. The bit size of the ``value`` must be
11335 smaller than the bit size of the destination type, ``ty2``.
11340 The '``sext``' instruction performs a sign extension by copying the sign
11341 bit (highest order bit) of the ``value`` until it reaches the bit size
11342 of the type ``ty2``.
11344 When sign extending from i1, the extension always results in -1 or 0.
11349 .. code-block:: llvm
11351 %X = sext i8 -1 to i16 ; yields i16 :65535
11352 %Y = sext i1 true to i32 ; yields i32:-1
11353 %Z = sext <2 x i16> <i16 8, i16 7> to <2 x i32> ; yields <i32 8, i32 7>
11355 '``fptrunc .. to``' Instruction
11356 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11363 <result> = fptrunc <ty> <value> to <ty2> ; yields ty2
11368 The '``fptrunc``' instruction truncates ``value`` to type ``ty2``.
11373 The '``fptrunc``' instruction takes a :ref:`floating-point <t_floating>`
11374 value to cast and a :ref:`floating-point <t_floating>` type to cast it to.
11375 The size of ``value`` must be larger than the size of ``ty2``. This
11376 implies that ``fptrunc`` cannot be used to make a *no-op cast*.
11381 The '``fptrunc``' instruction casts a ``value`` from a larger
11382 :ref:`floating-point <t_floating>` type to a smaller :ref:`floating-point
11383 <t_floating>` type.
11384 This instruction is assumed to execute in the default :ref:`floating-point
11385 environment <floatenv>`.
11387 NaN values follow the usual :ref:`NaN behaviors <floatnan>`, except that _if_ a
11388 NaN payload is propagated from the input ("Quieting NaN propagation" or
11389 "Unchanged NaN propagation" cases), then the low order bits of the NaN payload
11390 which cannot fit in the resulting type are discarded. Note that if discarding
11391 the low order bits leads to an all-0 payload, this cannot be represented as a
11392 signaling NaN (it would represent an infinity instead), so in that case
11393 "Unchanged NaN propagation" is not possible.
11398 .. code-block:: llvm
11400 %X = fptrunc double 16777217.0 to float ; yields float:16777216.0
11401 %Y = fptrunc double 1.0E+300 to half ; yields half:+infinity
11403 '``fpext .. to``' Instruction
11404 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11411 <result> = fpext <ty> <value> to <ty2> ; yields ty2
11416 The '``fpext``' extends a floating-point ``value`` to a larger floating-point
11422 The '``fpext``' instruction takes a :ref:`floating-point <t_floating>`
11423 ``value`` to cast, and a :ref:`floating-point <t_floating>` type to cast it
11424 to. The source type must be smaller than the destination type.
11429 The '``fpext``' instruction extends the ``value`` from a smaller
11430 :ref:`floating-point <t_floating>` type to a larger :ref:`floating-point
11431 <t_floating>` type. The ``fpext`` cannot be used to make a
11432 *no-op cast* because it always changes bits. Use ``bitcast`` to make a
11433 *no-op cast* for a floating-point cast.
11435 NaN values follow the usual :ref:`NaN behaviors <floatnan>`, except that _if_ a
11436 NaN payload is propagated from the input ("Quieting NaN propagation" or
11437 "Unchanged NaN propagation" cases), then it is copied to the high order bits of
11438 the resulting payload, and the remaining low order bits are zero.
11443 .. code-block:: llvm
11445 %X = fpext float 3.125 to double ; yields double:3.125000e+00
11446 %Y = fpext double %X to fp128 ; yields fp128:0xL00000000000000004000900000000000
11448 '``fptoui .. to``' Instruction
11449 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11456 <result> = fptoui <ty> <value> to <ty2> ; yields ty2
11461 The '``fptoui``' converts a floating-point ``value`` to its unsigned
11462 integer equivalent of type ``ty2``.
11467 The '``fptoui``' instruction takes a value to cast, which must be a
11468 scalar or vector :ref:`floating-point <t_floating>` value, and a type to
11469 cast it to ``ty2``, which must be an :ref:`integer <t_integer>` type. If
11470 ``ty`` is a vector floating-point type, ``ty2`` must be a vector integer
11471 type with the same number of elements as ``ty``
11476 The '``fptoui``' instruction converts its :ref:`floating-point
11477 <t_floating>` operand into the nearest (rounding towards zero)
11478 unsigned integer value. If the value cannot fit in ``ty2``, the result
11479 is a :ref:`poison value <poisonvalues>`.
11484 .. code-block:: llvm
11486 %X = fptoui double 123.0 to i32 ; yields i32:123
11487 %Y = fptoui float 1.0E+300 to i1 ; yields undefined:1
11488 %Z = fptoui float 1.04E+17 to i8 ; yields undefined:1
11490 '``fptosi .. to``' Instruction
11491 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11498 <result> = fptosi <ty> <value> to <ty2> ; yields ty2
11503 The '``fptosi``' instruction converts :ref:`floating-point <t_floating>`
11504 ``value`` to type ``ty2``.
11509 The '``fptosi``' instruction takes a value to cast, which must be a
11510 scalar or vector :ref:`floating-point <t_floating>` value, and a type to
11511 cast it to ``ty2``, which must be an :ref:`integer <t_integer>` type. If
11512 ``ty`` is a vector floating-point type, ``ty2`` must be a vector integer
11513 type with the same number of elements as ``ty``
11518 The '``fptosi``' instruction converts its :ref:`floating-point
11519 <t_floating>` operand into the nearest (rounding towards zero)
11520 signed integer value. If the value cannot fit in ``ty2``, the result
11521 is a :ref:`poison value <poisonvalues>`.
11526 .. code-block:: llvm
11528 %X = fptosi double -123.0 to i32 ; yields i32:-123
11529 %Y = fptosi float 1.0E-247 to i1 ; yields undefined:1
11530 %Z = fptosi float 1.04E+17 to i8 ; yields undefined:1
11532 '``uitofp .. to``' Instruction
11533 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11540 <result> = uitofp <ty> <value> to <ty2> ; yields ty2
11545 The '``uitofp``' instruction regards ``value`` as an unsigned integer
11546 and converts that value to the ``ty2`` type.
11551 The '``uitofp``' instruction takes a value to cast, which must be a
11552 scalar or vector :ref:`integer <t_integer>` value, and a type to cast it to
11553 ``ty2``, which must be an :ref:`floating-point <t_floating>` type. If
11554 ``ty`` is a vector integer type, ``ty2`` must be a vector floating-point
11555 type with the same number of elements as ``ty``
11560 The '``uitofp``' instruction interprets its operand as an unsigned
11561 integer quantity and converts it to the corresponding floating-point
11562 value. If the value cannot be exactly represented, it is rounded using
11563 the default rounding mode.
11569 .. code-block:: llvm
11571 %X = uitofp i32 257 to float ; yields float:257.0
11572 %Y = uitofp i8 -1 to double ; yields double:255.0
11574 '``sitofp .. to``' Instruction
11575 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11582 <result> = sitofp <ty> <value> to <ty2> ; yields ty2
11587 The '``sitofp``' instruction regards ``value`` as a signed integer and
11588 converts that value to the ``ty2`` type.
11593 The '``sitofp``' instruction takes a value to cast, which must be a
11594 scalar or vector :ref:`integer <t_integer>` value, and a type to cast it to
11595 ``ty2``, which must be an :ref:`floating-point <t_floating>` type. If
11596 ``ty`` is a vector integer type, ``ty2`` must be a vector floating-point
11597 type with the same number of elements as ``ty``
11602 The '``sitofp``' instruction interprets its operand as a signed integer
11603 quantity and converts it to the corresponding floating-point value. If the
11604 value cannot be exactly represented, it is rounded using the default rounding
11610 .. code-block:: llvm
11612 %X = sitofp i32 257 to float ; yields float:257.0
11613 %Y = sitofp i8 -1 to double ; yields double:-1.0
11617 '``ptrtoint .. to``' Instruction
11618 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11625 <result> = ptrtoint <ty> <value> to <ty2> ; yields ty2
11630 The '``ptrtoint``' instruction converts the pointer or a vector of
11631 pointers ``value`` to the integer (or vector of integers) type ``ty2``.
11636 The '``ptrtoint``' instruction takes a ``value`` to cast, which must be
11637 a value of type :ref:`pointer <t_pointer>` or a vector of pointers, and a
11638 type to cast it to ``ty2``, which must be an :ref:`integer <t_integer>` or
11639 a vector of integers type.
11644 The '``ptrtoint``' instruction converts ``value`` to integer type
11645 ``ty2`` by interpreting the pointer value as an integer and either
11646 truncating or zero extending that value to the size of the integer type.
11647 If ``value`` is smaller than ``ty2`` then a zero extension is done. If
11648 ``value`` is larger than ``ty2`` then a truncation is done. If they are
11649 the same size, then nothing is done (*no-op cast*) other than a type
11655 .. code-block:: llvm
11657 %X = ptrtoint ptr %P to i8 ; yields truncation on 32-bit architecture
11658 %Y = ptrtoint ptr %P to i64 ; yields zero extension on 32-bit architecture
11659 %Z = ptrtoint <4 x ptr> %P to <4 x i64>; yields vector zero extension for a vector of addresses on 32-bit architecture
11663 '``inttoptr .. to``' Instruction
11664 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11671 <result> = inttoptr <ty> <value> to <ty2>[, !dereferenceable !<deref_bytes_node>][, !dereferenceable_or_null !<deref_bytes_node>] ; yields ty2
11676 The '``inttoptr``' instruction converts an integer ``value`` to a
11677 pointer type, ``ty2``.
11682 The '``inttoptr``' instruction takes an :ref:`integer <t_integer>` value to
11683 cast, and a type to cast it to, which must be a :ref:`pointer <t_pointer>`
11686 The optional ``!dereferenceable`` metadata must reference a single metadata
11687 name ``<deref_bytes_node>`` corresponding to a metadata node with one ``i64``
11689 See ``dereferenceable`` metadata.
11691 The optional ``!dereferenceable_or_null`` metadata must reference a single
11692 metadata name ``<deref_bytes_node>`` corresponding to a metadata node with one
11694 See ``dereferenceable_or_null`` metadata.
11699 The '``inttoptr``' instruction converts ``value`` to type ``ty2`` by
11700 applying either a zero extension or a truncation depending on the size
11701 of the integer ``value``. If ``value`` is larger than the size of a
11702 pointer then a truncation is done. If ``value`` is smaller than the size
11703 of a pointer then a zero extension is done. If they are the same size,
11704 nothing is done (*no-op cast*).
11709 .. code-block:: llvm
11711 %X = inttoptr i32 255 to ptr ; yields zero extension on 64-bit architecture
11712 %Y = inttoptr i32 255 to ptr ; yields no-op on 32-bit architecture
11713 %Z = inttoptr i64 0 to ptr ; yields truncation on 32-bit architecture
11714 %Z = inttoptr <4 x i32> %G to <4 x ptr>; yields truncation of vector G to four pointers
11718 '``bitcast .. to``' Instruction
11719 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11726 <result> = bitcast <ty> <value> to <ty2> ; yields ty2
11731 The '``bitcast``' instruction converts ``value`` to type ``ty2`` without
11737 The '``bitcast``' instruction takes a value to cast, which must be a
11738 non-aggregate first class value, and a type to cast it to, which must
11739 also be a non-aggregate :ref:`first class <t_firstclass>` type. The
11740 bit sizes of ``value`` and the destination type, ``ty2``, must be
11741 identical. If the source type is a pointer, the destination type must
11742 also be a pointer of the same size. This instruction supports bitwise
11743 conversion of vectors to integers and to vectors of other types (as
11744 long as they have the same size).
11749 The '``bitcast``' instruction converts ``value`` to type ``ty2``. It
11750 is always a *no-op cast* because no bits change with this
11751 conversion. The conversion is done as if the ``value`` had been stored
11752 to memory and read back as type ``ty2``. Pointer (or vector of
11753 pointers) types may only be converted to other pointer (or vector of
11754 pointers) types with the same address space through this instruction.
11755 To convert pointers to other types, use the :ref:`inttoptr <i_inttoptr>`
11756 or :ref:`ptrtoint <i_ptrtoint>` instructions first.
11758 There is a caveat for bitcasts involving vector types in relation to
11759 endianness. For example ``bitcast <2 x i8> <value> to i16`` puts element zero
11760 of the vector in the least significant bits of the i16 for little-endian while
11761 element zero ends up in the most significant bits for big-endian.
11766 .. code-block:: text
11768 %X = bitcast i8 255 to i8 ; yields i8 :-1
11769 %Y = bitcast i32* %x to i16* ; yields i16*:%x
11770 %Z = bitcast <2 x i32> %V to i64; ; yields i64: %V (depends on endianness)
11771 %Z = bitcast <2 x i32*> %V to <2 x i64*> ; yields <2 x i64*>
11773 .. _i_addrspacecast:
11775 '``addrspacecast .. to``' Instruction
11776 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11783 <result> = addrspacecast <pty> <ptrval> to <pty2> ; yields pty2
11788 The '``addrspacecast``' instruction converts ``ptrval`` from ``pty`` in
11789 address space ``n`` to type ``pty2`` in address space ``m``.
11794 The '``addrspacecast``' instruction takes a pointer or vector of pointer value
11795 to cast and a pointer type to cast it to, which must have a different
11801 The '``addrspacecast``' instruction converts the pointer value
11802 ``ptrval`` to type ``pty2``. It can be a *no-op cast* or a complex
11803 value modification, depending on the target and the address space
11804 pair. Pointer conversions within the same address space must be
11805 performed with the ``bitcast`` instruction. Note that if the address
11806 space conversion produces a dereferenceable result then both result
11807 and operand refer to the same memory location. The conversion must
11808 have no side effects, and must not capture the value of the pointer.
11810 If the source is :ref:`poison <poisonvalues>`, the result is
11811 :ref:`poison <poisonvalues>`.
11813 If the source is not :ref:`poison <poisonvalues>`, and both source and
11814 destination are :ref:`integral pointers <nointptrtype>`, and the
11815 result pointer is dereferenceable, the cast is assumed to be
11816 reversible (i.e. casting the result back to the original address space
11817 should yield the original bit pattern).
11822 .. code-block:: llvm
11824 %X = addrspacecast ptr %x to ptr addrspace(1)
11825 %Y = addrspacecast ptr addrspace(1) %y to ptr addrspace(2)
11826 %Z = addrspacecast <4 x ptr> %z to <4 x ptr addrspace(3)>
11833 The instructions in this category are the "miscellaneous" instructions,
11834 which defy better classification.
11838 '``icmp``' Instruction
11839 ^^^^^^^^^^^^^^^^^^^^^^
11846 <result> = icmp <cond> <ty> <op1>, <op2> ; yields i1 or <N x i1>:result
11851 The '``icmp``' instruction returns a boolean value or a vector of
11852 boolean values based on comparison of its two integer, integer vector,
11853 pointer, or pointer vector operands.
11858 The '``icmp``' instruction takes three operands. The first operand is
11859 the condition code indicating the kind of comparison to perform. It is
11860 not a value, just a keyword. The possible condition codes are:
11865 #. ``ne``: not equal
11866 #. ``ugt``: unsigned greater than
11867 #. ``uge``: unsigned greater or equal
11868 #. ``ult``: unsigned less than
11869 #. ``ule``: unsigned less or equal
11870 #. ``sgt``: signed greater than
11871 #. ``sge``: signed greater or equal
11872 #. ``slt``: signed less than
11873 #. ``sle``: signed less or equal
11875 The remaining two arguments must be :ref:`integer <t_integer>` or
11876 :ref:`pointer <t_pointer>` or integer :ref:`vector <t_vector>` typed. They
11877 must also be identical types.
11882 The '``icmp``' compares ``op1`` and ``op2`` according to the condition
11883 code given as ``cond``. The comparison performed always yields either an
11884 :ref:`i1 <t_integer>` or vector of ``i1`` result, as follows:
11886 .. _icmp_md_cc_sem:
11888 #. ``eq``: yields ``true`` if the operands are equal, ``false``
11889 otherwise. No sign interpretation is necessary or performed.
11890 #. ``ne``: yields ``true`` if the operands are unequal, ``false``
11891 otherwise. No sign interpretation is necessary or performed.
11892 #. ``ugt``: interprets the operands as unsigned values and yields
11893 ``true`` if ``op1`` is greater than ``op2``.
11894 #. ``uge``: interprets the operands as unsigned values and yields
11895 ``true`` if ``op1`` is greater than or equal to ``op2``.
11896 #. ``ult``: interprets the operands as unsigned values and yields
11897 ``true`` if ``op1`` is less than ``op2``.
11898 #. ``ule``: interprets the operands as unsigned values and yields
11899 ``true`` if ``op1`` is less than or equal to ``op2``.
11900 #. ``sgt``: interprets the operands as signed values and yields ``true``
11901 if ``op1`` is greater than ``op2``.
11902 #. ``sge``: interprets the operands as signed values and yields ``true``
11903 if ``op1`` is greater than or equal to ``op2``.
11904 #. ``slt``: interprets the operands as signed values and yields ``true``
11905 if ``op1`` is less than ``op2``.
11906 #. ``sle``: interprets the operands as signed values and yields ``true``
11907 if ``op1`` is less than or equal to ``op2``.
11909 If the operands are :ref:`pointer <t_pointer>` typed, the pointer values
11910 are compared as if they were integers.
11912 If the operands are integer vectors, then they are compared element by
11913 element. The result is an ``i1`` vector with the same number of elements
11914 as the values being compared. Otherwise, the result is an ``i1``.
11919 .. code-block:: text
11921 <result> = icmp eq i32 4, 5 ; yields: result=false
11922 <result> = icmp ne ptr %X, %X ; yields: result=false
11923 <result> = icmp ult i16 4, 5 ; yields: result=true
11924 <result> = icmp sgt i16 4, 5 ; yields: result=false
11925 <result> = icmp ule i16 -4, 5 ; yields: result=false
11926 <result> = icmp sge i16 4, 5 ; yields: result=false
11930 '``fcmp``' Instruction
11931 ^^^^^^^^^^^^^^^^^^^^^^
11938 <result> = fcmp [fast-math flags]* <cond> <ty> <op1>, <op2> ; yields i1 or <N x i1>:result
11943 The '``fcmp``' instruction returns a boolean value or vector of boolean
11944 values based on comparison of its operands.
11946 If the operands are floating-point scalars, then the result type is a
11947 boolean (:ref:`i1 <t_integer>`).
11949 If the operands are floating-point vectors, then the result type is a
11950 vector of boolean with the same number of elements as the operands being
11956 The '``fcmp``' instruction takes three operands. The first operand is
11957 the condition code indicating the kind of comparison to perform. It is
11958 not a value, just a keyword. The possible condition codes are:
11960 #. ``false``: no comparison, always returns false
11961 #. ``oeq``: ordered and equal
11962 #. ``ogt``: ordered and greater than
11963 #. ``oge``: ordered and greater than or equal
11964 #. ``olt``: ordered and less than
11965 #. ``ole``: ordered and less than or equal
11966 #. ``one``: ordered and not equal
11967 #. ``ord``: ordered (no nans)
11968 #. ``ueq``: unordered or equal
11969 #. ``ugt``: unordered or greater than
11970 #. ``uge``: unordered or greater than or equal
11971 #. ``ult``: unordered or less than
11972 #. ``ule``: unordered or less than or equal
11973 #. ``une``: unordered or not equal
11974 #. ``uno``: unordered (either nans)
11975 #. ``true``: no comparison, always returns true
11977 *Ordered* means that neither operand is a QNAN while *unordered* means
11978 that either operand may be a QNAN.
11980 Each of ``val1`` and ``val2`` arguments must be either a :ref:`floating-point
11981 <t_floating>` type or a :ref:`vector <t_vector>` of floating-point type.
11982 They must have identical types.
11987 The '``fcmp``' instruction compares ``op1`` and ``op2`` according to the
11988 condition code given as ``cond``. If the operands are vectors, then the
11989 vectors are compared element by element. Each comparison performed
11990 always yields an :ref:`i1 <t_integer>` result, as follows:
11992 #. ``false``: always yields ``false``, regardless of operands.
11993 #. ``oeq``: yields ``true`` if both operands are not a QNAN and ``op1``
11994 is equal to ``op2``.
11995 #. ``ogt``: yields ``true`` if both operands are not a QNAN and ``op1``
11996 is greater than ``op2``.
11997 #. ``oge``: yields ``true`` if both operands are not a QNAN and ``op1``
11998 is greater than or equal to ``op2``.
11999 #. ``olt``: yields ``true`` if both operands are not a QNAN and ``op1``
12000 is less than ``op2``.
12001 #. ``ole``: yields ``true`` if both operands are not a QNAN and ``op1``
12002 is less than or equal to ``op2``.
12003 #. ``one``: yields ``true`` if both operands are not a QNAN and ``op1``
12004 is not equal to ``op2``.
12005 #. ``ord``: yields ``true`` if both operands are not a QNAN.
12006 #. ``ueq``: yields ``true`` if either operand is a QNAN or ``op1`` is
12008 #. ``ugt``: yields ``true`` if either operand is a QNAN or ``op1`` is
12009 greater than ``op2``.
12010 #. ``uge``: yields ``true`` if either operand is a QNAN or ``op1`` is
12011 greater than or equal to ``op2``.
12012 #. ``ult``: yields ``true`` if either operand is a QNAN or ``op1`` is
12014 #. ``ule``: yields ``true`` if either operand is a QNAN or ``op1`` is
12015 less than or equal to ``op2``.
12016 #. ``une``: yields ``true`` if either operand is a QNAN or ``op1`` is
12017 not equal to ``op2``.
12018 #. ``uno``: yields ``true`` if either operand is a QNAN.
12019 #. ``true``: always yields ``true``, regardless of operands.
12021 The ``fcmp`` instruction can also optionally take any number of
12022 :ref:`fast-math flags <fastmath>`, which are optimization hints to enable
12023 otherwise unsafe floating-point optimizations.
12025 Any set of fast-math flags are legal on an ``fcmp`` instruction, but the
12026 only flags that have any effect on its semantics are those that allow
12027 assumptions to be made about the values of input arguments; namely
12028 ``nnan``, ``ninf``, and ``reassoc``. See :ref:`fastmath` for more information.
12033 .. code-block:: text
12035 <result> = fcmp oeq float 4.0, 5.0 ; yields: result=false
12036 <result> = fcmp one float 4.0, 5.0 ; yields: result=true
12037 <result> = fcmp olt float 4.0, 5.0 ; yields: result=true
12038 <result> = fcmp ueq double 1.0, 2.0 ; yields: result=false
12042 '``phi``' Instruction
12043 ^^^^^^^^^^^^^^^^^^^^^
12050 <result> = phi [fast-math-flags] <ty> [ <val0>, <label0>], ...
12055 The '``phi``' instruction is used to implement the φ node in the SSA
12056 graph representing the function.
12061 The type of the incoming values is specified with the first type field.
12062 After this, the '``phi``' instruction takes a list of pairs as
12063 arguments, with one pair for each predecessor basic block of the current
12064 block. Only values of :ref:`first class <t_firstclass>` type may be used as
12065 the value arguments to the PHI node. Only labels may be used as the
12068 There must be no non-phi instructions between the start of a basic block
12069 and the PHI instructions: i.e. PHI instructions must be first in a basic
12072 For the purposes of the SSA form, the use of each incoming value is
12073 deemed to occur on the edge from the corresponding predecessor block to
12074 the current block (but after any definition of an '``invoke``'
12075 instruction's return value on the same edge).
12077 The optional ``fast-math-flags`` marker indicates that the phi has one
12078 or more :ref:`fast-math-flags <fastmath>`. These are optimization hints
12079 to enable otherwise unsafe floating-point optimizations. Fast-math-flags
12080 are only valid for phis that return a floating-point scalar or vector
12081 type, or an array (nested to any depth) of floating-point scalar or vector
12087 At runtime, the '``phi``' instruction logically takes on the value
12088 specified by the pair corresponding to the predecessor basic block that
12089 executed just prior to the current block.
12094 .. code-block:: llvm
12096 Loop: ; Infinite loop that counts from 0 on up...
12097 %indvar = phi i32 [ 0, %LoopHeader ], [ %nextindvar, %Loop ]
12098 %nextindvar = add i32 %indvar, 1
12103 '``select``' Instruction
12104 ^^^^^^^^^^^^^^^^^^^^^^^^
12111 <result> = select [fast-math flags] selty <cond>, <ty> <val1>, <ty> <val2> ; yields ty
12113 selty is either i1 or {<N x i1>}
12118 The '``select``' instruction is used to choose one value based on a
12119 condition, without IR-level branching.
12124 The '``select``' instruction requires an 'i1' value or a vector of 'i1'
12125 values indicating the condition, and two values of the same :ref:`first
12126 class <t_firstclass>` type.
12128 #. The optional ``fast-math flags`` marker indicates that the select has one or more
12129 :ref:`fast-math flags <fastmath>`. These are optimization hints to enable
12130 otherwise unsafe floating-point optimizations. Fast-math flags are only valid
12131 for selects that return a floating-point scalar or vector type, or an array
12132 (nested to any depth) of floating-point scalar or vector types.
12137 If the condition is an i1 and it evaluates to 1, the instruction returns
12138 the first value argument; otherwise, it returns the second value
12141 If the condition is a vector of i1, then the value arguments must be
12142 vectors of the same size, and the selection is done element by element.
12144 If the condition is an i1 and the value arguments are vectors of the
12145 same size, then an entire vector is selected.
12150 .. code-block:: llvm
12152 %X = select i1 true, i8 17, i8 42 ; yields i8:17
12157 '``freeze``' Instruction
12158 ^^^^^^^^^^^^^^^^^^^^^^^^
12165 <result> = freeze ty <val> ; yields ty:result
12170 The '``freeze``' instruction is used to stop propagation of
12171 :ref:`undef <undefvalues>` and :ref:`poison <poisonvalues>` values.
12176 The '``freeze``' instruction takes a single argument.
12181 If the argument is ``undef`` or ``poison``, '``freeze``' returns an
12182 arbitrary, but fixed, value of type '``ty``'.
12183 Otherwise, this instruction is a no-op and returns the input argument.
12184 All uses of a value returned by the same '``freeze``' instruction are
12185 guaranteed to always observe the same value, while different '``freeze``'
12186 instructions may yield different values.
12188 While ``undef`` and ``poison`` pointers can be frozen, the result is a
12189 non-dereferenceable pointer. See the
12190 :ref:`Pointer Aliasing Rules <pointeraliasing>` section for more information.
12191 If an aggregate value or vector is frozen, the operand is frozen element-wise.
12192 The padding of an aggregate isn't considered, since it isn't visible
12193 without storing it into memory and loading it with a different type.
12199 .. code-block:: text
12203 %y = add i32 %w, %w ; undef
12204 %z = add i32 %x, %x ; even number because all uses of %x observe
12206 %x2 = freeze i32 %w
12207 %cmp = icmp eq i32 %x, %x2 ; can be true or false
12209 ; example with vectors
12210 %v = <2 x i32> <i32 undef, i32 poison>
12211 %a = extractelement <2 x i32> %v, i32 0 ; undef
12212 %b = extractelement <2 x i32> %v, i32 1 ; poison
12213 %add = add i32 %a, %a ; undef
12215 %v.fr = freeze <2 x i32> %v ; element-wise freeze
12216 %d = extractelement <2 x i32> %v.fr, i32 0 ; not undef
12217 %add.f = add i32 %d, %d ; even number
12219 ; branching on frozen value
12220 %poison = add nsw i1 %k, undef ; poison
12221 %c = freeze i1 %poison
12222 br i1 %c, label %foo, label %bar ; non-deterministic branch to %foo or %bar
12227 '``call``' Instruction
12228 ^^^^^^^^^^^^^^^^^^^^^^
12235 <result> = [tail | musttail | notail ] call [fast-math flags] [cconv] [ret attrs] [addrspace(<num>)]
12236 <ty>|<fnty> <fnptrval>(<function args>) [fn attrs] [ operand bundles ]
12241 The '``call``' instruction represents a simple function call.
12246 This instruction requires several arguments:
12248 #. The optional ``tail`` and ``musttail`` markers indicate that the optimizers
12249 should perform tail call optimization. The ``tail`` marker is a hint that
12250 `can be ignored <CodeGenerator.html#tail-call-optimization>`_. The
12251 ``musttail`` marker means that the call must be tail call optimized in order
12252 for the program to be correct. This is true even in the presence of
12253 attributes like "disable-tail-calls". The ``musttail`` marker provides these
12256 #. The call will not cause unbounded stack growth if it is part of a
12257 recursive cycle in the call graph.
12258 #. Arguments with the :ref:`inalloca <attr_inalloca>` or
12259 :ref:`preallocated <attr_preallocated>` attribute are forwarded in place.
12260 #. If the musttail call appears in a function with the ``"thunk"`` attribute
12261 and the caller and callee both have varargs, then any unprototyped
12262 arguments in register or memory are forwarded to the callee. Similarly,
12263 the return value of the callee is returned to the caller's caller, even
12264 if a void return type is in use.
12266 Both markers imply that the callee does not access allocas from the caller.
12267 The ``tail`` marker additionally implies that the callee does not access
12268 varargs from the caller. Calls marked ``musttail`` must obey the following
12271 - The call must immediately precede a :ref:`ret <i_ret>` instruction,
12272 or a pointer bitcast followed by a ret instruction.
12273 - The ret instruction must return the (possibly bitcasted) value
12274 produced by the call, undef, or void.
12275 - The calling conventions of the caller and callee must match.
12276 - The callee must be varargs iff the caller is varargs. Bitcasting a
12277 non-varargs function to the appropriate varargs type is legal so
12278 long as the non-varargs prefixes obey the other rules.
12279 - The return type must not undergo automatic conversion to an `sret` pointer.
12281 In addition, if the calling convention is not `swifttailcc` or `tailcc`:
12283 - All ABI-impacting function attributes, such as sret, byval, inreg,
12284 returned, and inalloca, must match.
12285 - The caller and callee prototypes must match. Pointer types of parameters
12286 or return types may differ in pointee type, but not in address space.
12288 On the other hand, if the calling convention is `swifttailcc` or `swiftcc`:
12290 - Only these ABI-impacting attributes attributes are allowed: sret, byval,
12291 swiftself, and swiftasync.
12292 - Prototypes are not required to match.
12294 Tail call optimization for calls marked ``tail`` is guaranteed to occur if
12295 the following conditions are met:
12297 - Caller and callee both have the calling convention ``fastcc`` or ``tailcc``.
12298 - The call is in tail position (ret immediately follows call and ret
12299 uses value of call or is void).
12300 - Option ``-tailcallopt`` is enabled,
12301 ``llvm::GuaranteedTailCallOpt`` is ``true``, or the calling convention
12303 - `Platform-specific constraints are
12304 met. <CodeGenerator.html#tailcallopt>`_
12306 #. The optional ``notail`` marker indicates that the optimizers should not add
12307 ``tail`` or ``musttail`` markers to the call. It is used to prevent tail
12308 call optimization from being performed on the call.
12310 #. The optional ``fast-math flags`` marker indicates that the call has one or more
12311 :ref:`fast-math flags <fastmath>`, which are optimization hints to enable
12312 otherwise unsafe floating-point optimizations. Fast-math flags are only valid
12313 for calls that return a floating-point scalar or vector type, or an array
12314 (nested to any depth) of floating-point scalar or vector types.
12316 #. The optional "cconv" marker indicates which :ref:`calling
12317 convention <callingconv>` the call should use. If none is
12318 specified, the call defaults to using C calling conventions. The
12319 calling convention of the call must match the calling convention of
12320 the target function, or else the behavior is undefined.
12321 #. The optional :ref:`Parameter Attributes <paramattrs>` list for return
12322 values. Only '``zeroext``', '``signext``', and '``inreg``' attributes
12324 #. The optional addrspace attribute can be used to indicate the address space
12325 of the called function. If it is not specified, the program address space
12326 from the :ref:`datalayout string<langref_datalayout>` will be used.
12327 #. '``ty``': the type of the call instruction itself which is also the
12328 type of the return value. Functions that return no value are marked
12330 #. '``fnty``': shall be the signature of the function being called. The
12331 argument types must match the types implied by this signature. This
12332 type can be omitted if the function is not varargs.
12333 #. '``fnptrval``': An LLVM value containing a pointer to a function to
12334 be called. In most cases, this is a direct function call, but
12335 indirect ``call``'s are just as possible, calling an arbitrary pointer
12337 #. '``function args``': argument list whose types match the function
12338 signature argument types and parameter attributes. All arguments must
12339 be of :ref:`first class <t_firstclass>` type. If the function signature
12340 indicates the function accepts a variable number of arguments, the
12341 extra arguments can be specified.
12342 #. The optional :ref:`function attributes <fnattrs>` list.
12343 #. The optional :ref:`operand bundles <opbundles>` list.
12348 The '``call``' instruction is used to cause control flow to transfer to
12349 a specified function, with its incoming arguments bound to the specified
12350 values. Upon a '``ret``' instruction in the called function, control
12351 flow continues with the instruction after the function call, and the
12352 return value of the function is bound to the result argument.
12357 .. code-block:: llvm
12359 %retval = call i32 @test(i32 %argc)
12360 call i32 (ptr, ...) @printf(ptr %msg, i32 12, i8 42) ; yields i32
12361 %X = tail call i32 @foo() ; yields i32
12362 %Y = tail call fastcc i32 @foo() ; yields i32
12363 call void %foo(i8 signext 97)
12365 %struct.A = type { i32, i8 }
12366 %r = call %struct.A @foo() ; yields { i32, i8 }
12367 %gr = extractvalue %struct.A %r, 0 ; yields i32
12368 %gr1 = extractvalue %struct.A %r, 1 ; yields i8
12369 %Z = call void @foo() noreturn ; indicates that %foo never returns normally
12370 %ZZ = call zeroext i32 @bar() ; Return value is %zero extended
12372 llvm treats calls to some functions with names and arguments that match
12373 the standard C99 library as being the C99 library functions, and may
12374 perform optimizations or generate code for them under that assumption.
12375 This is something we'd like to change in the future to provide better
12376 support for freestanding environments and non-C-based languages.
12380 '``va_arg``' Instruction
12381 ^^^^^^^^^^^^^^^^^^^^^^^^
12388 <resultval> = va_arg <va_list*> <arglist>, <argty>
12393 The '``va_arg``' instruction is used to access arguments passed through
12394 the "variable argument" area of a function call. It is used to implement
12395 the ``va_arg`` macro in C.
12400 This instruction takes a ``va_list*`` value and the type of the
12401 argument. It returns a value of the specified argument type and
12402 increments the ``va_list`` to point to the next argument. The actual
12403 type of ``va_list`` is target specific.
12408 The '``va_arg``' instruction loads an argument of the specified type
12409 from the specified ``va_list`` and causes the ``va_list`` to point to
12410 the next argument. For more information, see the variable argument
12411 handling :ref:`Intrinsic Functions <int_varargs>`.
12413 It is legal for this instruction to be called in a function which does
12414 not take a variable number of arguments, for example, the ``vfprintf``
12417 ``va_arg`` is an LLVM instruction instead of an :ref:`intrinsic
12418 function <intrinsics>` because it takes a type as an argument.
12423 See the :ref:`variable argument processing <int_varargs>` section.
12425 Note that the code generator does not yet fully support va\_arg on many
12426 targets. Also, it does not currently support va\_arg with aggregate
12427 types on any target.
12431 '``landingpad``' Instruction
12432 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12439 <resultval> = landingpad <resultty> <clause>+
12440 <resultval> = landingpad <resultty> cleanup <clause>*
12442 <clause> := catch <type> <value>
12443 <clause> := filter <array constant type> <array constant>
12448 The '``landingpad``' instruction is used by `LLVM's exception handling
12449 system <ExceptionHandling.html#overview>`_ to specify that a basic block
12450 is a landing pad --- one where the exception lands, and corresponds to the
12451 code found in the ``catch`` portion of a ``try``/``catch`` sequence. It
12452 defines values supplied by the :ref:`personality function <personalityfn>` upon
12453 re-entry to the function. The ``resultval`` has the type ``resultty``.
12459 ``cleanup`` flag indicates that the landing pad block is a cleanup.
12461 A ``clause`` begins with the clause type --- ``catch`` or ``filter`` --- and
12462 contains the global variable representing the "type" that may be caught
12463 or filtered respectively. Unlike the ``catch`` clause, the ``filter``
12464 clause takes an array constant as its argument. Use
12465 "``[0 x ptr] undef``" for a filter which cannot throw. The
12466 '``landingpad``' instruction must contain *at least* one ``clause`` or
12467 the ``cleanup`` flag.
12472 The '``landingpad``' instruction defines the values which are set by the
12473 :ref:`personality function <personalityfn>` upon re-entry to the function, and
12474 therefore the "result type" of the ``landingpad`` instruction. As with
12475 calling conventions, how the personality function results are
12476 represented in LLVM IR is target specific.
12478 The clauses are applied in order from top to bottom. If two
12479 ``landingpad`` instructions are merged together through inlining, the
12480 clauses from the calling function are appended to the list of clauses.
12481 When the call stack is being unwound due to an exception being thrown,
12482 the exception is compared against each ``clause`` in turn. If it doesn't
12483 match any of the clauses, and the ``cleanup`` flag is not set, then
12484 unwinding continues further up the call stack.
12486 The ``landingpad`` instruction has several restrictions:
12488 - A landing pad block is a basic block which is the unwind destination
12489 of an '``invoke``' instruction.
12490 - A landing pad block must have a '``landingpad``' instruction as its
12491 first non-PHI instruction.
12492 - There can be only one '``landingpad``' instruction within the landing
12494 - A basic block that is not a landing pad block may not include a
12495 '``landingpad``' instruction.
12500 .. code-block:: llvm
12502 ;; A landing pad which can catch an integer.
12503 %res = landingpad { ptr, i32 }
12505 ;; A landing pad that is a cleanup.
12506 %res = landingpad { ptr, i32 }
12508 ;; A landing pad which can catch an integer and can only throw a double.
12509 %res = landingpad { ptr, i32 }
12511 filter [1 x ptr] [ptr @_ZTId]
12515 '``catchpad``' Instruction
12516 ^^^^^^^^^^^^^^^^^^^^^^^^^^
12523 <resultval> = catchpad within <catchswitch> [<args>*]
12528 The '``catchpad``' instruction is used by `LLVM's exception handling
12529 system <ExceptionHandling.html#overview>`_ to specify that a basic block
12530 begins a catch handler --- one where a personality routine attempts to transfer
12531 control to catch an exception.
12536 The ``catchswitch`` operand must always be a token produced by a
12537 :ref:`catchswitch <i_catchswitch>` instruction in a predecessor block. This
12538 ensures that each ``catchpad`` has exactly one predecessor block, and it always
12539 terminates in a ``catchswitch``.
12541 The ``args`` correspond to whatever information the personality routine
12542 requires to know if this is an appropriate handler for the exception. Control
12543 will transfer to the ``catchpad`` if this is the first appropriate handler for
12546 The ``resultval`` has the type :ref:`token <t_token>` and is used to match the
12547 ``catchpad`` to corresponding :ref:`catchrets <i_catchret>` and other nested EH
12553 When the call stack is being unwound due to an exception being thrown, the
12554 exception is compared against the ``args``. If it doesn't match, control will
12555 not reach the ``catchpad`` instruction. The representation of ``args`` is
12556 entirely target and personality function-specific.
12558 Like the :ref:`landingpad <i_landingpad>` instruction, the ``catchpad``
12559 instruction must be the first non-phi of its parent basic block.
12561 The meaning of the tokens produced and consumed by ``catchpad`` and other "pad"
12562 instructions is described in the
12563 `Windows exception handling documentation\ <ExceptionHandling.html#wineh>`_.
12565 When a ``catchpad`` has been "entered" but not yet "exited" (as
12566 described in the `EH documentation\ <ExceptionHandling.html#wineh-constraints>`_),
12567 it is undefined behavior to execute a :ref:`call <i_call>` or :ref:`invoke <i_invoke>`
12568 that does not carry an appropriate :ref:`"funclet" bundle <ob_funclet>`.
12573 .. code-block:: text
12576 %cs = catchswitch within none [label %handler0] unwind to caller
12577 ;; A catch block which can catch an integer.
12579 %tok = catchpad within %cs [ptr @_ZTIi]
12583 '``cleanuppad``' Instruction
12584 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12591 <resultval> = cleanuppad within <parent> [<args>*]
12596 The '``cleanuppad``' instruction is used by `LLVM's exception handling
12597 system <ExceptionHandling.html#overview>`_ to specify that a basic block
12598 is a cleanup block --- one where a personality routine attempts to
12599 transfer control to run cleanup actions.
12600 The ``args`` correspond to whatever additional
12601 information the :ref:`personality function <personalityfn>` requires to
12602 execute the cleanup.
12603 The ``resultval`` has the type :ref:`token <t_token>` and is used to
12604 match the ``cleanuppad`` to corresponding :ref:`cleanuprets <i_cleanupret>`.
12605 The ``parent`` argument is the token of the funclet that contains the
12606 ``cleanuppad`` instruction. If the ``cleanuppad`` is not inside a funclet,
12607 this operand may be the token ``none``.
12612 The instruction takes a list of arbitrary values which are interpreted
12613 by the :ref:`personality function <personalityfn>`.
12618 When the call stack is being unwound due to an exception being thrown,
12619 the :ref:`personality function <personalityfn>` transfers control to the
12620 ``cleanuppad`` with the aid of the personality-specific arguments.
12621 As with calling conventions, how the personality function results are
12622 represented in LLVM IR is target specific.
12624 The ``cleanuppad`` instruction has several restrictions:
12626 - A cleanup block is a basic block which is the unwind destination of
12627 an exceptional instruction.
12628 - A cleanup block must have a '``cleanuppad``' instruction as its
12629 first non-PHI instruction.
12630 - There can be only one '``cleanuppad``' instruction within the
12632 - A basic block that is not a cleanup block may not include a
12633 '``cleanuppad``' instruction.
12635 When a ``cleanuppad`` has been "entered" but not yet "exited" (as
12636 described in the `EH documentation\ <ExceptionHandling.html#wineh-constraints>`_),
12637 it is undefined behavior to execute a :ref:`call <i_call>` or :ref:`invoke <i_invoke>`
12638 that does not carry an appropriate :ref:`"funclet" bundle <ob_funclet>`.
12643 .. code-block:: text
12645 %tok = cleanuppad within %cs []
12649 Intrinsic Functions
12650 ===================
12652 LLVM supports the notion of an "intrinsic function". These functions
12653 have well known names and semantics and are required to follow certain
12654 restrictions. Overall, these intrinsics represent an extension mechanism
12655 for the LLVM language that does not require changing all of the
12656 transformations in LLVM when adding to the language (or the bitcode
12657 reader/writer, the parser, etc...).
12659 Intrinsic function names must all start with an "``llvm.``" prefix. This
12660 prefix is reserved in LLVM for intrinsic names; thus, function names may
12661 not begin with this prefix. Intrinsic functions must always be external
12662 functions: you cannot define the body of intrinsic functions. Intrinsic
12663 functions may only be used in call or invoke instructions: it is illegal
12664 to take the address of an intrinsic function. Additionally, because
12665 intrinsic functions are part of the LLVM language, it is required if any
12666 are added that they be documented here.
12668 Some intrinsic functions can be overloaded, i.e., the intrinsic
12669 represents a family of functions that perform the same operation but on
12670 different data types. Because LLVM can represent over 8 million
12671 different integer types, overloading is used commonly to allow an
12672 intrinsic function to operate on any integer type. One or more of the
12673 argument types or the result type can be overloaded to accept any
12674 integer type. Argument types may also be defined as exactly matching a
12675 previous argument's type or the result type. This allows an intrinsic
12676 function which accepts multiple arguments, but needs all of them to be
12677 of the same type, to only be overloaded with respect to a single
12678 argument or the result.
12680 Overloaded intrinsics will have the names of its overloaded argument
12681 types encoded into its function name, each preceded by a period. Only
12682 those types which are overloaded result in a name suffix. Arguments
12683 whose type is matched against another type do not. For example, the
12684 ``llvm.ctpop`` function can take an integer of any width and returns an
12685 integer of exactly the same integer width. This leads to a family of
12686 functions such as ``i8 @llvm.ctpop.i8(i8 %val)`` and
12687 ``i29 @llvm.ctpop.i29(i29 %val)``. Only one type, the return type, is
12688 overloaded, and only one type suffix is required. Because the argument's
12689 type is matched against the return type, it does not require its own
12692 :ref:`Unnamed types <t_opaque>` are encoded as ``s_s``. Overloaded intrinsics
12693 that depend on an unnamed type in one of its overloaded argument types get an
12694 additional ``.<number>`` suffix. This allows differentiating intrinsics with
12695 different unnamed types as arguments. (For example:
12696 ``llvm.ssa.copy.p0s_s.2(%42*)``) The number is tracked in the LLVM module and
12697 it ensures unique names in the module. While linking together two modules, it is
12698 still possible to get a name clash. In that case one of the names will be
12699 changed by getting a new number.
12701 For target developers who are defining intrinsics for back-end code
12702 generation, any intrinsic overloads based solely the distinction between
12703 integer or floating point types should not be relied upon for correct
12704 code generation. In such cases, the recommended approach for target
12705 maintainers when defining intrinsics is to create separate integer and
12706 FP intrinsics rather than rely on overloading. For example, if different
12707 codegen is required for ``llvm.target.foo(<4 x i32>)`` and
12708 ``llvm.target.foo(<4 x float>)`` then these should be split into
12709 different intrinsics.
12711 To learn how to add an intrinsic function, please see the `Extending
12712 LLVM Guide <ExtendingLLVM.html>`_.
12716 Variable Argument Handling Intrinsics
12717 -------------------------------------
12719 Variable argument support is defined in LLVM with the
12720 :ref:`va_arg <i_va_arg>` instruction and these three intrinsic
12721 functions. These functions are related to the similarly named macros
12722 defined in the ``<stdarg.h>`` header file.
12724 All of these functions operate on arguments that use a target-specific
12725 value type "``va_list``". The LLVM assembly language reference manual
12726 does not define what this type is, so all transformations should be
12727 prepared to handle these functions regardless of the type used.
12729 This example shows how the :ref:`va_arg <i_va_arg>` instruction and the
12730 variable argument handling intrinsic functions are used.
12732 .. code-block:: llvm
12734 ; This struct is different for every platform. For most platforms,
12735 ; it is merely a ptr.
12736 %struct.va_list = type { ptr }
12738 ; For Unix x86_64 platforms, va_list is the following struct:
12739 ; %struct.va_list = type { i32, i32, ptr, ptr }
12741 define i32 @test(i32 %X, ...) {
12742 ; Initialize variable argument processing
12743 %ap = alloca %struct.va_list
12744 call void @llvm.va_start(ptr %ap)
12746 ; Read a single integer argument
12747 %tmp = va_arg ptr %ap, i32
12749 ; Demonstrate usage of llvm.va_copy and llvm.va_end
12751 call void @llvm.va_copy(ptr %aq, ptr %ap)
12752 call void @llvm.va_end(ptr %aq)
12754 ; Stop processing of arguments.
12755 call void @llvm.va_end(ptr %ap)
12759 declare void @llvm.va_start(ptr)
12760 declare void @llvm.va_copy(ptr, ptr)
12761 declare void @llvm.va_end(ptr)
12765 '``llvm.va_start``' Intrinsic
12766 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12773 declare void @llvm.va_start(ptr <arglist>)
12778 The '``llvm.va_start``' intrinsic initializes ``<arglist>`` for
12779 subsequent use by ``va_arg``.
12784 The argument is a pointer to a ``va_list`` element to initialize.
12789 The '``llvm.va_start``' intrinsic works just like the ``va_start`` macro
12790 available in C. In a target-dependent way, it initializes the
12791 ``va_list`` element to which the argument points, so that the next call
12792 to ``va_arg`` will produce the first variable argument passed to the
12793 function. Unlike the C ``va_start`` macro, this intrinsic does not need
12794 to know the last argument of the function as the compiler can figure
12797 '``llvm.va_end``' Intrinsic
12798 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
12805 declare void @llvm.va_end(ptr <arglist>)
12810 The '``llvm.va_end``' intrinsic destroys ``<arglist>``, which has been
12811 initialized previously with ``llvm.va_start`` or ``llvm.va_copy``.
12816 The argument is a pointer to a ``va_list`` to destroy.
12821 The '``llvm.va_end``' intrinsic works just like the ``va_end`` macro
12822 available in C. In a target-dependent way, it destroys the ``va_list``
12823 element to which the argument points. Calls to
12824 :ref:`llvm.va_start <int_va_start>` and
12825 :ref:`llvm.va_copy <int_va_copy>` must be matched exactly with calls to
12830 '``llvm.va_copy``' Intrinsic
12831 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12838 declare void @llvm.va_copy(ptr <destarglist>, ptr <srcarglist>)
12843 The '``llvm.va_copy``' intrinsic copies the current argument position
12844 from the source argument list to the destination argument list.
12849 The first argument is a pointer to a ``va_list`` element to initialize.
12850 The second argument is a pointer to a ``va_list`` element to copy from.
12855 The '``llvm.va_copy``' intrinsic works just like the ``va_copy`` macro
12856 available in C. In a target-dependent way, it copies the source
12857 ``va_list`` element into the destination ``va_list`` element. This
12858 intrinsic is necessary because the `` llvm.va_start`` intrinsic may be
12859 arbitrarily complex and require, for example, memory allocation.
12861 Accurate Garbage Collection Intrinsics
12862 --------------------------------------
12864 LLVM's support for `Accurate Garbage Collection <GarbageCollection.html>`_
12865 (GC) requires the frontend to generate code containing appropriate intrinsic
12866 calls and select an appropriate GC strategy which knows how to lower these
12867 intrinsics in a manner which is appropriate for the target collector.
12869 These intrinsics allow identification of :ref:`GC roots on the
12870 stack <int_gcroot>`, as well as garbage collector implementations that
12871 require :ref:`read <int_gcread>` and :ref:`write <int_gcwrite>` barriers.
12872 Frontends for type-safe garbage collected languages should generate
12873 these intrinsics to make use of the LLVM garbage collectors. For more
12874 details, see `Garbage Collection with LLVM <GarbageCollection.html>`_.
12876 LLVM provides an second experimental set of intrinsics for describing garbage
12877 collection safepoints in compiled code. These intrinsics are an alternative
12878 to the ``llvm.gcroot`` intrinsics, but are compatible with the ones for
12879 :ref:`read <int_gcread>` and :ref:`write <int_gcwrite>` barriers. The
12880 differences in approach are covered in the `Garbage Collection with LLVM
12881 <GarbageCollection.html>`_ documentation. The intrinsics themselves are
12882 described in :doc:`Statepoints`.
12886 '``llvm.gcroot``' Intrinsic
12887 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
12894 declare void @llvm.gcroot(ptr %ptrloc, ptr %metadata)
12899 The '``llvm.gcroot``' intrinsic declares the existence of a GC root to
12900 the code generator, and allows some metadata to be associated with it.
12905 The first argument specifies the address of a stack object that contains
12906 the root pointer. The second pointer (which must be either a constant or
12907 a global value address) contains the meta-data to be associated with the
12913 At runtime, a call to this intrinsic stores a null pointer into the
12914 "ptrloc" location. At compile-time, the code generator generates
12915 information to allow the runtime to find the pointer at GC safe points.
12916 The '``llvm.gcroot``' intrinsic may only be used in a function which
12917 :ref:`specifies a GC algorithm <gc>`.
12921 '``llvm.gcread``' Intrinsic
12922 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
12929 declare ptr @llvm.gcread(ptr %ObjPtr, ptr %Ptr)
12934 The '``llvm.gcread``' intrinsic identifies reads of references from heap
12935 locations, allowing garbage collector implementations that require read
12941 The second argument is the address to read from, which should be an
12942 address allocated from the garbage collector. The first object is a
12943 pointer to the start of the referenced object, if needed by the language
12944 runtime (otherwise null).
12949 The '``llvm.gcread``' intrinsic has the same semantics as a load
12950 instruction, but may be replaced with substantially more complex code by
12951 the garbage collector runtime, as needed. The '``llvm.gcread``'
12952 intrinsic may only be used in a function which :ref:`specifies a GC
12957 '``llvm.gcwrite``' Intrinsic
12958 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12965 declare void @llvm.gcwrite(ptr %P1, ptr %Obj, ptr %P2)
12970 The '``llvm.gcwrite``' intrinsic identifies writes of references to heap
12971 locations, allowing garbage collector implementations that require write
12972 barriers (such as generational or reference counting collectors).
12977 The first argument is the reference to store, the second is the start of
12978 the object to store it to, and the third is the address of the field of
12979 Obj to store to. If the runtime does not require a pointer to the
12980 object, Obj may be null.
12985 The '``llvm.gcwrite``' intrinsic has the same semantics as a store
12986 instruction, but may be replaced with substantially more complex code by
12987 the garbage collector runtime, as needed. The '``llvm.gcwrite``'
12988 intrinsic may only be used in a function which :ref:`specifies a GC
12994 '``llvm.experimental.gc.statepoint``' Intrinsic
12995 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13003 @llvm.experimental.gc.statepoint(i64 <id>, i32 <num patch bytes>,
13004 ptr elementtype(func_type) <target>,
13005 i64 <#call args>, i64 <flags>,
13006 ... (call parameters),
13012 The statepoint intrinsic represents a call which is parse-able by the
13018 The 'id' operand is a constant integer that is reported as the ID
13019 field in the generated stackmap. LLVM does not interpret this
13020 parameter in any way and its meaning is up to the statepoint user to
13021 decide. Note that LLVM is free to duplicate code containing
13022 statepoint calls, and this may transform IR that had a unique 'id' per
13023 lexical call to statepoint to IR that does not.
13025 If 'num patch bytes' is non-zero then the call instruction
13026 corresponding to the statepoint is not emitted and LLVM emits 'num
13027 patch bytes' bytes of nops in its place. LLVM will emit code to
13028 prepare the function arguments and retrieve the function return value
13029 in accordance to the calling convention; the former before the nop
13030 sequence and the latter after the nop sequence. It is expected that
13031 the user will patch over the 'num patch bytes' bytes of nops with a
13032 calling sequence specific to their runtime before executing the
13033 generated machine code. There are no guarantees with respect to the
13034 alignment of the nop sequence. Unlike :doc:`StackMaps` statepoints do
13035 not have a concept of shadow bytes. Note that semantically the
13036 statepoint still represents a call or invoke to 'target', and the nop
13037 sequence after patching is expected to represent an operation
13038 equivalent to a call or invoke to 'target'.
13040 The 'target' operand is the function actually being called. The operand
13041 must have an :ref:`elementtype <attr_elementtype>` attribute specifying
13042 the function type of the target. The target can be specified as either
13043 a symbolic LLVM function, or as an arbitrary Value of pointer type. Note
13044 that the function type must match the signature of the callee and the
13045 types of the 'call parameters' arguments.
13047 The '#call args' operand is the number of arguments to the actual
13048 call. It must exactly match the number of arguments passed in the
13049 'call parameters' variable length section.
13051 The 'flags' operand is used to specify extra information about the
13052 statepoint. This is currently only used to mark certain statepoints
13053 as GC transitions. This operand is a 64-bit integer with the following
13054 layout, where bit 0 is the least significant bit:
13056 +-------+---------------------------------------------------+
13058 +=======+===================================================+
13059 | 0 | Set if the statepoint is a GC transition, cleared |
13061 +-------+---------------------------------------------------+
13062 | 1-63 | Reserved for future use; must be cleared. |
13063 +-------+---------------------------------------------------+
13065 The 'call parameters' arguments are simply the arguments which need to
13066 be passed to the call target. They will be lowered according to the
13067 specified calling convention and otherwise handled like a normal call
13068 instruction. The number of arguments must exactly match what is
13069 specified in '# call args'. The types must match the signature of
13072 The 'call parameter' attributes must be followed by two 'i64 0' constants.
13073 These were originally the length prefixes for 'gc transition parameter' and
13074 'deopt parameter' arguments, but the role of these parameter sets have been
13075 entirely replaced with the corresponding operand bundles. In a future
13076 revision, these now redundant arguments will be removed.
13081 A statepoint is assumed to read and write all memory. As a result,
13082 memory operations can not be reordered past a statepoint. It is
13083 illegal to mark a statepoint as being either 'readonly' or 'readnone'.
13085 Note that legal IR can not perform any memory operation on a 'gc
13086 pointer' argument of the statepoint in a location statically reachable
13087 from the statepoint. Instead, the explicitly relocated value (from a
13088 ``gc.relocate``) must be used.
13090 '``llvm.experimental.gc.result``' Intrinsic
13091 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13099 @llvm.experimental.gc.result(token %statepoint_token)
13104 ``gc.result`` extracts the result of the original call instruction
13105 which was replaced by the ``gc.statepoint``. The ``gc.result``
13106 intrinsic is actually a family of three intrinsics due to an
13107 implementation limitation. Other than the type of the return value,
13108 the semantics are the same.
13113 The first and only argument is the ``gc.statepoint`` which starts
13114 the safepoint sequence of which this ``gc.result`` is a part.
13115 Despite the typing of this as a generic token, *only* the value defined
13116 by a ``gc.statepoint`` is legal here.
13121 The ``gc.result`` represents the return value of the call target of
13122 the ``statepoint``. The type of the ``gc.result`` must exactly match
13123 the type of the target. If the call target returns void, there will
13124 be no ``gc.result``.
13126 A ``gc.result`` is modeled as a 'readnone' pure function. It has no
13127 side effects since it is just a projection of the return value of the
13128 previous call represented by the ``gc.statepoint``.
13130 '``llvm.experimental.gc.relocate``' Intrinsic
13131 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13138 declare <pointer type>
13139 @llvm.experimental.gc.relocate(token %statepoint_token,
13141 i32 %pointer_offset)
13146 A ``gc.relocate`` returns the potentially relocated value of a pointer
13152 The first argument is the ``gc.statepoint`` which starts the
13153 safepoint sequence of which this ``gc.relocation`` is a part.
13154 Despite the typing of this as a generic token, *only* the value defined
13155 by a ``gc.statepoint`` is legal here.
13157 The second and third arguments are both indices into operands of the
13158 corresponding statepoint's :ref:`gc-live <ob_gc_live>` operand bundle.
13160 The second argument is an index which specifies the allocation for the pointer
13161 being relocated. The associated value must be within the object with which the
13162 pointer being relocated is associated. The optimizer is free to change *which*
13163 interior derived pointer is reported, provided that it does not replace an
13164 actual base pointer with another interior derived pointer. Collectors are
13165 allowed to rely on the base pointer operand remaining an actual base pointer if
13168 The third argument is an index which specify the (potentially) derived pointer
13169 being relocated. It is legal for this index to be the same as the second
13170 argument if-and-only-if a base pointer is being relocated.
13175 The return value of ``gc.relocate`` is the potentially relocated value
13176 of the pointer specified by its arguments. It is unspecified how the
13177 value of the returned pointer relates to the argument to the
13178 ``gc.statepoint`` other than that a) it points to the same source
13179 language object with the same offset, and b) the 'based-on'
13180 relationship of the newly relocated pointers is a projection of the
13181 unrelocated pointers. In particular, the integer value of the pointer
13182 returned is unspecified.
13184 A ``gc.relocate`` is modeled as a ``readnone`` pure function. It has no
13185 side effects since it is just a way to extract information about work
13186 done during the actual call modeled by the ``gc.statepoint``.
13188 .. _gc.get.pointer.base:
13190 '``llvm.experimental.gc.get.pointer.base``' Intrinsic
13191 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13198 declare <pointer type>
13199 @llvm.experimental.gc.get.pointer.base(
13200 <pointer type> readnone nocapture %derived_ptr)
13201 nounwind willreturn memory(none)
13206 ``gc.get.pointer.base`` for a derived pointer returns its base pointer.
13211 The only argument is a pointer which is based on some object with
13212 an unknown offset from the base of said object.
13217 This intrinsic is used in the abstract machine model for GC to represent
13218 the base pointer for an arbitrary derived pointer.
13220 This intrinsic is inlined by the :ref:`RewriteStatepointsForGC` pass by
13221 replacing all uses of this callsite with the offset of a derived pointer from
13222 its base pointer value. The replacement is done as part of the lowering to the
13223 explicit statepoint model.
13225 The return pointer type must be the same as the type of the parameter.
13228 '``llvm.experimental.gc.get.pointer.offset``' Intrinsic
13229 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13237 @llvm.experimental.gc.get.pointer.offset(
13238 <pointer type> readnone nocapture %derived_ptr)
13239 nounwind willreturn memory(none)
13244 ``gc.get.pointer.offset`` for a derived pointer returns the offset from its
13250 The only argument is a pointer which is based on some object with
13251 an unknown offset from the base of said object.
13256 This intrinsic is used in the abstract machine model for GC to represent
13257 the offset of an arbitrary derived pointer from its base pointer.
13259 This intrinsic is inlined by the :ref:`RewriteStatepointsForGC` pass by
13260 replacing all uses of this callsite with the offset of a derived pointer from
13261 its base pointer value. The replacement is done as part of the lowering to the
13262 explicit statepoint model.
13264 Basically this call calculates difference between the derived pointer and its
13265 base pointer (see :ref:`gc.get.pointer.base`) both ptrtoint casted. But
13266 this cast done outside the :ref:`RewriteStatepointsForGC` pass could result
13267 in the pointers lost for further lowering from the abstract model to the
13268 explicit physical one.
13270 Code Generator Intrinsics
13271 -------------------------
13273 These intrinsics are provided by LLVM to expose special features that
13274 may only be implemented with code generator support.
13276 '``llvm.returnaddress``' Intrinsic
13277 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13284 declare ptr @llvm.returnaddress(i32 <level>)
13289 The '``llvm.returnaddress``' intrinsic attempts to compute a
13290 target-specific value indicating the return address of the current
13291 function or one of its callers.
13296 The argument to this intrinsic indicates which function to return the
13297 address for. Zero indicates the calling function, one indicates its
13298 caller, etc. The argument is **required** to be a constant integer
13304 The '``llvm.returnaddress``' intrinsic either returns a pointer
13305 indicating the return address of the specified call frame, or zero if it
13306 cannot be identified. The value returned by this intrinsic is likely to
13307 be incorrect or 0 for arguments other than zero, so it should only be
13308 used for debugging purposes.
13310 Note that calling this intrinsic does not prevent function inlining or
13311 other aggressive transformations, so the value returned may not be that
13312 of the obvious source-language caller.
13314 '``llvm.addressofreturnaddress``' Intrinsic
13315 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13322 declare ptr @llvm.addressofreturnaddress()
13327 The '``llvm.addressofreturnaddress``' intrinsic returns a target-specific
13328 pointer to the place in the stack frame where the return address of the
13329 current function is stored.
13334 Note that calling this intrinsic does not prevent function inlining or
13335 other aggressive transformations, so the value returned may not be that
13336 of the obvious source-language caller.
13338 This intrinsic is only implemented for x86 and aarch64.
13340 '``llvm.sponentry``' Intrinsic
13341 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13348 declare ptr @llvm.sponentry()
13353 The '``llvm.sponentry``' intrinsic returns the stack pointer value at
13354 the entry of the current function calling this intrinsic.
13359 Note this intrinsic is only verified on AArch64 and ARM.
13361 '``llvm.frameaddress``' Intrinsic
13362 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13369 declare ptr @llvm.frameaddress(i32 <level>)
13374 The '``llvm.frameaddress``' intrinsic attempts to return the
13375 target-specific frame pointer value for the specified stack frame.
13380 The argument to this intrinsic indicates which function to return the
13381 frame pointer for. Zero indicates the calling function, one indicates
13382 its caller, etc. The argument is **required** to be a constant integer
13388 The '``llvm.frameaddress``' intrinsic either returns a pointer
13389 indicating the frame address of the specified call frame, or zero if it
13390 cannot be identified. The value returned by this intrinsic is likely to
13391 be incorrect or 0 for arguments other than zero, so it should only be
13392 used for debugging purposes.
13394 Note that calling this intrinsic does not prevent function inlining or
13395 other aggressive transformations, so the value returned may not be that
13396 of the obvious source-language caller.
13398 '``llvm.swift.async.context.addr``' Intrinsic
13399 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13406 declare ptr @llvm.swift.async.context.addr()
13411 The '``llvm.swift.async.context.addr``' intrinsic returns a pointer to
13412 the part of the extended frame record containing the asynchronous
13413 context of a Swift execution.
13418 If the caller has a ``swiftasync`` parameter, that argument will initially
13419 be stored at the returned address. If not, it will be initialized to null.
13421 '``llvm.localescape``' and '``llvm.localrecover``' Intrinsics
13422 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13429 declare void @llvm.localescape(...)
13430 declare ptr @llvm.localrecover(ptr %func, ptr %fp, i32 %idx)
13435 The '``llvm.localescape``' intrinsic escapes offsets of a collection of static
13436 allocas, and the '``llvm.localrecover``' intrinsic applies those offsets to a
13437 live frame pointer to recover the address of the allocation. The offset is
13438 computed during frame layout of the caller of ``llvm.localescape``.
13443 All arguments to '``llvm.localescape``' must be pointers to static allocas or
13444 casts of static allocas. Each function can only call '``llvm.localescape``'
13445 once, and it can only do so from the entry block.
13447 The ``func`` argument to '``llvm.localrecover``' must be a constant
13448 bitcasted pointer to a function defined in the current module. The code
13449 generator cannot determine the frame allocation offset of functions defined in
13452 The ``fp`` argument to '``llvm.localrecover``' must be a frame pointer of a
13453 call frame that is currently live. The return value of '``llvm.localaddress``'
13454 is one way to produce such a value, but various runtimes also expose a suitable
13455 pointer in platform-specific ways.
13457 The ``idx`` argument to '``llvm.localrecover``' indicates which alloca passed to
13458 '``llvm.localescape``' to recover. It is zero-indexed.
13463 These intrinsics allow a group of functions to share access to a set of local
13464 stack allocations of a one parent function. The parent function may call the
13465 '``llvm.localescape``' intrinsic once from the function entry block, and the
13466 child functions can use '``llvm.localrecover``' to access the escaped allocas.
13467 The '``llvm.localescape``' intrinsic blocks inlining, as inlining changes where
13468 the escaped allocas are allocated, which would break attempts to use
13469 '``llvm.localrecover``'.
13471 '``llvm.seh.try.begin``' and '``llvm.seh.try.end``' Intrinsics
13472 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13479 declare void @llvm.seh.try.begin()
13480 declare void @llvm.seh.try.end()
13485 The '``llvm.seh.try.begin``' and '``llvm.seh.try.end``' intrinsics mark
13486 the boundary of a _try region for Windows SEH Asynchrous Exception Handling.
13491 When a C-function is compiled with Windows SEH Asynchrous Exception option,
13492 -feh_asynch (aka MSVC -EHa), these two intrinsics are injected to mark _try
13493 boundary and to prevent potential exceptions from being moved across boundary.
13494 Any set of operations can then be confined to the region by reading their leaf
13495 inputs via volatile loads and writing their root outputs via volatile stores.
13497 '``llvm.seh.scope.begin``' and '``llvm.seh.scope.end``' Intrinsics
13498 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13505 declare void @llvm.seh.scope.begin()
13506 declare void @llvm.seh.scope.end()
13511 The '``llvm.seh.scope.begin``' and '``llvm.seh.scope.end``' intrinsics mark
13512 the boundary of a CPP object lifetime for Windows SEH Asynchrous Exception
13513 Handling (MSVC option -EHa).
13518 LLVM's ordinary exception-handling representation associates EH cleanups and
13519 handlers only with ``invoke``s, which normally correspond only to call sites. To
13520 support arbitrary faulting instructions, it must be possible to recover the current
13521 EH scope for any instruction. Turning every operation in LLVM that could fault
13522 into an ``invoke`` of a new, potentially-throwing intrinsic would require adding a
13523 large number of intrinsics, impede optimization of those operations, and make
13524 compilation slower by introducing many extra basic blocks. These intrinsics can
13525 be used instead to mark the region protected by a cleanup, such as for a local
13526 C++ object with a non-trivial destructor. ``llvm.seh.scope.begin`` is used to mark
13527 the start of the region; it is always called with ``invoke``, with the unwind block
13528 being the desired unwind destination for any potentially-throwing instructions
13529 within the region. `llvm.seh.scope.end` is used to mark when the scope ends
13530 and the EH cleanup is no longer required (e.g. because the destructor is being
13533 .. _int_read_register:
13534 .. _int_read_volatile_register:
13535 .. _int_write_register:
13537 '``llvm.read_register``', '``llvm.read_volatile_register``', and '``llvm.write_register``' Intrinsics
13538 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13545 declare i32 @llvm.read_register.i32(metadata)
13546 declare i64 @llvm.read_register.i64(metadata)
13547 declare i32 @llvm.read_volatile_register.i32(metadata)
13548 declare i64 @llvm.read_volatile_register.i64(metadata)
13549 declare void @llvm.write_register.i32(metadata, i32 @value)
13550 declare void @llvm.write_register.i64(metadata, i64 @value)
13556 The '``llvm.read_register``', '``llvm.read_volatile_register``', and
13557 '``llvm.write_register``' intrinsics provide access to the named register.
13558 The register must be valid on the architecture being compiled to. The type
13559 needs to be compatible with the register being read.
13564 The '``llvm.read_register``' and '``llvm.read_volatile_register``' intrinsics
13565 return the current value of the register, where possible. The
13566 '``llvm.write_register``' intrinsic sets the current value of the register,
13569 A call to '``llvm.read_volatile_register``' is assumed to have side-effects
13570 and possibly return a different value each time (e.g. for a timer register).
13572 This is useful to implement named register global variables that need
13573 to always be mapped to a specific register, as is common practice on
13574 bare-metal programs including OS kernels.
13576 The compiler doesn't check for register availability or use of the used
13577 register in surrounding code, including inline assembly. Because of that,
13578 allocatable registers are not supported.
13580 Warning: So far it only works with the stack pointer on selected
13581 architectures (ARM, AArch64, PowerPC and x86_64). Significant amount of
13582 work is needed to support other registers and even more so, allocatable
13587 '``llvm.stacksave``' Intrinsic
13588 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13595 declare ptr @llvm.stacksave.p0()
13596 declare ptr addrspace(5) @llvm.stacksave.p5()
13601 The '``llvm.stacksave``' intrinsic is used to remember the current state
13602 of the function stack, for use with
13603 :ref:`llvm.stackrestore <int_stackrestore>`. This is useful for
13604 implementing language features like scoped automatic variable sized
13610 This intrinsic returns an opaque pointer value that can be passed to
13611 :ref:`llvm.stackrestore <int_stackrestore>`. When an
13612 ``llvm.stackrestore`` intrinsic is executed with a value saved from
13613 ``llvm.stacksave``, it effectively restores the state of the stack to
13614 the state it was in when the ``llvm.stacksave`` intrinsic executed. In
13615 practice, this pops any :ref:`alloca <i_alloca>` blocks from the stack
13616 that were allocated after the ``llvm.stacksave`` was executed. The
13617 address space should typically be the
13618 :ref:`alloca address space <alloca_addrspace>`.
13620 .. _int_stackrestore:
13622 '``llvm.stackrestore``' Intrinsic
13623 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13630 declare void @llvm.stackrestore.p0(ptr %ptr)
13631 declare void @llvm.stackrestore.p5(ptr addrspace(5) %ptr)
13636 The '``llvm.stackrestore``' intrinsic is used to restore the state of
13637 the function stack to the state it was in when the corresponding
13638 :ref:`llvm.stacksave <int_stacksave>` intrinsic executed. This is
13639 useful for implementing language features like scoped automatic
13640 variable sized arrays in C99. The address space should typically be
13641 the :ref:`alloca address space <alloca_addrspace>`.
13646 See the description for :ref:`llvm.stacksave <int_stacksave>`.
13648 .. _int_get_dynamic_area_offset:
13650 '``llvm.get.dynamic.area.offset``' Intrinsic
13651 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13658 declare i32 @llvm.get.dynamic.area.offset.i32()
13659 declare i64 @llvm.get.dynamic.area.offset.i64()
13664 The '``llvm.get.dynamic.area.offset.*``' intrinsic family is used to
13665 get the offset from native stack pointer to the address of the most
13666 recent dynamic alloca on the caller's stack. These intrinsics are
13667 intended for use in combination with
13668 :ref:`llvm.stacksave <int_stacksave>` to get a
13669 pointer to the most recent dynamic alloca. This is useful, for example,
13670 for AddressSanitizer's stack unpoisoning routines.
13675 These intrinsics return a non-negative integer value that can be used to
13676 get the address of the most recent dynamic alloca, allocated by :ref:`alloca <i_alloca>`
13677 on the caller's stack. In particular, for targets where stack grows downwards,
13678 adding this offset to the native stack pointer would get the address of the most
13679 recent dynamic alloca. For targets where stack grows upwards, the situation is a bit more
13680 complicated, because subtracting this value from stack pointer would get the address
13681 one past the end of the most recent dynamic alloca.
13683 Although for most targets `llvm.get.dynamic.area.offset <int_get_dynamic_area_offset>`
13684 returns just a zero, for others, such as PowerPC and PowerPC64, it returns a
13685 compile-time-known constant value.
13687 The return value type of :ref:`llvm.get.dynamic.area.offset <int_get_dynamic_area_offset>`
13688 must match the target's default address space's (address space 0) pointer type.
13690 '``llvm.prefetch``' Intrinsic
13691 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13698 declare void @llvm.prefetch(ptr <address>, i32 <rw>, i32 <locality>, i32 <cache type>)
13703 The '``llvm.prefetch``' intrinsic is a hint to the code generator to
13704 insert a prefetch instruction if supported; otherwise, it is a noop.
13705 Prefetches have no effect on the behavior of the program but can change
13706 its performance characteristics.
13711 ``address`` is the address to be prefetched, ``rw`` is the specifier
13712 determining if the fetch should be for a read (0) or write (1), and
13713 ``locality`` is a temporal locality specifier ranging from (0) - no
13714 locality, to (3) - extremely local keep in cache. The ``cache type``
13715 specifies whether the prefetch is performed on the data (1) or
13716 instruction (0) cache. The ``rw``, ``locality`` and ``cache type``
13717 arguments must be constant integers.
13722 This intrinsic does not modify the behavior of the program. In
13723 particular, prefetches cannot trap and do not produce a value. On
13724 targets that support this intrinsic, the prefetch can provide hints to
13725 the processor cache for better performance.
13727 '``llvm.pcmarker``' Intrinsic
13728 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13735 declare void @llvm.pcmarker(i32 <id>)
13740 The '``llvm.pcmarker``' intrinsic is a method to export a Program
13741 Counter (PC) in a region of code to simulators and other tools. The
13742 method is target specific, but it is expected that the marker will use
13743 exported symbols to transmit the PC of the marker. The marker makes no
13744 guarantees that it will remain with any specific instruction after
13745 optimizations. It is possible that the presence of a marker will inhibit
13746 optimizations. The intended use is to be inserted after optimizations to
13747 allow correlations of simulation runs.
13752 ``id`` is a numerical id identifying the marker.
13757 This intrinsic does not modify the behavior of the program. Backends
13758 that do not support this intrinsic may ignore it.
13760 '``llvm.readcyclecounter``' Intrinsic
13761 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13768 declare i64 @llvm.readcyclecounter()
13773 The '``llvm.readcyclecounter``' intrinsic provides access to the cycle
13774 counter register (or similar low latency, high accuracy clocks) on those
13775 targets that support it. On X86, it should map to RDTSC. On Alpha, it
13776 should map to RPCC. As the backing counters overflow quickly (on the
13777 order of 9 seconds on alpha), this should only be used for small
13783 When directly supported, reading the cycle counter should not modify any
13784 memory. Implementations are allowed to either return an application
13785 specific value or a system wide value. On backends without support, this
13786 is lowered to a constant 0.
13788 Note that runtime support may be conditional on the privilege-level code is
13789 running at and the host platform.
13791 '``llvm.clear_cache``' Intrinsic
13792 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13799 declare void @llvm.clear_cache(ptr, ptr)
13804 The '``llvm.clear_cache``' intrinsic ensures visibility of modifications
13805 in the specified range to the execution unit of the processor. On
13806 targets with non-unified instruction and data cache, the implementation
13807 flushes the instruction cache.
13812 On platforms with coherent instruction and data caches (e.g. x86), this
13813 intrinsic is a nop. On platforms with non-coherent instruction and data
13814 cache (e.g. ARM, MIPS), the intrinsic is lowered either to appropriate
13815 instructions or a system call, if cache flushing requires special
13818 The default behavior is to emit a call to ``__clear_cache`` from the run
13821 This intrinsic does *not* empty the instruction pipeline. Modifications
13822 of the current function are outside the scope of the intrinsic.
13824 '``llvm.instrprof.increment``' Intrinsic
13825 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13832 declare void @llvm.instrprof.increment(ptr <name>, i64 <hash>,
13833 i32 <num-counters>, i32 <index>)
13838 The '``llvm.instrprof.increment``' intrinsic can be emitted by a
13839 frontend for use with instrumentation based profiling. These will be
13840 lowered by the ``-instrprof`` pass to generate execution counts of a
13841 program at runtime.
13846 The first argument is a pointer to a global variable containing the
13847 name of the entity being instrumented. This should generally be the
13848 (mangled) function name for a set of counters.
13850 The second argument is a hash value that can be used by the consumer
13851 of the profile data to detect changes to the instrumented source, and
13852 the third is the number of counters associated with ``name``. It is an
13853 error if ``hash`` or ``num-counters`` differ between two instances of
13854 ``instrprof.increment`` that refer to the same name.
13856 The last argument refers to which of the counters for ``name`` should
13857 be incremented. It should be a value between 0 and ``num-counters``.
13862 This intrinsic represents an increment of a profiling counter. It will
13863 cause the ``-instrprof`` pass to generate the appropriate data
13864 structures and the code to increment the appropriate value, in a
13865 format that can be written out by a compiler runtime and consumed via
13866 the ``llvm-profdata`` tool.
13868 '``llvm.instrprof.increment.step``' Intrinsic
13869 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13876 declare void @llvm.instrprof.increment.step(ptr <name>, i64 <hash>,
13877 i32 <num-counters>,
13878 i32 <index>, i64 <step>)
13883 The '``llvm.instrprof.increment.step``' intrinsic is an extension to
13884 the '``llvm.instrprof.increment``' intrinsic with an additional fifth
13885 argument to specify the step of the increment.
13889 The first four arguments are the same as '``llvm.instrprof.increment``'
13892 The last argument specifies the value of the increment of the counter variable.
13896 See description of '``llvm.instrprof.increment``' intrinsic.
13898 '``llvm.instrprof.timestamp``' Intrinsic
13899 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13906 declare void @llvm.instrprof.timestamp(i8* <name>, i64 <hash>,
13907 i32 <num-counters>, i32 <index>)
13912 The '``llvm.instrprof.timestamp``' intrinsic is used to implement temporal
13917 The arguments are the same as '``llvm.instrprof.increment``'. The ``index`` is
13918 expected to always be zero.
13922 Similar to the '``llvm.instrprof.increment``' intrinsic, but it stores a
13923 timestamp representing when this function was executed for the first time.
13925 '``llvm.instrprof.cover``' Intrinsic
13926 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13933 declare void @llvm.instrprof.cover(ptr <name>, i64 <hash>,
13934 i32 <num-counters>, i32 <index>)
13939 The '``llvm.instrprof.cover``' intrinsic is used to implement coverage
13944 The arguments are the same as the first four arguments of
13945 '``llvm.instrprof.increment``'.
13949 Similar to the '``llvm.instrprof.increment``' intrinsic, but it stores zero to
13950 the profiling variable to signify that the function has been covered. We store
13951 zero because this is more efficient on some targets.
13953 '``llvm.instrprof.value.profile``' Intrinsic
13954 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13961 declare void @llvm.instrprof.value.profile(ptr <name>, i64 <hash>,
13962 i64 <value>, i32 <value_kind>,
13968 The '``llvm.instrprof.value.profile``' intrinsic can be emitted by a
13969 frontend for use with instrumentation based profiling. This will be
13970 lowered by the ``-instrprof`` pass to find out the target values,
13971 instrumented expressions take in a program at runtime.
13976 The first argument is a pointer to a global variable containing the
13977 name of the entity being instrumented. ``name`` should generally be the
13978 (mangled) function name for a set of counters.
13980 The second argument is a hash value that can be used by the consumer
13981 of the profile data to detect changes to the instrumented source. It
13982 is an error if ``hash`` differs between two instances of
13983 ``llvm.instrprof.*`` that refer to the same name.
13985 The third argument is the value of the expression being profiled. The profiled
13986 expression's value should be representable as an unsigned 64-bit value. The
13987 fourth argument represents the kind of value profiling that is being done. The
13988 supported value profiling kinds are enumerated through the
13989 ``InstrProfValueKind`` type declared in the
13990 ``<include/llvm/ProfileData/InstrProf.h>`` header file. The last argument is the
13991 index of the instrumented expression within ``name``. It should be >= 0.
13996 This intrinsic represents the point where a call to a runtime routine
13997 should be inserted for value profiling of target expressions. ``-instrprof``
13998 pass will generate the appropriate data structures and replace the
13999 ``llvm.instrprof.value.profile`` intrinsic with the call to the profile
14000 runtime library with proper arguments.
14002 '``llvm.instrprof.mcdc.parameters``' Intrinsic
14003 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14010 declare void @llvm.instrprof.mcdc.parameters(ptr <name>, i64 <hash>,
14011 i32 <bitmap-bytes>)
14016 The '``llvm.instrprof.mcdc.parameters``' intrinsic is used to initiate MC/DC
14017 code coverage instrumentation for a function.
14022 The first argument is a pointer to a global variable containing the
14023 name of the entity being instrumented. This should generally be the
14024 (mangled) function name for a set of counters.
14026 The second argument is a hash value that can be used by the consumer
14027 of the profile data to detect changes to the instrumented source.
14029 The third argument is the number of bitmap bytes required by the function to
14030 record the number of test vectors executed for each boolean expression.
14035 This intrinsic represents basic MC/DC parameters initiating one or more MC/DC
14036 instrumentation sequences in a function. It will cause the ``-instrprof`` pass
14037 to generate the appropriate data structures and the code to instrument MC/DC
14038 test vectors in a format that can be written out by a compiler runtime and
14039 consumed via the ``llvm-profdata`` tool.
14041 '``llvm.instrprof.mcdc.condbitmap.update``' Intrinsic
14042 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14049 declare void @llvm.instrprof.mcdc.condbitmap.update(ptr <name>, i64 <hash>,
14050 i32 <condition-id>,
14051 ptr <mcdc-temp-addr>,
14057 The '``llvm.instrprof.mcdc.condbitmap.update``' intrinsic is used to track
14058 MC/DC condition evaluation for each condition in a boolean expression.
14063 The first argument is a pointer to a global variable containing the
14064 name of the entity being instrumented. This should generally be the
14065 (mangled) function name for a set of counters.
14067 The second argument is a hash value that can be used by the consumer
14068 of the profile data to detect changes to the instrumented source.
14070 The third argument is an ID of a condition to track. This value is used as a
14071 bit index into the condition bitmap.
14073 The fourth argument is the address of the condition bitmap.
14075 The fifth argument is the boolean value representing the evaluation of the
14076 condition (true or false)
14081 This intrinsic represents the update of a condition bitmap that is local to a
14082 function and will cause the ``-instrprof`` pass to generate the code to
14083 instrument the control flow around each condition in a boolean expression. The
14084 ID of each condition corresponds to a bit index in the condition bitmap which
14085 is set based on the evaluation of the condition.
14087 '``llvm.instrprof.mcdc.tvbitmap.update``' Intrinsic
14088 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14095 declare void @llvm.instrprof.mcdc.tvbitmap.update(ptr <name>, i64 <hash>,
14096 i32 <bitmap-bytes>)
14097 i32 <bitmap-index>,
14098 ptr <mcdc-temp-addr>)
14103 The '``llvm.instrprof.mcdc.tvbitmap.update``' intrinsic is used to track MC/DC
14104 test vector execution after each boolean expression has been fully executed.
14105 The overall value of the condition bitmap, after it has been successively
14106 updated using the '``llvm.instrprof.mcdc.condbitmap.update``' intrinsic with
14107 the true or false evaluation of each condition, uniquely identifies an executed
14108 MC/DC test vector and is used as a bit index into the global test vector
14114 The first argument is a pointer to a global variable containing the
14115 name of the entity being instrumented. This should generally be the
14116 (mangled) function name for a set of counters.
14118 The second argument is a hash value that can be used by the consumer
14119 of the profile data to detect changes to the instrumented source.
14121 The third argument is the number of bitmap bytes required by the function to
14122 record the number of test vectors executed for each boolean expression.
14124 The fourth argument is the byte index into the global test vector bitmap
14125 corresponding to the function.
14127 The fifth argument is the address of the condition bitmap, which contains a
14128 value representing an executed MC/DC test vector. It is loaded and used as the
14129 bit index of the test vector bitmap.
14134 This intrinsic represents the final operation of an MC/DC instrumentation
14135 sequence and will cause the ``-instrprof`` pass to generate the code to
14136 instrument an update of a function's global test vector bitmap to indicate that
14137 a test vector has been executed. The global test vector bitmap can be consumed
14138 by the ``llvm-profdata`` and ``llvm-cov`` tools.
14140 '``llvm.thread.pointer``' Intrinsic
14141 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14148 declare ptr @llvm.thread.pointer()
14153 The '``llvm.thread.pointer``' intrinsic returns the value of the thread
14159 The '``llvm.thread.pointer``' intrinsic returns a pointer to the TLS area
14160 for the current thread. The exact semantics of this value are target
14161 specific: it may point to the start of TLS area, to the end, or somewhere
14162 in the middle. Depending on the target, this intrinsic may read a register,
14163 call a helper function, read from an alternate memory space, or perform
14164 other operations necessary to locate the TLS area. Not all targets support
14167 '``llvm.call.preallocated.setup``' Intrinsic
14168 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14175 declare token @llvm.call.preallocated.setup(i32 %num_args)
14180 The '``llvm.call.preallocated.setup``' intrinsic returns a token which can
14181 be used with a call's ``"preallocated"`` operand bundle to indicate that
14182 certain arguments are allocated and initialized before the call.
14187 The '``llvm.call.preallocated.setup``' intrinsic returns a token which is
14188 associated with at most one call. The token can be passed to
14189 '``@llvm.call.preallocated.arg``' to get a pointer to get that
14190 corresponding argument. The token must be the parameter to a
14191 ``"preallocated"`` operand bundle for the corresponding call.
14193 Nested calls to '``llvm.call.preallocated.setup``' are allowed, but must
14194 be properly nested. e.g.
14196 :: code-block:: llvm
14198 %t1 = call token @llvm.call.preallocated.setup(i32 0)
14199 %t2 = call token @llvm.call.preallocated.setup(i32 0)
14200 call void foo() ["preallocated"(token %t2)]
14201 call void foo() ["preallocated"(token %t1)]
14203 is allowed, but not
14205 :: code-block:: llvm
14207 %t1 = call token @llvm.call.preallocated.setup(i32 0)
14208 %t2 = call token @llvm.call.preallocated.setup(i32 0)
14209 call void foo() ["preallocated"(token %t1)]
14210 call void foo() ["preallocated"(token %t2)]
14212 .. _int_call_preallocated_arg:
14214 '``llvm.call.preallocated.arg``' Intrinsic
14215 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14222 declare ptr @llvm.call.preallocated.arg(token %setup_token, i32 %arg_index)
14227 The '``llvm.call.preallocated.arg``' intrinsic returns a pointer to the
14228 corresponding preallocated argument for the preallocated call.
14233 The '``llvm.call.preallocated.arg``' intrinsic returns a pointer to the
14234 ``%arg_index``th argument with the ``preallocated`` attribute for
14235 the call associated with the ``%setup_token``, which must be from
14236 '``llvm.call.preallocated.setup``'.
14238 A call to '``llvm.call.preallocated.arg``' must have a call site
14239 ``preallocated`` attribute. The type of the ``preallocated`` attribute must
14240 match the type used by the ``preallocated`` attribute of the corresponding
14241 argument at the preallocated call. The type is used in the case that an
14242 ``llvm.call.preallocated.setup`` does not have a corresponding call (e.g. due
14243 to DCE), where otherwise we cannot know how large the arguments are.
14245 It is undefined behavior if this is called with a token from an
14246 '``llvm.call.preallocated.setup``' if another
14247 '``llvm.call.preallocated.setup``' has already been called or if the
14248 preallocated call corresponding to the '``llvm.call.preallocated.setup``'
14249 has already been called.
14251 .. _int_call_preallocated_teardown:
14253 '``llvm.call.preallocated.teardown``' Intrinsic
14254 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14261 declare ptr @llvm.call.preallocated.teardown(token %setup_token)
14266 The '``llvm.call.preallocated.teardown``' intrinsic cleans up the stack
14267 created by a '``llvm.call.preallocated.setup``'.
14272 The token argument must be a '``llvm.call.preallocated.setup``'.
14274 The '``llvm.call.preallocated.teardown``' intrinsic cleans up the stack
14275 allocated by the corresponding '``llvm.call.preallocated.setup``'. Exactly
14276 one of this or the preallocated call must be called to prevent stack leaks.
14277 It is undefined behavior to call both a '``llvm.call.preallocated.teardown``'
14278 and the preallocated call for a given '``llvm.call.preallocated.setup``'.
14280 For example, if the stack is allocated for a preallocated call by a
14281 '``llvm.call.preallocated.setup``', then an initializer function called on an
14282 allocated argument throws an exception, there should be a
14283 '``llvm.call.preallocated.teardown``' in the exception handler to prevent
14286 Following the nesting rules in '``llvm.call.preallocated.setup``', nested
14287 calls to '``llvm.call.preallocated.setup``' and
14288 '``llvm.call.preallocated.teardown``' are allowed but must be properly
14294 .. code-block:: llvm
14296 %cs = call token @llvm.call.preallocated.setup(i32 1)
14297 %x = call ptr @llvm.call.preallocated.arg(token %cs, i32 0) preallocated(i32)
14298 invoke void @constructor(ptr %x) to label %conta unwind label %contb
14300 call void @foo1(ptr preallocated(i32) %x) ["preallocated"(token %cs)]
14303 %s = catchswitch within none [label %catch] unwind to caller
14305 %p = catchpad within %s []
14306 call void @llvm.call.preallocated.teardown(token %cs)
14309 Standard C/C++ Library Intrinsics
14310 ---------------------------------
14312 LLVM provides intrinsics for a few important standard C/C++ library
14313 functions. These intrinsics allow source-language front-ends to pass
14314 information about the alignment of the pointer arguments to the code
14315 generator, providing opportunity for more efficient code generation.
14319 '``llvm.abs.*``' Intrinsic
14320 ^^^^^^^^^^^^^^^^^^^^^^^^^^
14325 This is an overloaded intrinsic. You can use ``llvm.abs`` on any
14326 integer bit width or any vector of integer elements.
14330 declare i32 @llvm.abs.i32(i32 <src>, i1 <is_int_min_poison>)
14331 declare <4 x i32> @llvm.abs.v4i32(<4 x i32> <src>, i1 <is_int_min_poison>)
14336 The '``llvm.abs``' family of intrinsic functions returns the absolute value
14342 The first argument is the value for which the absolute value is to be returned.
14343 This argument may be of any integer type or a vector with integer element type.
14344 The return type must match the first argument type.
14346 The second argument must be a constant and is a flag to indicate whether the
14347 result value of the '``llvm.abs``' intrinsic is a
14348 :ref:`poison value <poisonvalues>` if the argument is statically or dynamically
14349 an ``INT_MIN`` value.
14354 The '``llvm.abs``' intrinsic returns the magnitude (always positive) of the
14355 argument or each element of a vector argument.". If the argument is ``INT_MIN``,
14356 then the result is also ``INT_MIN`` if ``is_int_min_poison == 0`` and
14357 ``poison`` otherwise.
14362 '``llvm.smax.*``' Intrinsic
14363 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
14368 This is an overloaded intrinsic. You can use ``@llvm.smax`` on any
14369 integer bit width or any vector of integer elements.
14373 declare i32 @llvm.smax.i32(i32 %a, i32 %b)
14374 declare <4 x i32> @llvm.smax.v4i32(<4 x i32> %a, <4 x i32> %b)
14379 Return the larger of ``%a`` and ``%b`` comparing the values as signed integers.
14380 Vector intrinsics operate on a per-element basis. The larger element of ``%a``
14381 and ``%b`` at a given index is returned for that index.
14386 The arguments (``%a`` and ``%b``) may be of any integer type or a vector with
14387 integer element type. The argument types must match each other, and the return
14388 type must match the argument type.
14393 '``llvm.smin.*``' Intrinsic
14394 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
14399 This is an overloaded intrinsic. You can use ``@llvm.smin`` on any
14400 integer bit width or any vector of integer elements.
14404 declare i32 @llvm.smin.i32(i32 %a, i32 %b)
14405 declare <4 x i32> @llvm.smin.v4i32(<4 x i32> %a, <4 x i32> %b)
14410 Return the smaller of ``%a`` and ``%b`` comparing the values as signed integers.
14411 Vector intrinsics operate on a per-element basis. The smaller element of ``%a``
14412 and ``%b`` at a given index is returned for that index.
14417 The arguments (``%a`` and ``%b``) may be of any integer type or a vector with
14418 integer element type. The argument types must match each other, and the return
14419 type must match the argument type.
14424 '``llvm.umax.*``' Intrinsic
14425 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
14430 This is an overloaded intrinsic. You can use ``@llvm.umax`` on any
14431 integer bit width or any vector of integer elements.
14435 declare i32 @llvm.umax.i32(i32 %a, i32 %b)
14436 declare <4 x i32> @llvm.umax.v4i32(<4 x i32> %a, <4 x i32> %b)
14441 Return the larger of ``%a`` and ``%b`` comparing the values as unsigned
14442 integers. Vector intrinsics operate on a per-element basis. The larger element
14443 of ``%a`` and ``%b`` at a given index is returned for that index.
14448 The arguments (``%a`` and ``%b``) may be of any integer type or a vector with
14449 integer element type. The argument types must match each other, and the return
14450 type must match the argument type.
14455 '``llvm.umin.*``' Intrinsic
14456 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
14461 This is an overloaded intrinsic. You can use ``@llvm.umin`` on any
14462 integer bit width or any vector of integer elements.
14466 declare i32 @llvm.umin.i32(i32 %a, i32 %b)
14467 declare <4 x i32> @llvm.umin.v4i32(<4 x i32> %a, <4 x i32> %b)
14472 Return the smaller of ``%a`` and ``%b`` comparing the values as unsigned
14473 integers. Vector intrinsics operate on a per-element basis. The smaller element
14474 of ``%a`` and ``%b`` at a given index is returned for that index.
14479 The arguments (``%a`` and ``%b``) may be of any integer type or a vector with
14480 integer element type. The argument types must match each other, and the return
14481 type must match the argument type.
14486 '``llvm.memcpy``' Intrinsic
14487 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
14492 This is an overloaded intrinsic. You can use ``llvm.memcpy`` on any
14493 integer bit width and for different address spaces. Not all targets
14494 support all bit widths however.
14498 declare void @llvm.memcpy.p0.p0.i32(ptr <dest>, ptr <src>,
14499 i32 <len>, i1 <isvolatile>)
14500 declare void @llvm.memcpy.p0.p0.i64(ptr <dest>, ptr <src>,
14501 i64 <len>, i1 <isvolatile>)
14506 The '``llvm.memcpy.*``' intrinsics copy a block of memory from the
14507 source location to the destination location.
14509 Note that, unlike the standard libc function, the ``llvm.memcpy.*``
14510 intrinsics do not return a value, takes extra isvolatile
14511 arguments and the pointers can be in specified address spaces.
14516 The first argument is a pointer to the destination, the second is a
14517 pointer to the source. The third argument is an integer argument
14518 specifying the number of bytes to copy, and the fourth is a
14519 boolean indicating a volatile access.
14521 The :ref:`align <attr_align>` parameter attribute can be provided
14522 for the first and second arguments.
14524 If the ``isvolatile`` parameter is ``true``, the ``llvm.memcpy`` call is
14525 a :ref:`volatile operation <volatile>`. The detailed access behavior is not
14526 very cleanly specified and it is unwise to depend on it.
14531 The '``llvm.memcpy.*``' intrinsics copy a block of memory from the source
14532 location to the destination location, which must either be equal or
14533 non-overlapping. It copies "len" bytes of memory over. If the argument is known
14534 to be aligned to some boundary, this can be specified as an attribute on the
14537 If ``<len>`` is 0, it is no-op modulo the behavior of attributes attached to
14539 If ``<len>`` is not a well-defined value, the behavior is undefined.
14540 If ``<len>`` is not zero, both ``<dest>`` and ``<src>`` should be well-defined,
14541 otherwise the behavior is undefined.
14543 .. _int_memcpy_inline:
14545 '``llvm.memcpy.inline``' Intrinsic
14546 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14551 This is an overloaded intrinsic. You can use ``llvm.memcpy.inline`` on any
14552 integer bit width and for different address spaces. Not all targets
14553 support all bit widths however.
14557 declare void @llvm.memcpy.inline.p0.p0.i32(ptr <dest>, ptr <src>,
14558 i32 <len>, i1 <isvolatile>)
14559 declare void @llvm.memcpy.inline.p0.p0.i64(ptr <dest>, ptr <src>,
14560 i64 <len>, i1 <isvolatile>)
14565 The '``llvm.memcpy.inline.*``' intrinsics copy a block of memory from the
14566 source location to the destination location and guarantees that no external
14567 functions are called.
14569 Note that, unlike the standard libc function, the ``llvm.memcpy.inline.*``
14570 intrinsics do not return a value, takes extra isvolatile
14571 arguments and the pointers can be in specified address spaces.
14576 The first argument is a pointer to the destination, the second is a
14577 pointer to the source. The third argument is a constant integer argument
14578 specifying the number of bytes to copy, and the fourth is a
14579 boolean indicating a volatile access.
14581 The :ref:`align <attr_align>` parameter attribute can be provided
14582 for the first and second arguments.
14584 If the ``isvolatile`` parameter is ``true``, the ``llvm.memcpy.inline`` call is
14585 a :ref:`volatile operation <volatile>`. The detailed access behavior is not
14586 very cleanly specified and it is unwise to depend on it.
14591 The '``llvm.memcpy.inline.*``' intrinsics copy a block of memory from the
14592 source location to the destination location, which are not allowed to
14593 overlap. It copies "len" bytes of memory over. If the argument is known
14594 to be aligned to some boundary, this can be specified as an attribute on
14596 The behavior of '``llvm.memcpy.inline.*``' is equivalent to the behavior of
14597 '``llvm.memcpy.*``', but the generated code is guaranteed not to call any
14598 external functions.
14602 '``llvm.memmove``' Intrinsic
14603 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14608 This is an overloaded intrinsic. You can use llvm.memmove on any integer
14609 bit width and for different address space. Not all targets support all
14610 bit widths however.
14614 declare void @llvm.memmove.p0.p0.i32(ptr <dest>, ptr <src>,
14615 i32 <len>, i1 <isvolatile>)
14616 declare void @llvm.memmove.p0.p0.i64(ptr <dest>, ptr <src>,
14617 i64 <len>, i1 <isvolatile>)
14622 The '``llvm.memmove.*``' intrinsics move a block of memory from the
14623 source location to the destination location. It is similar to the
14624 '``llvm.memcpy``' intrinsic but allows the two memory locations to
14627 Note that, unlike the standard libc function, the ``llvm.memmove.*``
14628 intrinsics do not return a value, takes an extra isvolatile
14629 argument and the pointers can be in specified address spaces.
14634 The first argument is a pointer to the destination, the second is a
14635 pointer to the source. The third argument is an integer argument
14636 specifying the number of bytes to copy, and the fourth is a
14637 boolean indicating a volatile access.
14639 The :ref:`align <attr_align>` parameter attribute can be provided
14640 for the first and second arguments.
14642 If the ``isvolatile`` parameter is ``true``, the ``llvm.memmove`` call
14643 is a :ref:`volatile operation <volatile>`. The detailed access behavior is
14644 not very cleanly specified and it is unwise to depend on it.
14649 The '``llvm.memmove.*``' intrinsics copy a block of memory from the
14650 source location to the destination location, which may overlap. It
14651 copies "len" bytes of memory over. If the argument is known to be
14652 aligned to some boundary, this can be specified as an attribute on
14655 If ``<len>`` is 0, it is no-op modulo the behavior of attributes attached to
14657 If ``<len>`` is not a well-defined value, the behavior is undefined.
14658 If ``<len>`` is not zero, both ``<dest>`` and ``<src>`` should be well-defined,
14659 otherwise the behavior is undefined.
14663 '``llvm.memset.*``' Intrinsics
14664 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14669 This is an overloaded intrinsic. You can use llvm.memset on any integer
14670 bit width and for different address spaces. However, not all targets
14671 support all bit widths.
14675 declare void @llvm.memset.p0.i32(ptr <dest>, i8 <val>,
14676 i32 <len>, i1 <isvolatile>)
14677 declare void @llvm.memset.p0.i64(ptr <dest>, i8 <val>,
14678 i64 <len>, i1 <isvolatile>)
14683 The '``llvm.memset.*``' intrinsics fill a block of memory with a
14684 particular byte value.
14686 Note that, unlike the standard libc function, the ``llvm.memset``
14687 intrinsic does not return a value and takes an extra volatile
14688 argument. Also, the destination can be in an arbitrary address space.
14693 The first argument is a pointer to the destination to fill, the second
14694 is the byte value with which to fill it, the third argument is an
14695 integer argument specifying the number of bytes to fill, and the fourth
14696 is a boolean indicating a volatile access.
14698 The :ref:`align <attr_align>` parameter attribute can be provided
14699 for the first arguments.
14701 If the ``isvolatile`` parameter is ``true``, the ``llvm.memset`` call is
14702 a :ref:`volatile operation <volatile>`. The detailed access behavior is not
14703 very cleanly specified and it is unwise to depend on it.
14708 The '``llvm.memset.*``' intrinsics fill "len" bytes of memory starting
14709 at the destination location. If the argument is known to be
14710 aligned to some boundary, this can be specified as an attribute on
14713 If ``<len>`` is 0, it is no-op modulo the behavior of attributes attached to
14715 If ``<len>`` is not a well-defined value, the behavior is undefined.
14716 If ``<len>`` is not zero, ``<dest>`` should be well-defined, otherwise the
14717 behavior is undefined.
14719 .. _int_memset_inline:
14721 '``llvm.memset.inline``' Intrinsic
14722 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14727 This is an overloaded intrinsic. You can use ``llvm.memset.inline`` on any
14728 integer bit width and for different address spaces. Not all targets
14729 support all bit widths however.
14733 declare void @llvm.memset.inline.p0.p0i8.i32(ptr <dest>, i8 <val>,
14734 i32 <len>, i1 <isvolatile>)
14735 declare void @llvm.memset.inline.p0.p0.i64(ptr <dest>, i8 <val>,
14736 i64 <len>, i1 <isvolatile>)
14741 The '``llvm.memset.inline.*``' intrinsics fill a block of memory with a
14742 particular byte value and guarantees that no external functions are called.
14744 Note that, unlike the standard libc function, the ``llvm.memset.inline.*``
14745 intrinsics do not return a value, take an extra isvolatile argument and the
14746 pointer can be in specified address spaces.
14751 The first argument is a pointer to the destination to fill, the second
14752 is the byte value with which to fill it, the third argument is a constant
14753 integer argument specifying the number of bytes to fill, and the fourth
14754 is a boolean indicating a volatile access.
14756 The :ref:`align <attr_align>` parameter attribute can be provided
14757 for the first argument.
14759 If the ``isvolatile`` parameter is ``true``, the ``llvm.memset.inline`` call is
14760 a :ref:`volatile operation <volatile>`. The detailed access behavior is not
14761 very cleanly specified and it is unwise to depend on it.
14766 The '``llvm.memset.inline.*``' intrinsics fill "len" bytes of memory starting
14767 at the destination location. If the argument is known to be
14768 aligned to some boundary, this can be specified as an attribute on
14771 ``len`` must be a constant expression.
14772 If ``<len>`` is 0, it is no-op modulo the behavior of attributes attached to
14774 If ``<len>`` is not a well-defined value, the behavior is undefined.
14775 If ``<len>`` is not zero, ``<dest>`` should be well-defined, otherwise the
14776 behavior is undefined.
14778 The behavior of '``llvm.memset.inline.*``' is equivalent to the behavior of
14779 '``llvm.memset.*``', but the generated code is guaranteed not to call any
14780 external functions.
14784 '``llvm.sqrt.*``' Intrinsic
14785 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
14790 This is an overloaded intrinsic. You can use ``llvm.sqrt`` on any
14791 floating-point or vector of floating-point type. Not all targets support
14796 declare float @llvm.sqrt.f32(float %Val)
14797 declare double @llvm.sqrt.f64(double %Val)
14798 declare x86_fp80 @llvm.sqrt.f80(x86_fp80 %Val)
14799 declare fp128 @llvm.sqrt.f128(fp128 %Val)
14800 declare ppc_fp128 @llvm.sqrt.ppcf128(ppc_fp128 %Val)
14805 The '``llvm.sqrt``' intrinsics return the square root of the specified value.
14810 The argument and return value are floating-point numbers of the same type.
14815 Return the same value as a corresponding libm '``sqrt``' function but without
14816 trapping or setting ``errno``. For types specified by IEEE-754, the result
14817 matches a conforming libm implementation.
14819 When specified with the fast-math-flag 'afn', the result may be approximated
14820 using a less accurate calculation.
14822 '``llvm.powi.*``' Intrinsic
14823 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
14828 This is an overloaded intrinsic. You can use ``llvm.powi`` on any
14829 floating-point or vector of floating-point type. Not all targets support
14832 Generally, the only supported type for the exponent is the one matching
14833 with the C type ``int``.
14837 declare float @llvm.powi.f32.i32(float %Val, i32 %power)
14838 declare double @llvm.powi.f64.i16(double %Val, i16 %power)
14839 declare x86_fp80 @llvm.powi.f80.i32(x86_fp80 %Val, i32 %power)
14840 declare fp128 @llvm.powi.f128.i32(fp128 %Val, i32 %power)
14841 declare ppc_fp128 @llvm.powi.ppcf128.i32(ppc_fp128 %Val, i32 %power)
14846 The '``llvm.powi.*``' intrinsics return the first operand raised to the
14847 specified (positive or negative) power. The order of evaluation of
14848 multiplications is not defined. When a vector of floating-point type is
14849 used, the second argument remains a scalar integer value.
14854 The second argument is an integer power, and the first is a value to
14855 raise to that power.
14860 This function returns the first value raised to the second power with an
14861 unspecified sequence of rounding operations.
14863 '``llvm.sin.*``' Intrinsic
14864 ^^^^^^^^^^^^^^^^^^^^^^^^^^
14869 This is an overloaded intrinsic. You can use ``llvm.sin`` on any
14870 floating-point or vector of floating-point type. Not all targets support
14875 declare float @llvm.sin.f32(float %Val)
14876 declare double @llvm.sin.f64(double %Val)
14877 declare x86_fp80 @llvm.sin.f80(x86_fp80 %Val)
14878 declare fp128 @llvm.sin.f128(fp128 %Val)
14879 declare ppc_fp128 @llvm.sin.ppcf128(ppc_fp128 %Val)
14884 The '``llvm.sin.*``' intrinsics return the sine of the operand.
14889 The argument and return value are floating-point numbers of the same type.
14894 Return the same value as a corresponding libm '``sin``' function but without
14895 trapping or setting ``errno``.
14897 When specified with the fast-math-flag 'afn', the result may be approximated
14898 using a less accurate calculation.
14900 '``llvm.cos.*``' Intrinsic
14901 ^^^^^^^^^^^^^^^^^^^^^^^^^^
14906 This is an overloaded intrinsic. You can use ``llvm.cos`` on any
14907 floating-point or vector of floating-point type. Not all targets support
14912 declare float @llvm.cos.f32(float %Val)
14913 declare double @llvm.cos.f64(double %Val)
14914 declare x86_fp80 @llvm.cos.f80(x86_fp80 %Val)
14915 declare fp128 @llvm.cos.f128(fp128 %Val)
14916 declare ppc_fp128 @llvm.cos.ppcf128(ppc_fp128 %Val)
14921 The '``llvm.cos.*``' intrinsics return the cosine of the operand.
14926 The argument and return value are floating-point numbers of the same type.
14931 Return the same value as a corresponding libm '``cos``' function but without
14932 trapping or setting ``errno``.
14934 When specified with the fast-math-flag 'afn', the result may be approximated
14935 using a less accurate calculation.
14937 '``llvm.pow.*``' Intrinsic
14938 ^^^^^^^^^^^^^^^^^^^^^^^^^^
14943 This is an overloaded intrinsic. You can use ``llvm.pow`` on any
14944 floating-point or vector of floating-point type. Not all targets support
14949 declare float @llvm.pow.f32(float %Val, float %Power)
14950 declare double @llvm.pow.f64(double %Val, double %Power)
14951 declare x86_fp80 @llvm.pow.f80(x86_fp80 %Val, x86_fp80 %Power)
14952 declare fp128 @llvm.pow.f128(fp128 %Val, fp128 %Power)
14953 declare ppc_fp128 @llvm.pow.ppcf128(ppc_fp128 %Val, ppc_fp128 Power)
14958 The '``llvm.pow.*``' intrinsics return the first operand raised to the
14959 specified (positive or negative) power.
14964 The arguments and return value are floating-point numbers of the same type.
14969 Return the same value as a corresponding libm '``pow``' function but without
14970 trapping or setting ``errno``.
14972 When specified with the fast-math-flag 'afn', the result may be approximated
14973 using a less accurate calculation.
14977 '``llvm.exp.*``' Intrinsic
14978 ^^^^^^^^^^^^^^^^^^^^^^^^^^
14983 This is an overloaded intrinsic. You can use ``llvm.exp`` on any
14984 floating-point or vector of floating-point type. Not all targets support
14989 declare float @llvm.exp.f32(float %Val)
14990 declare double @llvm.exp.f64(double %Val)
14991 declare x86_fp80 @llvm.exp.f80(x86_fp80 %Val)
14992 declare fp128 @llvm.exp.f128(fp128 %Val)
14993 declare ppc_fp128 @llvm.exp.ppcf128(ppc_fp128 %Val)
14998 The '``llvm.exp.*``' intrinsics compute the base-e exponential of the specified
15004 The argument and return value are floating-point numbers of the same type.
15009 Return the same value as a corresponding libm '``exp``' function but without
15010 trapping or setting ``errno``.
15012 When specified with the fast-math-flag 'afn', the result may be approximated
15013 using a less accurate calculation.
15017 '``llvm.exp2.*``' Intrinsic
15018 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
15023 This is an overloaded intrinsic. You can use ``llvm.exp2`` on any
15024 floating-point or vector of floating-point type. Not all targets support
15029 declare float @llvm.exp2.f32(float %Val)
15030 declare double @llvm.exp2.f64(double %Val)
15031 declare x86_fp80 @llvm.exp2.f80(x86_fp80 %Val)
15032 declare fp128 @llvm.exp2.f128(fp128 %Val)
15033 declare ppc_fp128 @llvm.exp2.ppcf128(ppc_fp128 %Val)
15038 The '``llvm.exp2.*``' intrinsics compute the base-2 exponential of the
15044 The argument and return value are floating-point numbers of the same type.
15049 Return the same value as a corresponding libm '``exp2``' function but without
15050 trapping or setting ``errno``.
15052 When specified with the fast-math-flag 'afn', the result may be approximated
15053 using a less accurate calculation.
15057 '``llvm.exp10.*``' Intrinsic
15058 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15063 This is an overloaded intrinsic. You can use ``llvm.exp10`` on any
15064 floating-point or vector of floating-point type. Not all targets support
15069 declare float @llvm.exp10.f32(float %Val)
15070 declare double @llvm.exp10.f64(double %Val)
15071 declare x86_fp80 @llvm.exp10.f80(x86_fp80 %Val)
15072 declare fp128 @llvm.exp10.f128(fp128 %Val)
15073 declare ppc_fp128 @llvm.exp10.ppcf128(ppc_fp128 %Val)
15078 The '``llvm.exp10.*``' intrinsics compute the base-10 exponential of the
15084 The argument and return value are floating-point numbers of the same type.
15089 Return the same value as a corresponding libm '``exp10``' function but without
15090 trapping or setting ``errno``.
15092 When specified with the fast-math-flag 'afn', the result may be approximated
15093 using a less accurate calculation.
15096 '``llvm.ldexp.*``' Intrinsic
15097 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15102 This is an overloaded intrinsic. You can use ``llvm.ldexp`` on any
15103 floating point or vector of floating point type. Not all targets support
15108 declare float @llvm.ldexp.f32.i32(float %Val, i32 %Exp)
15109 declare double @llvm.ldexp.f64.i32(double %Val, i32 %Exp)
15110 declare x86_fp80 @llvm.ldexp.f80.i32(x86_fp80 %Val, i32 %Exp)
15111 declare fp128 @llvm.ldexp.f128.i32(fp128 %Val, i32 %Exp)
15112 declare ppc_fp128 @llvm.ldexp.ppcf128.i32(ppc_fp128 %Val, i32 %Exp)
15113 declare <2 x float> @llvm.ldexp.v2f32.v2i32(<2 x float> %Val, <2 x i32> %Exp)
15118 The '``llvm.ldexp.*``' intrinsics perform the ldexp function.
15123 The first argument and the return value are :ref:`floating-point
15124 <t_floating>` or :ref:`vector <t_vector>` of floating-point values of
15125 the same type. The second argument is an integer with the same number
15131 This function multiplies the first argument by 2 raised to the second
15132 argument's power. If the first argument is NaN or infinite, the same
15133 value is returned. If the result underflows a zero with the same sign
15134 is returned. If the result overflows, the result is an infinity with
15139 '``llvm.frexp.*``' Intrinsic
15140 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15145 This is an overloaded intrinsic. You can use ``llvm.frexp`` on any
15146 floating point or vector of floating point type. Not all targets support
15151 declare { float, i32 } @llvm.frexp.f32.i32(float %Val)
15152 declare { double, i32 } @llvm.frexp.f64.i32(double %Val)
15153 declare { x86_fp80, i32 } @llvm.frexp.f80.i32(x86_fp80 %Val)
15154 declare { fp128, i32 } @llvm.frexp.f128.i32(fp128 %Val)
15155 declare { ppc_fp128, i32 } @llvm.frexp.ppcf128.i32(ppc_fp128 %Val)
15156 declare { <2 x float>, <2 x i32> } @llvm.frexp.v2f32.v2i32(<2 x float> %Val)
15161 The '``llvm.frexp.*``' intrinsics perform the frexp function.
15166 The argument is a :ref:`floating-point <t_floating>` or
15167 :ref:`vector <t_vector>` of floating-point values. Returns two values
15168 in a struct. The first struct field matches the argument type, and the
15169 second field is an integer or a vector of integer values with the same
15170 number of elements as the argument.
15175 This intrinsic splits a floating point value into a normalized
15176 fractional component and integral exponent.
15178 For a non-zero argument, returns the argument multiplied by some power
15179 of two such that the absolute value of the returned value is in the
15180 range [0.5, 1.0), with the same sign as the argument. The second
15181 result is an integer such that the first result raised to the power of
15182 the second result is the input argument.
15184 If the argument is a zero, returns a zero with the same sign and a 0
15187 If the argument is a NaN, a NaN is returned and the returned exponent
15190 If the argument is an infinity, returns an infinity with the same sign
15191 and an unspecified exponent.
15195 '``llvm.log.*``' Intrinsic
15196 ^^^^^^^^^^^^^^^^^^^^^^^^^^
15201 This is an overloaded intrinsic. You can use ``llvm.log`` on any
15202 floating-point or vector of floating-point type. Not all targets support
15207 declare float @llvm.log.f32(float %Val)
15208 declare double @llvm.log.f64(double %Val)
15209 declare x86_fp80 @llvm.log.f80(x86_fp80 %Val)
15210 declare fp128 @llvm.log.f128(fp128 %Val)
15211 declare ppc_fp128 @llvm.log.ppcf128(ppc_fp128 %Val)
15216 The '``llvm.log.*``' intrinsics compute the base-e logarithm of the specified
15222 The argument and return value are floating-point numbers of the same type.
15227 Return the same value as a corresponding libm '``log``' function but without
15228 trapping or setting ``errno``.
15230 When specified with the fast-math-flag 'afn', the result may be approximated
15231 using a less accurate calculation.
15235 '``llvm.log10.*``' Intrinsic
15236 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15241 This is an overloaded intrinsic. You can use ``llvm.log10`` on any
15242 floating-point or vector of floating-point type. Not all targets support
15247 declare float @llvm.log10.f32(float %Val)
15248 declare double @llvm.log10.f64(double %Val)
15249 declare x86_fp80 @llvm.log10.f80(x86_fp80 %Val)
15250 declare fp128 @llvm.log10.f128(fp128 %Val)
15251 declare ppc_fp128 @llvm.log10.ppcf128(ppc_fp128 %Val)
15256 The '``llvm.log10.*``' intrinsics compute the base-10 logarithm of the
15262 The argument and return value are floating-point numbers of the same type.
15267 Return the same value as a corresponding libm '``log10``' function but without
15268 trapping or setting ``errno``.
15270 When specified with the fast-math-flag 'afn', the result may be approximated
15271 using a less accurate calculation.
15276 '``llvm.log2.*``' Intrinsic
15277 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
15282 This is an overloaded intrinsic. You can use ``llvm.log2`` on any
15283 floating-point or vector of floating-point type. Not all targets support
15288 declare float @llvm.log2.f32(float %Val)
15289 declare double @llvm.log2.f64(double %Val)
15290 declare x86_fp80 @llvm.log2.f80(x86_fp80 %Val)
15291 declare fp128 @llvm.log2.f128(fp128 %Val)
15292 declare ppc_fp128 @llvm.log2.ppcf128(ppc_fp128 %Val)
15297 The '``llvm.log2.*``' intrinsics compute the base-2 logarithm of the specified
15303 The argument and return value are floating-point numbers of the same type.
15308 Return the same value as a corresponding libm '``log2``' function but without
15309 trapping or setting ``errno``.
15311 When specified with the fast-math-flag 'afn', the result may be approximated
15312 using a less accurate calculation.
15316 '``llvm.fma.*``' Intrinsic
15317 ^^^^^^^^^^^^^^^^^^^^^^^^^^
15322 This is an overloaded intrinsic. You can use ``llvm.fma`` on any
15323 floating-point or vector of floating-point type. Not all targets support
15328 declare float @llvm.fma.f32(float %a, float %b, float %c)
15329 declare double @llvm.fma.f64(double %a, double %b, double %c)
15330 declare x86_fp80 @llvm.fma.f80(x86_fp80 %a, x86_fp80 %b, x86_fp80 %c)
15331 declare fp128 @llvm.fma.f128(fp128 %a, fp128 %b, fp128 %c)
15332 declare ppc_fp128 @llvm.fma.ppcf128(ppc_fp128 %a, ppc_fp128 %b, ppc_fp128 %c)
15337 The '``llvm.fma.*``' intrinsics perform the fused multiply-add operation.
15342 The arguments and return value are floating-point numbers of the same type.
15347 Return the same value as a corresponding libm '``fma``' function but without
15348 trapping or setting ``errno``.
15350 When specified with the fast-math-flag 'afn', the result may be approximated
15351 using a less accurate calculation.
15355 '``llvm.fabs.*``' Intrinsic
15356 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
15361 This is an overloaded intrinsic. You can use ``llvm.fabs`` on any
15362 floating-point or vector of floating-point type. Not all targets support
15367 declare float @llvm.fabs.f32(float %Val)
15368 declare double @llvm.fabs.f64(double %Val)
15369 declare x86_fp80 @llvm.fabs.f80(x86_fp80 %Val)
15370 declare fp128 @llvm.fabs.f128(fp128 %Val)
15371 declare ppc_fp128 @llvm.fabs.ppcf128(ppc_fp128 %Val)
15376 The '``llvm.fabs.*``' intrinsics return the absolute value of the
15382 The argument and return value are floating-point numbers of the same
15388 This function returns the same values as the libm ``fabs`` functions
15389 would, and handles error conditions in the same way.
15390 The returned value is completely identical to the input except for the sign bit;
15391 in particular, if the input is a NaN, then the quiet/signaling bit and payload
15392 are perfectly preserved.
15396 '``llvm.minnum.*``' Intrinsic
15397 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15402 This is an overloaded intrinsic. You can use ``llvm.minnum`` on any
15403 floating-point or vector of floating-point type. Not all targets support
15408 declare float @llvm.minnum.f32(float %Val0, float %Val1)
15409 declare double @llvm.minnum.f64(double %Val0, double %Val1)
15410 declare x86_fp80 @llvm.minnum.f80(x86_fp80 %Val0, x86_fp80 %Val1)
15411 declare fp128 @llvm.minnum.f128(fp128 %Val0, fp128 %Val1)
15412 declare ppc_fp128 @llvm.minnum.ppcf128(ppc_fp128 %Val0, ppc_fp128 %Val1)
15417 The '``llvm.minnum.*``' intrinsics return the minimum of the two
15424 The arguments and return value are floating-point numbers of the same
15430 Follows the IEEE-754 semantics for minNum, except for handling of
15431 signaling NaNs. This match's the behavior of libm's fmin.
15433 If either operand is a NaN, returns the other non-NaN operand. Returns
15434 NaN only if both operands are NaN. If the operands compare equal,
15435 returns either one of the operands. For example, this means that
15436 fmin(+0.0, -0.0) returns either operand.
15438 Unlike the IEEE-754 2008 behavior, this does not distinguish between
15439 signaling and quiet NaN inputs. If a target's implementation follows
15440 the standard and returns a quiet NaN if either input is a signaling
15441 NaN, the intrinsic lowering is responsible for quieting the inputs to
15442 correctly return the non-NaN input (e.g. by using the equivalent of
15443 ``llvm.canonicalize``).
15447 '``llvm.maxnum.*``' Intrinsic
15448 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15453 This is an overloaded intrinsic. You can use ``llvm.maxnum`` on any
15454 floating-point or vector of floating-point type. Not all targets support
15459 declare float @llvm.maxnum.f32(float %Val0, float %Val1)
15460 declare double @llvm.maxnum.f64(double %Val0, double %Val1)
15461 declare x86_fp80 @llvm.maxnum.f80(x86_fp80 %Val0, x86_fp80 %Val1)
15462 declare fp128 @llvm.maxnum.f128(fp128 %Val0, fp128 %Val1)
15463 declare ppc_fp128 @llvm.maxnum.ppcf128(ppc_fp128 %Val0, ppc_fp128 %Val1)
15468 The '``llvm.maxnum.*``' intrinsics return the maximum of the two
15475 The arguments and return value are floating-point numbers of the same
15480 Follows the IEEE-754 semantics for maxNum except for the handling of
15481 signaling NaNs. This matches the behavior of libm's fmax.
15483 If either operand is a NaN, returns the other non-NaN operand. Returns
15484 NaN only if both operands are NaN. If the operands compare equal,
15485 returns either one of the operands. For example, this means that
15486 fmax(+0.0, -0.0) returns either -0.0 or 0.0.
15488 Unlike the IEEE-754 2008 behavior, this does not distinguish between
15489 signaling and quiet NaN inputs. If a target's implementation follows
15490 the standard and returns a quiet NaN if either input is a signaling
15491 NaN, the intrinsic lowering is responsible for quieting the inputs to
15492 correctly return the non-NaN input (e.g. by using the equivalent of
15493 ``llvm.canonicalize``).
15495 '``llvm.minimum.*``' Intrinsic
15496 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15501 This is an overloaded intrinsic. You can use ``llvm.minimum`` on any
15502 floating-point or vector of floating-point type. Not all targets support
15507 declare float @llvm.minimum.f32(float %Val0, float %Val1)
15508 declare double @llvm.minimum.f64(double %Val0, double %Val1)
15509 declare x86_fp80 @llvm.minimum.f80(x86_fp80 %Val0, x86_fp80 %Val1)
15510 declare fp128 @llvm.minimum.f128(fp128 %Val0, fp128 %Val1)
15511 declare ppc_fp128 @llvm.minimum.ppcf128(ppc_fp128 %Val0, ppc_fp128 %Val1)
15516 The '``llvm.minimum.*``' intrinsics return the minimum of the two
15517 arguments, propagating NaNs and treating -0.0 as less than +0.0.
15523 The arguments and return value are floating-point numbers of the same
15528 If either operand is a NaN, returns NaN. Otherwise returns the lesser
15529 of the two arguments. -0.0 is considered to be less than +0.0 for this
15530 intrinsic. Note that these are the semantics specified in the draft of
15533 '``llvm.maximum.*``' Intrinsic
15534 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15539 This is an overloaded intrinsic. You can use ``llvm.maximum`` on any
15540 floating-point or vector of floating-point type. Not all targets support
15545 declare float @llvm.maximum.f32(float %Val0, float %Val1)
15546 declare double @llvm.maximum.f64(double %Val0, double %Val1)
15547 declare x86_fp80 @llvm.maximum.f80(x86_fp80 %Val0, x86_fp80 %Val1)
15548 declare fp128 @llvm.maximum.f128(fp128 %Val0, fp128 %Val1)
15549 declare ppc_fp128 @llvm.maximum.ppcf128(ppc_fp128 %Val0, ppc_fp128 %Val1)
15554 The '``llvm.maximum.*``' intrinsics return the maximum of the two
15555 arguments, propagating NaNs and treating -0.0 as less than +0.0.
15561 The arguments and return value are floating-point numbers of the same
15566 If either operand is a NaN, returns NaN. Otherwise returns the greater
15567 of the two arguments. -0.0 is considered to be less than +0.0 for this
15568 intrinsic. Note that these are the semantics specified in the draft of
15573 '``llvm.copysign.*``' Intrinsic
15574 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15579 This is an overloaded intrinsic. You can use ``llvm.copysign`` on any
15580 floating-point or vector of floating-point type. Not all targets support
15585 declare float @llvm.copysign.f32(float %Mag, float %Sgn)
15586 declare double @llvm.copysign.f64(double %Mag, double %Sgn)
15587 declare x86_fp80 @llvm.copysign.f80(x86_fp80 %Mag, x86_fp80 %Sgn)
15588 declare fp128 @llvm.copysign.f128(fp128 %Mag, fp128 %Sgn)
15589 declare ppc_fp128 @llvm.copysign.ppcf128(ppc_fp128 %Mag, ppc_fp128 %Sgn)
15594 The '``llvm.copysign.*``' intrinsics return a value with the magnitude of the
15595 first operand and the sign of the second operand.
15600 The arguments and return value are floating-point numbers of the same
15606 This function returns the same values as the libm ``copysign``
15607 functions would, and handles error conditions in the same way.
15608 The returned value is completely identical to the first operand except for the
15609 sign bit; in particular, if the input is a NaN, then the quiet/signaling bit and
15610 payload are perfectly preserved.
15614 '``llvm.floor.*``' Intrinsic
15615 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15620 This is an overloaded intrinsic. You can use ``llvm.floor`` on any
15621 floating-point or vector of floating-point type. Not all targets support
15626 declare float @llvm.floor.f32(float %Val)
15627 declare double @llvm.floor.f64(double %Val)
15628 declare x86_fp80 @llvm.floor.f80(x86_fp80 %Val)
15629 declare fp128 @llvm.floor.f128(fp128 %Val)
15630 declare ppc_fp128 @llvm.floor.ppcf128(ppc_fp128 %Val)
15635 The '``llvm.floor.*``' intrinsics return the floor of the operand.
15640 The argument and return value are floating-point numbers of the same
15646 This function returns the same values as the libm ``floor`` functions
15647 would, and handles error conditions in the same way.
15651 '``llvm.ceil.*``' Intrinsic
15652 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
15657 This is an overloaded intrinsic. You can use ``llvm.ceil`` on any
15658 floating-point or vector of floating-point type. Not all targets support
15663 declare float @llvm.ceil.f32(float %Val)
15664 declare double @llvm.ceil.f64(double %Val)
15665 declare x86_fp80 @llvm.ceil.f80(x86_fp80 %Val)
15666 declare fp128 @llvm.ceil.f128(fp128 %Val)
15667 declare ppc_fp128 @llvm.ceil.ppcf128(ppc_fp128 %Val)
15672 The '``llvm.ceil.*``' intrinsics return the ceiling of the operand.
15677 The argument and return value are floating-point numbers of the same
15683 This function returns the same values as the libm ``ceil`` functions
15684 would, and handles error conditions in the same way.
15687 .. _int_llvm_trunc:
15689 '``llvm.trunc.*``' Intrinsic
15690 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15695 This is an overloaded intrinsic. You can use ``llvm.trunc`` on any
15696 floating-point or vector of floating-point type. Not all targets support
15701 declare float @llvm.trunc.f32(float %Val)
15702 declare double @llvm.trunc.f64(double %Val)
15703 declare x86_fp80 @llvm.trunc.f80(x86_fp80 %Val)
15704 declare fp128 @llvm.trunc.f128(fp128 %Val)
15705 declare ppc_fp128 @llvm.trunc.ppcf128(ppc_fp128 %Val)
15710 The '``llvm.trunc.*``' intrinsics returns the operand rounded to the
15711 nearest integer not larger in magnitude than the operand.
15716 The argument and return value are floating-point numbers of the same
15722 This function returns the same values as the libm ``trunc`` functions
15723 would, and handles error conditions in the same way.
15727 '``llvm.rint.*``' Intrinsic
15728 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
15733 This is an overloaded intrinsic. You can use ``llvm.rint`` on any
15734 floating-point or vector of floating-point type. Not all targets support
15739 declare float @llvm.rint.f32(float %Val)
15740 declare double @llvm.rint.f64(double %Val)
15741 declare x86_fp80 @llvm.rint.f80(x86_fp80 %Val)
15742 declare fp128 @llvm.rint.f128(fp128 %Val)
15743 declare ppc_fp128 @llvm.rint.ppcf128(ppc_fp128 %Val)
15748 The '``llvm.rint.*``' intrinsics returns the operand rounded to the
15749 nearest integer. It may raise an inexact floating-point exception if the
15750 operand isn't an integer.
15755 The argument and return value are floating-point numbers of the same
15761 This function returns the same values as the libm ``rint`` functions
15762 would, and handles error conditions in the same way. Since LLVM assumes the
15763 :ref:`default floating-point environment <floatenv>`, the rounding mode is
15764 assumed to be set to "nearest", so halfway cases are rounded to the even
15765 integer. Use :ref:`Constrained Floating-Point Intrinsics <constrainedfp>`
15766 to avoid that assumption.
15770 '``llvm.nearbyint.*``' Intrinsic
15771 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15776 This is an overloaded intrinsic. You can use ``llvm.nearbyint`` on any
15777 floating-point or vector of floating-point type. Not all targets support
15782 declare float @llvm.nearbyint.f32(float %Val)
15783 declare double @llvm.nearbyint.f64(double %Val)
15784 declare x86_fp80 @llvm.nearbyint.f80(x86_fp80 %Val)
15785 declare fp128 @llvm.nearbyint.f128(fp128 %Val)
15786 declare ppc_fp128 @llvm.nearbyint.ppcf128(ppc_fp128 %Val)
15791 The '``llvm.nearbyint.*``' intrinsics returns the operand rounded to the
15797 The argument and return value are floating-point numbers of the same
15803 This function returns the same values as the libm ``nearbyint``
15804 functions would, and handles error conditions in the same way. Since LLVM
15805 assumes the :ref:`default floating-point environment <floatenv>`, the rounding
15806 mode is assumed to be set to "nearest", so halfway cases are rounded to the even
15807 integer. Use :ref:`Constrained Floating-Point Intrinsics <constrainedfp>` to
15808 avoid that assumption.
15812 '``llvm.round.*``' Intrinsic
15813 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15818 This is an overloaded intrinsic. You can use ``llvm.round`` on any
15819 floating-point or vector of floating-point type. Not all targets support
15824 declare float @llvm.round.f32(float %Val)
15825 declare double @llvm.round.f64(double %Val)
15826 declare x86_fp80 @llvm.round.f80(x86_fp80 %Val)
15827 declare fp128 @llvm.round.f128(fp128 %Val)
15828 declare ppc_fp128 @llvm.round.ppcf128(ppc_fp128 %Val)
15833 The '``llvm.round.*``' intrinsics returns the operand rounded to the
15839 The argument and return value are floating-point numbers of the same
15845 This function returns the same values as the libm ``round``
15846 functions would, and handles error conditions in the same way.
15850 '``llvm.roundeven.*``' Intrinsic
15851 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15856 This is an overloaded intrinsic. You can use ``llvm.roundeven`` on any
15857 floating-point or vector of floating-point type. Not all targets support
15862 declare float @llvm.roundeven.f32(float %Val)
15863 declare double @llvm.roundeven.f64(double %Val)
15864 declare x86_fp80 @llvm.roundeven.f80(x86_fp80 %Val)
15865 declare fp128 @llvm.roundeven.f128(fp128 %Val)
15866 declare ppc_fp128 @llvm.roundeven.ppcf128(ppc_fp128 %Val)
15871 The '``llvm.roundeven.*``' intrinsics returns the operand rounded to the nearest
15872 integer in floating-point format rounding halfway cases to even (that is, to the
15873 nearest value that is an even integer).
15878 The argument and return value are floating-point numbers of the same type.
15883 This function implements IEEE-754 operation ``roundToIntegralTiesToEven``. It
15884 also behaves in the same way as C standard function ``roundeven``, except that
15885 it does not raise floating point exceptions.
15888 '``llvm.lround.*``' Intrinsic
15889 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15894 This is an overloaded intrinsic. You can use ``llvm.lround`` on any
15895 floating-point type. Not all targets support all types however.
15899 declare i32 @llvm.lround.i32.f32(float %Val)
15900 declare i32 @llvm.lround.i32.f64(double %Val)
15901 declare i32 @llvm.lround.i32.f80(float %Val)
15902 declare i32 @llvm.lround.i32.f128(double %Val)
15903 declare i32 @llvm.lround.i32.ppcf128(double %Val)
15905 declare i64 @llvm.lround.i64.f32(float %Val)
15906 declare i64 @llvm.lround.i64.f64(double %Val)
15907 declare i64 @llvm.lround.i64.f80(float %Val)
15908 declare i64 @llvm.lround.i64.f128(double %Val)
15909 declare i64 @llvm.lround.i64.ppcf128(double %Val)
15914 The '``llvm.lround.*``' intrinsics return the operand rounded to the nearest
15915 integer with ties away from zero.
15921 The argument is a floating-point number and the return value is an integer
15927 This function returns the same values as the libm ``lround`` functions
15928 would, but without setting errno. If the rounded value is too large to
15929 be stored in the result type, the return value is a non-deterministic
15930 value (equivalent to `freeze poison`).
15932 '``llvm.llround.*``' Intrinsic
15933 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15938 This is an overloaded intrinsic. You can use ``llvm.llround`` on any
15939 floating-point type. Not all targets support all types however.
15943 declare i64 @llvm.lround.i64.f32(float %Val)
15944 declare i64 @llvm.lround.i64.f64(double %Val)
15945 declare i64 @llvm.lround.i64.f80(float %Val)
15946 declare i64 @llvm.lround.i64.f128(double %Val)
15947 declare i64 @llvm.lround.i64.ppcf128(double %Val)
15952 The '``llvm.llround.*``' intrinsics return the operand rounded to the nearest
15953 integer with ties away from zero.
15958 The argument is a floating-point number and the return value is an integer
15964 This function returns the same values as the libm ``llround``
15965 functions would, but without setting errno. If the rounded value is
15966 too large to be stored in the result type, the return value is a
15967 non-deterministic value (equivalent to `freeze poison`).
15969 '``llvm.lrint.*``' Intrinsic
15970 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15975 This is an overloaded intrinsic. You can use ``llvm.lrint`` on any
15976 floating-point type or vector of floating-point type. Not all targets
15977 support all types however.
15981 declare i32 @llvm.lrint.i32.f32(float %Val)
15982 declare i32 @llvm.lrint.i32.f64(double %Val)
15983 declare i32 @llvm.lrint.i32.f80(float %Val)
15984 declare i32 @llvm.lrint.i32.f128(double %Val)
15985 declare i32 @llvm.lrint.i32.ppcf128(double %Val)
15987 declare i64 @llvm.lrint.i64.f32(float %Val)
15988 declare i64 @llvm.lrint.i64.f64(double %Val)
15989 declare i64 @llvm.lrint.i64.f80(float %Val)
15990 declare i64 @llvm.lrint.i64.f128(double %Val)
15991 declare i64 @llvm.lrint.i64.ppcf128(double %Val)
15996 The '``llvm.lrint.*``' intrinsics return the operand rounded to the nearest
16003 The argument is a floating-point number and the return value is an integer
16009 This function returns the same values as the libm ``lrint`` functions
16010 would, but without setting errno. If the rounded value is too large to
16011 be stored in the result type, the return value is a non-deterministic
16012 value (equivalent to `freeze poison`).
16014 '``llvm.llrint.*``' Intrinsic
16015 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16020 This is an overloaded intrinsic. You can use ``llvm.llrint`` on any
16021 floating-point type or vector of floating-point type. Not all targets
16022 support all types however.
16026 declare i64 @llvm.llrint.i64.f32(float %Val)
16027 declare i64 @llvm.llrint.i64.f64(double %Val)
16028 declare i64 @llvm.llrint.i64.f80(float %Val)
16029 declare i64 @llvm.llrint.i64.f128(double %Val)
16030 declare i64 @llvm.llrint.i64.ppcf128(double %Val)
16035 The '``llvm.llrint.*``' intrinsics return the operand rounded to the nearest
16041 The argument is a floating-point number and the return value is an integer
16047 This function returns the same values as the libm ``llrint`` functions
16048 would, but without setting errno. If the rounded value is too large to
16049 be stored in the result type, the return value is a non-deterministic
16050 value (equivalent to `freeze poison`).
16052 Bit Manipulation Intrinsics
16053 ---------------------------
16055 LLVM provides intrinsics for a few important bit manipulation
16056 operations. These allow efficient code generation for some algorithms.
16058 .. _int_bitreverse:
16060 '``llvm.bitreverse.*``' Intrinsics
16061 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16066 This is an overloaded intrinsic function. You can use bitreverse on any
16071 declare i16 @llvm.bitreverse.i16(i16 <id>)
16072 declare i32 @llvm.bitreverse.i32(i32 <id>)
16073 declare i64 @llvm.bitreverse.i64(i64 <id>)
16074 declare <4 x i32> @llvm.bitreverse.v4i32(<4 x i32> <id>)
16079 The '``llvm.bitreverse``' family of intrinsics is used to reverse the
16080 bitpattern of an integer value or vector of integer values; for example
16081 ``0b10110110`` becomes ``0b01101101``.
16086 The ``llvm.bitreverse.iN`` intrinsic returns an iN value that has bit
16087 ``M`` in the input moved to bit ``N-M-1`` in the output. The vector
16088 intrinsics, such as ``llvm.bitreverse.v4i32``, operate on a per-element
16089 basis and the element order is not affected.
16093 '``llvm.bswap.*``' Intrinsics
16094 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16099 This is an overloaded intrinsic function. You can use bswap on any
16100 integer type that is an even number of bytes (i.e. BitWidth % 16 == 0).
16104 declare i16 @llvm.bswap.i16(i16 <id>)
16105 declare i32 @llvm.bswap.i32(i32 <id>)
16106 declare i64 @llvm.bswap.i64(i64 <id>)
16107 declare <4 x i32> @llvm.bswap.v4i32(<4 x i32> <id>)
16112 The '``llvm.bswap``' family of intrinsics is used to byte swap an integer
16113 value or vector of integer values with an even number of bytes (positive
16114 multiple of 16 bits).
16119 The ``llvm.bswap.i16`` intrinsic returns an i16 value that has the high
16120 and low byte of the input i16 swapped. Similarly, the ``llvm.bswap.i32``
16121 intrinsic returns an i32 value that has the four bytes of the input i32
16122 swapped, so that if the input bytes are numbered 0, 1, 2, 3 then the
16123 returned i32 will have its bytes in 3, 2, 1, 0 order. The
16124 ``llvm.bswap.i48``, ``llvm.bswap.i64`` and other intrinsics extend this
16125 concept to additional even-byte lengths (6 bytes, 8 bytes and more,
16126 respectively). The vector intrinsics, such as ``llvm.bswap.v4i32``,
16127 operate on a per-element basis and the element order is not affected.
16131 '``llvm.ctpop.*``' Intrinsic
16132 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16137 This is an overloaded intrinsic. You can use llvm.ctpop on any integer
16138 bit width, or on any vector with integer elements. Not all targets
16139 support all bit widths or vector types, however.
16143 declare i8 @llvm.ctpop.i8(i8 <src>)
16144 declare i16 @llvm.ctpop.i16(i16 <src>)
16145 declare i32 @llvm.ctpop.i32(i32 <src>)
16146 declare i64 @llvm.ctpop.i64(i64 <src>)
16147 declare i256 @llvm.ctpop.i256(i256 <src>)
16148 declare <2 x i32> @llvm.ctpop.v2i32(<2 x i32> <src>)
16153 The '``llvm.ctpop``' family of intrinsics counts the number of bits set
16159 The only argument is the value to be counted. The argument may be of any
16160 integer type, or a vector with integer elements. The return type must
16161 match the argument type.
16166 The '``llvm.ctpop``' intrinsic counts the 1's in a variable, or within
16167 each element of a vector.
16171 '``llvm.ctlz.*``' Intrinsic
16172 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
16177 This is an overloaded intrinsic. You can use ``llvm.ctlz`` on any
16178 integer bit width, or any vector whose elements are integers. Not all
16179 targets support all bit widths or vector types, however.
16183 declare i8 @llvm.ctlz.i8 (i8 <src>, i1 <is_zero_poison>)
16184 declare <2 x i37> @llvm.ctlz.v2i37(<2 x i37> <src>, i1 <is_zero_poison>)
16189 The '``llvm.ctlz``' family of intrinsic functions counts the number of
16190 leading zeros in a variable.
16195 The first argument is the value to be counted. This argument may be of
16196 any integer type, or a vector with integer element type. The return
16197 type must match the first argument type.
16199 The second argument is a constant flag that indicates whether the intrinsic
16200 returns a valid result if the first argument is zero. If the first
16201 argument is zero and the second argument is true, the result is poison.
16202 Historically some architectures did not provide a defined result for zero
16203 values as efficiently, and many algorithms are now predicated on avoiding
16209 The '``llvm.ctlz``' intrinsic counts the leading (most significant)
16210 zeros in a variable, or within each element of the vector. If
16211 ``src == 0`` then the result is the size in bits of the type of ``src``
16212 if ``is_zero_poison == 0`` and ``poison`` otherwise. For example,
16213 ``llvm.ctlz(i32 2) = 30``.
16217 '``llvm.cttz.*``' Intrinsic
16218 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
16223 This is an overloaded intrinsic. You can use ``llvm.cttz`` on any
16224 integer bit width, or any vector of integer elements. Not all targets
16225 support all bit widths or vector types, however.
16229 declare i42 @llvm.cttz.i42 (i42 <src>, i1 <is_zero_poison>)
16230 declare <2 x i32> @llvm.cttz.v2i32(<2 x i32> <src>, i1 <is_zero_poison>)
16235 The '``llvm.cttz``' family of intrinsic functions counts the number of
16241 The first argument is the value to be counted. This argument may be of
16242 any integer type, or a vector with integer element type. The return
16243 type must match the first argument type.
16245 The second argument is a constant flag that indicates whether the intrinsic
16246 returns a valid result if the first argument is zero. If the first
16247 argument is zero and the second argument is true, the result is poison.
16248 Historically some architectures did not provide a defined result for zero
16249 values as efficiently, and many algorithms are now predicated on avoiding
16255 The '``llvm.cttz``' intrinsic counts the trailing (least significant)
16256 zeros in a variable, or within each element of a vector. If ``src == 0``
16257 then the result is the size in bits of the type of ``src`` if
16258 ``is_zero_poison == 0`` and ``poison`` otherwise. For example,
16259 ``llvm.cttz(2) = 1``.
16265 '``llvm.fshl.*``' Intrinsic
16266 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
16271 This is an overloaded intrinsic. You can use ``llvm.fshl`` on any
16272 integer bit width or any vector of integer elements. Not all targets
16273 support all bit widths or vector types, however.
16277 declare i8 @llvm.fshl.i8 (i8 %a, i8 %b, i8 %c)
16278 declare i64 @llvm.fshl.i64(i64 %a, i64 %b, i64 %c)
16279 declare <2 x i32> @llvm.fshl.v2i32(<2 x i32> %a, <2 x i32> %b, <2 x i32> %c)
16284 The '``llvm.fshl``' family of intrinsic functions performs a funnel shift left:
16285 the first two values are concatenated as { %a : %b } (%a is the most significant
16286 bits of the wide value), the combined value is shifted left, and the most
16287 significant bits are extracted to produce a result that is the same size as the
16288 original arguments. If the first 2 arguments are identical, this is equivalent
16289 to a rotate left operation. For vector types, the operation occurs for each
16290 element of the vector. The shift argument is treated as an unsigned amount
16291 modulo the element size of the arguments.
16296 The first two arguments are the values to be concatenated. The third
16297 argument is the shift amount. The arguments may be any integer type or a
16298 vector with integer element type. All arguments and the return value must
16299 have the same type.
16304 .. code-block:: text
16306 %r = call i8 @llvm.fshl.i8(i8 %x, i8 %y, i8 %z) ; %r = i8: msb_extract((concat(x, y) << (z % 8)), 8)
16307 %r = call i8 @llvm.fshl.i8(i8 255, i8 0, i8 15) ; %r = i8: 128 (0b10000000)
16308 %r = call i8 @llvm.fshl.i8(i8 15, i8 15, i8 11) ; %r = i8: 120 (0b01111000)
16309 %r = call i8 @llvm.fshl.i8(i8 0, i8 255, i8 8) ; %r = i8: 0 (0b00000000)
16313 '``llvm.fshr.*``' Intrinsic
16314 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
16319 This is an overloaded intrinsic. You can use ``llvm.fshr`` on any
16320 integer bit width or any vector of integer elements. Not all targets
16321 support all bit widths or vector types, however.
16325 declare i8 @llvm.fshr.i8 (i8 %a, i8 %b, i8 %c)
16326 declare i64 @llvm.fshr.i64(i64 %a, i64 %b, i64 %c)
16327 declare <2 x i32> @llvm.fshr.v2i32(<2 x i32> %a, <2 x i32> %b, <2 x i32> %c)
16332 The '``llvm.fshr``' family of intrinsic functions performs a funnel shift right:
16333 the first two values are concatenated as { %a : %b } (%a is the most significant
16334 bits of the wide value), the combined value is shifted right, and the least
16335 significant bits are extracted to produce a result that is the same size as the
16336 original arguments. If the first 2 arguments are identical, this is equivalent
16337 to a rotate right operation. For vector types, the operation occurs for each
16338 element of the vector. The shift argument is treated as an unsigned amount
16339 modulo the element size of the arguments.
16344 The first two arguments are the values to be concatenated. The third
16345 argument is the shift amount. The arguments may be any integer type or a
16346 vector with integer element type. All arguments and the return value must
16347 have the same type.
16352 .. code-block:: text
16354 %r = call i8 @llvm.fshr.i8(i8 %x, i8 %y, i8 %z) ; %r = i8: lsb_extract((concat(x, y) >> (z % 8)), 8)
16355 %r = call i8 @llvm.fshr.i8(i8 255, i8 0, i8 15) ; %r = i8: 254 (0b11111110)
16356 %r = call i8 @llvm.fshr.i8(i8 15, i8 15, i8 11) ; %r = i8: 225 (0b11100001)
16357 %r = call i8 @llvm.fshr.i8(i8 0, i8 255, i8 8) ; %r = i8: 255 (0b11111111)
16359 Arithmetic with Overflow Intrinsics
16360 -----------------------------------
16362 LLVM provides intrinsics for fast arithmetic overflow checking.
16364 Each of these intrinsics returns a two-element struct. The first
16365 element of this struct contains the result of the corresponding
16366 arithmetic operation modulo 2\ :sup:`n`\ , where n is the bit width of
16367 the result. Therefore, for example, the first element of the struct
16368 returned by ``llvm.sadd.with.overflow.i32`` is always the same as the
16369 result of a 32-bit ``add`` instruction with the same operands, where
16370 the ``add`` is *not* modified by an ``nsw`` or ``nuw`` flag.
16372 The second element of the result is an ``i1`` that is 1 if the
16373 arithmetic operation overflowed and 0 otherwise. An operation
16374 overflows if, for any values of its operands ``A`` and ``B`` and for
16375 any ``N`` larger than the operands' width, ``ext(A op B) to iN`` is
16376 not equal to ``(ext(A) to iN) op (ext(B) to iN)`` where ``ext`` is
16377 ``sext`` for signed overflow and ``zext`` for unsigned overflow, and
16378 ``op`` is the underlying arithmetic operation.
16380 The behavior of these intrinsics is well-defined for all argument
16383 '``llvm.sadd.with.overflow.*``' Intrinsics
16384 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16389 This is an overloaded intrinsic. You can use ``llvm.sadd.with.overflow``
16390 on any integer bit width or vectors of integers.
16394 declare {i16, i1} @llvm.sadd.with.overflow.i16(i16 %a, i16 %b)
16395 declare {i32, i1} @llvm.sadd.with.overflow.i32(i32 %a, i32 %b)
16396 declare {i64, i1} @llvm.sadd.with.overflow.i64(i64 %a, i64 %b)
16397 declare {<4 x i32>, <4 x i1>} @llvm.sadd.with.overflow.v4i32(<4 x i32> %a, <4 x i32> %b)
16402 The '``llvm.sadd.with.overflow``' family of intrinsic functions perform
16403 a signed addition of the two arguments, and indicate whether an overflow
16404 occurred during the signed summation.
16409 The arguments (%a and %b) and the first element of the result structure
16410 may be of integer types of any bit width, but they must have the same
16411 bit width. The second element of the result structure must be of type
16412 ``i1``. ``%a`` and ``%b`` are the two values that will undergo signed
16418 The '``llvm.sadd.with.overflow``' family of intrinsic functions perform
16419 a signed addition of the two variables. They return a structure --- the
16420 first element of which is the signed summation, and the second element
16421 of which is a bit specifying if the signed summation resulted in an
16427 .. code-block:: llvm
16429 %res = call {i32, i1} @llvm.sadd.with.overflow.i32(i32 %a, i32 %b)
16430 %sum = extractvalue {i32, i1} %res, 0
16431 %obit = extractvalue {i32, i1} %res, 1
16432 br i1 %obit, label %overflow, label %normal
16434 '``llvm.uadd.with.overflow.*``' Intrinsics
16435 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16440 This is an overloaded intrinsic. You can use ``llvm.uadd.with.overflow``
16441 on any integer bit width or vectors of integers.
16445 declare {i16, i1} @llvm.uadd.with.overflow.i16(i16 %a, i16 %b)
16446 declare {i32, i1} @llvm.uadd.with.overflow.i32(i32 %a, i32 %b)
16447 declare {i64, i1} @llvm.uadd.with.overflow.i64(i64 %a, i64 %b)
16448 declare {<4 x i32>, <4 x i1>} @llvm.uadd.with.overflow.v4i32(<4 x i32> %a, <4 x i32> %b)
16453 The '``llvm.uadd.with.overflow``' family of intrinsic functions perform
16454 an unsigned addition of the two arguments, and indicate whether a carry
16455 occurred during the unsigned summation.
16460 The arguments (%a and %b) and the first element of the result structure
16461 may be of integer types of any bit width, but they must have the same
16462 bit width. The second element of the result structure must be of type
16463 ``i1``. ``%a`` and ``%b`` are the two values that will undergo unsigned
16469 The '``llvm.uadd.with.overflow``' family of intrinsic functions perform
16470 an unsigned addition of the two arguments. They return a structure --- the
16471 first element of which is the sum, and the second element of which is a
16472 bit specifying if the unsigned summation resulted in a carry.
16477 .. code-block:: llvm
16479 %res = call {i32, i1} @llvm.uadd.with.overflow.i32(i32 %a, i32 %b)
16480 %sum = extractvalue {i32, i1} %res, 0
16481 %obit = extractvalue {i32, i1} %res, 1
16482 br i1 %obit, label %carry, label %normal
16484 '``llvm.ssub.with.overflow.*``' Intrinsics
16485 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16490 This is an overloaded intrinsic. You can use ``llvm.ssub.with.overflow``
16491 on any integer bit width or vectors of integers.
16495 declare {i16, i1} @llvm.ssub.with.overflow.i16(i16 %a, i16 %b)
16496 declare {i32, i1} @llvm.ssub.with.overflow.i32(i32 %a, i32 %b)
16497 declare {i64, i1} @llvm.ssub.with.overflow.i64(i64 %a, i64 %b)
16498 declare {<4 x i32>, <4 x i1>} @llvm.ssub.with.overflow.v4i32(<4 x i32> %a, <4 x i32> %b)
16503 The '``llvm.ssub.with.overflow``' family of intrinsic functions perform
16504 a signed subtraction of the two arguments, and indicate whether an
16505 overflow occurred during the signed subtraction.
16510 The arguments (%a and %b) and the first element of the result structure
16511 may be of integer types of any bit width, but they must have the same
16512 bit width. The second element of the result structure must be of type
16513 ``i1``. ``%a`` and ``%b`` are the two values that will undergo signed
16519 The '``llvm.ssub.with.overflow``' family of intrinsic functions perform
16520 a signed subtraction of the two arguments. They return a structure --- the
16521 first element of which is the subtraction, and the second element of
16522 which is a bit specifying if the signed subtraction resulted in an
16528 .. code-block:: llvm
16530 %res = call {i32, i1} @llvm.ssub.with.overflow.i32(i32 %a, i32 %b)
16531 %sum = extractvalue {i32, i1} %res, 0
16532 %obit = extractvalue {i32, i1} %res, 1
16533 br i1 %obit, label %overflow, label %normal
16535 '``llvm.usub.with.overflow.*``' Intrinsics
16536 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16541 This is an overloaded intrinsic. You can use ``llvm.usub.with.overflow``
16542 on any integer bit width or vectors of integers.
16546 declare {i16, i1} @llvm.usub.with.overflow.i16(i16 %a, i16 %b)
16547 declare {i32, i1} @llvm.usub.with.overflow.i32(i32 %a, i32 %b)
16548 declare {i64, i1} @llvm.usub.with.overflow.i64(i64 %a, i64 %b)
16549 declare {<4 x i32>, <4 x i1>} @llvm.usub.with.overflow.v4i32(<4 x i32> %a, <4 x i32> %b)
16554 The '``llvm.usub.with.overflow``' family of intrinsic functions perform
16555 an unsigned subtraction of the two arguments, and indicate whether an
16556 overflow occurred during the unsigned subtraction.
16561 The arguments (%a and %b) and the first element of the result structure
16562 may be of integer types of any bit width, but they must have the same
16563 bit width. The second element of the result structure must be of type
16564 ``i1``. ``%a`` and ``%b`` are the two values that will undergo unsigned
16570 The '``llvm.usub.with.overflow``' family of intrinsic functions perform
16571 an unsigned subtraction of the two arguments. They return a structure ---
16572 the first element of which is the subtraction, and the second element of
16573 which is a bit specifying if the unsigned subtraction resulted in an
16579 .. code-block:: llvm
16581 %res = call {i32, i1} @llvm.usub.with.overflow.i32(i32 %a, i32 %b)
16582 %sum = extractvalue {i32, i1} %res, 0
16583 %obit = extractvalue {i32, i1} %res, 1
16584 br i1 %obit, label %overflow, label %normal
16586 '``llvm.smul.with.overflow.*``' Intrinsics
16587 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16592 This is an overloaded intrinsic. You can use ``llvm.smul.with.overflow``
16593 on any integer bit width or vectors of integers.
16597 declare {i16, i1} @llvm.smul.with.overflow.i16(i16 %a, i16 %b)
16598 declare {i32, i1} @llvm.smul.with.overflow.i32(i32 %a, i32 %b)
16599 declare {i64, i1} @llvm.smul.with.overflow.i64(i64 %a, i64 %b)
16600 declare {<4 x i32>, <4 x i1>} @llvm.smul.with.overflow.v4i32(<4 x i32> %a, <4 x i32> %b)
16605 The '``llvm.smul.with.overflow``' family of intrinsic functions perform
16606 a signed multiplication of the two arguments, and indicate whether an
16607 overflow occurred during the signed multiplication.
16612 The arguments (%a and %b) and the first element of the result structure
16613 may be of integer types of any bit width, but they must have the same
16614 bit width. The second element of the result structure must be of type
16615 ``i1``. ``%a`` and ``%b`` are the two values that will undergo signed
16621 The '``llvm.smul.with.overflow``' family of intrinsic functions perform
16622 a signed multiplication of the two arguments. They return a structure ---
16623 the first element of which is the multiplication, and the second element
16624 of which is a bit specifying if the signed multiplication resulted in an
16630 .. code-block:: llvm
16632 %res = call {i32, i1} @llvm.smul.with.overflow.i32(i32 %a, i32 %b)
16633 %sum = extractvalue {i32, i1} %res, 0
16634 %obit = extractvalue {i32, i1} %res, 1
16635 br i1 %obit, label %overflow, label %normal
16637 '``llvm.umul.with.overflow.*``' Intrinsics
16638 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16643 This is an overloaded intrinsic. You can use ``llvm.umul.with.overflow``
16644 on any integer bit width or vectors of integers.
16648 declare {i16, i1} @llvm.umul.with.overflow.i16(i16 %a, i16 %b)
16649 declare {i32, i1} @llvm.umul.with.overflow.i32(i32 %a, i32 %b)
16650 declare {i64, i1} @llvm.umul.with.overflow.i64(i64 %a, i64 %b)
16651 declare {<4 x i32>, <4 x i1>} @llvm.umul.with.overflow.v4i32(<4 x i32> %a, <4 x i32> %b)
16656 The '``llvm.umul.with.overflow``' family of intrinsic functions perform
16657 a unsigned multiplication of the two arguments, and indicate whether an
16658 overflow occurred during the unsigned multiplication.
16663 The arguments (%a and %b) and the first element of the result structure
16664 may be of integer types of any bit width, but they must have the same
16665 bit width. The second element of the result structure must be of type
16666 ``i1``. ``%a`` and ``%b`` are the two values that will undergo unsigned
16672 The '``llvm.umul.with.overflow``' family of intrinsic functions perform
16673 an unsigned multiplication of the two arguments. They return a structure ---
16674 the first element of which is the multiplication, and the second
16675 element of which is a bit specifying if the unsigned multiplication
16676 resulted in an overflow.
16681 .. code-block:: llvm
16683 %res = call {i32, i1} @llvm.umul.with.overflow.i32(i32 %a, i32 %b)
16684 %sum = extractvalue {i32, i1} %res, 0
16685 %obit = extractvalue {i32, i1} %res, 1
16686 br i1 %obit, label %overflow, label %normal
16688 Saturation Arithmetic Intrinsics
16689 ---------------------------------
16691 Saturation arithmetic is a version of arithmetic in which operations are
16692 limited to a fixed range between a minimum and maximum value. If the result of
16693 an operation is greater than the maximum value, the result is set (or
16694 "clamped") to this maximum. If it is below the minimum, it is clamped to this
16698 '``llvm.sadd.sat.*``' Intrinsics
16699 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16704 This is an overloaded intrinsic. You can use ``llvm.sadd.sat``
16705 on any integer bit width or vectors of integers.
16709 declare i16 @llvm.sadd.sat.i16(i16 %a, i16 %b)
16710 declare i32 @llvm.sadd.sat.i32(i32 %a, i32 %b)
16711 declare i64 @llvm.sadd.sat.i64(i64 %a, i64 %b)
16712 declare <4 x i32> @llvm.sadd.sat.v4i32(<4 x i32> %a, <4 x i32> %b)
16717 The '``llvm.sadd.sat``' family of intrinsic functions perform signed
16718 saturating addition on the 2 arguments.
16723 The arguments (%a and %b) and the result may be of integer types of any bit
16724 width, but they must have the same bit width. ``%a`` and ``%b`` are the two
16725 values that will undergo signed addition.
16730 The maximum value this operation can clamp to is the largest signed value
16731 representable by the bit width of the arguments. The minimum value is the
16732 smallest signed value representable by this bit width.
16738 .. code-block:: llvm
16740 %res = call i4 @llvm.sadd.sat.i4(i4 1, i4 2) ; %res = 3
16741 %res = call i4 @llvm.sadd.sat.i4(i4 5, i4 6) ; %res = 7
16742 %res = call i4 @llvm.sadd.sat.i4(i4 -4, i4 2) ; %res = -2
16743 %res = call i4 @llvm.sadd.sat.i4(i4 -4, i4 -5) ; %res = -8
16746 '``llvm.uadd.sat.*``' Intrinsics
16747 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16752 This is an overloaded intrinsic. You can use ``llvm.uadd.sat``
16753 on any integer bit width or vectors of integers.
16757 declare i16 @llvm.uadd.sat.i16(i16 %a, i16 %b)
16758 declare i32 @llvm.uadd.sat.i32(i32 %a, i32 %b)
16759 declare i64 @llvm.uadd.sat.i64(i64 %a, i64 %b)
16760 declare <4 x i32> @llvm.uadd.sat.v4i32(<4 x i32> %a, <4 x i32> %b)
16765 The '``llvm.uadd.sat``' family of intrinsic functions perform unsigned
16766 saturating addition on the 2 arguments.
16771 The arguments (%a and %b) and the result may be of integer types of any bit
16772 width, but they must have the same bit width. ``%a`` and ``%b`` are the two
16773 values that will undergo unsigned addition.
16778 The maximum value this operation can clamp to is the largest unsigned value
16779 representable by the bit width of the arguments. Because this is an unsigned
16780 operation, the result will never saturate towards zero.
16786 .. code-block:: llvm
16788 %res = call i4 @llvm.uadd.sat.i4(i4 1, i4 2) ; %res = 3
16789 %res = call i4 @llvm.uadd.sat.i4(i4 5, i4 6) ; %res = 11
16790 %res = call i4 @llvm.uadd.sat.i4(i4 8, i4 8) ; %res = 15
16793 '``llvm.ssub.sat.*``' Intrinsics
16794 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16799 This is an overloaded intrinsic. You can use ``llvm.ssub.sat``
16800 on any integer bit width or vectors of integers.
16804 declare i16 @llvm.ssub.sat.i16(i16 %a, i16 %b)
16805 declare i32 @llvm.ssub.sat.i32(i32 %a, i32 %b)
16806 declare i64 @llvm.ssub.sat.i64(i64 %a, i64 %b)
16807 declare <4 x i32> @llvm.ssub.sat.v4i32(<4 x i32> %a, <4 x i32> %b)
16812 The '``llvm.ssub.sat``' family of intrinsic functions perform signed
16813 saturating subtraction on the 2 arguments.
16818 The arguments (%a and %b) and the result may be of integer types of any bit
16819 width, but they must have the same bit width. ``%a`` and ``%b`` are the two
16820 values that will undergo signed subtraction.
16825 The maximum value this operation can clamp to is the largest signed value
16826 representable by the bit width of the arguments. The minimum value is the
16827 smallest signed value representable by this bit width.
16833 .. code-block:: llvm
16835 %res = call i4 @llvm.ssub.sat.i4(i4 2, i4 1) ; %res = 1
16836 %res = call i4 @llvm.ssub.sat.i4(i4 2, i4 6) ; %res = -4
16837 %res = call i4 @llvm.ssub.sat.i4(i4 -4, i4 5) ; %res = -8
16838 %res = call i4 @llvm.ssub.sat.i4(i4 4, i4 -5) ; %res = 7
16841 '``llvm.usub.sat.*``' Intrinsics
16842 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16847 This is an overloaded intrinsic. You can use ``llvm.usub.sat``
16848 on any integer bit width or vectors of integers.
16852 declare i16 @llvm.usub.sat.i16(i16 %a, i16 %b)
16853 declare i32 @llvm.usub.sat.i32(i32 %a, i32 %b)
16854 declare i64 @llvm.usub.sat.i64(i64 %a, i64 %b)
16855 declare <4 x i32> @llvm.usub.sat.v4i32(<4 x i32> %a, <4 x i32> %b)
16860 The '``llvm.usub.sat``' family of intrinsic functions perform unsigned
16861 saturating subtraction on the 2 arguments.
16866 The arguments (%a and %b) and the result may be of integer types of any bit
16867 width, but they must have the same bit width. ``%a`` and ``%b`` are the two
16868 values that will undergo unsigned subtraction.
16873 The minimum value this operation can clamp to is 0, which is the smallest
16874 unsigned value representable by the bit width of the unsigned arguments.
16875 Because this is an unsigned operation, the result will never saturate towards
16876 the largest possible value representable by this bit width.
16882 .. code-block:: llvm
16884 %res = call i4 @llvm.usub.sat.i4(i4 2, i4 1) ; %res = 1
16885 %res = call i4 @llvm.usub.sat.i4(i4 2, i4 6) ; %res = 0
16888 '``llvm.sshl.sat.*``' Intrinsics
16889 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16894 This is an overloaded intrinsic. You can use ``llvm.sshl.sat``
16895 on integers or vectors of integers of any bit width.
16899 declare i16 @llvm.sshl.sat.i16(i16 %a, i16 %b)
16900 declare i32 @llvm.sshl.sat.i32(i32 %a, i32 %b)
16901 declare i64 @llvm.sshl.sat.i64(i64 %a, i64 %b)
16902 declare <4 x i32> @llvm.sshl.sat.v4i32(<4 x i32> %a, <4 x i32> %b)
16907 The '``llvm.sshl.sat``' family of intrinsic functions perform signed
16908 saturating left shift on the first argument.
16913 The arguments (``%a`` and ``%b``) and the result may be of integer types of any
16914 bit width, but they must have the same bit width. ``%a`` is the value to be
16915 shifted, and ``%b`` is the amount to shift by. If ``b`` is (statically or
16916 dynamically) equal to or larger than the integer bit width of the arguments,
16917 the result is a :ref:`poison value <poisonvalues>`. If the arguments are
16918 vectors, each vector element of ``a`` is shifted by the corresponding shift
16925 The maximum value this operation can clamp to is the largest signed value
16926 representable by the bit width of the arguments. The minimum value is the
16927 smallest signed value representable by this bit width.
16933 .. code-block:: llvm
16935 %res = call i4 @llvm.sshl.sat.i4(i4 2, i4 1) ; %res = 4
16936 %res = call i4 @llvm.sshl.sat.i4(i4 2, i4 2) ; %res = 7
16937 %res = call i4 @llvm.sshl.sat.i4(i4 -5, i4 1) ; %res = -8
16938 %res = call i4 @llvm.sshl.sat.i4(i4 -1, i4 1) ; %res = -2
16941 '``llvm.ushl.sat.*``' Intrinsics
16942 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16947 This is an overloaded intrinsic. You can use ``llvm.ushl.sat``
16948 on integers or vectors of integers of any bit width.
16952 declare i16 @llvm.ushl.sat.i16(i16 %a, i16 %b)
16953 declare i32 @llvm.ushl.sat.i32(i32 %a, i32 %b)
16954 declare i64 @llvm.ushl.sat.i64(i64 %a, i64 %b)
16955 declare <4 x i32> @llvm.ushl.sat.v4i32(<4 x i32> %a, <4 x i32> %b)
16960 The '``llvm.ushl.sat``' family of intrinsic functions perform unsigned
16961 saturating left shift on the first argument.
16966 The arguments (``%a`` and ``%b``) and the result may be of integer types of any
16967 bit width, but they must have the same bit width. ``%a`` is the value to be
16968 shifted, and ``%b`` is the amount to shift by. If ``b`` is (statically or
16969 dynamically) equal to or larger than the integer bit width of the arguments,
16970 the result is a :ref:`poison value <poisonvalues>`. If the arguments are
16971 vectors, each vector element of ``a`` is shifted by the corresponding shift
16977 The maximum value this operation can clamp to is the largest unsigned value
16978 representable by the bit width of the arguments.
16984 .. code-block:: llvm
16986 %res = call i4 @llvm.ushl.sat.i4(i4 2, i4 1) ; %res = 4
16987 %res = call i4 @llvm.ushl.sat.i4(i4 3, i4 3) ; %res = 15
16990 Fixed Point Arithmetic Intrinsics
16991 ---------------------------------
16993 A fixed point number represents a real data type for a number that has a fixed
16994 number of digits after a radix point (equivalent to the decimal point '.').
16995 The number of digits after the radix point is referred as the `scale`. These
16996 are useful for representing fractional values to a specific precision. The
16997 following intrinsics perform fixed point arithmetic operations on 2 operands
16998 of the same scale, specified as the third argument.
17000 The ``llvm.*mul.fix`` family of intrinsic functions represents a multiplication
17001 of fixed point numbers through scaled integers. Therefore, fixed point
17002 multiplication can be represented as
17004 .. code-block:: llvm
17006 %result = call i4 @llvm.smul.fix.i4(i4 %a, i4 %b, i32 %scale)
17009 %a2 = sext i4 %a to i8
17010 %b2 = sext i4 %b to i8
17011 %mul = mul nsw nuw i8 %a2, %b2
17012 %scale2 = trunc i32 %scale to i8
17013 %r = ashr i8 %mul, i8 %scale2 ; this is for a target rounding down towards negative infinity
17014 %result = trunc i8 %r to i4
17016 The ``llvm.*div.fix`` family of intrinsic functions represents a division of
17017 fixed point numbers through scaled integers. Fixed point division can be
17020 .. code-block:: llvm
17022 %result call i4 @llvm.sdiv.fix.i4(i4 %a, i4 %b, i32 %scale)
17025 %a2 = sext i4 %a to i8
17026 %b2 = sext i4 %b to i8
17027 %scale2 = trunc i32 %scale to i8
17028 %a3 = shl i8 %a2, %scale2
17029 %r = sdiv i8 %a3, %b2 ; this is for a target rounding towards zero
17030 %result = trunc i8 %r to i4
17032 For each of these functions, if the result cannot be represented exactly with
17033 the provided scale, the result is rounded. Rounding is unspecified since
17034 preferred rounding may vary for different targets. Rounding is specified
17035 through a target hook. Different pipelines should legalize or optimize this
17036 using the rounding specified by this hook if it is provided. Operations like
17037 constant folding, instruction combining, KnownBits, and ValueTracking should
17038 also use this hook, if provided, and not assume the direction of rounding. A
17039 rounded result must always be within one unit of precision from the true
17040 result. That is, the error between the returned result and the true result must
17041 be less than 1/2^(scale).
17044 '``llvm.smul.fix.*``' Intrinsics
17045 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17050 This is an overloaded intrinsic. You can use ``llvm.smul.fix``
17051 on any integer bit width or vectors of integers.
17055 declare i16 @llvm.smul.fix.i16(i16 %a, i16 %b, i32 %scale)
17056 declare i32 @llvm.smul.fix.i32(i32 %a, i32 %b, i32 %scale)
17057 declare i64 @llvm.smul.fix.i64(i64 %a, i64 %b, i32 %scale)
17058 declare <4 x i32> @llvm.smul.fix.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale)
17063 The '``llvm.smul.fix``' family of intrinsic functions perform signed
17064 fixed point multiplication on 2 arguments of the same scale.
17069 The arguments (%a and %b) and the result may be of integer types of any bit
17070 width, but they must have the same bit width. The arguments may also work with
17071 int vectors of the same length and int size. ``%a`` and ``%b`` are the two
17072 values that will undergo signed fixed point multiplication. The argument
17073 ``%scale`` represents the scale of both operands, and must be a constant
17079 This operation performs fixed point multiplication on the 2 arguments of a
17080 specified scale. The result will also be returned in the same scale specified
17081 in the third argument.
17083 If the result value cannot be precisely represented in the given scale, the
17084 value is rounded up or down to the closest representable value. The rounding
17085 direction is unspecified.
17087 It is undefined behavior if the result value does not fit within the range of
17088 the fixed point type.
17094 .. code-block:: llvm
17096 %res = call i4 @llvm.smul.fix.i4(i4 3, i4 2, i32 0) ; %res = 6 (2 x 3 = 6)
17097 %res = call i4 @llvm.smul.fix.i4(i4 3, i4 2, i32 1) ; %res = 3 (1.5 x 1 = 1.5)
17098 %res = call i4 @llvm.smul.fix.i4(i4 3, i4 -2, i32 1) ; %res = -3 (1.5 x -1 = -1.5)
17100 ; The result in the following could be rounded up to -2 or down to -2.5
17101 %res = call i4 @llvm.smul.fix.i4(i4 3, i4 -3, i32 1) ; %res = -5 (or -4) (1.5 x -1.5 = -2.25)
17104 '``llvm.umul.fix.*``' Intrinsics
17105 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17110 This is an overloaded intrinsic. You can use ``llvm.umul.fix``
17111 on any integer bit width or vectors of integers.
17115 declare i16 @llvm.umul.fix.i16(i16 %a, i16 %b, i32 %scale)
17116 declare i32 @llvm.umul.fix.i32(i32 %a, i32 %b, i32 %scale)
17117 declare i64 @llvm.umul.fix.i64(i64 %a, i64 %b, i32 %scale)
17118 declare <4 x i32> @llvm.umul.fix.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale)
17123 The '``llvm.umul.fix``' family of intrinsic functions perform unsigned
17124 fixed point multiplication on 2 arguments of the same scale.
17129 The arguments (%a and %b) and the result may be of integer types of any bit
17130 width, but they must have the same bit width. The arguments may also work with
17131 int vectors of the same length and int size. ``%a`` and ``%b`` are the two
17132 values that will undergo unsigned fixed point multiplication. The argument
17133 ``%scale`` represents the scale of both operands, and must be a constant
17139 This operation performs unsigned fixed point multiplication on the 2 arguments of a
17140 specified scale. The result will also be returned in the same scale specified
17141 in the third argument.
17143 If the result value cannot be precisely represented in the given scale, the
17144 value is rounded up or down to the closest representable value. The rounding
17145 direction is unspecified.
17147 It is undefined behavior if the result value does not fit within the range of
17148 the fixed point type.
17154 .. code-block:: llvm
17156 %res = call i4 @llvm.umul.fix.i4(i4 3, i4 2, i32 0) ; %res = 6 (2 x 3 = 6)
17157 %res = call i4 @llvm.umul.fix.i4(i4 3, i4 2, i32 1) ; %res = 3 (1.5 x 1 = 1.5)
17159 ; The result in the following could be rounded down to 3.5 or up to 4
17160 %res = call i4 @llvm.umul.fix.i4(i4 15, i4 1, i32 1) ; %res = 7 (or 8) (7.5 x 0.5 = 3.75)
17163 '``llvm.smul.fix.sat.*``' Intrinsics
17164 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17169 This is an overloaded intrinsic. You can use ``llvm.smul.fix.sat``
17170 on any integer bit width or vectors of integers.
17174 declare i16 @llvm.smul.fix.sat.i16(i16 %a, i16 %b, i32 %scale)
17175 declare i32 @llvm.smul.fix.sat.i32(i32 %a, i32 %b, i32 %scale)
17176 declare i64 @llvm.smul.fix.sat.i64(i64 %a, i64 %b, i32 %scale)
17177 declare <4 x i32> @llvm.smul.fix.sat.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale)
17182 The '``llvm.smul.fix.sat``' family of intrinsic functions perform signed
17183 fixed point saturating multiplication on 2 arguments of the same scale.
17188 The arguments (%a and %b) and the result may be of integer types of any bit
17189 width, but they must have the same bit width. ``%a`` and ``%b`` are the two
17190 values that will undergo signed fixed point multiplication. The argument
17191 ``%scale`` represents the scale of both operands, and must be a constant
17197 This operation performs fixed point multiplication on the 2 arguments of a
17198 specified scale. The result will also be returned in the same scale specified
17199 in the third argument.
17201 If the result value cannot be precisely represented in the given scale, the
17202 value is rounded up or down to the closest representable value. The rounding
17203 direction is unspecified.
17205 The maximum value this operation can clamp to is the largest signed value
17206 representable by the bit width of the first 2 arguments. The minimum value is the
17207 smallest signed value representable by this bit width.
17213 .. code-block:: llvm
17215 %res = call i4 @llvm.smul.fix.sat.i4(i4 3, i4 2, i32 0) ; %res = 6 (2 x 3 = 6)
17216 %res = call i4 @llvm.smul.fix.sat.i4(i4 3, i4 2, i32 1) ; %res = 3 (1.5 x 1 = 1.5)
17217 %res = call i4 @llvm.smul.fix.sat.i4(i4 3, i4 -2, i32 1) ; %res = -3 (1.5 x -1 = -1.5)
17219 ; The result in the following could be rounded up to -2 or down to -2.5
17220 %res = call i4 @llvm.smul.fix.sat.i4(i4 3, i4 -3, i32 1) ; %res = -5 (or -4) (1.5 x -1.5 = -2.25)
17223 %res = call i4 @llvm.smul.fix.sat.i4(i4 7, i4 2, i32 0) ; %res = 7
17224 %res = call i4 @llvm.smul.fix.sat.i4(i4 7, i4 4, i32 2) ; %res = 7
17225 %res = call i4 @llvm.smul.fix.sat.i4(i4 -8, i4 5, i32 2) ; %res = -8
17226 %res = call i4 @llvm.smul.fix.sat.i4(i4 -8, i4 -2, i32 1) ; %res = 7
17228 ; Scale can affect the saturation result
17229 %res = call i4 @llvm.smul.fix.sat.i4(i4 2, i4 4, i32 0) ; %res = 7 (2 x 4 -> clamped to 7)
17230 %res = call i4 @llvm.smul.fix.sat.i4(i4 2, i4 4, i32 1) ; %res = 4 (1 x 2 = 2)
17233 '``llvm.umul.fix.sat.*``' Intrinsics
17234 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17239 This is an overloaded intrinsic. You can use ``llvm.umul.fix.sat``
17240 on any integer bit width or vectors of integers.
17244 declare i16 @llvm.umul.fix.sat.i16(i16 %a, i16 %b, i32 %scale)
17245 declare i32 @llvm.umul.fix.sat.i32(i32 %a, i32 %b, i32 %scale)
17246 declare i64 @llvm.umul.fix.sat.i64(i64 %a, i64 %b, i32 %scale)
17247 declare <4 x i32> @llvm.umul.fix.sat.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale)
17252 The '``llvm.umul.fix.sat``' family of intrinsic functions perform unsigned
17253 fixed point saturating multiplication on 2 arguments of the same scale.
17258 The arguments (%a and %b) and the result may be of integer types of any bit
17259 width, but they must have the same bit width. ``%a`` and ``%b`` are the two
17260 values that will undergo unsigned fixed point multiplication. The argument
17261 ``%scale`` represents the scale of both operands, and must be a constant
17267 This operation performs fixed point multiplication on the 2 arguments of a
17268 specified scale. The result will also be returned in the same scale specified
17269 in the third argument.
17271 If the result value cannot be precisely represented in the given scale, the
17272 value is rounded up or down to the closest representable value. The rounding
17273 direction is unspecified.
17275 The maximum value this operation can clamp to is the largest unsigned value
17276 representable by the bit width of the first 2 arguments. The minimum value is the
17277 smallest unsigned value representable by this bit width (zero).
17283 .. code-block:: llvm
17285 %res = call i4 @llvm.umul.fix.sat.i4(i4 3, i4 2, i32 0) ; %res = 6 (2 x 3 = 6)
17286 %res = call i4 @llvm.umul.fix.sat.i4(i4 3, i4 2, i32 1) ; %res = 3 (1.5 x 1 = 1.5)
17288 ; The result in the following could be rounded down to 2 or up to 2.5
17289 %res = call i4 @llvm.umul.fix.sat.i4(i4 3, i4 3, i32 1) ; %res = 4 (or 5) (1.5 x 1.5 = 2.25)
17292 %res = call i4 @llvm.umul.fix.sat.i4(i4 8, i4 2, i32 0) ; %res = 15 (8 x 2 -> clamped to 15)
17293 %res = call i4 @llvm.umul.fix.sat.i4(i4 8, i4 8, i32 2) ; %res = 15 (2 x 2 -> clamped to 3.75)
17295 ; Scale can affect the saturation result
17296 %res = call i4 @llvm.umul.fix.sat.i4(i4 2, i4 4, i32 0) ; %res = 7 (2 x 4 -> clamped to 7)
17297 %res = call i4 @llvm.umul.fix.sat.i4(i4 2, i4 4, i32 1) ; %res = 4 (1 x 2 = 2)
17300 '``llvm.sdiv.fix.*``' Intrinsics
17301 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17306 This is an overloaded intrinsic. You can use ``llvm.sdiv.fix``
17307 on any integer bit width or vectors of integers.
17311 declare i16 @llvm.sdiv.fix.i16(i16 %a, i16 %b, i32 %scale)
17312 declare i32 @llvm.sdiv.fix.i32(i32 %a, i32 %b, i32 %scale)
17313 declare i64 @llvm.sdiv.fix.i64(i64 %a, i64 %b, i32 %scale)
17314 declare <4 x i32> @llvm.sdiv.fix.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale)
17319 The '``llvm.sdiv.fix``' family of intrinsic functions perform signed
17320 fixed point division on 2 arguments of the same scale.
17325 The arguments (%a and %b) and the result may be of integer types of any bit
17326 width, but they must have the same bit width. The arguments may also work with
17327 int vectors of the same length and int size. ``%a`` and ``%b`` are the two
17328 values that will undergo signed fixed point division. The argument
17329 ``%scale`` represents the scale of both operands, and must be a constant
17335 This operation performs fixed point division on the 2 arguments of a
17336 specified scale. The result will also be returned in the same scale specified
17337 in the third argument.
17339 If the result value cannot be precisely represented in the given scale, the
17340 value is rounded up or down to the closest representable value. The rounding
17341 direction is unspecified.
17343 It is undefined behavior if the result value does not fit within the range of
17344 the fixed point type, or if the second argument is zero.
17350 .. code-block:: llvm
17352 %res = call i4 @llvm.sdiv.fix.i4(i4 6, i4 2, i32 0) ; %res = 3 (6 / 2 = 3)
17353 %res = call i4 @llvm.sdiv.fix.i4(i4 6, i4 4, i32 1) ; %res = 3 (3 / 2 = 1.5)
17354 %res = call i4 @llvm.sdiv.fix.i4(i4 3, i4 -2, i32 1) ; %res = -3 (1.5 / -1 = -1.5)
17356 ; The result in the following could be rounded up to 1 or down to 0.5
17357 %res = call i4 @llvm.sdiv.fix.i4(i4 3, i4 4, i32 1) ; %res = 2 (or 1) (1.5 / 2 = 0.75)
17360 '``llvm.udiv.fix.*``' Intrinsics
17361 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17366 This is an overloaded intrinsic. You can use ``llvm.udiv.fix``
17367 on any integer bit width or vectors of integers.
17371 declare i16 @llvm.udiv.fix.i16(i16 %a, i16 %b, i32 %scale)
17372 declare i32 @llvm.udiv.fix.i32(i32 %a, i32 %b, i32 %scale)
17373 declare i64 @llvm.udiv.fix.i64(i64 %a, i64 %b, i32 %scale)
17374 declare <4 x i32> @llvm.udiv.fix.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale)
17379 The '``llvm.udiv.fix``' family of intrinsic functions perform unsigned
17380 fixed point division on 2 arguments of the same scale.
17385 The arguments (%a and %b) and the result may be of integer types of any bit
17386 width, but they must have the same bit width. The arguments may also work with
17387 int vectors of the same length and int size. ``%a`` and ``%b`` are the two
17388 values that will undergo unsigned fixed point division. The argument
17389 ``%scale`` represents the scale of both operands, and must be a constant
17395 This operation performs fixed point division on the 2 arguments of a
17396 specified scale. The result will also be returned in the same scale specified
17397 in the third argument.
17399 If the result value cannot be precisely represented in the given scale, the
17400 value is rounded up or down to the closest representable value. The rounding
17401 direction is unspecified.
17403 It is undefined behavior if the result value does not fit within the range of
17404 the fixed point type, or if the second argument is zero.
17410 .. code-block:: llvm
17412 %res = call i4 @llvm.udiv.fix.i4(i4 6, i4 2, i32 0) ; %res = 3 (6 / 2 = 3)
17413 %res = call i4 @llvm.udiv.fix.i4(i4 6, i4 4, i32 1) ; %res = 3 (3 / 2 = 1.5)
17414 %res = call i4 @llvm.udiv.fix.i4(i4 1, i4 -8, i32 4) ; %res = 2 (0.0625 / 0.5 = 0.125)
17416 ; The result in the following could be rounded up to 1 or down to 0.5
17417 %res = call i4 @llvm.udiv.fix.i4(i4 3, i4 4, i32 1) ; %res = 2 (or 1) (1.5 / 2 = 0.75)
17420 '``llvm.sdiv.fix.sat.*``' Intrinsics
17421 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17426 This is an overloaded intrinsic. You can use ``llvm.sdiv.fix.sat``
17427 on any integer bit width or vectors of integers.
17431 declare i16 @llvm.sdiv.fix.sat.i16(i16 %a, i16 %b, i32 %scale)
17432 declare i32 @llvm.sdiv.fix.sat.i32(i32 %a, i32 %b, i32 %scale)
17433 declare i64 @llvm.sdiv.fix.sat.i64(i64 %a, i64 %b, i32 %scale)
17434 declare <4 x i32> @llvm.sdiv.fix.sat.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale)
17439 The '``llvm.sdiv.fix.sat``' family of intrinsic functions perform signed
17440 fixed point saturating division on 2 arguments of the same scale.
17445 The arguments (%a and %b) and the result may be of integer types of any bit
17446 width, but they must have the same bit width. ``%a`` and ``%b`` are the two
17447 values that will undergo signed fixed point division. The argument
17448 ``%scale`` represents the scale of both operands, and must be a constant
17454 This operation performs fixed point division on the 2 arguments of a
17455 specified scale. The result will also be returned in the same scale specified
17456 in the third argument.
17458 If the result value cannot be precisely represented in the given scale, the
17459 value is rounded up or down to the closest representable value. The rounding
17460 direction is unspecified.
17462 The maximum value this operation can clamp to is the largest signed value
17463 representable by the bit width of the first 2 arguments. The minimum value is the
17464 smallest signed value representable by this bit width.
17466 It is undefined behavior if the second argument is zero.
17472 .. code-block:: llvm
17474 %res = call i4 @llvm.sdiv.fix.sat.i4(i4 6, i4 2, i32 0) ; %res = 3 (6 / 2 = 3)
17475 %res = call i4 @llvm.sdiv.fix.sat.i4(i4 6, i4 4, i32 1) ; %res = 3 (3 / 2 = 1.5)
17476 %res = call i4 @llvm.sdiv.fix.sat.i4(i4 3, i4 -2, i32 1) ; %res = -3 (1.5 / -1 = -1.5)
17478 ; The result in the following could be rounded up to 1 or down to 0.5
17479 %res = call i4 @llvm.sdiv.fix.sat.i4(i4 3, i4 4, i32 1) ; %res = 2 (or 1) (1.5 / 2 = 0.75)
17482 %res = call i4 @llvm.sdiv.fix.sat.i4(i4 -8, i4 -1, i32 0) ; %res = 7 (-8 / -1 = 8 => 7)
17483 %res = call i4 @llvm.sdiv.fix.sat.i4(i4 4, i4 2, i32 2) ; %res = 7 (1 / 0.5 = 2 => 1.75)
17484 %res = call i4 @llvm.sdiv.fix.sat.i4(i4 -4, i4 1, i32 2) ; %res = -8 (-1 / 0.25 = -4 => -2)
17487 '``llvm.udiv.fix.sat.*``' Intrinsics
17488 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17493 This is an overloaded intrinsic. You can use ``llvm.udiv.fix.sat``
17494 on any integer bit width or vectors of integers.
17498 declare i16 @llvm.udiv.fix.sat.i16(i16 %a, i16 %b, i32 %scale)
17499 declare i32 @llvm.udiv.fix.sat.i32(i32 %a, i32 %b, i32 %scale)
17500 declare i64 @llvm.udiv.fix.sat.i64(i64 %a, i64 %b, i32 %scale)
17501 declare <4 x i32> @llvm.udiv.fix.sat.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale)
17506 The '``llvm.udiv.fix.sat``' family of intrinsic functions perform unsigned
17507 fixed point saturating division on 2 arguments of the same scale.
17512 The arguments (%a and %b) and the result may be of integer types of any bit
17513 width, but they must have the same bit width. ``%a`` and ``%b`` are the two
17514 values that will undergo unsigned fixed point division. The argument
17515 ``%scale`` represents the scale of both operands, and must be a constant
17521 This operation performs fixed point division on the 2 arguments of a
17522 specified scale. The result will also be returned in the same scale specified
17523 in the third argument.
17525 If the result value cannot be precisely represented in the given scale, the
17526 value is rounded up or down to the closest representable value. The rounding
17527 direction is unspecified.
17529 The maximum value this operation can clamp to is the largest unsigned value
17530 representable by the bit width of the first 2 arguments. The minimum value is the
17531 smallest unsigned value representable by this bit width (zero).
17533 It is undefined behavior if the second argument is zero.
17538 .. code-block:: llvm
17540 %res = call i4 @llvm.udiv.fix.sat.i4(i4 6, i4 2, i32 0) ; %res = 3 (6 / 2 = 3)
17541 %res = call i4 @llvm.udiv.fix.sat.i4(i4 6, i4 4, i32 1) ; %res = 3 (3 / 2 = 1.5)
17543 ; The result in the following could be rounded down to 0.5 or up to 1
17544 %res = call i4 @llvm.udiv.fix.sat.i4(i4 3, i4 4, i32 1) ; %res = 1 (or 2) (1.5 / 2 = 0.75)
17547 %res = call i4 @llvm.udiv.fix.sat.i4(i4 8, i4 2, i32 2) ; %res = 15 (2 / 0.5 = 4 => 3.75)
17550 Specialised Arithmetic Intrinsics
17551 ---------------------------------
17553 .. _i_intr_llvm_canonicalize:
17555 '``llvm.canonicalize.*``' Intrinsic
17556 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17563 declare float @llvm.canonicalize.f32(float %a)
17564 declare double @llvm.canonicalize.f64(double %b)
17569 The '``llvm.canonicalize.*``' intrinsic returns the platform specific canonical
17570 encoding of a floating-point number. This canonicalization is useful for
17571 implementing certain numeric primitives such as frexp. The canonical encoding is
17572 defined by IEEE-754-2008 to be:
17576 2.1.8 canonical encoding: The preferred encoding of a floating-point
17577 representation in a format. Applied to declets, significands of finite
17578 numbers, infinities, and NaNs, especially in decimal formats.
17580 This operation can also be considered equivalent to the IEEE-754-2008
17581 conversion of a floating-point value to the same format. NaNs are handled
17582 according to section 6.2.
17584 Examples of non-canonical encodings:
17586 - x87 pseudo denormals, pseudo NaNs, pseudo Infinity, Unnormals. These are
17587 converted to a canonical representation per hardware-specific protocol.
17588 - Many normal decimal floating-point numbers have non-canonical alternative
17590 - Some machines, like GPUs or ARMv7 NEON, do not support subnormal values.
17591 These are treated as non-canonical encodings of zero and will be flushed to
17592 a zero of the same sign by this operation.
17594 Note that per IEEE-754-2008 6.2, systems that support signaling NaNs with
17595 default exception handling must signal an invalid exception, and produce a
17598 This function should always be implementable as multiplication by 1.0, provided
17599 that the compiler does not constant fold the operation. Likewise, division by
17600 1.0 and ``llvm.minnum(x, x)`` are possible implementations. Addition with
17601 -0.0 is also sufficient provided that the rounding mode is not -Infinity.
17603 ``@llvm.canonicalize`` must preserve the equality relation. That is:
17605 - ``(@llvm.canonicalize(x) == x)`` is equivalent to ``(x == x)``
17606 - ``(@llvm.canonicalize(x) == @llvm.canonicalize(y))`` is equivalent
17609 Additionally, the sign of zero must be conserved:
17610 ``@llvm.canonicalize(-0.0) = -0.0`` and ``@llvm.canonicalize(+0.0) = +0.0``
17612 The payload bits of a NaN must be conserved, with two exceptions.
17613 First, environments which use only a single canonical representation of NaN
17614 must perform said canonicalization. Second, SNaNs must be quieted per the
17617 The canonicalization operation may be optimized away if:
17619 - The input is known to be canonical. For example, it was produced by a
17620 floating-point operation that is required by the standard to be canonical.
17621 - The result is consumed only by (or fused with) other floating-point
17622 operations. That is, the bits of the floating-point value are not examined.
17626 '``llvm.fmuladd.*``' Intrinsic
17627 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17634 declare float @llvm.fmuladd.f32(float %a, float %b, float %c)
17635 declare double @llvm.fmuladd.f64(double %a, double %b, double %c)
17640 The '``llvm.fmuladd.*``' intrinsic functions represent multiply-add
17641 expressions that can be fused if the code generator determines that (a) the
17642 target instruction set has support for a fused operation, and (b) that the
17643 fused operation is more efficient than the equivalent, separate pair of mul
17644 and add instructions.
17649 The '``llvm.fmuladd.*``' intrinsics each take three arguments: two
17650 multiplicands, a and b, and an addend c.
17659 %0 = call float @llvm.fmuladd.f32(%a, %b, %c)
17661 is equivalent to the expression a \* b + c, except that it is unspecified
17662 whether rounding will be performed between the multiplication and addition
17663 steps. Fusion is not guaranteed, even if the target platform supports it.
17664 If a fused multiply-add is required, the corresponding
17665 :ref:`llvm.fma <int_fma>` intrinsic function should be used instead.
17666 This never sets errno, just as '``llvm.fma.*``'.
17671 .. code-block:: llvm
17673 %r2 = call float @llvm.fmuladd.f32(float %a, float %b, float %c) ; yields float:r2 = (a * b) + c
17676 Hardware-Loop Intrinsics
17677 ------------------------
17679 LLVM support several intrinsics to mark a loop as a hardware-loop. They are
17680 hints to the backend which are required to lower these intrinsics further to target
17681 specific instructions, or revert the hardware-loop to a normal loop if target
17682 specific restriction are not met and a hardware-loop can't be generated.
17684 These intrinsics may be modified in the future and are not intended to be used
17685 outside the backend. Thus, front-end and mid-level optimizations should not be
17686 generating these intrinsics.
17689 '``llvm.set.loop.iterations.*``' Intrinsic
17690 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17695 This is an overloaded intrinsic.
17699 declare void @llvm.set.loop.iterations.i32(i32)
17700 declare void @llvm.set.loop.iterations.i64(i64)
17705 The '``llvm.set.loop.iterations.*``' intrinsics are used to specify the
17706 hardware-loop trip count. They are placed in the loop preheader basic block and
17707 are marked as ``IntrNoDuplicate`` to avoid optimizers duplicating these
17713 The integer operand is the loop trip count of the hardware-loop, and thus
17714 not e.g. the loop back-edge taken count.
17719 The '``llvm.set.loop.iterations.*``' intrinsics do not perform any arithmetic
17720 on their operand. It's a hint to the backend that can use this to set up the
17721 hardware-loop count with a target specific instruction, usually a move of this
17722 value to a special register or a hardware-loop instruction.
17725 '``llvm.start.loop.iterations.*``' Intrinsic
17726 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17731 This is an overloaded intrinsic.
17735 declare i32 @llvm.start.loop.iterations.i32(i32)
17736 declare i64 @llvm.start.loop.iterations.i64(i64)
17741 The '``llvm.start.loop.iterations.*``' intrinsics are similar to the
17742 '``llvm.set.loop.iterations.*``' intrinsics, used to specify the
17743 hardware-loop trip count but also produce a value identical to the input
17744 that can be used as the input to the loop. They are placed in the loop
17745 preheader basic block and the output is expected to be the input to the
17746 phi for the induction variable of the loop, decremented by the
17747 '``llvm.loop.decrement.reg.*``'.
17752 The integer operand is the loop trip count of the hardware-loop, and thus
17753 not e.g. the loop back-edge taken count.
17758 The '``llvm.start.loop.iterations.*``' intrinsics do not perform any arithmetic
17759 on their operand. It's a hint to the backend that can use this to set up the
17760 hardware-loop count with a target specific instruction, usually a move of this
17761 value to a special register or a hardware-loop instruction.
17763 '``llvm.test.set.loop.iterations.*``' Intrinsic
17764 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17769 This is an overloaded intrinsic.
17773 declare i1 @llvm.test.set.loop.iterations.i32(i32)
17774 declare i1 @llvm.test.set.loop.iterations.i64(i64)
17779 The '``llvm.test.set.loop.iterations.*``' intrinsics are used to specify the
17780 the loop trip count, and also test that the given count is not zero, allowing
17781 it to control entry to a while-loop. They are placed in the loop preheader's
17782 predecessor basic block, and are marked as ``IntrNoDuplicate`` to avoid
17783 optimizers duplicating these instructions.
17788 The integer operand is the loop trip count of the hardware-loop, and thus
17789 not e.g. the loop back-edge taken count.
17794 The '``llvm.test.set.loop.iterations.*``' intrinsics do not perform any
17795 arithmetic on their operand. It's a hint to the backend that can use this to
17796 set up the hardware-loop count with a target specific instruction, usually a
17797 move of this value to a special register or a hardware-loop instruction.
17798 The result is the conditional value of whether the given count is not zero.
17801 '``llvm.test.start.loop.iterations.*``' Intrinsic
17802 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17807 This is an overloaded intrinsic.
17811 declare {i32, i1} @llvm.test.start.loop.iterations.i32(i32)
17812 declare {i64, i1} @llvm.test.start.loop.iterations.i64(i64)
17817 The '``llvm.test.start.loop.iterations.*``' intrinsics are similar to the
17818 '``llvm.test.set.loop.iterations.*``' and '``llvm.start.loop.iterations.*``'
17819 intrinsics, used to specify the hardware-loop trip count, but also produce a
17820 value identical to the input that can be used as the input to the loop. The
17821 second i1 output controls entry to a while-loop.
17826 The integer operand is the loop trip count of the hardware-loop, and thus
17827 not e.g. the loop back-edge taken count.
17832 The '``llvm.test.start.loop.iterations.*``' intrinsics do not perform any
17833 arithmetic on their operand. It's a hint to the backend that can use this to
17834 set up the hardware-loop count with a target specific instruction, usually a
17835 move of this value to a special register or a hardware-loop instruction.
17836 The result is a pair of the input and a conditional value of whether the
17837 given count is not zero.
17840 '``llvm.loop.decrement.reg.*``' Intrinsic
17841 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17846 This is an overloaded intrinsic.
17850 declare i32 @llvm.loop.decrement.reg.i32(i32, i32)
17851 declare i64 @llvm.loop.decrement.reg.i64(i64, i64)
17856 The '``llvm.loop.decrement.reg.*``' intrinsics are used to lower the loop
17857 iteration counter and return an updated value that will be used in the next
17863 Both arguments must have identical integer types. The first operand is the
17864 loop iteration counter. The second operand is the maximum number of elements
17865 processed in an iteration.
17870 The '``llvm.loop.decrement.reg.*``' intrinsics do an integer ``SUB`` of its
17871 two operands, which is not allowed to wrap. They return the remaining number of
17872 iterations still to be executed, and can be used together with a ``PHI``,
17873 ``ICMP`` and ``BR`` to control the number of loop iterations executed. Any
17874 optimisations are allowed to treat it is a ``SUB``, and it is supported by
17875 SCEV, so it's the backends responsibility to handle cases where it may be
17876 optimised. These intrinsics are marked as ``IntrNoDuplicate`` to avoid
17877 optimizers duplicating these instructions.
17880 '``llvm.loop.decrement.*``' Intrinsic
17881 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17886 This is an overloaded intrinsic.
17890 declare i1 @llvm.loop.decrement.i32(i32)
17891 declare i1 @llvm.loop.decrement.i64(i64)
17896 The HardwareLoops pass allows the loop decrement value to be specified with an
17897 option. It defaults to a loop decrement value of 1, but it can be an unsigned
17898 integer value provided by this option. The '``llvm.loop.decrement.*``'
17899 intrinsics decrement the loop iteration counter with this value, and return a
17900 false predicate if the loop should exit, and true otherwise.
17901 This is emitted if the loop counter is not updated via a ``PHI`` node, which
17902 can also be controlled with an option.
17907 The integer argument is the loop decrement value used to decrement the loop
17913 The '``llvm.loop.decrement.*``' intrinsics do a ``SUB`` of the loop iteration
17914 counter with the given loop decrement value, and return false if the loop
17915 should exit, this ``SUB`` is not allowed to wrap. The result is a condition
17916 that is used by the conditional branch controlling the loop.
17919 Vector Reduction Intrinsics
17920 ---------------------------
17922 Horizontal reductions of vectors can be expressed using the following
17923 intrinsics. Each one takes a vector operand as an input and applies its
17924 respective operation across all elements of the vector, returning a single
17925 scalar result of the same element type.
17927 .. _int_vector_reduce_add:
17929 '``llvm.vector.reduce.add.*``' Intrinsic
17930 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17937 declare i32 @llvm.vector.reduce.add.v4i32(<4 x i32> %a)
17938 declare i64 @llvm.vector.reduce.add.v2i64(<2 x i64> %a)
17943 The '``llvm.vector.reduce.add.*``' intrinsics do an integer ``ADD``
17944 reduction of a vector, returning the result as a scalar. The return type matches
17945 the element-type of the vector input.
17949 The argument to this intrinsic must be a vector of integer values.
17951 .. _int_vector_reduce_fadd:
17953 '``llvm.vector.reduce.fadd.*``' Intrinsic
17954 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17961 declare float @llvm.vector.reduce.fadd.v4f32(float %start_value, <4 x float> %a)
17962 declare double @llvm.vector.reduce.fadd.v2f64(double %start_value, <2 x double> %a)
17967 The '``llvm.vector.reduce.fadd.*``' intrinsics do a floating-point
17968 ``ADD`` reduction of a vector, returning the result as a scalar. The return type
17969 matches the element-type of the vector input.
17971 If the intrinsic call has the 'reassoc' flag set, then the reduction will not
17972 preserve the associativity of an equivalent scalarized counterpart. Otherwise
17973 the reduction will be *sequential*, thus implying that the operation respects
17974 the associativity of a scalarized reduction. That is, the reduction begins with
17975 the start value and performs an fadd operation with consecutively increasing
17976 vector element indices. See the following pseudocode:
17980 float sequential_fadd(start_value, input_vector)
17981 result = start_value
17982 for i = 0 to length(input_vector)
17983 result = result + input_vector[i]
17989 The first argument to this intrinsic is a scalar start value for the reduction.
17990 The type of the start value matches the element-type of the vector input.
17991 The second argument must be a vector of floating-point values.
17993 To ignore the start value, negative zero (``-0.0``) can be used, as it is
17994 the neutral value of floating point addition.
18001 %unord = call reassoc float @llvm.vector.reduce.fadd.v4f32(float -0.0, <4 x float> %input) ; relaxed reduction
18002 %ord = call float @llvm.vector.reduce.fadd.v4f32(float %start_value, <4 x float> %input) ; sequential reduction
18005 .. _int_vector_reduce_mul:
18007 '``llvm.vector.reduce.mul.*``' Intrinsic
18008 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18015 declare i32 @llvm.vector.reduce.mul.v4i32(<4 x i32> %a)
18016 declare i64 @llvm.vector.reduce.mul.v2i64(<2 x i64> %a)
18021 The '``llvm.vector.reduce.mul.*``' intrinsics do an integer ``MUL``
18022 reduction of a vector, returning the result as a scalar. The return type matches
18023 the element-type of the vector input.
18027 The argument to this intrinsic must be a vector of integer values.
18029 .. _int_vector_reduce_fmul:
18031 '``llvm.vector.reduce.fmul.*``' Intrinsic
18032 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18039 declare float @llvm.vector.reduce.fmul.v4f32(float %start_value, <4 x float> %a)
18040 declare double @llvm.vector.reduce.fmul.v2f64(double %start_value, <2 x double> %a)
18045 The '``llvm.vector.reduce.fmul.*``' intrinsics do a floating-point
18046 ``MUL`` reduction of a vector, returning the result as a scalar. The return type
18047 matches the element-type of the vector input.
18049 If the intrinsic call has the 'reassoc' flag set, then the reduction will not
18050 preserve the associativity of an equivalent scalarized counterpart. Otherwise
18051 the reduction will be *sequential*, thus implying that the operation respects
18052 the associativity of a scalarized reduction. That is, the reduction begins with
18053 the start value and performs an fmul operation with consecutively increasing
18054 vector element indices. See the following pseudocode:
18058 float sequential_fmul(start_value, input_vector)
18059 result = start_value
18060 for i = 0 to length(input_vector)
18061 result = result * input_vector[i]
18067 The first argument to this intrinsic is a scalar start value for the reduction.
18068 The type of the start value matches the element-type of the vector input.
18069 The second argument must be a vector of floating-point values.
18071 To ignore the start value, one (``1.0``) can be used, as it is the neutral
18072 value of floating point multiplication.
18079 %unord = call reassoc float @llvm.vector.reduce.fmul.v4f32(float 1.0, <4 x float> %input) ; relaxed reduction
18080 %ord = call float @llvm.vector.reduce.fmul.v4f32(float %start_value, <4 x float> %input) ; sequential reduction
18082 .. _int_vector_reduce_and:
18084 '``llvm.vector.reduce.and.*``' Intrinsic
18085 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18092 declare i32 @llvm.vector.reduce.and.v4i32(<4 x i32> %a)
18097 The '``llvm.vector.reduce.and.*``' intrinsics do a bitwise ``AND``
18098 reduction of a vector, returning the result as a scalar. The return type matches
18099 the element-type of the vector input.
18103 The argument to this intrinsic must be a vector of integer values.
18105 .. _int_vector_reduce_or:
18107 '``llvm.vector.reduce.or.*``' Intrinsic
18108 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18115 declare i32 @llvm.vector.reduce.or.v4i32(<4 x i32> %a)
18120 The '``llvm.vector.reduce.or.*``' intrinsics do a bitwise ``OR`` reduction
18121 of a vector, returning the result as a scalar. The return type matches the
18122 element-type of the vector input.
18126 The argument to this intrinsic must be a vector of integer values.
18128 .. _int_vector_reduce_xor:
18130 '``llvm.vector.reduce.xor.*``' Intrinsic
18131 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18138 declare i32 @llvm.vector.reduce.xor.v4i32(<4 x i32> %a)
18143 The '``llvm.vector.reduce.xor.*``' intrinsics do a bitwise ``XOR``
18144 reduction of a vector, returning the result as a scalar. The return type matches
18145 the element-type of the vector input.
18149 The argument to this intrinsic must be a vector of integer values.
18151 .. _int_vector_reduce_smax:
18153 '``llvm.vector.reduce.smax.*``' Intrinsic
18154 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18161 declare i32 @llvm.vector.reduce.smax.v4i32(<4 x i32> %a)
18166 The '``llvm.vector.reduce.smax.*``' intrinsics do a signed integer
18167 ``MAX`` reduction of a vector, returning the result as a scalar. The return type
18168 matches the element-type of the vector input.
18172 The argument to this intrinsic must be a vector of integer values.
18174 .. _int_vector_reduce_smin:
18176 '``llvm.vector.reduce.smin.*``' Intrinsic
18177 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18184 declare i32 @llvm.vector.reduce.smin.v4i32(<4 x i32> %a)
18189 The '``llvm.vector.reduce.smin.*``' intrinsics do a signed integer
18190 ``MIN`` reduction of a vector, returning the result as a scalar. The return type
18191 matches the element-type of the vector input.
18195 The argument to this intrinsic must be a vector of integer values.
18197 .. _int_vector_reduce_umax:
18199 '``llvm.vector.reduce.umax.*``' Intrinsic
18200 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18207 declare i32 @llvm.vector.reduce.umax.v4i32(<4 x i32> %a)
18212 The '``llvm.vector.reduce.umax.*``' intrinsics do an unsigned
18213 integer ``MAX`` reduction of a vector, returning the result as a scalar. The
18214 return type matches the element-type of the vector input.
18218 The argument to this intrinsic must be a vector of integer values.
18220 .. _int_vector_reduce_umin:
18222 '``llvm.vector.reduce.umin.*``' Intrinsic
18223 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18230 declare i32 @llvm.vector.reduce.umin.v4i32(<4 x i32> %a)
18235 The '``llvm.vector.reduce.umin.*``' intrinsics do an unsigned
18236 integer ``MIN`` reduction of a vector, returning the result as a scalar. The
18237 return type matches the element-type of the vector input.
18241 The argument to this intrinsic must be a vector of integer values.
18243 .. _int_vector_reduce_fmax:
18245 '``llvm.vector.reduce.fmax.*``' Intrinsic
18246 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18253 declare float @llvm.vector.reduce.fmax.v4f32(<4 x float> %a)
18254 declare double @llvm.vector.reduce.fmax.v2f64(<2 x double> %a)
18259 The '``llvm.vector.reduce.fmax.*``' intrinsics do a floating-point
18260 ``MAX`` reduction of a vector, returning the result as a scalar. The return type
18261 matches the element-type of the vector input.
18263 This instruction has the same comparison semantics as the '``llvm.maxnum.*``'
18264 intrinsic. That is, the result will always be a number unless all elements of
18265 the vector are NaN. For a vector with maximum element magnitude 0.0 and
18266 containing both +0.0 and -0.0 elements, the sign of the result is unspecified.
18268 If the intrinsic call has the ``nnan`` fast-math flag, then the operation can
18269 assume that NaNs are not present in the input vector.
18273 The argument to this intrinsic must be a vector of floating-point values.
18275 .. _int_vector_reduce_fmin:
18277 '``llvm.vector.reduce.fmin.*``' Intrinsic
18278 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18282 This is an overloaded intrinsic.
18286 declare float @llvm.vector.reduce.fmin.v4f32(<4 x float> %a)
18287 declare double @llvm.vector.reduce.fmin.v2f64(<2 x double> %a)
18292 The '``llvm.vector.reduce.fmin.*``' intrinsics do a floating-point
18293 ``MIN`` reduction of a vector, returning the result as a scalar. The return type
18294 matches the element-type of the vector input.
18296 This instruction has the same comparison semantics as the '``llvm.minnum.*``'
18297 intrinsic. That is, the result will always be a number unless all elements of
18298 the vector are NaN. For a vector with minimum element magnitude 0.0 and
18299 containing both +0.0 and -0.0 elements, the sign of the result is unspecified.
18301 If the intrinsic call has the ``nnan`` fast-math flag, then the operation can
18302 assume that NaNs are not present in the input vector.
18306 The argument to this intrinsic must be a vector of floating-point values.
18308 .. _int_vector_reduce_fmaximum:
18310 '``llvm.vector.reduce.fmaximum.*``' Intrinsic
18311 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18315 This is an overloaded intrinsic.
18319 declare float @llvm.vector.reduce.fmaximum.v4f32(<4 x float> %a)
18320 declare double @llvm.vector.reduce.fmaximum.v2f64(<2 x double> %a)
18325 The '``llvm.vector.reduce.fmaximum.*``' intrinsics do a floating-point
18326 ``MAX`` reduction of a vector, returning the result as a scalar. The return type
18327 matches the element-type of the vector input.
18329 This instruction has the same comparison semantics as the '``llvm.maximum.*``'
18330 intrinsic. That is, this intrinsic propagates NaNs and +0.0 is considered
18331 greater than -0.0. If any element of the vector is a NaN, the result is NaN.
18335 The argument to this intrinsic must be a vector of floating-point values.
18337 .. _int_vector_reduce_fminimum:
18339 '``llvm.vector.reduce.fminimum.*``' Intrinsic
18340 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18344 This is an overloaded intrinsic.
18348 declare float @llvm.vector.reduce.fminimum.v4f32(<4 x float> %a)
18349 declare double @llvm.vector.reduce.fminimum.v2f64(<2 x double> %a)
18354 The '``llvm.vector.reduce.fminimum.*``' intrinsics do a floating-point
18355 ``MIN`` reduction of a vector, returning the result as a scalar. The return type
18356 matches the element-type of the vector input.
18358 This instruction has the same comparison semantics as the '``llvm.minimum.*``'
18359 intrinsic. That is, this intrinsic propagates NaNs and -0.0 is considered less
18360 than +0.0. If any element of the vector is a NaN, the result is NaN.
18364 The argument to this intrinsic must be a vector of floating-point values.
18366 '``llvm.vector.insert``' Intrinsic
18367 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18371 This is an overloaded intrinsic.
18375 ; Insert fixed type into scalable type
18376 declare <vscale x 4 x float> @llvm.vector.insert.nxv4f32.v4f32(<vscale x 4 x float> %vec, <4 x float> %subvec, i64 <idx>)
18377 declare <vscale x 2 x double> @llvm.vector.insert.nxv2f64.v2f64(<vscale x 2 x double> %vec, <2 x double> %subvec, i64 <idx>)
18379 ; Insert scalable type into scalable type
18380 declare <vscale x 4 x float> @llvm.vector.insert.nxv4f64.nxv2f64(<vscale x 4 x float> %vec, <vscale x 2 x float> %subvec, i64 <idx>)
18382 ; Insert fixed type into fixed type
18383 declare <4 x double> @llvm.vector.insert.v4f64.v2f64(<4 x double> %vec, <2 x double> %subvec, i64 <idx>)
18388 The '``llvm.vector.insert.*``' intrinsics insert a vector into another vector
18389 starting from a given index. The return type matches the type of the vector we
18390 insert into. Conceptually, this can be used to build a scalable vector out of
18391 non-scalable vectors, however this intrinsic can also be used on purely fixed
18394 Scalable vectors can only be inserted into other scalable vectors.
18399 The ``vec`` is the vector which ``subvec`` will be inserted into.
18400 The ``subvec`` is the vector that will be inserted.
18402 ``idx`` represents the starting element number at which ``subvec`` will be
18403 inserted. ``idx`` must be a constant multiple of ``subvec``'s known minimum
18404 vector length. If ``subvec`` is a scalable vector, ``idx`` is first scaled by
18405 the runtime scaling factor of ``subvec``. The elements of ``vec`` starting at
18406 ``idx`` are overwritten with ``subvec``. Elements ``idx`` through (``idx`` +
18407 num_elements(``subvec``) - 1) must be valid ``vec`` indices. If this condition
18408 cannot be determined statically but is false at runtime, then the result vector
18409 is a :ref:`poison value <poisonvalues>`.
18412 '``llvm.vector.extract``' Intrinsic
18413 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18417 This is an overloaded intrinsic.
18421 ; Extract fixed type from scalable type
18422 declare <4 x float> @llvm.vector.extract.v4f32.nxv4f32(<vscale x 4 x float> %vec, i64 <idx>)
18423 declare <2 x double> @llvm.vector.extract.v2f64.nxv2f64(<vscale x 2 x double> %vec, i64 <idx>)
18425 ; Extract scalable type from scalable type
18426 declare <vscale x 2 x float> @llvm.vector.extract.nxv2f32.nxv4f32(<vscale x 4 x float> %vec, i64 <idx>)
18428 ; Extract fixed type from fixed type
18429 declare <2 x double> @llvm.vector.extract.v2f64.v4f64(<4 x double> %vec, i64 <idx>)
18434 The '``llvm.vector.extract.*``' intrinsics extract a vector from within another
18435 vector starting from a given index. The return type must be explicitly
18436 specified. Conceptually, this can be used to decompose a scalable vector into
18437 non-scalable parts, however this intrinsic can also be used on purely fixed
18440 Scalable vectors can only be extracted from other scalable vectors.
18445 The ``vec`` is the vector from which we will extract a subvector.
18447 The ``idx`` specifies the starting element number within ``vec`` from which a
18448 subvector is extracted. ``idx`` must be a constant multiple of the known-minimum
18449 vector length of the result type. If the result type is a scalable vector,
18450 ``idx`` is first scaled by the result type's runtime scaling factor. Elements
18451 ``idx`` through (``idx`` + num_elements(result_type) - 1) must be valid vector
18452 indices. If this condition cannot be determined statically but is false at
18453 runtime, then the result vector is a :ref:`poison value <poisonvalues>`. The
18454 ``idx`` parameter must be a vector index constant type (for most targets this
18455 will be an integer pointer type).
18457 '``llvm.experimental.vector.reverse``' Intrinsic
18458 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18462 This is an overloaded intrinsic.
18466 declare <2 x i8> @llvm.experimental.vector.reverse.v2i8(<2 x i8> %a)
18467 declare <vscale x 4 x i32> @llvm.experimental.vector.reverse.nxv4i32(<vscale x 4 x i32> %a)
18472 The '``llvm.experimental.vector.reverse.*``' intrinsics reverse a vector.
18473 The intrinsic takes a single vector and returns a vector of matching type but
18474 with the original lane order reversed. These intrinsics work for both fixed
18475 and scalable vectors. While this intrinsic is marked as experimental the
18476 recommended way to express reverse operations for fixed-width vectors is still
18477 to use a shufflevector, as that may allow for more optimization opportunities.
18482 The argument to this intrinsic must be a vector.
18484 '``llvm.experimental.vector.deinterleave2``' Intrinsic
18485 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18489 This is an overloaded intrinsic.
18493 declare {<2 x double>, <2 x double>} @llvm.experimental.vector.deinterleave2.v4f64(<4 x double> %vec1)
18494 declare {<vscale x 4 x i32>, <vscale x 4 x i32>} @llvm.experimental.vector.deinterleave2.nxv8i32(<vscale x 8 x i32> %vec1)
18499 The '``llvm.experimental.vector.deinterleave2``' intrinsic constructs two
18500 vectors by deinterleaving the even and odd lanes of the input vector.
18502 This intrinsic works for both fixed and scalable vectors. While this intrinsic
18503 supports all vector types the recommended way to express this operation for
18504 fixed-width vectors is still to use a shufflevector, as that may allow for more
18505 optimization opportunities.
18509 .. code-block:: text
18511 {<2 x i64>, <2 x i64>} llvm.experimental.vector.deinterleave2.v4i64(<4 x i64> <i64 0, i64 1, i64 2, i64 3>); ==> {<2 x i64> <i64 0, i64 2>, <2 x i64> <i64 1, i64 3>}
18516 The argument is a vector whose type corresponds to the logical concatenation of
18517 the two result types.
18519 '``llvm.experimental.vector.interleave2``' Intrinsic
18520 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18524 This is an overloaded intrinsic.
18528 declare <4 x double> @llvm.experimental.vector.interleave2.v4f64(<2 x double> %vec1, <2 x double> %vec2)
18529 declare <vscale x 8 x i32> @llvm.experimental.vector.interleave2.nxv8i32(<vscale x 4 x i32> %vec1, <vscale x 4 x i32> %vec2)
18534 The '``llvm.experimental.vector.interleave2``' intrinsic constructs a vector
18535 by interleaving two input vectors.
18537 This intrinsic works for both fixed and scalable vectors. While this intrinsic
18538 supports all vector types the recommended way to express this operation for
18539 fixed-width vectors is still to use a shufflevector, as that may allow for more
18540 optimization opportunities.
18544 .. code-block:: text
18546 <4 x i64> llvm.experimental.vector.interleave2.v4i64(<2 x i64> <i64 0, i64 2>, <2 x i64> <i64 1, i64 3>); ==> <4 x i64> <i64 0, i64 1, i64 2, i64 3>
18550 Both arguments must be vectors of the same type whereby their logical
18551 concatenation matches the result type.
18553 '``llvm.experimental.cttz.elts``' Intrinsic
18554 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18559 This is an overloaded intrinsic. You can use ```llvm.experimental.cttz.elts```
18560 on any vector of integer elements, both fixed width and scalable.
18564 declare i8 @llvm.experimental.cttz.elts.i8.v8i1(<8 x i1> <src>, i1 <is_zero_poison>)
18569 The '``llvm.experimental.cttz.elts``' intrinsic counts the number of trailing
18570 zero elements of a vector.
18575 The first argument is the vector to be counted. This argument must be a vector
18576 with integer element type. The return type must also be an integer type which is
18577 wide enough to hold the maximum number of elements of the source vector. The
18578 behaviour of this intrinsic is undefined if the return type is not wide enough
18579 for the number of elements in the input vector.
18581 The second argument is a constant flag that indicates whether the intrinsic
18582 returns a valid result if the first argument is all zero. If the first argument
18583 is all zero and the second argument is true, the result is poison.
18588 The '``llvm.experimental.cttz.elts``' intrinsic counts the trailing (least
18589 significant) zero elements in a vector. If ``src == 0`` the result is the
18590 number of elements in the input vector.
18592 '``llvm.experimental.vector.splice``' Intrinsic
18593 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18597 This is an overloaded intrinsic.
18601 declare <2 x double> @llvm.experimental.vector.splice.v2f64(<2 x double> %vec1, <2 x double> %vec2, i32 %imm)
18602 declare <vscale x 4 x i32> @llvm.experimental.vector.splice.nxv4i32(<vscale x 4 x i32> %vec1, <vscale x 4 x i32> %vec2, i32 %imm)
18607 The '``llvm.experimental.vector.splice.*``' intrinsics construct a vector by
18608 concatenating elements from the first input vector with elements of the second
18609 input vector, returning a vector of the same type as the input vectors. The
18610 signed immediate, modulo the number of elements in the vector, is the index
18611 into the first vector from which to extract the result value. This means
18612 conceptually that for a positive immediate, a vector is extracted from
18613 ``concat(%vec1, %vec2)`` starting at index ``imm``, whereas for a negative
18614 immediate, it extracts ``-imm`` trailing elements from the first vector, and
18615 the remaining elements from ``%vec2``.
18617 These intrinsics work for both fixed and scalable vectors. While this intrinsic
18618 is marked as experimental, the recommended way to express this operation for
18619 fixed-width vectors is still to use a shufflevector, as that may allow for more
18620 optimization opportunities.
18624 .. code-block:: text
18626 llvm.experimental.vector.splice(<A,B,C,D>, <E,F,G,H>, 1); ==> <B, C, D, E> index
18627 llvm.experimental.vector.splice(<A,B,C,D>, <E,F,G,H>, -3); ==> <B, C, D, E> trailing elements
18633 The first two operands are vectors with the same type. The start index is imm
18634 modulo the runtime number of elements in the source vector. For a fixed-width
18635 vector <N x eltty>, imm is a signed integer constant in the range
18636 -N <= imm < N. For a scalable vector <vscale x N x eltty>, imm is a signed
18637 integer constant in the range -X <= imm < X where X=vscale_range_min * N.
18639 '``llvm.experimental.stepvector``' Intrinsic
18640 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18642 This is an overloaded intrinsic. You can use ``llvm.experimental.stepvector``
18643 to generate a vector whose lane values comprise the linear sequence
18644 <0, 1, 2, ...>. It is primarily intended for scalable vectors.
18648 declare <vscale x 4 x i32> @llvm.experimental.stepvector.nxv4i32()
18649 declare <vscale x 8 x i16> @llvm.experimental.stepvector.nxv8i16()
18651 The '``llvm.experimental.stepvector``' intrinsics are used to create vectors
18652 of integers whose elements contain a linear sequence of values starting from 0
18653 with a step of 1. This experimental intrinsic can only be used for vectors
18654 with integer elements that are at least 8 bits in size. If the sequence value
18655 exceeds the allowed limit for the element type then the result for that lane is
18658 These intrinsics work for both fixed and scalable vectors. While this intrinsic
18659 is marked as experimental, the recommended way to express this operation for
18660 fixed-width vectors is still to generate a constant vector instead.
18669 '``llvm.experimental.get.vector.length``' Intrinsic
18670 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18674 This is an overloaded intrinsic.
18678 declare i32 @llvm.experimental.get.vector.length.i32(i32 %cnt, i32 immarg %vf, i1 immarg %scalable)
18679 declare i32 @llvm.experimental.get.vector.length.i64(i64 %cnt, i32 immarg %vf, i1 immarg %scalable)
18684 The '``llvm.experimental.get.vector.length.*``' intrinsics take a number of
18685 elements to process and returns how many of the elements can be processed
18686 with the requested vectorization factor.
18691 The first argument is an unsigned value of any scalar integer type and specifies
18692 the total number of elements to be processed. The second argument is an i32
18693 immediate for the vectorization factor. The third argument indicates if the
18694 vectorization factor should be multiplied by vscale.
18699 Returns a positive i32 value (explicit vector length) that is unknown at compile
18700 time and depends on the hardware specification.
18701 If the result value does not fit in the result type, then the result is
18702 a :ref:`poison value <poisonvalues>`.
18704 This intrinsic is intended to be used by loop vectorization with VP intrinsics
18705 in order to get the number of elements to process on each loop iteration. The
18706 result should be used to decrease the count for the next iteration until the
18707 count reaches zero.
18709 If the count is larger than the number of lanes in the type described by the
18710 last 2 arguments, this intrinsic may return a value less than the number of
18711 lanes implied by the type. The result will be at least as large as the result
18712 will be on any later loop iteration.
18714 This intrinsic will only return 0 if the input count is also 0. A non-zero input
18715 count will produce a non-zero result.
18720 Operations on matrixes requiring shape information (like number of rows/columns
18721 or the memory layout) can be expressed using the matrix intrinsics. These
18722 intrinsics require matrix dimensions to be passed as immediate arguments, and
18723 matrixes are passed and returned as vectors. This means that for a ``R`` x
18724 ``C`` matrix, element ``i`` of column ``j`` is at index ``j * R + i`` in the
18725 corresponding vector, with indices starting at 0. Currently column-major layout
18726 is assumed. The intrinsics support both integer and floating point matrixes.
18729 '``llvm.matrix.transpose.*``' Intrinsic
18730 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18734 This is an overloaded intrinsic.
18738 declare vectorty @llvm.matrix.transpose.*(vectorty %In, i32 <Rows>, i32 <Cols>)
18743 The '``llvm.matrix.transpose.*``' intrinsics treat ``%In`` as a ``<Rows> x
18744 <Cols>`` matrix and return the transposed matrix in the result vector.
18749 The first argument ``%In`` is a vector that corresponds to a ``<Rows> x
18750 <Cols>`` matrix. Thus, arguments ``<Rows>`` and ``<Cols>`` correspond to the
18751 number of rows and columns, respectively, and must be positive, constant
18752 integers. The returned vector must have ``<Rows> * <Cols>`` elements, and have
18753 the same float or integer element type as ``%In``.
18755 '``llvm.matrix.multiply.*``' Intrinsic
18756 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18760 This is an overloaded intrinsic.
18764 declare vectorty @llvm.matrix.multiply.*(vectorty %A, vectorty %B, i32 <OuterRows>, i32 <Inner>, i32 <OuterColumns>)
18769 The '``llvm.matrix.multiply.*``' intrinsics treat ``%A`` as a ``<OuterRows> x
18770 <Inner>`` matrix, ``%B`` as a ``<Inner> x <OuterColumns>`` matrix, and
18771 multiplies them. The result matrix is returned in the result vector.
18776 The first vector argument ``%A`` corresponds to a matrix with ``<OuterRows> *
18777 <Inner>`` elements, and the second argument ``%B`` to a matrix with
18778 ``<Inner> * <OuterColumns>`` elements. Arguments ``<OuterRows>``,
18779 ``<Inner>`` and ``<OuterColumns>`` must be positive, constant integers. The
18780 returned vector must have ``<OuterRows> * <OuterColumns>`` elements.
18781 Vectors ``%A``, ``%B``, and the returned vector all have the same float or
18782 integer element type.
18785 '``llvm.matrix.column.major.load.*``' Intrinsic
18786 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18790 This is an overloaded intrinsic.
18794 declare vectorty @llvm.matrix.column.major.load.*(
18795 ptrty %Ptr, i64 %Stride, i1 <IsVolatile>, i32 <Rows>, i32 <Cols>)
18800 The '``llvm.matrix.column.major.load.*``' intrinsics load a ``<Rows> x <Cols>``
18801 matrix using a stride of ``%Stride`` to compute the start address of the
18802 different columns. The offset is computed using ``%Stride``'s bitwidth. This
18803 allows for convenient loading of sub matrixes. If ``<IsVolatile>`` is true, the
18804 intrinsic is considered a :ref:`volatile memory access <volatile>`. The result
18805 matrix is returned in the result vector. If the ``%Ptr`` argument is known to
18806 be aligned to some boundary, this can be specified as an attribute on the
18812 The first argument ``%Ptr`` is a pointer type to the returned vector type, and
18813 corresponds to the start address to load from. The second argument ``%Stride``
18814 is a positive, constant integer with ``%Stride >= <Rows>``. ``%Stride`` is used
18815 to compute the column memory addresses. I.e., for a column ``C``, its start
18816 memory addresses is calculated with ``%Ptr + C * %Stride``. The third Argument
18817 ``<IsVolatile>`` is a boolean value. The fourth and fifth arguments,
18818 ``<Rows>`` and ``<Cols>``, correspond to the number of rows and columns,
18819 respectively, and must be positive, constant integers. The returned vector must
18820 have ``<Rows> * <Cols>`` elements.
18822 The :ref:`align <attr_align>` parameter attribute can be provided for the
18823 ``%Ptr`` arguments.
18826 '``llvm.matrix.column.major.store.*``' Intrinsic
18827 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18834 declare void @llvm.matrix.column.major.store.*(
18835 vectorty %In, ptrty %Ptr, i64 %Stride, i1 <IsVolatile>, i32 <Rows>, i32 <Cols>)
18840 The '``llvm.matrix.column.major.store.*``' intrinsics store the ``<Rows> x
18841 <Cols>`` matrix in ``%In`` to memory using a stride of ``%Stride`` between
18842 columns. The offset is computed using ``%Stride``'s bitwidth. If
18843 ``<IsVolatile>`` is true, the intrinsic is considered a
18844 :ref:`volatile memory access <volatile>`.
18846 If the ``%Ptr`` argument is known to be aligned to some boundary, this can be
18847 specified as an attribute on the argument.
18852 The first argument ``%In`` is a vector that corresponds to a ``<Rows> x
18853 <Cols>`` matrix to be stored to memory. The second argument ``%Ptr`` is a
18854 pointer to the vector type of ``%In``, and is the start address of the matrix
18855 in memory. The third argument ``%Stride`` is a positive, constant integer with
18856 ``%Stride >= <Rows>``. ``%Stride`` is used to compute the column memory
18857 addresses. I.e., for a column ``C``, its start memory addresses is calculated
18858 with ``%Ptr + C * %Stride``. The fourth argument ``<IsVolatile>`` is a boolean
18859 value. The arguments ``<Rows>`` and ``<Cols>`` correspond to the number of rows
18860 and columns, respectively, and must be positive, constant integers.
18862 The :ref:`align <attr_align>` parameter attribute can be provided
18863 for the ``%Ptr`` arguments.
18866 Half Precision Floating-Point Intrinsics
18867 ----------------------------------------
18869 For most target platforms, half precision floating-point is a
18870 storage-only format. This means that it is a dense encoding (in memory)
18871 but does not support computation in the format.
18873 This means that code must first load the half-precision floating-point
18874 value as an i16, then convert it to float with
18875 :ref:`llvm.convert.from.fp16 <int_convert_from_fp16>`. Computation can
18876 then be performed on the float value (including extending to double
18877 etc). To store the value back to memory, it is first converted to float
18878 if needed, then converted to i16 with
18879 :ref:`llvm.convert.to.fp16 <int_convert_to_fp16>`, then storing as an
18882 .. _int_convert_to_fp16:
18884 '``llvm.convert.to.fp16``' Intrinsic
18885 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18892 declare i16 @llvm.convert.to.fp16.f32(float %a)
18893 declare i16 @llvm.convert.to.fp16.f64(double %a)
18898 The '``llvm.convert.to.fp16``' intrinsic function performs a conversion from a
18899 conventional floating-point type to half precision floating-point format.
18904 The intrinsic function contains single argument - the value to be
18910 The '``llvm.convert.to.fp16``' intrinsic function performs a conversion from a
18911 conventional floating-point format to half precision floating-point format. The
18912 return value is an ``i16`` which contains the converted number.
18917 .. code-block:: llvm
18919 %res = call i16 @llvm.convert.to.fp16.f32(float %a)
18920 store i16 %res, i16* @x, align 2
18922 .. _int_convert_from_fp16:
18924 '``llvm.convert.from.fp16``' Intrinsic
18925 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18932 declare float @llvm.convert.from.fp16.f32(i16 %a)
18933 declare double @llvm.convert.from.fp16.f64(i16 %a)
18938 The '``llvm.convert.from.fp16``' intrinsic function performs a
18939 conversion from half precision floating-point format to single precision
18940 floating-point format.
18945 The intrinsic function contains single argument - the value to be
18951 The '``llvm.convert.from.fp16``' intrinsic function performs a
18952 conversion from half single precision floating-point format to single
18953 precision floating-point format. The input half-float value is
18954 represented by an ``i16`` value.
18959 .. code-block:: llvm
18961 %a = load i16, ptr @x, align 2
18962 %res = call float @llvm.convert.from.fp16(i16 %a)
18964 Saturating floating-point to integer conversions
18965 ------------------------------------------------
18967 The ``fptoui`` and ``fptosi`` instructions return a
18968 :ref:`poison value <poisonvalues>` if the rounded-towards-zero value is not
18969 representable by the result type. These intrinsics provide an alternative
18970 conversion, which will saturate towards the smallest and largest representable
18971 integer values instead.
18973 '``llvm.fptoui.sat.*``' Intrinsic
18974 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18979 This is an overloaded intrinsic. You can use ``llvm.fptoui.sat`` on any
18980 floating-point argument type and any integer result type, or vectors thereof.
18981 Not all targets may support all types, however.
18985 declare i32 @llvm.fptoui.sat.i32.f32(float %f)
18986 declare i19 @llvm.fptoui.sat.i19.f64(double %f)
18987 declare <4 x i100> @llvm.fptoui.sat.v4i100.v4f128(<4 x fp128> %f)
18992 This intrinsic converts the argument into an unsigned integer using saturating
18998 The argument may be any floating-point or vector of floating-point type. The
18999 return value may be any integer or vector of integer type. The number of vector
19000 elements in argument and return must be the same.
19005 The conversion to integer is performed subject to the following rules:
19007 - If the argument is any NaN, zero is returned.
19008 - If the argument is smaller than zero (this includes negative infinity),
19010 - If the argument is larger than the largest representable unsigned integer of
19011 the result type (this includes positive infinity), the largest representable
19012 unsigned integer is returned.
19013 - Otherwise, the result of rounding the argument towards zero is returned.
19018 .. code-block:: text
19020 %a = call i8 @llvm.fptoui.sat.i8.f32(float 123.9) ; yields i8: 123
19021 %b = call i8 @llvm.fptoui.sat.i8.f32(float -5.7) ; yields i8: 0
19022 %c = call i8 @llvm.fptoui.sat.i8.f32(float 377.0) ; yields i8: 255
19023 %d = call i8 @llvm.fptoui.sat.i8.f32(float 0xFFF8000000000000) ; yields i8: 0
19025 '``llvm.fptosi.sat.*``' Intrinsic
19026 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19031 This is an overloaded intrinsic. You can use ``llvm.fptosi.sat`` on any
19032 floating-point argument type and any integer result type, or vectors thereof.
19033 Not all targets may support all types, however.
19037 declare i32 @llvm.fptosi.sat.i32.f32(float %f)
19038 declare i19 @llvm.fptosi.sat.i19.f64(double %f)
19039 declare <4 x i100> @llvm.fptosi.sat.v4i100.v4f128(<4 x fp128> %f)
19044 This intrinsic converts the argument into a signed integer using saturating
19050 The argument may be any floating-point or vector of floating-point type. The
19051 return value may be any integer or vector of integer type. The number of vector
19052 elements in argument and return must be the same.
19057 The conversion to integer is performed subject to the following rules:
19059 - If the argument is any NaN, zero is returned.
19060 - If the argument is smaller than the smallest representable signed integer of
19061 the result type (this includes negative infinity), the smallest
19062 representable signed integer is returned.
19063 - If the argument is larger than the largest representable signed integer of
19064 the result type (this includes positive infinity), the largest representable
19065 signed integer is returned.
19066 - Otherwise, the result of rounding the argument towards zero is returned.
19071 .. code-block:: text
19073 %a = call i8 @llvm.fptosi.sat.i8.f32(float 23.9) ; yields i8: 23
19074 %b = call i8 @llvm.fptosi.sat.i8.f32(float -130.8) ; yields i8: -128
19075 %c = call i8 @llvm.fptosi.sat.i8.f32(float 999.0) ; yields i8: 127
19076 %d = call i8 @llvm.fptosi.sat.i8.f32(float 0xFFF8000000000000) ; yields i8: 0
19078 Convergence Intrinsics
19079 ----------------------
19081 The LLVM convergence intrinsics for controlling the semantics of ``convergent``
19082 operations, which all start with the ``llvm.experimental.convergence.``
19083 prefix, are described in the :doc:`ConvergentOperations` document.
19085 .. _dbg_intrinsics:
19087 Debugger Intrinsics
19088 -------------------
19090 The LLVM debugger intrinsics (which all start with ``llvm.dbg.``
19091 prefix), are described in the `LLVM Source Level
19092 Debugging <SourceLevelDebugging.html#format-common-intrinsics>`_
19095 Exception Handling Intrinsics
19096 -----------------------------
19098 The LLVM exception handling intrinsics (which all start with
19099 ``llvm.eh.`` prefix), are described in the `LLVM Exception
19100 Handling <ExceptionHandling.html#format-common-intrinsics>`_ document.
19102 Pointer Authentication Intrinsics
19103 ---------------------------------
19105 The LLVM pointer authentication intrinsics (which all start with
19106 ``llvm.ptrauth.`` prefix), are described in the `Pointer Authentication
19107 <PointerAuth.html#intrinsics>`_ document.
19109 .. _int_trampoline:
19111 Trampoline Intrinsics
19112 ---------------------
19114 These intrinsics make it possible to excise one parameter, marked with
19115 the :ref:`nest <nest>` attribute, from a function. The result is a
19116 callable function pointer lacking the nest parameter - the caller does
19117 not need to provide a value for it. Instead, the value to use is stored
19118 in advance in a "trampoline", a block of memory usually allocated on the
19119 stack, which also contains code to splice the nest value into the
19120 argument list. This is used to implement the GCC nested function address
19123 For example, if the function is ``i32 f(ptr nest %c, i32 %x, i32 %y)``
19124 then the resulting function pointer has signature ``i32 (i32, i32)``.
19125 It can be created as follows:
19127 .. code-block:: llvm
19129 %tramp = alloca [10 x i8], align 4 ; size and alignment only correct for X86
19130 call ptr @llvm.init.trampoline(ptr %tramp, ptr @f, ptr %nval)
19131 %fp = call ptr @llvm.adjust.trampoline(ptr %tramp)
19133 The call ``%val = call i32 %fp(i32 %x, i32 %y)`` is then equivalent to
19134 ``%val = call i32 %f(ptr %nval, i32 %x, i32 %y)``.
19138 '``llvm.init.trampoline``' Intrinsic
19139 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19146 declare void @llvm.init.trampoline(ptr <tramp>, ptr <func>, ptr <nval>)
19151 This fills the memory pointed to by ``tramp`` with executable code,
19152 turning it into a trampoline.
19157 The ``llvm.init.trampoline`` intrinsic takes three arguments, all
19158 pointers. The ``tramp`` argument must point to a sufficiently large and
19159 sufficiently aligned block of memory; this memory is written to by the
19160 intrinsic. Note that the size and the alignment are target-specific -
19161 LLVM currently provides no portable way of determining them, so a
19162 front-end that generates this intrinsic needs to have some
19163 target-specific knowledge. The ``func`` argument must hold a function.
19168 The block of memory pointed to by ``tramp`` is filled with target
19169 dependent code, turning it into a function. Then ``tramp`` needs to be
19170 passed to :ref:`llvm.adjust.trampoline <int_at>` to get a pointer which can
19171 be :ref:`bitcast (to a new function) and called <int_trampoline>`. The new
19172 function's signature is the same as that of ``func`` with any arguments
19173 marked with the ``nest`` attribute removed. At most one such ``nest``
19174 argument is allowed, and it must be of pointer type. Calling the new
19175 function is equivalent to calling ``func`` with the same argument list,
19176 but with ``nval`` used for the missing ``nest`` argument. If, after
19177 calling ``llvm.init.trampoline``, the memory pointed to by ``tramp`` is
19178 modified, then the effect of any later call to the returned function
19179 pointer is undefined.
19183 '``llvm.adjust.trampoline``' Intrinsic
19184 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19191 declare ptr @llvm.adjust.trampoline(ptr <tramp>)
19196 This performs any required machine-specific adjustment to the address of
19197 a trampoline (passed as ``tramp``).
19202 ``tramp`` must point to a block of memory which already has trampoline
19203 code filled in by a previous call to
19204 :ref:`llvm.init.trampoline <int_it>`.
19209 On some architectures the address of the code to be executed needs to be
19210 different than the address where the trampoline is actually stored. This
19211 intrinsic returns the executable address corresponding to ``tramp``
19212 after performing the required machine specific adjustments. The pointer
19213 returned can then be :ref:`bitcast and executed <int_trampoline>`.
19218 Vector Predication Intrinsics
19219 -----------------------------
19220 VP intrinsics are intended for predicated SIMD/vector code. A typical VP
19221 operation takes a vector mask and an explicit vector length parameter as in:
19225 <W x T> llvm.vp.<opcode>.*(<W x T> %x, <W x T> %y, <W x i1> %mask, i32 %evl)
19227 The vector mask parameter (%mask) always has a vector of `i1` type, for example
19228 `<32 x i1>`. The explicit vector length parameter always has the type `i32` and
19229 is an unsigned integer value. The explicit vector length parameter (%evl) is in
19234 0 <= %evl <= W, where W is the number of vector elements
19236 Note that for :ref:`scalable vector types <t_vector>` ``W`` is the runtime
19237 length of the vector.
19239 The VP intrinsic has undefined behavior if ``%evl > W``. The explicit vector
19240 length (%evl) creates a mask, %EVLmask, with all elements ``0 <= i < %evl`` set
19241 to True, and all other lanes ``%evl <= i < W`` to False. A new mask %M is
19242 calculated with an element-wise AND from %mask and %EVLmask:
19246 M = %mask AND %EVLmask
19248 A vector operation ``<opcode>`` on vectors ``A`` and ``B`` calculates:
19252 A <opcode> B = { A[i] <opcode> B[i] M[i] = True, and
19258 Some targets, such as AVX512, do not support the %evl parameter in hardware.
19259 The use of an effective %evl is discouraged for those targets. The function
19260 ``TargetTransformInfo::hasActiveVectorLength()`` returns true when the target
19261 has native support for %evl.
19265 '``llvm.vp.select.*``' Intrinsics
19266 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19270 This is an overloaded intrinsic.
19274 declare <16 x i32> @llvm.vp.select.v16i32 (<16 x i1> <condition>, <16 x i32> <on_true>, <16 x i32> <on_false>, i32 <evl>)
19275 declare <vscale x 4 x i64> @llvm.vp.select.nxv4i64 (<vscale x 4 x i1> <condition>, <vscale x 4 x i64> <on_true>, <vscale x 4 x i64> <on_false>, i32 <evl>)
19280 The '``llvm.vp.select``' intrinsic is used to choose one value based on a
19281 condition vector, without IR-level branching.
19286 The first operand is a vector of ``i1`` and indicates the condition. The
19287 second operand is the value that is selected where the condition vector is
19288 true. The third operand is the value that is selected where the condition
19289 vector is false. The vectors must be of the same size. The fourth operand is
19290 the explicit vector length.
19292 #. The optional ``fast-math flags`` marker indicates that the select has one or
19293 more :ref:`fast-math flags <fastmath>`. These are optimization hints to
19294 enable otherwise unsafe floating-point optimizations. Fast-math flags are
19295 only valid for selects that return a floating-point scalar or vector type,
19296 or an array (nested to any depth) of floating-point scalar or vector types.
19301 The intrinsic selects lanes from the second and third operand depending on a
19304 All result lanes at positions greater or equal than ``%evl`` are undefined.
19305 For all lanes below ``%evl`` where the condition vector is true the lane is
19306 taken from the second operand. Otherwise, the lane is taken from the third
19312 .. code-block:: llvm
19314 %r = call <4 x i32> @llvm.vp.select.v4i32(<4 x i1> %cond, <4 x i32> %on_true, <4 x i32> %on_false, i32 %evl)
19317 ;; Any result is legal on lanes at and above %evl.
19318 %also.r = select <4 x i1> %cond, <4 x i32> %on_true, <4 x i32> %on_false
19323 '``llvm.vp.merge.*``' Intrinsics
19324 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19328 This is an overloaded intrinsic.
19332 declare <16 x i32> @llvm.vp.merge.v16i32 (<16 x i1> <condition>, <16 x i32> <on_true>, <16 x i32> <on_false>, i32 <pivot>)
19333 declare <vscale x 4 x i64> @llvm.vp.merge.nxv4i64 (<vscale x 4 x i1> <condition>, <vscale x 4 x i64> <on_true>, <vscale x 4 x i64> <on_false>, i32 <pivot>)
19338 The '``llvm.vp.merge``' intrinsic is used to choose one value based on a
19339 condition vector and an index operand, without IR-level branching.
19344 The first operand is a vector of ``i1`` and indicates the condition. The
19345 second operand is the value that is merged where the condition vector is true.
19346 The third operand is the value that is selected where the condition vector is
19347 false or the lane position is greater equal than the pivot. The fourth operand
19350 #. The optional ``fast-math flags`` marker indicates that the merge has one or
19351 more :ref:`fast-math flags <fastmath>`. These are optimization hints to
19352 enable otherwise unsafe floating-point optimizations. Fast-math flags are
19353 only valid for merges that return a floating-point scalar or vector type,
19354 or an array (nested to any depth) of floating-point scalar or vector types.
19359 The intrinsic selects lanes from the second and third operand depending on a
19360 condition vector and pivot value.
19362 For all lanes where the condition vector is true and the lane position is less
19363 than ``%pivot`` the lane is taken from the second operand. Otherwise, the lane
19364 is taken from the third operand.
19369 .. code-block:: llvm
19371 %r = call <4 x i32> @llvm.vp.merge.v4i32(<4 x i1> %cond, <4 x i32> %on_true, <4 x i32> %on_false, i32 %pivot)
19374 ;; Lanes at and above %pivot are taken from %on_false
19375 %atfirst = insertelement <4 x i32> undef, i32 %pivot, i32 0
19376 %splat = shufflevector <4 x i32> %atfirst, <4 x i32> poison, <4 x i32> zeroinitializer
19377 %pivotmask = icmp ult <4 x i32> <i32 0, i32 1, i32 2, i32 3>, <4 x i32> %splat
19378 %mergemask = and <4 x i1> %cond, <4 x i1> %pivotmask
19379 %also.r = select <4 x i1> %mergemask, <4 x i32> %on_true, <4 x i32> %on_false
19385 '``llvm.vp.add.*``' Intrinsics
19386 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19390 This is an overloaded intrinsic.
19394 declare <16 x i32> @llvm.vp.add.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
19395 declare <vscale x 4 x i32> @llvm.vp.add.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
19396 declare <256 x i64> @llvm.vp.add.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
19401 Predicated integer addition of two vectors of integers.
19407 The first two operands and the result have the same vector of integer type. The
19408 third operand is the vector mask and has the same number of elements as the
19409 result vector type. The fourth operand is the explicit vector length of the
19415 The '``llvm.vp.add``' intrinsic performs integer addition (:ref:`add <i_add>`)
19416 of the first and second vector operand on each enabled lane. The result on
19417 disabled lanes is a :ref:`poison value <poisonvalues>`.
19422 .. code-block:: llvm
19424 %r = call <4 x i32> @llvm.vp.add.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
19425 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
19427 %t = add <4 x i32> %a, %b
19428 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
19432 '``llvm.vp.sub.*``' Intrinsics
19433 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19437 This is an overloaded intrinsic.
19441 declare <16 x i32> @llvm.vp.sub.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
19442 declare <vscale x 4 x i32> @llvm.vp.sub.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
19443 declare <256 x i64> @llvm.vp.sub.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
19448 Predicated integer subtraction of two vectors of integers.
19454 The first two operands and the result have the same vector of integer type. The
19455 third operand is the vector mask and has the same number of elements as the
19456 result vector type. The fourth operand is the explicit vector length of the
19462 The '``llvm.vp.sub``' intrinsic performs integer subtraction
19463 (:ref:`sub <i_sub>`) of the first and second vector operand on each enabled
19464 lane. The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
19469 .. code-block:: llvm
19471 %r = call <4 x i32> @llvm.vp.sub.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
19472 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
19474 %t = sub <4 x i32> %a, %b
19475 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
19481 '``llvm.vp.mul.*``' Intrinsics
19482 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19486 This is an overloaded intrinsic.
19490 declare <16 x i32> @llvm.vp.mul.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
19491 declare <vscale x 4 x i32> @llvm.vp.mul.nxv46i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
19492 declare <256 x i64> @llvm.vp.mul.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
19497 Predicated integer multiplication of two vectors of integers.
19503 The first two operands and the result have the same vector of integer type. The
19504 third operand is the vector mask and has the same number of elements as the
19505 result vector type. The fourth operand is the explicit vector length of the
19510 The '``llvm.vp.mul``' intrinsic performs integer multiplication
19511 (:ref:`mul <i_mul>`) of the first and second vector operand on each enabled
19512 lane. The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
19517 .. code-block:: llvm
19519 %r = call <4 x i32> @llvm.vp.mul.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
19520 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
19522 %t = mul <4 x i32> %a, %b
19523 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
19528 '``llvm.vp.sdiv.*``' Intrinsics
19529 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19533 This is an overloaded intrinsic.
19537 declare <16 x i32> @llvm.vp.sdiv.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
19538 declare <vscale x 4 x i32> @llvm.vp.sdiv.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
19539 declare <256 x i64> @llvm.vp.sdiv.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
19544 Predicated, signed division of two vectors of integers.
19550 The first two operands and the result have the same vector of integer type. The
19551 third operand is the vector mask and has the same number of elements as the
19552 result vector type. The fourth operand is the explicit vector length of the
19558 The '``llvm.vp.sdiv``' intrinsic performs signed division (:ref:`sdiv <i_sdiv>`)
19559 of the first and second vector operand on each enabled lane. The result on
19560 disabled lanes is a :ref:`poison value <poisonvalues>`.
19565 .. code-block:: llvm
19567 %r = call <4 x i32> @llvm.vp.sdiv.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
19568 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
19570 %t = sdiv <4 x i32> %a, %b
19571 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
19576 '``llvm.vp.udiv.*``' Intrinsics
19577 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19581 This is an overloaded intrinsic.
19585 declare <16 x i32> @llvm.vp.udiv.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
19586 declare <vscale x 4 x i32> @llvm.vp.udiv.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
19587 declare <256 x i64> @llvm.vp.udiv.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
19592 Predicated, unsigned division of two vectors of integers.
19598 The first two operands and the result have the same vector of integer type. The third operand is the vector mask and has the same number of elements as the result vector type. The fourth operand is the explicit vector length of the operation.
19603 The '``llvm.vp.udiv``' intrinsic performs unsigned division
19604 (:ref:`udiv <i_udiv>`) of the first and second vector operand on each enabled
19605 lane. The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
19610 .. code-block:: llvm
19612 %r = call <4 x i32> @llvm.vp.udiv.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
19613 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
19615 %t = udiv <4 x i32> %a, %b
19616 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
19622 '``llvm.vp.srem.*``' Intrinsics
19623 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19627 This is an overloaded intrinsic.
19631 declare <16 x i32> @llvm.vp.srem.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
19632 declare <vscale x 4 x i32> @llvm.vp.srem.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
19633 declare <256 x i64> @llvm.vp.srem.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
19638 Predicated computations of the signed remainder of two integer vectors.
19644 The first two operands and the result have the same vector of integer type. The
19645 third operand is the vector mask and has the same number of elements as the
19646 result vector type. The fourth operand is the explicit vector length of the
19652 The '``llvm.vp.srem``' intrinsic computes the remainder of the signed division
19653 (:ref:`srem <i_srem>`) of the first and second vector operand on each enabled
19654 lane. The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
19659 .. code-block:: llvm
19661 %r = call <4 x i32> @llvm.vp.srem.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
19662 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
19664 %t = srem <4 x i32> %a, %b
19665 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
19671 '``llvm.vp.urem.*``' Intrinsics
19672 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19676 This is an overloaded intrinsic.
19680 declare <16 x i32> @llvm.vp.urem.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
19681 declare <vscale x 4 x i32> @llvm.vp.urem.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
19682 declare <256 x i64> @llvm.vp.urem.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
19687 Predicated computation of the unsigned remainder of two integer vectors.
19693 The first two operands and the result have the same vector of integer type. The
19694 third operand is the vector mask and has the same number of elements as the
19695 result vector type. The fourth operand is the explicit vector length of the
19701 The '``llvm.vp.urem``' intrinsic computes the remainder of the unsigned division
19702 (:ref:`urem <i_urem>`) of the first and second vector operand on each enabled
19703 lane. The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
19708 .. code-block:: llvm
19710 %r = call <4 x i32> @llvm.vp.urem.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
19711 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
19713 %t = urem <4 x i32> %a, %b
19714 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
19719 '``llvm.vp.ashr.*``' Intrinsics
19720 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19724 This is an overloaded intrinsic.
19728 declare <16 x i32> @llvm.vp.ashr.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
19729 declare <vscale x 4 x i32> @llvm.vp.ashr.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
19730 declare <256 x i64> @llvm.vp.ashr.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
19735 Vector-predicated arithmetic right-shift.
19741 The first two operands and the result have the same vector of integer type. The
19742 third operand is the vector mask and has the same number of elements as the
19743 result vector type. The fourth operand is the explicit vector length of the
19749 The '``llvm.vp.ashr``' intrinsic computes the arithmetic right shift
19750 (:ref:`ashr <i_ashr>`) of the first operand by the second operand on each
19751 enabled lane. The result on disabled lanes is a
19752 :ref:`poison value <poisonvalues>`.
19757 .. code-block:: llvm
19759 %r = call <4 x i32> @llvm.vp.ashr.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
19760 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
19762 %t = ashr <4 x i32> %a, %b
19763 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
19769 '``llvm.vp.lshr.*``' Intrinsics
19770 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19774 This is an overloaded intrinsic.
19778 declare <16 x i32> @llvm.vp.lshr.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
19779 declare <vscale x 4 x i32> @llvm.vp.lshr.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
19780 declare <256 x i64> @llvm.vp.lshr.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
19785 Vector-predicated logical right-shift.
19791 The first two operands and the result have the same vector of integer type. The
19792 third operand is the vector mask and has the same number of elements as the
19793 result vector type. The fourth operand is the explicit vector length of the
19799 The '``llvm.vp.lshr``' intrinsic computes the logical right shift
19800 (:ref:`lshr <i_lshr>`) of the first operand by the second operand on each
19801 enabled lane. The result on disabled lanes is a
19802 :ref:`poison value <poisonvalues>`.
19807 .. code-block:: llvm
19809 %r = call <4 x i32> @llvm.vp.lshr.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
19810 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
19812 %t = lshr <4 x i32> %a, %b
19813 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
19818 '``llvm.vp.shl.*``' Intrinsics
19819 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19823 This is an overloaded intrinsic.
19827 declare <16 x i32> @llvm.vp.shl.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
19828 declare <vscale x 4 x i32> @llvm.vp.shl.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
19829 declare <256 x i64> @llvm.vp.shl.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
19834 Vector-predicated left shift.
19840 The first two operands and the result have the same vector of integer type. The
19841 third operand is the vector mask and has the same number of elements as the
19842 result vector type. The fourth operand is the explicit vector length of the
19848 The '``llvm.vp.shl``' intrinsic computes the left shift (:ref:`shl <i_shl>`) of
19849 the first operand by the second operand on each enabled lane. The result on
19850 disabled lanes is a :ref:`poison value <poisonvalues>`.
19855 .. code-block:: llvm
19857 %r = call <4 x i32> @llvm.vp.shl.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
19858 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
19860 %t = shl <4 x i32> %a, %b
19861 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
19866 '``llvm.vp.or.*``' Intrinsics
19867 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19871 This is an overloaded intrinsic.
19875 declare <16 x i32> @llvm.vp.or.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
19876 declare <vscale x 4 x i32> @llvm.vp.or.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
19877 declare <256 x i64> @llvm.vp.or.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
19882 Vector-predicated or.
19888 The first two operands and the result have the same vector of integer type. The
19889 third operand is the vector mask and has the same number of elements as the
19890 result vector type. The fourth operand is the explicit vector length of the
19896 The '``llvm.vp.or``' intrinsic performs a bitwise or (:ref:`or <i_or>`) of the
19897 first two operands on each enabled lane. The result on disabled lanes is
19898 a :ref:`poison value <poisonvalues>`.
19903 .. code-block:: llvm
19905 %r = call <4 x i32> @llvm.vp.or.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
19906 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
19908 %t = or <4 x i32> %a, %b
19909 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
19914 '``llvm.vp.and.*``' Intrinsics
19915 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19919 This is an overloaded intrinsic.
19923 declare <16 x i32> @llvm.vp.and.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
19924 declare <vscale x 4 x i32> @llvm.vp.and.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
19925 declare <256 x i64> @llvm.vp.and.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
19930 Vector-predicated and.
19936 The first two operands and the result have the same vector of integer type. The
19937 third operand is the vector mask and has the same number of elements as the
19938 result vector type. The fourth operand is the explicit vector length of the
19944 The '``llvm.vp.and``' intrinsic performs a bitwise and (:ref:`and <i_or>`) of
19945 the first two operands on each enabled lane. The result on disabled lanes is
19946 a :ref:`poison value <poisonvalues>`.
19951 .. code-block:: llvm
19953 %r = call <4 x i32> @llvm.vp.and.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
19954 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
19956 %t = and <4 x i32> %a, %b
19957 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
19962 '``llvm.vp.xor.*``' Intrinsics
19963 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19967 This is an overloaded intrinsic.
19971 declare <16 x i32> @llvm.vp.xor.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
19972 declare <vscale x 4 x i32> @llvm.vp.xor.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
19973 declare <256 x i64> @llvm.vp.xor.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
19978 Vector-predicated, bitwise xor.
19984 The first two operands and the result have the same vector of integer type. The
19985 third operand is the vector mask and has the same number of elements as the
19986 result vector type. The fourth operand is the explicit vector length of the
19992 The '``llvm.vp.xor``' intrinsic performs a bitwise xor (:ref:`xor <i_xor>`) of
19993 the first two operands on each enabled lane.
19994 The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
19999 .. code-block:: llvm
20001 %r = call <4 x i32> @llvm.vp.xor.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
20002 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
20004 %t = xor <4 x i32> %a, %b
20005 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
20009 '``llvm.vp.abs.*``' Intrinsics
20010 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20014 This is an overloaded intrinsic.
20018 declare <16 x i32> @llvm.vp.abs.v16i32 (<16 x i32> <op>, <16 x i1> <mask>, i32 <vector_length>, i1 <is_int_min_poison>)
20019 declare <vscale x 4 x i32> @llvm.vp.abs.nxv4i32 (<vscale x 4 x i32> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>, i1 <is_int_min_poison>)
20020 declare <256 x i64> @llvm.vp.abs.v256i64 (<256 x i64> <op>, <256 x i1> <mask>, i32 <vector_length>, i1 <is_int_min_poison>)
20025 Predicated abs of a vector of integers.
20031 The first operand and the result have the same vector of integer type. The
20032 second operand is the vector mask and has the same number of elements as the
20033 result vector type. The third operand is the explicit vector length of the
20034 operation. The fourth argument must be a constant and is a flag to indicate
20035 whether the result value of the '``llvm.vp.abs``' intrinsic is a
20036 :ref:`poison value <poisonvalues>` if the argument is statically or dynamically
20037 an ``INT_MIN`` value.
20042 The '``llvm.vp.abs``' intrinsic performs abs (:ref:`abs <int_abs>`) of the first operand on each
20043 enabled lane. The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
20048 .. code-block:: llvm
20050 %r = call <4 x i32> @llvm.vp.abs.v4i32(<4 x i32> %a, <4 x i1> %mask, i32 %evl, i1 false)
20051 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
20053 %t = call <4 x i32> @llvm.abs.v4i32(<4 x i32> %a, i1 false)
20054 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
20060 '``llvm.vp.smax.*``' Intrinsics
20061 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20065 This is an overloaded intrinsic.
20069 declare <16 x i32> @llvm.vp.smax.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
20070 declare <vscale x 4 x i32> @llvm.vp.smax.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
20071 declare <256 x i64> @llvm.vp.smax.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
20076 Predicated integer signed maximum of two vectors of integers.
20082 The first two operands and the result have the same vector of integer type. The
20083 third operand is the vector mask and has the same number of elements as the
20084 result vector type. The fourth operand is the explicit vector length of the
20090 The '``llvm.vp.smax``' intrinsic performs integer signed maximum (:ref:`smax <int_smax>`)
20091 of the first and second vector operand on each enabled lane. The result on
20092 disabled lanes is a :ref:`poison value <poisonvalues>`.
20097 .. code-block:: llvm
20099 %r = call <4 x i32> @llvm.vp.smax.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
20100 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
20102 %t = call <4 x i32> @llvm.smax.v4i32(<4 x i32> %a, <4 x i32> %b)
20103 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
20108 '``llvm.vp.smin.*``' Intrinsics
20109 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20113 This is an overloaded intrinsic.
20117 declare <16 x i32> @llvm.vp.smin.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
20118 declare <vscale x 4 x i32> @llvm.vp.smin.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
20119 declare <256 x i64> @llvm.vp.smin.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
20124 Predicated integer signed minimum of two vectors of integers.
20130 The first two operands and the result have the same vector of integer type. The
20131 third operand is the vector mask and has the same number of elements as the
20132 result vector type. The fourth operand is the explicit vector length of the
20138 The '``llvm.vp.smin``' intrinsic performs integer signed minimum (:ref:`smin <int_smin>`)
20139 of the first and second vector operand on each enabled lane. The result on
20140 disabled lanes is a :ref:`poison value <poisonvalues>`.
20145 .. code-block:: llvm
20147 %r = call <4 x i32> @llvm.vp.smin.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
20148 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
20150 %t = call <4 x i32> @llvm.smin.v4i32(<4 x i32> %a, <4 x i32> %b)
20151 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
20156 '``llvm.vp.umax.*``' Intrinsics
20157 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20161 This is an overloaded intrinsic.
20165 declare <16 x i32> @llvm.vp.umax.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
20166 declare <vscale x 4 x i32> @llvm.vp.umax.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
20167 declare <256 x i64> @llvm.vp.umax.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
20172 Predicated integer unsigned maximum of two vectors of integers.
20178 The first two operands and the result have the same vector of integer type. The
20179 third operand is the vector mask and has the same number of elements as the
20180 result vector type. The fourth operand is the explicit vector length of the
20186 The '``llvm.vp.umax``' intrinsic performs integer unsigned maximum (:ref:`umax <int_umax>`)
20187 of the first and second vector operand on each enabled lane. The result on
20188 disabled lanes is a :ref:`poison value <poisonvalues>`.
20193 .. code-block:: llvm
20195 %r = call <4 x i32> @llvm.vp.umax.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
20196 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
20198 %t = call <4 x i32> @llvm.umax.v4i32(<4 x i32> %a, <4 x i32> %b)
20199 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
20204 '``llvm.vp.umin.*``' Intrinsics
20205 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20209 This is an overloaded intrinsic.
20213 declare <16 x i32> @llvm.vp.umin.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
20214 declare <vscale x 4 x i32> @llvm.vp.umin.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
20215 declare <256 x i64> @llvm.vp.umin.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
20220 Predicated integer unsigned minimum of two vectors of integers.
20226 The first two operands and the result have the same vector of integer type. The
20227 third operand is the vector mask and has the same number of elements as the
20228 result vector type. The fourth operand is the explicit vector length of the
20234 The '``llvm.vp.umin``' intrinsic performs integer unsigned minimum (:ref:`umin <int_umin>`)
20235 of the first and second vector operand on each enabled lane. The result on
20236 disabled lanes is a :ref:`poison value <poisonvalues>`.
20241 .. code-block:: llvm
20243 %r = call <4 x i32> @llvm.vp.umin.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
20244 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
20246 %t = call <4 x i32> @llvm.umin.v4i32(<4 x i32> %a, <4 x i32> %b)
20247 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
20250 .. _int_vp_copysign:
20252 '``llvm.vp.copysign.*``' Intrinsics
20253 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20257 This is an overloaded intrinsic.
20261 declare <16 x float> @llvm.vp.copysign.v16f32 (<16 x float> <mag_op>, <16 x float> <sign_op>, <16 x i1> <mask>, i32 <vector_length>)
20262 declare <vscale x 4 x float> @llvm.vp.copysign.nxv4f32 (<vscale x 4 x float> <mag_op>, <vscale x 4 x float> <sign_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
20263 declare <256 x double> @llvm.vp.copysign.v256f64 (<256 x double> <mag_op>, <256 x double> <sign_op>, <256 x i1> <mask>, i32 <vector_length>)
20268 Predicated floating-point copysign of two vectors of floating-point values.
20274 The first two operands and the result have the same vector of floating-point type. The
20275 third operand is the vector mask and has the same number of elements as the
20276 result vector type. The fourth operand is the explicit vector length of the
20282 The '``llvm.vp.copysign``' intrinsic performs floating-point copysign (:ref:`copysign <int_copysign>`)
20283 of the first and second vector operand on each enabled lane. The result on
20284 disabled lanes is a :ref:`poison value <poisonvalues>`. The operation is
20285 performed in the default floating-point environment.
20290 .. code-block:: llvm
20292 %r = call <4 x float> @llvm.vp.copysign.v4f32(<4 x float> %mag, <4 x float> %sign, <4 x i1> %mask, i32 %evl)
20293 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
20295 %t = call <4 x float> @llvm.copysign.v4f32(<4 x float> %mag, <4 x float> %sign)
20296 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
20301 '``llvm.vp.minnum.*``' Intrinsics
20302 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20306 This is an overloaded intrinsic.
20310 declare <16 x float> @llvm.vp.minnum.v16f32 (<16 x float> <left_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
20311 declare <vscale x 4 x float> @llvm.vp.minnum.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
20312 declare <256 x double> @llvm.vp.minnum.v256f64 (<256 x double> <left_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
20317 Predicated floating-point IEEE-754 minNum of two vectors of floating-point values.
20323 The first two operands and the result have the same vector of floating-point type. The
20324 third operand is the vector mask and has the same number of elements as the
20325 result vector type. The fourth operand is the explicit vector length of the
20331 The '``llvm.vp.minnum``' intrinsic performs floating-point minimum (:ref:`minnum <i_minnum>`)
20332 of the first and second vector operand on each enabled lane. The result on
20333 disabled lanes is a :ref:`poison value <poisonvalues>`. The operation is
20334 performed in the default floating-point environment.
20339 .. code-block:: llvm
20341 %r = call <4 x float> @llvm.vp.minnum.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl)
20342 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
20344 %t = call <4 x float> @llvm.minnum.v4f32(<4 x float> %a, <4 x float> %b)
20345 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
20350 '``llvm.vp.maxnum.*``' Intrinsics
20351 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20355 This is an overloaded intrinsic.
20359 declare <16 x float> @llvm.vp.maxnum.v16f32 (<16 x float> <left_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
20360 declare <vscale x 4 x float> @llvm.vp.maxnum.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
20361 declare <256 x double> @llvm.vp.maxnum.v256f64 (<256 x double> <left_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
20366 Predicated floating-point IEEE-754 maxNum of two vectors of floating-point values.
20372 The first two operands and the result have the same vector of floating-point type. The
20373 third operand is the vector mask and has the same number of elements as the
20374 result vector type. The fourth operand is the explicit vector length of the
20380 The '``llvm.vp.maxnum``' intrinsic performs floating-point maximum (:ref:`maxnum <i_maxnum>`)
20381 of the first and second vector operand on each enabled lane. The result on
20382 disabled lanes is a :ref:`poison value <poisonvalues>`. The operation is
20383 performed in the default floating-point environment.
20388 .. code-block:: llvm
20390 %r = call <4 x float> @llvm.vp.maxnum.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl)
20391 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
20393 %t = call <4 x float> @llvm.maxnum.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl)
20394 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
20399 '``llvm.vp.fadd.*``' Intrinsics
20400 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20404 This is an overloaded intrinsic.
20408 declare <16 x float> @llvm.vp.fadd.v16f32 (<16 x float> <left_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
20409 declare <vscale x 4 x float> @llvm.vp.fadd.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
20410 declare <256 x double> @llvm.vp.fadd.v256f64 (<256 x double> <left_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
20415 Predicated floating-point addition of two vectors of floating-point values.
20421 The first two operands and the result have the same vector of floating-point type. The
20422 third operand is the vector mask and has the same number of elements as the
20423 result vector type. The fourth operand is the explicit vector length of the
20429 The '``llvm.vp.fadd``' intrinsic performs floating-point addition (:ref:`fadd <i_fadd>`)
20430 of the first and second vector operand on each enabled lane. The result on
20431 disabled lanes is a :ref:`poison value <poisonvalues>`. The operation is
20432 performed in the default floating-point environment.
20437 .. code-block:: llvm
20439 %r = call <4 x float> @llvm.vp.fadd.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl)
20440 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
20442 %t = fadd <4 x float> %a, %b
20443 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
20448 '``llvm.vp.fsub.*``' Intrinsics
20449 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20453 This is an overloaded intrinsic.
20457 declare <16 x float> @llvm.vp.fsub.v16f32 (<16 x float> <left_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
20458 declare <vscale x 4 x float> @llvm.vp.fsub.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
20459 declare <256 x double> @llvm.vp.fsub.v256f64 (<256 x double> <left_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
20464 Predicated floating-point subtraction of two vectors of floating-point values.
20470 The first two operands and the result have the same vector of floating-point type. The
20471 third operand is the vector mask and has the same number of elements as the
20472 result vector type. The fourth operand is the explicit vector length of the
20478 The '``llvm.vp.fsub``' intrinsic performs floating-point subtraction (:ref:`fsub <i_fsub>`)
20479 of the first and second vector operand on each enabled lane. The result on
20480 disabled lanes is a :ref:`poison value <poisonvalues>`. The operation is
20481 performed in the default floating-point environment.
20486 .. code-block:: llvm
20488 %r = call <4 x float> @llvm.vp.fsub.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl)
20489 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
20491 %t = fsub <4 x float> %a, %b
20492 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
20497 '``llvm.vp.fmul.*``' Intrinsics
20498 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20502 This is an overloaded intrinsic.
20506 declare <16 x float> @llvm.vp.fmul.v16f32 (<16 x float> <left_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
20507 declare <vscale x 4 x float> @llvm.vp.fmul.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
20508 declare <256 x double> @llvm.vp.fmul.v256f64 (<256 x double> <left_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
20513 Predicated floating-point multiplication of two vectors of floating-point values.
20519 The first two operands and the result have the same vector of floating-point type. The
20520 third operand is the vector mask and has the same number of elements as the
20521 result vector type. The fourth operand is the explicit vector length of the
20527 The '``llvm.vp.fmul``' intrinsic performs floating-point multiplication (:ref:`fmul <i_fmul>`)
20528 of the first and second vector operand on each enabled lane. The result on
20529 disabled lanes is a :ref:`poison value <poisonvalues>`. The operation is
20530 performed in the default floating-point environment.
20535 .. code-block:: llvm
20537 %r = call <4 x float> @llvm.vp.fmul.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl)
20538 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
20540 %t = fmul <4 x float> %a, %b
20541 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
20546 '``llvm.vp.fdiv.*``' Intrinsics
20547 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20551 This is an overloaded intrinsic.
20555 declare <16 x float> @llvm.vp.fdiv.v16f32 (<16 x float> <left_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
20556 declare <vscale x 4 x float> @llvm.vp.fdiv.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
20557 declare <256 x double> @llvm.vp.fdiv.v256f64 (<256 x double> <left_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
20562 Predicated floating-point division of two vectors of floating-point values.
20568 The first two operands and the result have the same vector of floating-point type. The
20569 third operand is the vector mask and has the same number of elements as the
20570 result vector type. The fourth operand is the explicit vector length of the
20576 The '``llvm.vp.fdiv``' intrinsic performs floating-point division (:ref:`fdiv <i_fdiv>`)
20577 of the first and second vector operand on each enabled lane. The result on
20578 disabled lanes is a :ref:`poison value <poisonvalues>`. The operation is
20579 performed in the default floating-point environment.
20584 .. code-block:: llvm
20586 %r = call <4 x float> @llvm.vp.fdiv.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl)
20587 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
20589 %t = fdiv <4 x float> %a, %b
20590 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
20595 '``llvm.vp.frem.*``' Intrinsics
20596 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20600 This is an overloaded intrinsic.
20604 declare <16 x float> @llvm.vp.frem.v16f32 (<16 x float> <left_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
20605 declare <vscale x 4 x float> @llvm.vp.frem.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
20606 declare <256 x double> @llvm.vp.frem.v256f64 (<256 x double> <left_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
20611 Predicated floating-point remainder of two vectors of floating-point values.
20617 The first two operands and the result have the same vector of floating-point type. The
20618 third operand is the vector mask and has the same number of elements as the
20619 result vector type. The fourth operand is the explicit vector length of the
20625 The '``llvm.vp.frem``' intrinsic performs floating-point remainder (:ref:`frem <i_frem>`)
20626 of the first and second vector operand on each enabled lane. The result on
20627 disabled lanes is a :ref:`poison value <poisonvalues>`. The operation is
20628 performed in the default floating-point environment.
20633 .. code-block:: llvm
20635 %r = call <4 x float> @llvm.vp.frem.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl)
20636 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
20638 %t = frem <4 x float> %a, %b
20639 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
20644 '``llvm.vp.fneg.*``' Intrinsics
20645 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20649 This is an overloaded intrinsic.
20653 declare <16 x float> @llvm.vp.fneg.v16f32 (<16 x float> <op>, <16 x i1> <mask>, i32 <vector_length>)
20654 declare <vscale x 4 x float> @llvm.vp.fneg.nxv4f32 (<vscale x 4 x float> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
20655 declare <256 x double> @llvm.vp.fneg.v256f64 (<256 x double> <op>, <256 x i1> <mask>, i32 <vector_length>)
20660 Predicated floating-point negation of a vector of floating-point values.
20666 The first operand and the result have the same vector of floating-point type.
20667 The second operand is the vector mask and has the same number of elements as the
20668 result vector type. The third operand is the explicit vector length of the
20674 The '``llvm.vp.fneg``' intrinsic performs floating-point negation (:ref:`fneg <i_fneg>`)
20675 of the first vector operand on each enabled lane. The result on disabled lanes
20676 is a :ref:`poison value <poisonvalues>`.
20681 .. code-block:: llvm
20683 %r = call <4 x float> @llvm.vp.fneg.v4f32(<4 x float> %a, <4 x i1> %mask, i32 %evl)
20684 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
20686 %t = fneg <4 x float> %a
20687 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
20692 '``llvm.vp.fabs.*``' Intrinsics
20693 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20697 This is an overloaded intrinsic.
20701 declare <16 x float> @llvm.vp.fabs.v16f32 (<16 x float> <op>, <16 x i1> <mask>, i32 <vector_length>)
20702 declare <vscale x 4 x float> @llvm.vp.fabs.nxv4f32 (<vscale x 4 x float> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
20703 declare <256 x double> @llvm.vp.fabs.v256f64 (<256 x double> <op>, <256 x i1> <mask>, i32 <vector_length>)
20708 Predicated floating-point absolute value of a vector of floating-point values.
20714 The first operand and the result have the same vector of floating-point type.
20715 The second operand is the vector mask and has the same number of elements as the
20716 result vector type. The third operand is the explicit vector length of the
20722 The '``llvm.vp.fabs``' intrinsic performs floating-point absolute value
20723 (:ref:`fabs <int_fabs>`) of the first vector operand on each enabled lane. The
20724 result on disabled lanes is a :ref:`poison value <poisonvalues>`.
20729 .. code-block:: llvm
20731 %r = call <4 x float> @llvm.vp.fabs.v4f32(<4 x float> %a, <4 x i1> %mask, i32 %evl)
20732 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
20734 %t = call <4 x float> @llvm.fabs.v4f32(<4 x float> %a)
20735 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
20740 '``llvm.vp.sqrt.*``' Intrinsics
20741 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20745 This is an overloaded intrinsic.
20749 declare <16 x float> @llvm.vp.sqrt.v16f32 (<16 x float> <op>, <16 x i1> <mask>, i32 <vector_length>)
20750 declare <vscale x 4 x float> @llvm.vp.sqrt.nxv4f32 (<vscale x 4 x float> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
20751 declare <256 x double> @llvm.vp.sqrt.v256f64 (<256 x double> <op>, <256 x i1> <mask>, i32 <vector_length>)
20756 Predicated floating-point square root of a vector of floating-point values.
20762 The first operand and the result have the same vector of floating-point type.
20763 The second operand is the vector mask and has the same number of elements as the
20764 result vector type. The third operand is the explicit vector length of the
20770 The '``llvm.vp.sqrt``' intrinsic performs floating-point square root (:ref:`sqrt <int_sqrt>`) of
20771 the first vector operand on each enabled lane. The result on disabled lanes is
20772 a :ref:`poison value <poisonvalues>`. The operation is performed in the default
20773 floating-point environment.
20778 .. code-block:: llvm
20780 %r = call <4 x float> @llvm.vp.sqrt.v4f32(<4 x float> %a, <4 x i1> %mask, i32 %evl)
20781 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
20783 %t = call <4 x float> @llvm.sqrt.v4f32(<4 x float> %a)
20784 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
20789 '``llvm.vp.fma.*``' Intrinsics
20790 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20794 This is an overloaded intrinsic.
20798 declare <16 x float> @llvm.vp.fma.v16f32 (<16 x float> <left_op>, <16 x float> <middle_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
20799 declare <vscale x 4 x float> @llvm.vp.fma.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <middle_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
20800 declare <256 x double> @llvm.vp.fma.v256f64 (<256 x double> <left_op>, <256 x double> <middle_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
20805 Predicated floating-point fused multiply-add of two vectors of floating-point values.
20811 The first three operands and the result have the same vector of floating-point type. The
20812 fourth operand is the vector mask and has the same number of elements as the
20813 result vector type. The fifth operand is the explicit vector length of the
20819 The '``llvm.vp.fma``' intrinsic performs floating-point fused multiply-add (:ref:`llvm.fma <int_fma>`)
20820 of the first, second, and third vector operand on each enabled lane. The result on
20821 disabled lanes is a :ref:`poison value <poisonvalues>`. The operation is
20822 performed in the default floating-point environment.
20827 .. code-block:: llvm
20829 %r = call <4 x float> @llvm.vp.fma.v4f32(<4 x float> %a, <4 x float> %b, <4 x float> %c, <4 x i1> %mask, i32 %evl)
20830 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
20832 %t = call <4 x float> @llvm.fma(<4 x float> %a, <4 x float> %b, <4 x float> %c)
20833 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
20836 .. _int_vp_fmuladd:
20838 '``llvm.vp.fmuladd.*``' Intrinsics
20839 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20843 This is an overloaded intrinsic.
20847 declare <16 x float> @llvm.vp.fmuladd.v16f32 (<16 x float> <left_op>, <16 x float> <middle_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
20848 declare <vscale x 4 x float> @llvm.vp.fmuladd.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <middle_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
20849 declare <256 x double> @llvm.vp.fmuladd.v256f64 (<256 x double> <left_op>, <256 x double> <middle_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
20854 Predicated floating-point multiply-add of two vectors of floating-point values
20855 that can be fused if code generator determines that (a) the target instruction
20856 set has support for a fused operation, and (b) that the fused operation is more
20857 efficient than the equivalent, separate pair of mul and add instructions.
20862 The first three operands and the result have the same vector of floating-point
20863 type. The fourth operand is the vector mask and has the same number of elements
20864 as the result vector type. The fifth operand is the explicit vector length of
20870 The '``llvm.vp.fmuladd``' intrinsic performs floating-point multiply-add (:ref:`llvm.fuladd <int_fmuladd>`)
20871 of the first, second, and third vector operand on each enabled lane. The result
20872 on disabled lanes is a :ref:`poison value <poisonvalues>`. The operation is
20873 performed in the default floating-point environment.
20878 .. code-block:: llvm
20880 %r = call <4 x float> @llvm.vp.fmuladd.v4f32(<4 x float> %a, <4 x float> %b, <4 x float> %c, <4 x i1> %mask, i32 %evl)
20881 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
20883 %t = call <4 x float> @llvm.fmuladd(<4 x float> %a, <4 x float> %b, <4 x float> %c)
20884 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
20887 .. _int_vp_reduce_add:
20889 '``llvm.vp.reduce.add.*``' Intrinsics
20890 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20894 This is an overloaded intrinsic.
20898 declare i32 @llvm.vp.reduce.add.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>)
20899 declare i16 @llvm.vp.reduce.add.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
20904 Predicated integer ``ADD`` reduction of a vector and a scalar starting value,
20905 returning the result as a scalar.
20910 The first operand is the start value of the reduction, which must be a scalar
20911 integer type equal to the result type. The second operand is the vector on
20912 which the reduction is performed and must be a vector of integer values whose
20913 element type is the result/start type. The third operand is the vector mask and
20914 is a vector of boolean values with the same number of elements as the vector
20915 operand. The fourth operand is the explicit vector length of the operation.
20920 The '``llvm.vp.reduce.add``' intrinsic performs the integer ``ADD`` reduction
20921 (:ref:`llvm.vector.reduce.add <int_vector_reduce_add>`) of the vector operand
20922 ``val`` on each enabled lane, adding it to the scalar ``start_value``. Disabled
20923 lanes are treated as containing the neutral value ``0`` (i.e. having no effect
20924 on the reduction operation). If the vector length is zero, the result is equal
20925 to ``start_value``.
20927 To ignore the start value, the neutral value can be used.
20932 .. code-block:: llvm
20934 %r = call i32 @llvm.vp.reduce.add.v4i32(i32 %start, <4 x i32> %a, <4 x i1> %mask, i32 %evl)
20935 ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
20936 ; are treated as though %mask were false for those lanes.
20938 %masked.a = select <4 x i1> %mask, <4 x i32> %a, <4 x i32> zeroinitializer
20939 %reduction = call i32 @llvm.vector.reduce.add.v4i32(<4 x i32> %masked.a)
20940 %also.r = add i32 %reduction, %start
20943 .. _int_vp_reduce_fadd:
20945 '``llvm.vp.reduce.fadd.*``' Intrinsics
20946 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20950 This is an overloaded intrinsic.
20954 declare float @llvm.vp.reduce.fadd.v4f32(float <start_value>, <4 x float> <val>, <4 x i1> <mask>, i32 <vector_length>)
20955 declare double @llvm.vp.reduce.fadd.nxv8f64(double <start_value>, <vscale x 8 x double> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
20960 Predicated floating-point ``ADD`` reduction of a vector and a scalar starting
20961 value, returning the result as a scalar.
20966 The first operand is the start value of the reduction, which must be a scalar
20967 floating-point type equal to the result type. The second operand is the vector
20968 on which the reduction is performed and must be a vector of floating-point
20969 values whose element type is the result/start type. The third operand is the
20970 vector mask and is a vector of boolean values with the same number of elements
20971 as the vector operand. The fourth operand is the explicit vector length of the
20977 The '``llvm.vp.reduce.fadd``' intrinsic performs the floating-point ``ADD``
20978 reduction (:ref:`llvm.vector.reduce.fadd <int_vector_reduce_fadd>`) of the
20979 vector operand ``val`` on each enabled lane, adding it to the scalar
20980 ``start_value``. Disabled lanes are treated as containing the neutral value
20981 ``-0.0`` (i.e. having no effect on the reduction operation). If no lanes are
20982 enabled, the resulting value will be equal to ``start_value``.
20984 To ignore the start value, the neutral value can be used.
20986 See the unpredicated version (:ref:`llvm.vector.reduce.fadd
20987 <int_vector_reduce_fadd>`) for more detail on the semantics of the reduction.
20992 .. code-block:: llvm
20994 %r = call float @llvm.vp.reduce.fadd.v4f32(float %start, <4 x float> %a, <4 x i1> %mask, i32 %evl)
20995 ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
20996 ; are treated as though %mask were false for those lanes.
20998 %masked.a = select <4 x i1> %mask, <4 x float> %a, <4 x float> <float -0.0, float -0.0, float -0.0, float -0.0>
20999 %also.r = call float @llvm.vector.reduce.fadd.v4f32(float %start, <4 x float> %masked.a)
21002 .. _int_vp_reduce_mul:
21004 '``llvm.vp.reduce.mul.*``' Intrinsics
21005 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21009 This is an overloaded intrinsic.
21013 declare i32 @llvm.vp.reduce.mul.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>)
21014 declare i16 @llvm.vp.reduce.mul.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
21019 Predicated integer ``MUL`` reduction of a vector and a scalar starting value,
21020 returning the result as a scalar.
21026 The first operand is the start value of the reduction, which must be a scalar
21027 integer type equal to the result type. The second operand is the vector on
21028 which the reduction is performed and must be a vector of integer values whose
21029 element type is the result/start type. The third operand is the vector mask and
21030 is a vector of boolean values with the same number of elements as the vector
21031 operand. The fourth operand is the explicit vector length of the operation.
21036 The '``llvm.vp.reduce.mul``' intrinsic performs the integer ``MUL`` reduction
21037 (:ref:`llvm.vector.reduce.mul <int_vector_reduce_mul>`) of the vector operand ``val``
21038 on each enabled lane, multiplying it by the scalar ``start_value``. Disabled
21039 lanes are treated as containing the neutral value ``1`` (i.e. having no effect
21040 on the reduction operation). If the vector length is zero, the result is the
21043 To ignore the start value, the neutral value can be used.
21048 .. code-block:: llvm
21050 %r = call i32 @llvm.vp.reduce.mul.v4i32(i32 %start, <4 x i32> %a, <4 x i1> %mask, i32 %evl)
21051 ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
21052 ; are treated as though %mask were false for those lanes.
21054 %masked.a = select <4 x i1> %mask, <4 x i32> %a, <4 x i32> <i32 1, i32 1, i32 1, i32 1>
21055 %reduction = call i32 @llvm.vector.reduce.mul.v4i32(<4 x i32> %masked.a)
21056 %also.r = mul i32 %reduction, %start
21058 .. _int_vp_reduce_fmul:
21060 '``llvm.vp.reduce.fmul.*``' Intrinsics
21061 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21065 This is an overloaded intrinsic.
21069 declare float @llvm.vp.reduce.fmul.v4f32(float <start_value>, <4 x float> <val>, <4 x i1> <mask>, i32 <vector_length>)
21070 declare double @llvm.vp.reduce.fmul.nxv8f64(double <start_value>, <vscale x 8 x double> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
21075 Predicated floating-point ``MUL`` reduction of a vector and a scalar starting
21076 value, returning the result as a scalar.
21082 The first operand is the start value of the reduction, which must be a scalar
21083 floating-point type equal to the result type. The second operand is the vector
21084 on which the reduction is performed and must be a vector of floating-point
21085 values whose element type is the result/start type. The third operand is the
21086 vector mask and is a vector of boolean values with the same number of elements
21087 as the vector operand. The fourth operand is the explicit vector length of the
21093 The '``llvm.vp.reduce.fmul``' intrinsic performs the floating-point ``MUL``
21094 reduction (:ref:`llvm.vector.reduce.fmul <int_vector_reduce_fmul>`) of the
21095 vector operand ``val`` on each enabled lane, multiplying it by the scalar
21096 `start_value``. Disabled lanes are treated as containing the neutral value
21097 ``1.0`` (i.e. having no effect on the reduction operation). If no lanes are
21098 enabled, the resulting value will be equal to the starting value.
21100 To ignore the start value, the neutral value can be used.
21102 See the unpredicated version (:ref:`llvm.vector.reduce.fmul
21103 <int_vector_reduce_fmul>`) for more detail on the semantics.
21108 .. code-block:: llvm
21110 %r = call float @llvm.vp.reduce.fmul.v4f32(float %start, <4 x float> %a, <4 x i1> %mask, i32 %evl)
21111 ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
21112 ; are treated as though %mask were false for those lanes.
21114 %masked.a = select <4 x i1> %mask, <4 x float> %a, <4 x float> <float 1.0, float 1.0, float 1.0, float 1.0>
21115 %also.r = call float @llvm.vector.reduce.fmul.v4f32(float %start, <4 x float> %masked.a)
21118 .. _int_vp_reduce_and:
21120 '``llvm.vp.reduce.and.*``' Intrinsics
21121 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21125 This is an overloaded intrinsic.
21129 declare i32 @llvm.vp.reduce.and.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>)
21130 declare i16 @llvm.vp.reduce.and.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
21135 Predicated integer ``AND`` reduction of a vector and a scalar starting value,
21136 returning the result as a scalar.
21142 The first operand is the start value of the reduction, which must be a scalar
21143 integer type equal to the result type. The second operand is the vector on
21144 which the reduction is performed and must be a vector of integer values whose
21145 element type is the result/start type. The third operand is the vector mask and
21146 is a vector of boolean values with the same number of elements as the vector
21147 operand. The fourth operand is the explicit vector length of the operation.
21152 The '``llvm.vp.reduce.and``' intrinsic performs the integer ``AND`` reduction
21153 (:ref:`llvm.vector.reduce.and <int_vector_reduce_and>`) of the vector operand
21154 ``val`` on each enabled lane, performing an '``and``' of that with with the
21155 scalar ``start_value``. Disabled lanes are treated as containing the neutral
21156 value ``UINT_MAX``, or ``-1`` (i.e. having no effect on the reduction
21157 operation). If the vector length is zero, the result is the start value.
21159 To ignore the start value, the neutral value can be used.
21164 .. code-block:: llvm
21166 %r = call i32 @llvm.vp.reduce.and.v4i32(i32 %start, <4 x i32> %a, <4 x i1> %mask, i32 %evl)
21167 ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
21168 ; are treated as though %mask were false for those lanes.
21170 %masked.a = select <4 x i1> %mask, <4 x i32> %a, <4 x i32> <i32 -1, i32 -1, i32 -1, i32 -1>
21171 %reduction = call i32 @llvm.vector.reduce.and.v4i32(<4 x i32> %masked.a)
21172 %also.r = and i32 %reduction, %start
21175 .. _int_vp_reduce_or:
21177 '``llvm.vp.reduce.or.*``' Intrinsics
21178 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21182 This is an overloaded intrinsic.
21186 declare i32 @llvm.vp.reduce.or.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>)
21187 declare i16 @llvm.vp.reduce.or.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
21192 Predicated integer ``OR`` reduction of a vector and a scalar starting value,
21193 returning the result as a scalar.
21199 The first operand is the start value of the reduction, which must be a scalar
21200 integer type equal to the result type. The second operand is the vector on
21201 which the reduction is performed and must be a vector of integer values whose
21202 element type is the result/start type. The third operand is the vector mask and
21203 is a vector of boolean values with the same number of elements as the vector
21204 operand. The fourth operand is the explicit vector length of the operation.
21209 The '``llvm.vp.reduce.or``' intrinsic performs the integer ``OR`` reduction
21210 (:ref:`llvm.vector.reduce.or <int_vector_reduce_or>`) of the vector operand
21211 ``val`` on each enabled lane, performing an '``or``' of that with the scalar
21212 ``start_value``. Disabled lanes are treated as containing the neutral value
21213 ``0`` (i.e. having no effect on the reduction operation). If the vector length
21214 is zero, the result is the start value.
21216 To ignore the start value, the neutral value can be used.
21221 .. code-block:: llvm
21223 %r = call i32 @llvm.vp.reduce.or.v4i32(i32 %start, <4 x i32> %a, <4 x i1> %mask, i32 %evl)
21224 ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
21225 ; are treated as though %mask were false for those lanes.
21227 %masked.a = select <4 x i1> %mask, <4 x i32> %a, <4 x i32> <i32 0, i32 0, i32 0, i32 0>
21228 %reduction = call i32 @llvm.vector.reduce.or.v4i32(<4 x i32> %masked.a)
21229 %also.r = or i32 %reduction, %start
21231 .. _int_vp_reduce_xor:
21233 '``llvm.vp.reduce.xor.*``' Intrinsics
21234 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21238 This is an overloaded intrinsic.
21242 declare i32 @llvm.vp.reduce.xor.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>)
21243 declare i16 @llvm.vp.reduce.xor.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
21248 Predicated integer ``XOR`` reduction of a vector and a scalar starting value,
21249 returning the result as a scalar.
21255 The first operand is the start value of the reduction, which must be a scalar
21256 integer type equal to the result type. The second operand is the vector on
21257 which the reduction is performed and must be a vector of integer values whose
21258 element type is the result/start type. The third operand is the vector mask and
21259 is a vector of boolean values with the same number of elements as the vector
21260 operand. The fourth operand is the explicit vector length of the operation.
21265 The '``llvm.vp.reduce.xor``' intrinsic performs the integer ``XOR`` reduction
21266 (:ref:`llvm.vector.reduce.xor <int_vector_reduce_xor>`) of the vector operand
21267 ``val`` on each enabled lane, performing an '``xor``' of that with the scalar
21268 ``start_value``. Disabled lanes are treated as containing the neutral value
21269 ``0`` (i.e. having no effect on the reduction operation). If the vector length
21270 is zero, the result is the start value.
21272 To ignore the start value, the neutral value can be used.
21277 .. code-block:: llvm
21279 %r = call i32 @llvm.vp.reduce.xor.v4i32(i32 %start, <4 x i32> %a, <4 x i1> %mask, i32 %evl)
21280 ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
21281 ; are treated as though %mask were false for those lanes.
21283 %masked.a = select <4 x i1> %mask, <4 x i32> %a, <4 x i32> <i32 0, i32 0, i32 0, i32 0>
21284 %reduction = call i32 @llvm.vector.reduce.xor.v4i32(<4 x i32> %masked.a)
21285 %also.r = xor i32 %reduction, %start
21288 .. _int_vp_reduce_smax:
21290 '``llvm.vp.reduce.smax.*``' Intrinsics
21291 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21295 This is an overloaded intrinsic.
21299 declare i32 @llvm.vp.reduce.smax.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>)
21300 declare i16 @llvm.vp.reduce.smax.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
21305 Predicated signed-integer ``MAX`` reduction of a vector and a scalar starting
21306 value, returning the result as a scalar.
21312 The first operand is the start value of the reduction, which must be a scalar
21313 integer type equal to the result type. The second operand is the vector on
21314 which the reduction is performed and must be a vector of integer values whose
21315 element type is the result/start type. The third operand is the vector mask and
21316 is a vector of boolean values with the same number of elements as the vector
21317 operand. The fourth operand is the explicit vector length of the operation.
21322 The '``llvm.vp.reduce.smax``' intrinsic performs the signed-integer ``MAX``
21323 reduction (:ref:`llvm.vector.reduce.smax <int_vector_reduce_smax>`) of the
21324 vector operand ``val`` on each enabled lane, and taking the maximum of that and
21325 the scalar ``start_value``. Disabled lanes are treated as containing the
21326 neutral value ``INT_MIN`` (i.e. having no effect on the reduction operation).
21327 If the vector length is zero, the result is the start value.
21329 To ignore the start value, the neutral value can be used.
21334 .. code-block:: llvm
21336 %r = call i8 @llvm.vp.reduce.smax.v4i8(i8 %start, <4 x i8> %a, <4 x i1> %mask, i32 %evl)
21337 ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
21338 ; are treated as though %mask were false for those lanes.
21340 %masked.a = select <4 x i1> %mask, <4 x i8> %a, <4 x i8> <i8 -128, i8 -128, i8 -128, i8 -128>
21341 %reduction = call i8 @llvm.vector.reduce.smax.v4i8(<4 x i8> %masked.a)
21342 %also.r = call i8 @llvm.smax.i8(i8 %reduction, i8 %start)
21345 .. _int_vp_reduce_smin:
21347 '``llvm.vp.reduce.smin.*``' Intrinsics
21348 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21352 This is an overloaded intrinsic.
21356 declare i32 @llvm.vp.reduce.smin.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>)
21357 declare i16 @llvm.vp.reduce.smin.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
21362 Predicated signed-integer ``MIN`` reduction of a vector and a scalar starting
21363 value, returning the result as a scalar.
21369 The first operand is the start value of the reduction, which must be a scalar
21370 integer type equal to the result type. The second operand is the vector on
21371 which the reduction is performed and must be a vector of integer values whose
21372 element type is the result/start type. The third operand is the vector mask and
21373 is a vector of boolean values with the same number of elements as the vector
21374 operand. The fourth operand is the explicit vector length of the operation.
21379 The '``llvm.vp.reduce.smin``' intrinsic performs the signed-integer ``MIN``
21380 reduction (:ref:`llvm.vector.reduce.smin <int_vector_reduce_smin>`) of the
21381 vector operand ``val`` on each enabled lane, and taking the minimum of that and
21382 the scalar ``start_value``. Disabled lanes are treated as containing the
21383 neutral value ``INT_MAX`` (i.e. having no effect on the reduction operation).
21384 If the vector length is zero, the result is the start value.
21386 To ignore the start value, the neutral value can be used.
21391 .. code-block:: llvm
21393 %r = call i8 @llvm.vp.reduce.smin.v4i8(i8 %start, <4 x i8> %a, <4 x i1> %mask, i32 %evl)
21394 ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
21395 ; are treated as though %mask were false for those lanes.
21397 %masked.a = select <4 x i1> %mask, <4 x i8> %a, <4 x i8> <i8 127, i8 127, i8 127, i8 127>
21398 %reduction = call i8 @llvm.vector.reduce.smin.v4i8(<4 x i8> %masked.a)
21399 %also.r = call i8 @llvm.smin.i8(i8 %reduction, i8 %start)
21402 .. _int_vp_reduce_umax:
21404 '``llvm.vp.reduce.umax.*``' Intrinsics
21405 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21409 This is an overloaded intrinsic.
21413 declare i32 @llvm.vp.reduce.umax.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>)
21414 declare i16 @llvm.vp.reduce.umax.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
21419 Predicated unsigned-integer ``MAX`` reduction of a vector and a scalar starting
21420 value, returning the result as a scalar.
21426 The first operand is the start value of the reduction, which must be a scalar
21427 integer type equal to the result type. The second operand is the vector on
21428 which the reduction is performed and must be a vector of integer values whose
21429 element type is the result/start type. The third operand is the vector mask and
21430 is a vector of boolean values with the same number of elements as the vector
21431 operand. The fourth operand is the explicit vector length of the operation.
21436 The '``llvm.vp.reduce.umax``' intrinsic performs the unsigned-integer ``MAX``
21437 reduction (:ref:`llvm.vector.reduce.umax <int_vector_reduce_umax>`) of the
21438 vector operand ``val`` on each enabled lane, and taking the maximum of that and
21439 the scalar ``start_value``. Disabled lanes are treated as containing the
21440 neutral value ``0`` (i.e. having no effect on the reduction operation). If the
21441 vector length is zero, the result is the start value.
21443 To ignore the start value, the neutral value can be used.
21448 .. code-block:: llvm
21450 %r = call i32 @llvm.vp.reduce.umax.v4i32(i32 %start, <4 x i32> %a, <4 x i1> %mask, i32 %evl)
21451 ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
21452 ; are treated as though %mask were false for those lanes.
21454 %masked.a = select <4 x i1> %mask, <4 x i32> %a, <4 x i32> <i32 0, i32 0, i32 0, i32 0>
21455 %reduction = call i32 @llvm.vector.reduce.umax.v4i32(<4 x i32> %masked.a)
21456 %also.r = call i32 @llvm.umax.i32(i32 %reduction, i32 %start)
21459 .. _int_vp_reduce_umin:
21461 '``llvm.vp.reduce.umin.*``' Intrinsics
21462 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21466 This is an overloaded intrinsic.
21470 declare i32 @llvm.vp.reduce.umin.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>)
21471 declare i16 @llvm.vp.reduce.umin.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
21476 Predicated unsigned-integer ``MIN`` reduction of a vector and a scalar starting
21477 value, returning the result as a scalar.
21483 The first operand is the start value of the reduction, which must be a scalar
21484 integer type equal to the result type. The second operand is the vector on
21485 which the reduction is performed and must be a vector of integer values whose
21486 element type is the result/start type. The third operand is the vector mask and
21487 is a vector of boolean values with the same number of elements as the vector
21488 operand. The fourth operand is the explicit vector length of the operation.
21493 The '``llvm.vp.reduce.umin``' intrinsic performs the unsigned-integer ``MIN``
21494 reduction (:ref:`llvm.vector.reduce.umin <int_vector_reduce_umin>`) of the
21495 vector operand ``val`` on each enabled lane, taking the minimum of that and the
21496 scalar ``start_value``. Disabled lanes are treated as containing the neutral
21497 value ``UINT_MAX``, or ``-1`` (i.e. having no effect on the reduction
21498 operation). If the vector length is zero, the result is the start value.
21500 To ignore the start value, the neutral value can be used.
21505 .. code-block:: llvm
21507 %r = call i32 @llvm.vp.reduce.umin.v4i32(i32 %start, <4 x i32> %a, <4 x i1> %mask, i32 %evl)
21508 ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
21509 ; are treated as though %mask were false for those lanes.
21511 %masked.a = select <4 x i1> %mask, <4 x i32> %a, <4 x i32> <i32 -1, i32 -1, i32 -1, i32 -1>
21512 %reduction = call i32 @llvm.vector.reduce.umin.v4i32(<4 x i32> %masked.a)
21513 %also.r = call i32 @llvm.umin.i32(i32 %reduction, i32 %start)
21516 .. _int_vp_reduce_fmax:
21518 '``llvm.vp.reduce.fmax.*``' Intrinsics
21519 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21523 This is an overloaded intrinsic.
21527 declare float @llvm.vp.reduce.fmax.v4f32(float <start_value>, <4 x float> <val>, <4 x i1> <mask>, float <vector_length>)
21528 declare double @llvm.vp.reduce.fmax.nxv8f64(double <start_value>, <vscale x 8 x double> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
21533 Predicated floating-point ``MAX`` reduction of a vector and a scalar starting
21534 value, returning the result as a scalar.
21540 The first operand is the start value of the reduction, which must be a scalar
21541 floating-point type equal to the result type. The second operand is the vector
21542 on which the reduction is performed and must be a vector of floating-point
21543 values whose element type is the result/start type. The third operand is the
21544 vector mask and is a vector of boolean values with the same number of elements
21545 as the vector operand. The fourth operand is the explicit vector length of the
21551 The '``llvm.vp.reduce.fmax``' intrinsic performs the floating-point ``MAX``
21552 reduction (:ref:`llvm.vector.reduce.fmax <int_vector_reduce_fmax>`) of the
21553 vector operand ``val`` on each enabled lane, taking the maximum of that and the
21554 scalar ``start_value``. Disabled lanes are treated as containing the neutral
21555 value (i.e. having no effect on the reduction operation). If the vector length
21556 is zero, the result is the start value.
21558 The neutral value is dependent on the :ref:`fast-math flags <fastmath>`. If no
21559 flags are set, the neutral value is ``-QNAN``. If ``nnan`` and ``ninf`` are
21560 both set, then the neutral value is the smallest floating-point value for the
21561 result type. If only ``nnan`` is set then the neutral value is ``-Infinity``.
21563 This instruction has the same comparison semantics as the
21564 :ref:`llvm.vector.reduce.fmax <int_vector_reduce_fmax>` intrinsic (and thus the
21565 '``llvm.maxnum.*``' intrinsic). That is, the result will always be a number
21566 unless all elements of the vector and the starting value are ``NaN``. For a
21567 vector with maximum element magnitude ``0.0`` and containing both ``+0.0`` and
21568 ``-0.0`` elements, the sign of the result is unspecified.
21570 To ignore the start value, the neutral value can be used.
21575 .. code-block:: llvm
21577 %r = call float @llvm.vp.reduce.fmax.v4f32(float %float, <4 x float> %a, <4 x i1> %mask, i32 %evl)
21578 ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
21579 ; are treated as though %mask were false for those lanes.
21581 %masked.a = select <4 x i1> %mask, <4 x float> %a, <4 x float> <float QNAN, float QNAN, float QNAN, float QNAN>
21582 %reduction = call float @llvm.vector.reduce.fmax.v4f32(<4 x float> %masked.a)
21583 %also.r = call float @llvm.maxnum.f32(float %reduction, float %start)
21586 .. _int_vp_reduce_fmin:
21588 '``llvm.vp.reduce.fmin.*``' Intrinsics
21589 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21593 This is an overloaded intrinsic.
21597 declare float @llvm.vp.reduce.fmin.v4f32(float <start_value>, <4 x float> <val>, <4 x i1> <mask>, float <vector_length>)
21598 declare double @llvm.vp.reduce.fmin.nxv8f64(double <start_value>, <vscale x 8 x double> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
21603 Predicated floating-point ``MIN`` reduction of a vector and a scalar starting
21604 value, returning the result as a scalar.
21610 The first operand is the start value of the reduction, which must be a scalar
21611 floating-point type equal to the result type. The second operand is the vector
21612 on which the reduction is performed and must be a vector of floating-point
21613 values whose element type is the result/start type. The third operand is the
21614 vector mask and is a vector of boolean values with the same number of elements
21615 as the vector operand. The fourth operand is the explicit vector length of the
21621 The '``llvm.vp.reduce.fmin``' intrinsic performs the floating-point ``MIN``
21622 reduction (:ref:`llvm.vector.reduce.fmin <int_vector_reduce_fmin>`) of the
21623 vector operand ``val`` on each enabled lane, taking the minimum of that and the
21624 scalar ``start_value``. Disabled lanes are treated as containing the neutral
21625 value (i.e. having no effect on the reduction operation). If the vector length
21626 is zero, the result is the start value.
21628 The neutral value is dependent on the :ref:`fast-math flags <fastmath>`. If no
21629 flags are set, the neutral value is ``+QNAN``. If ``nnan`` and ``ninf`` are
21630 both set, then the neutral value is the largest floating-point value for the
21631 result type. If only ``nnan`` is set then the neutral value is ``+Infinity``.
21633 This instruction has the same comparison semantics as the
21634 :ref:`llvm.vector.reduce.fmin <int_vector_reduce_fmin>` intrinsic (and thus the
21635 '``llvm.minnum.*``' intrinsic). That is, the result will always be a number
21636 unless all elements of the vector and the starting value are ``NaN``. For a
21637 vector with maximum element magnitude ``0.0`` and containing both ``+0.0`` and
21638 ``-0.0`` elements, the sign of the result is unspecified.
21640 To ignore the start value, the neutral value can be used.
21645 .. code-block:: llvm
21647 %r = call float @llvm.vp.reduce.fmin.v4f32(float %start, <4 x float> %a, <4 x i1> %mask, i32 %evl)
21648 ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
21649 ; are treated as though %mask were false for those lanes.
21651 %masked.a = select <4 x i1> %mask, <4 x float> %a, <4 x float> <float QNAN, float QNAN, float QNAN, float QNAN>
21652 %reduction = call float @llvm.vector.reduce.fmin.v4f32(<4 x float> %masked.a)
21653 %also.r = call float @llvm.minnum.f32(float %reduction, float %start)
21656 .. _int_get_active_lane_mask:
21658 '``llvm.get.active.lane.mask.*``' Intrinsics
21659 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21663 This is an overloaded intrinsic.
21667 declare <4 x i1> @llvm.get.active.lane.mask.v4i1.i32(i32 %base, i32 %n)
21668 declare <8 x i1> @llvm.get.active.lane.mask.v8i1.i64(i64 %base, i64 %n)
21669 declare <16 x i1> @llvm.get.active.lane.mask.v16i1.i64(i64 %base, i64 %n)
21670 declare <vscale x 16 x i1> @llvm.get.active.lane.mask.nxv16i1.i64(i64 %base, i64 %n)
21676 Create a mask representing active and inactive vector lanes.
21682 Both operands have the same scalar integer type. The result is a vector with
21683 the i1 element type.
21688 The '``llvm.get.active.lane.mask.*``' intrinsics are semantically equivalent
21693 %m[i] = icmp ult (%base + i), %n
21695 where ``%m`` is a vector (mask) of active/inactive lanes with its elements
21696 indexed by ``i``, and ``%base``, ``%n`` are the two arguments to
21697 ``llvm.get.active.lane.mask.*``, ``%icmp`` is an integer compare and ``ult``
21698 the unsigned less-than comparison operator. Overflow cannot occur in
21699 ``(%base + i)`` and its comparison against ``%n`` as it is performed in integer
21700 numbers and not in machine numbers. If ``%n`` is ``0``, then the result is a
21701 poison value. The above is equivalent to:
21705 %m = @llvm.get.active.lane.mask(%base, %n)
21707 This can, for example, be emitted by the loop vectorizer in which case
21708 ``%base`` is the first element of the vector induction variable (VIV) and
21709 ``%n`` is the loop tripcount. Thus, these intrinsics perform an element-wise
21710 less than comparison of VIV with the loop tripcount, producing a mask of
21711 true/false values representing active/inactive vector lanes, except if the VIV
21712 overflows in which case they return false in the lanes where the VIV overflows.
21713 The arguments are scalar types to accommodate scalable vector types, for which
21714 it is unknown what the type of the step vector needs to be that enumerate its
21715 lanes without overflow.
21717 This mask ``%m`` can e.g. be used in masked load/store instructions. These
21718 intrinsics provide a hint to the backend. I.e., for a vector loop, the
21719 back-edge taken count of the original scalar loop is explicit as the second
21726 .. code-block:: llvm
21728 %active.lane.mask = call <4 x i1> @llvm.get.active.lane.mask.v4i1.i64(i64 %elem0, i64 429)
21729 %wide.masked.load = call <4 x i32> @llvm.masked.load.v4i32.p0v4i32(<4 x i32>* %3, i32 4, <4 x i1> %active.lane.mask, <4 x i32> poison)
21732 .. _int_experimental_vp_splice:
21734 '``llvm.experimental.vp.splice``' Intrinsic
21735 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21739 This is an overloaded intrinsic.
21743 declare <2 x double> @llvm.experimental.vp.splice.v2f64(<2 x double> %vec1, <2 x double> %vec2, i32 %imm, <2 x i1> %mask, i32 %evl1, i32 %evl2)
21744 declare <vscale x 4 x i32> @llvm.experimental.vp.splice.nxv4i32(<vscale x 4 x i32> %vec1, <vscale x 4 x i32> %vec2, i32 %imm, <vscale x 4 x i1> %mask, i32 %evl1, i32 %evl2)
21749 The '``llvm.experimental.vp.splice.*``' intrinsic is the vector length
21750 predicated version of the '``llvm.experimental.vector.splice.*``' intrinsic.
21755 The result and the first two arguments ``vec1`` and ``vec2`` are vectors with
21756 the same type. The third argument ``imm`` is an immediate signed integer that
21757 indicates the offset index. The fourth argument ``mask`` is a vector mask and
21758 has the same number of elements as the result. The last two arguments ``evl1``
21759 and ``evl2`` are unsigned integers indicating the explicit vector lengths of
21760 ``vec1`` and ``vec2`` respectively. ``imm``, ``evl1`` and ``evl2`` should
21761 respect the following constraints: ``-evl1 <= imm < evl1``, ``0 <= evl1 <= VL``
21762 and ``0 <= evl2 <= VL``, where ``VL`` is the runtime vector factor. If these
21763 constraints are not satisfied the intrinsic has undefined behaviour.
21768 Effectively, this intrinsic concatenates ``vec1[0..evl1-1]`` and
21769 ``vec2[0..evl2-1]`` and creates the result vector by selecting the elements in a
21770 window of size ``evl2``, starting at index ``imm`` (for a positive immediate) of
21771 the concatenated vector. Elements in the result vector beyond ``evl2`` are
21772 ``undef``. If ``imm`` is negative the starting index is ``evl1 + imm``. The result
21773 vector of active vector length ``evl2`` contains ``evl1 - imm`` (``-imm`` for
21774 negative ``imm``) elements from indices ``[imm..evl1 - 1]``
21775 (``[evl1 + imm..evl1 -1]`` for negative ``imm``) of ``vec1`` followed by the
21776 first ``evl2 - (evl1 - imm)`` (``evl2 + imm`` for negative ``imm``) elements of
21777 ``vec2``. If ``evl1 - imm`` (``-imm``) >= ``evl2``, only the first ``evl2``
21778 elements are considered and the remaining are ``undef``. The lanes in the result
21779 vector disabled by ``mask`` are ``poison``.
21784 .. code-block:: text
21786 llvm.experimental.vp.splice(<A,B,C,D>, <E,F,G,H>, 1, 2, 3); ==> <B, E, F, poison> index
21787 llvm.experimental.vp.splice(<A,B,C,D>, <E,F,G,H>, -2, 3, 2); ==> <B, C, poison, poison> trailing elements
21790 .. _int_experimental_vp_reverse:
21793 '``llvm.experimental.vp.reverse``' Intrinsic
21794 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21798 This is an overloaded intrinsic.
21802 declare <2 x double> @llvm.experimental.vp.reverse.v2f64(<2 x double> %vec, <2 x i1> %mask, i32 %evl)
21803 declare <vscale x 4 x i32> @llvm.experimental.vp.reverse.nxv4i32(<vscale x 4 x i32> %vec, <vscale x 4 x i1> %mask, i32 %evl)
21808 The '``llvm.experimental.vp.reverse.*``' intrinsic is the vector length
21809 predicated version of the '``llvm.experimental.vector.reverse.*``' intrinsic.
21814 The result and the first argument ``vec`` are vectors with the same type.
21815 The second argument ``mask`` is a vector mask and has the same number of
21816 elements as the result. The third argument is the explicit vector length of
21822 This intrinsic reverses the order of the first ``evl`` elements in a vector.
21823 The lanes in the result vector disabled by ``mask`` are ``poison``. The
21824 elements past ``evl`` are poison.
21828 '``llvm.vp.load``' Intrinsic
21829 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21833 This is an overloaded intrinsic.
21837 declare <4 x float> @llvm.vp.load.v4f32.p0(ptr %ptr, <4 x i1> %mask, i32 %evl)
21838 declare <vscale x 2 x i16> @llvm.vp.load.nxv2i16.p0(ptr %ptr, <vscale x 2 x i1> %mask, i32 %evl)
21839 declare <8 x float> @llvm.vp.load.v8f32.p1(ptr addrspace(1) %ptr, <8 x i1> %mask, i32 %evl)
21840 declare <vscale x 1 x i64> @llvm.vp.load.nxv1i64.p6(ptr addrspace(6) %ptr, <vscale x 1 x i1> %mask, i32 %evl)
21845 The '``llvm.vp.load.*``' intrinsic is the vector length predicated version of
21846 the :ref:`llvm.masked.load <int_mload>` intrinsic.
21851 The first operand is the base pointer for the load. The second operand is a
21852 vector of boolean values with the same number of elements as the return type.
21853 The third is the explicit vector length of the operation. The return type and
21854 underlying type of the base pointer are the same vector types.
21856 The :ref:`align <attr_align>` parameter attribute can be provided for the first
21862 The '``llvm.vp.load``' intrinsic reads a vector from memory in the same way as
21863 the '``llvm.masked.load``' intrinsic, where the mask is taken from the
21864 combination of the '``mask``' and '``evl``' operands in the usual VP way.
21865 Certain '``llvm.masked.load``' operands do not have corresponding operands in
21866 '``llvm.vp.load``': the '``passthru``' operand is implicitly ``poison``; the
21867 '``alignment``' operand is taken as the ``align`` parameter attribute, if
21868 provided. The default alignment is taken as the ABI alignment of the return
21869 type as specified by the :ref:`datalayout string<langref_datalayout>`.
21874 .. code-block:: text
21876 %r = call <8 x i8> @llvm.vp.load.v8i8.p0(ptr align 2 %ptr, <8 x i1> %mask, i32 %evl)
21877 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
21879 %also.r = call <8 x i8> @llvm.masked.load.v8i8.p0(ptr %ptr, i32 2, <8 x i1> %mask, <8 x i8> poison)
21884 '``llvm.vp.store``' Intrinsic
21885 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21889 This is an overloaded intrinsic.
21893 declare void @llvm.vp.store.v4f32.p0(<4 x float> %val, ptr %ptr, <4 x i1> %mask, i32 %evl)
21894 declare void @llvm.vp.store.nxv2i16.p0(<vscale x 2 x i16> %val, ptr %ptr, <vscale x 2 x i1> %mask, i32 %evl)
21895 declare void @llvm.vp.store.v8f32.p1(<8 x float> %val, ptr addrspace(1) %ptr, <8 x i1> %mask, i32 %evl)
21896 declare void @llvm.vp.store.nxv1i64.p6(<vscale x 1 x i64> %val, ptr addrspace(6) %ptr, <vscale x 1 x i1> %mask, i32 %evl)
21901 The '``llvm.vp.store.*``' intrinsic is the vector length predicated version of
21902 the :ref:`llvm.masked.store <int_mstore>` intrinsic.
21907 The first operand is the vector value to be written to memory. The second
21908 operand is the base pointer for the store. It has the same underlying type as
21909 the value operand. The third operand is a vector of boolean values with the
21910 same number of elements as the return type. The fourth is the explicit vector
21911 length of the operation.
21913 The :ref:`align <attr_align>` parameter attribute can be provided for the
21919 The '``llvm.vp.store``' intrinsic reads a vector from memory in the same way as
21920 the '``llvm.masked.store``' intrinsic, where the mask is taken from the
21921 combination of the '``mask``' and '``evl``' operands in the usual VP way. The
21922 alignment of the operation (corresponding to the '``alignment``' operand of
21923 '``llvm.masked.store``') is specified by the ``align`` parameter attribute (see
21924 above). If it is not provided then the ABI alignment of the type of the
21925 '``value``' operand as specified by the :ref:`datalayout
21926 string<langref_datalayout>` is used instead.
21931 .. code-block:: text
21933 call void @llvm.vp.store.v8i8.p0(<8 x i8> %val, ptr align 4 %ptr, <8 x i1> %mask, i32 %evl)
21934 ;; For all lanes below %evl, the call above is lane-wise equivalent to the call below.
21936 call void @llvm.masked.store.v8i8.p0(<8 x i8> %val, ptr %ptr, i32 4, <8 x i1> %mask)
21939 .. _int_experimental_vp_strided_load:
21941 '``llvm.experimental.vp.strided.load``' Intrinsic
21942 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21946 This is an overloaded intrinsic.
21950 declare <4 x float> @llvm.experimental.vp.strided.load.v4f32.i64(ptr %ptr, i64 %stride, <4 x i1> %mask, i32 %evl)
21951 declare <vscale x 2 x i16> @llvm.experimental.vp.strided.load.nxv2i16.i64(ptr %ptr, i64 %stride, <vscale x 2 x i1> %mask, i32 %evl)
21956 The '``llvm.experimental.vp.strided.load``' intrinsic loads, into a vector, scalar values from
21957 memory locations evenly spaced apart by '``stride``' number of bytes, starting from '``ptr``'.
21962 The first operand is the base pointer for the load. The second operand is the stride
21963 value expressed in bytes. The third operand is a vector of boolean values
21964 with the same number of elements as the return type. The fourth is the explicit
21965 vector length of the operation. The base pointer underlying type matches the type of the scalar
21966 elements of the return operand.
21968 The :ref:`align <attr_align>` parameter attribute can be provided for the first
21974 The '``llvm.experimental.vp.strided.load``' intrinsic loads, into a vector, multiple scalar
21975 values from memory in the same way as the :ref:`llvm.vp.gather <int_vp_gather>` intrinsic,
21976 where the vector of pointers is in the form:
21978 ``%ptrs = <%ptr, %ptr + %stride, %ptr + 2 * %stride, ... >``,
21980 with '``ptr``' previously casted to a pointer '``i8``', '``stride``' always interpreted as a signed
21981 integer and all arithmetic occurring in the pointer type.
21986 .. code-block:: text
21988 %r = call <8 x i64> @llvm.experimental.vp.strided.load.v8i64.i64(i64* %ptr, i64 %stride, <8 x i64> %mask, i32 %evl)
21989 ;; The operation can also be expressed like this:
21991 %addr = bitcast i64* %ptr to i8*
21992 ;; Create a vector of pointers %addrs in the form:
21993 ;; %addrs = <%addr, %addr + %stride, %addr + 2 * %stride, ...>
21994 %ptrs = bitcast <8 x i8* > %addrs to <8 x i64* >
21995 %also.r = call <8 x i64> @llvm.vp.gather.v8i64.v8p0i64(<8 x i64* > %ptrs, <8 x i64> %mask, i32 %evl)
21998 .. _int_experimental_vp_strided_store:
22000 '``llvm.experimental.vp.strided.store``' Intrinsic
22001 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22005 This is an overloaded intrinsic.
22009 declare void @llvm.experimental.vp.strided.store.v4f32.i64(<4 x float> %val, ptr %ptr, i64 %stride, <4 x i1> %mask, i32 %evl)
22010 declare void @llvm.experimental.vp.strided.store.nxv2i16.i64(<vscale x 2 x i16> %val, ptr %ptr, i64 %stride, <vscale x 2 x i1> %mask, i32 %evl)
22015 The '``@llvm.experimental.vp.strided.store``' intrinsic stores the elements of
22016 '``val``' into memory locations evenly spaced apart by '``stride``' number of
22017 bytes, starting from '``ptr``'.
22022 The first operand is the vector value to be written to memory. The second
22023 operand is the base pointer for the store. Its underlying type matches the
22024 scalar element type of the value operand. The third operand is the stride value
22025 expressed in bytes. The fourth operand is a vector of boolean values with the
22026 same number of elements as the return type. The fifth is the explicit vector
22027 length of the operation.
22029 The :ref:`align <attr_align>` parameter attribute can be provided for the
22035 The '``llvm.experimental.vp.strided.store``' intrinsic stores the elements of
22036 '``val``' in the same way as the :ref:`llvm.vp.scatter <int_vp_scatter>` intrinsic,
22037 where the vector of pointers is in the form:
22039 ``%ptrs = <%ptr, %ptr + %stride, %ptr + 2 * %stride, ... >``,
22041 with '``ptr``' previously casted to a pointer '``i8``', '``stride``' always interpreted as a signed
22042 integer and all arithmetic occurring in the pointer type.
22047 .. code-block:: text
22049 call void @llvm.experimental.vp.strided.store.v8i64.i64(<8 x i64> %val, i64* %ptr, i64 %stride, <8 x i1> %mask, i32 %evl)
22050 ;; The operation can also be expressed like this:
22052 %addr = bitcast i64* %ptr to i8*
22053 ;; Create a vector of pointers %addrs in the form:
22054 ;; %addrs = <%addr, %addr + %stride, %addr + 2 * %stride, ...>
22055 %ptrs = bitcast <8 x i8* > %addrs to <8 x i64* >
22056 call void @llvm.vp.scatter.v8i64.v8p0i64(<8 x i64> %val, <8 x i64*> %ptrs, <8 x i1> %mask, i32 %evl)
22061 '``llvm.vp.gather``' Intrinsic
22062 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22066 This is an overloaded intrinsic.
22070 declare <4 x double> @llvm.vp.gather.v4f64.v4p0(<4 x ptr> %ptrs, <4 x i1> %mask, i32 %evl)
22071 declare <vscale x 2 x i8> @llvm.vp.gather.nxv2i8.nxv2p0(<vscale x 2 x ptr> %ptrs, <vscale x 2 x i1> %mask, i32 %evl)
22072 declare <2 x float> @llvm.vp.gather.v2f32.v2p2(<2 x ptr addrspace(2)> %ptrs, <2 x i1> %mask, i32 %evl)
22073 declare <vscale x 4 x i32> @llvm.vp.gather.nxv4i32.nxv4p4(<vscale x 4 x ptr addrspace(4)> %ptrs, <vscale x 4 x i1> %mask, i32 %evl)
22078 The '``llvm.vp.gather.*``' intrinsic is the vector length predicated version of
22079 the :ref:`llvm.masked.gather <int_mgather>` intrinsic.
22084 The first operand is a vector of pointers which holds all memory addresses to
22085 read. The second operand is a vector of boolean values with the same number of
22086 elements as the return type. The third is the explicit vector length of the
22087 operation. The return type and underlying type of the vector of pointers are
22088 the same vector types.
22090 The :ref:`align <attr_align>` parameter attribute can be provided for the first
22096 The '``llvm.vp.gather``' intrinsic reads multiple scalar values from memory in
22097 the same way as the '``llvm.masked.gather``' intrinsic, where the mask is taken
22098 from the combination of the '``mask``' and '``evl``' operands in the usual VP
22099 way. Certain '``llvm.masked.gather``' operands do not have corresponding
22100 operands in '``llvm.vp.gather``': the '``passthru``' operand is implicitly
22101 ``poison``; the '``alignment``' operand is taken as the ``align`` parameter, if
22102 provided. The default alignment is taken as the ABI alignment of the source
22103 addresses as specified by the :ref:`datalayout string<langref_datalayout>`.
22108 .. code-block:: text
22110 %r = call <8 x i8> @llvm.vp.gather.v8i8.v8p0(<8 x ptr> align 8 %ptrs, <8 x i1> %mask, i32 %evl)
22111 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
22113 %also.r = call <8 x i8> @llvm.masked.gather.v8i8.v8p0(<8 x ptr> %ptrs, i32 8, <8 x i1> %mask, <8 x i8> poison)
22116 .. _int_vp_scatter:
22118 '``llvm.vp.scatter``' Intrinsic
22119 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22123 This is an overloaded intrinsic.
22127 declare void @llvm.vp.scatter.v4f64.v4p0(<4 x double> %val, <4 x ptr> %ptrs, <4 x i1> %mask, i32 %evl)
22128 declare void @llvm.vp.scatter.nxv2i8.nxv2p0(<vscale x 2 x i8> %val, <vscale x 2 x ptr> %ptrs, <vscale x 2 x i1> %mask, i32 %evl)
22129 declare void @llvm.vp.scatter.v2f32.v2p2(<2 x float> %val, <2 x ptr addrspace(2)> %ptrs, <2 x i1> %mask, i32 %evl)
22130 declare void @llvm.vp.scatter.nxv4i32.nxv4p4(<vscale x 4 x i32> %val, <vscale x 4 x ptr addrspace(4)> %ptrs, <vscale x 4 x i1> %mask, i32 %evl)
22135 The '``llvm.vp.scatter.*``' intrinsic is the vector length predicated version of
22136 the :ref:`llvm.masked.scatter <int_mscatter>` intrinsic.
22141 The first operand is a vector value to be written to memory. The second operand
22142 is a vector of pointers, pointing to where the value elements should be stored.
22143 The third operand is a vector of boolean values with the same number of
22144 elements as the return type. The fourth is the explicit vector length of the
22147 The :ref:`align <attr_align>` parameter attribute can be provided for the
22153 The '``llvm.vp.scatter``' intrinsic writes multiple scalar values to memory in
22154 the same way as the '``llvm.masked.scatter``' intrinsic, where the mask is
22155 taken from the combination of the '``mask``' and '``evl``' operands in the
22156 usual VP way. The '``alignment``' operand of the '``llvm.masked.scatter``' does
22157 not have a corresponding operand in '``llvm.vp.scatter``': it is instead
22158 provided via the optional ``align`` parameter attribute on the
22159 vector-of-pointers operand. Otherwise it is taken as the ABI alignment of the
22160 destination addresses as specified by the :ref:`datalayout
22161 string<langref_datalayout>`.
22166 .. code-block:: text
22168 call void @llvm.vp.scatter.v8i8.v8p0(<8 x i8> %val, <8 x ptr> align 1 %ptrs, <8 x i1> %mask, i32 %evl)
22169 ;; For all lanes below %evl, the call above is lane-wise equivalent to the call below.
22171 call void @llvm.masked.scatter.v8i8.v8p0(<8 x i8> %val, <8 x ptr> %ptrs, i32 1, <8 x i1> %mask)
22176 '``llvm.vp.trunc.*``' Intrinsics
22177 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22181 This is an overloaded intrinsic.
22185 declare <16 x i16> @llvm.vp.trunc.v16i16.v16i32 (<16 x i32> <op>, <16 x i1> <mask>, i32 <vector_length>)
22186 declare <vscale x 4 x i16> @llvm.vp.trunc.nxv4i16.nxv4i32 (<vscale x 4 x i32> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
22191 The '``llvm.vp.trunc``' intrinsic truncates its first operand to the return
22192 type. The operation has a mask and an explicit vector length parameter.
22198 The '``llvm.vp.trunc``' intrinsic takes a value to cast as its first operand.
22199 The return type is the type to cast the value to. Both types must be vector of
22200 :ref:`integer <t_integer>` type. The bit size of the value must be larger than
22201 the bit size of the return type. The second operand is the vector mask. The
22202 return type, the value to cast, and the vector mask have the same number of
22203 elements. The third operand is the explicit vector length of the operation.
22208 The '``llvm.vp.trunc``' intrinsic truncates the high order bits in value and
22209 converts the remaining bits to return type. Since the source size must be larger
22210 than the destination size, '``llvm.vp.trunc``' cannot be a *no-op cast*. It will
22211 always truncate bits. The conversion is performed on lane positions below the
22212 explicit vector length and where the vector mask is true. Masked-off lanes are
22218 .. code-block:: llvm
22220 %r = call <4 x i16> @llvm.vp.trunc.v4i16.v4i32(<4 x i32> %a, <4 x i1> %mask, i32 %evl)
22221 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
22223 %t = trunc <4 x i32> %a to <4 x i16>
22224 %also.r = select <4 x i1> %mask, <4 x i16> %t, <4 x i16> poison
22229 '``llvm.vp.zext.*``' Intrinsics
22230 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22234 This is an overloaded intrinsic.
22238 declare <16 x i32> @llvm.vp.zext.v16i32.v16i16 (<16 x i16> <op>, <16 x i1> <mask>, i32 <vector_length>)
22239 declare <vscale x 4 x i32> @llvm.vp.zext.nxv4i32.nxv4i16 (<vscale x 4 x i16> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
22244 The '``llvm.vp.zext``' intrinsic zero extends its first operand to the return
22245 type. The operation has a mask and an explicit vector length parameter.
22251 The '``llvm.vp.zext``' intrinsic takes a value to cast as its first operand.
22252 The return type is the type to cast the value to. Both types must be vectors of
22253 :ref:`integer <t_integer>` type. The bit size of the value must be smaller than
22254 the bit size of the return type. The second operand is the vector mask. The
22255 return type, the value to cast, and the vector mask have the same number of
22256 elements. The third operand is the explicit vector length of the operation.
22261 The '``llvm.vp.zext``' intrinsic fill the high order bits of the value with zero
22262 bits until it reaches the size of the return type. When zero extending from i1,
22263 the result will always be either 0 or 1. The conversion is performed on lane
22264 positions below the explicit vector length and where the vector mask is true.
22265 Masked-off lanes are ``poison``.
22270 .. code-block:: llvm
22272 %r = call <4 x i32> @llvm.vp.zext.v4i32.v4i16(<4 x i16> %a, <4 x i1> %mask, i32 %evl)
22273 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
22275 %t = zext <4 x i16> %a to <4 x i32>
22276 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
22281 '``llvm.vp.sext.*``' Intrinsics
22282 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22286 This is an overloaded intrinsic.
22290 declare <16 x i32> @llvm.vp.sext.v16i32.v16i16 (<16 x i16> <op>, <16 x i1> <mask>, i32 <vector_length>)
22291 declare <vscale x 4 x i32> @llvm.vp.sext.nxv4i32.nxv4i16 (<vscale x 4 x i16> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
22296 The '``llvm.vp.sext``' intrinsic sign extends its first operand to the return
22297 type. The operation has a mask and an explicit vector length parameter.
22303 The '``llvm.vp.sext``' intrinsic takes a value to cast as its first operand.
22304 The return type is the type to cast the value to. Both types must be vectors of
22305 :ref:`integer <t_integer>` type. The bit size of the value must be smaller than
22306 the bit size of the return type. The second operand is the vector mask. The
22307 return type, the value to cast, and the vector mask have the same number of
22308 elements. The third operand is the explicit vector length of the operation.
22313 The '``llvm.vp.sext``' intrinsic performs a sign extension by copying the sign
22314 bit (highest order bit) of the value until it reaches the size of the return
22315 type. When sign extending from i1, the result will always be either -1 or 0.
22316 The conversion is performed on lane positions below the explicit vector length
22317 and where the vector mask is true. Masked-off lanes are ``poison``.
22322 .. code-block:: llvm
22324 %r = call <4 x i32> @llvm.vp.sext.v4i32.v4i16(<4 x i16> %a, <4 x i1> %mask, i32 %evl)
22325 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
22327 %t = sext <4 x i16> %a to <4 x i32>
22328 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
22331 .. _int_vp_fptrunc:
22333 '``llvm.vp.fptrunc.*``' Intrinsics
22334 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22338 This is an overloaded intrinsic.
22342 declare <16 x float> @llvm.vp.fptrunc.v16f32.v16f64 (<16 x double> <op>, <16 x i1> <mask>, i32 <vector_length>)
22343 declare <vscale x 4 x float> @llvm.vp.trunc.nxv4f32.nxv4f64 (<vscale x 4 x double> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
22348 The '``llvm.vp.fptrunc``' intrinsic truncates its first operand to the return
22349 type. The operation has a mask and an explicit vector length parameter.
22355 The '``llvm.vp.fptrunc``' intrinsic takes a value to cast as its first operand.
22356 The return type is the type to cast the value to. Both types must be vector of
22357 :ref:`floating-point <t_floating>` type. The bit size of the value must be
22358 larger than the bit size of the return type. This implies that
22359 '``llvm.vp.fptrunc``' cannot be used to make a *no-op cast*. The second operand
22360 is the vector mask. The return type, the value to cast, and the vector mask have
22361 the same number of elements. The third operand is the explicit vector length of
22367 The '``llvm.vp.fptrunc``' intrinsic casts a ``value`` from a larger
22368 :ref:`floating-point <t_floating>` type to a smaller :ref:`floating-point
22369 <t_floating>` type.
22370 This instruction is assumed to execute in the default :ref:`floating-point
22371 environment <floatenv>`. The conversion is performed on lane positions below the
22372 explicit vector length and where the vector mask is true. Masked-off lanes are
22378 .. code-block:: llvm
22380 %r = call <4 x float> @llvm.vp.fptrunc.v4f32.v4f64(<4 x double> %a, <4 x i1> %mask, i32 %evl)
22381 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
22383 %t = fptrunc <4 x double> %a to <4 x float>
22384 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
22389 '``llvm.vp.fpext.*``' Intrinsics
22390 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22394 This is an overloaded intrinsic.
22398 declare <16 x double> @llvm.vp.fpext.v16f64.v16f32 (<16 x float> <op>, <16 x i1> <mask>, i32 <vector_length>)
22399 declare <vscale x 4 x double> @llvm.vp.fpext.nxv4f64.nxv4f32 (<vscale x 4 x float> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
22404 The '``llvm.vp.fpext``' intrinsic extends its first operand to the return
22405 type. The operation has a mask and an explicit vector length parameter.
22411 The '``llvm.vp.fpext``' intrinsic takes a value to cast as its first operand.
22412 The return type is the type to cast the value to. Both types must be vector of
22413 :ref:`floating-point <t_floating>` type. The bit size of the value must be
22414 smaller than the bit size of the return type. This implies that
22415 '``llvm.vp.fpext``' cannot be used to make a *no-op cast*. The second operand
22416 is the vector mask. The return type, the value to cast, and the vector mask have
22417 the same number of elements. The third operand is the explicit vector length of
22423 The '``llvm.vp.fpext``' intrinsic extends the ``value`` from a smaller
22424 :ref:`floating-point <t_floating>` type to a larger :ref:`floating-point
22425 <t_floating>` type. The '``llvm.vp.fpext``' cannot be used to make a
22426 *no-op cast* because it always changes bits. Use ``bitcast`` to make a
22427 *no-op cast* for a floating-point cast.
22428 The conversion is performed on lane positions below the explicit vector length
22429 and where the vector mask is true. Masked-off lanes are ``poison``.
22434 .. code-block:: llvm
22436 %r = call <4 x double> @llvm.vp.fpext.v4f64.v4f32(<4 x float> %a, <4 x i1> %mask, i32 %evl)
22437 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
22439 %t = fpext <4 x float> %a to <4 x double>
22440 %also.r = select <4 x i1> %mask, <4 x double> %t, <4 x double> poison
22445 '``llvm.vp.fptoui.*``' Intrinsics
22446 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22450 This is an overloaded intrinsic.
22454 declare <16 x i32> @llvm.vp.fptoui.v16i32.v16f32 (<16 x float> <op>, <16 x i1> <mask>, i32 <vector_length>)
22455 declare <vscale x 4 x i32> @llvm.vp.fptoui.nxv4i32.nxv4f32 (<vscale x 4 x float> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
22456 declare <256 x i64> @llvm.vp.fptoui.v256i64.v256f64 (<256 x double> <op>, <256 x i1> <mask>, i32 <vector_length>)
22461 The '``llvm.vp.fptoui``' intrinsic converts the :ref:`floating-point
22462 <t_floating>` operand to the unsigned integer return type.
22463 The operation has a mask and an explicit vector length parameter.
22469 The '``llvm.vp.fptoui``' intrinsic takes a value to cast as its first operand.
22470 The value to cast must be a vector of :ref:`floating-point <t_floating>` type.
22471 The return type is the type to cast the value to. The return type must be
22472 vector of :ref:`integer <t_integer>` type. The second operand is the vector
22473 mask. The return type, the value to cast, and the vector mask have the same
22474 number of elements. The third operand is the explicit vector length of the
22480 The '``llvm.vp.fptoui``' intrinsic converts its :ref:`floating-point
22481 <t_floating>` operand into the nearest (rounding towards zero) unsigned integer
22482 value where the lane position is below the explicit vector length and the
22483 vector mask is true. Masked-off lanes are ``poison``. On enabled lanes where
22484 conversion takes place and the value cannot fit in the return type, the result
22485 on that lane is a :ref:`poison value <poisonvalues>`.
22490 .. code-block:: llvm
22492 %r = call <4 x i32> @llvm.vp.fptoui.v4i32.v4f32(<4 x float> %a, <4 x i1> %mask, i32 %evl)
22493 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
22495 %t = fptoui <4 x float> %a to <4 x i32>
22496 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
22501 '``llvm.vp.fptosi.*``' Intrinsics
22502 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22506 This is an overloaded intrinsic.
22510 declare <16 x i32> @llvm.vp.fptosi.v16i32.v16f32 (<16 x float> <op>, <16 x i1> <mask>, i32 <vector_length>)
22511 declare <vscale x 4 x i32> @llvm.vp.fptosi.nxv4i32.nxv4f32 (<vscale x 4 x float> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
22512 declare <256 x i64> @llvm.vp.fptosi.v256i64.v256f64 (<256 x double> <op>, <256 x i1> <mask>, i32 <vector_length>)
22517 The '``llvm.vp.fptosi``' intrinsic converts the :ref:`floating-point
22518 <t_floating>` operand to the signed integer return type.
22519 The operation has a mask and an explicit vector length parameter.
22525 The '``llvm.vp.fptosi``' intrinsic takes a value to cast as its first operand.
22526 The value to cast must be a vector of :ref:`floating-point <t_floating>` type.
22527 The return type is the type to cast the value to. The return type must be
22528 vector of :ref:`integer <t_integer>` type. The second operand is the vector
22529 mask. The return type, the value to cast, and the vector mask have the same
22530 number of elements. The third operand is the explicit vector length of the
22536 The '``llvm.vp.fptosi``' intrinsic converts its :ref:`floating-point
22537 <t_floating>` operand into the nearest (rounding towards zero) signed integer
22538 value where the lane position is below the explicit vector length and the
22539 vector mask is true. Masked-off lanes are ``poison``. On enabled lanes where
22540 conversion takes place and the value cannot fit in the return type, the result
22541 on that lane is a :ref:`poison value <poisonvalues>`.
22546 .. code-block:: llvm
22548 %r = call <4 x i32> @llvm.vp.fptosi.v4i32.v4f32(<4 x float> %a, <4 x i1> %mask, i32 %evl)
22549 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
22551 %t = fptosi <4 x float> %a to <4 x i32>
22552 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
22557 '``llvm.vp.uitofp.*``' Intrinsics
22558 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22562 This is an overloaded intrinsic.
22566 declare <16 x float> @llvm.vp.uitofp.v16f32.v16i32 (<16 x i32> <op>, <16 x i1> <mask>, i32 <vector_length>)
22567 declare <vscale x 4 x float> @llvm.vp.uitofp.nxv4f32.nxv4i32 (<vscale x 4 x i32> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
22568 declare <256 x double> @llvm.vp.uitofp.v256f64.v256i64 (<256 x i64> <op>, <256 x i1> <mask>, i32 <vector_length>)
22573 The '``llvm.vp.uitofp``' intrinsic converts its unsigned integer operand to the
22574 :ref:`floating-point <t_floating>` return type. The operation has a mask and
22575 an explicit vector length parameter.
22581 The '``llvm.vp.uitofp``' intrinsic takes a value to cast as its first operand.
22582 The value to cast must be vector of :ref:`integer <t_integer>` type. The
22583 return type is the type to cast the value to. The return type must be a vector
22584 of :ref:`floating-point <t_floating>` type. The second operand is the vector
22585 mask. The return type, the value to cast, and the vector mask have the same
22586 number of elements. The third operand is the explicit vector length of the
22592 The '``llvm.vp.uitofp``' intrinsic interprets its first operand as an unsigned
22593 integer quantity and converts it to the corresponding floating-point value. If
22594 the value cannot be exactly represented, it is rounded using the default
22595 rounding mode. The conversion is performed on lane positions below the
22596 explicit vector length and where the vector mask is true. Masked-off lanes are
22602 .. code-block:: llvm
22604 %r = call <4 x float> @llvm.vp.uitofp.v4f32.v4i32(<4 x i32> %a, <4 x i1> %mask, i32 %evl)
22605 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
22607 %t = uitofp <4 x i32> %a to <4 x float>
22608 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
22613 '``llvm.vp.sitofp.*``' Intrinsics
22614 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22618 This is an overloaded intrinsic.
22622 declare <16 x float> @llvm.vp.sitofp.v16f32.v16i32 (<16 x i32> <op>, <16 x i1> <mask>, i32 <vector_length>)
22623 declare <vscale x 4 x float> @llvm.vp.sitofp.nxv4f32.nxv4i32 (<vscale x 4 x i32> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
22624 declare <256 x double> @llvm.vp.sitofp.v256f64.v256i64 (<256 x i64> <op>, <256 x i1> <mask>, i32 <vector_length>)
22629 The '``llvm.vp.sitofp``' intrinsic converts its signed integer operand to the
22630 :ref:`floating-point <t_floating>` return type. The operation has a mask and
22631 an explicit vector length parameter.
22637 The '``llvm.vp.sitofp``' intrinsic takes a value to cast as its first operand.
22638 The value to cast must be vector of :ref:`integer <t_integer>` type. The
22639 return type is the type to cast the value to. The return type must be a vector
22640 of :ref:`floating-point <t_floating>` type. The second operand is the vector
22641 mask. The return type, the value to cast, and the vector mask have the same
22642 number of elements. The third operand is the explicit vector length of the
22648 The '``llvm.vp.sitofp``' intrinsic interprets its first operand as a signed
22649 integer quantity and converts it to the corresponding floating-point value. If
22650 the value cannot be exactly represented, it is rounded using the default
22651 rounding mode. The conversion is performed on lane positions below the
22652 explicit vector length and where the vector mask is true. Masked-off lanes are
22658 .. code-block:: llvm
22660 %r = call <4 x float> @llvm.vp.sitofp.v4f32.v4i32(<4 x i32> %a, <4 x i1> %mask, i32 %evl)
22661 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
22663 %t = sitofp <4 x i32> %a to <4 x float>
22664 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
22667 .. _int_vp_ptrtoint:
22669 '``llvm.vp.ptrtoint.*``' Intrinsics
22670 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22674 This is an overloaded intrinsic.
22678 declare <16 x i8> @llvm.vp.ptrtoint.v16i8.v16p0(<16 x ptr> <op>, <16 x i1> <mask>, i32 <vector_length>)
22679 declare <vscale x 4 x i8> @llvm.vp.ptrtoint.nxv4i8.nxv4p0(<vscale x 4 x ptr> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
22680 declare <256 x i64> @llvm.vp.ptrtoint.v16i64.v16p0(<256 x ptr> <op>, <256 x i1> <mask>, i32 <vector_length>)
22685 The '``llvm.vp.ptrtoint``' intrinsic converts its pointer to the integer return
22686 type. The operation has a mask and an explicit vector length parameter.
22692 The '``llvm.vp.ptrtoint``' intrinsic takes a value to cast as its first operand
22693 , which must be a vector of pointers, and a type to cast it to return type,
22694 which must be a vector of :ref:`integer <t_integer>` type.
22695 The second operand is the vector mask. The return type, the value to cast, and
22696 the vector mask have the same number of elements.
22697 The third operand is the explicit vector length of the operation.
22702 The '``llvm.vp.ptrtoint``' intrinsic converts value to return type by
22703 interpreting the pointer value as an integer and either truncating or zero
22704 extending that value to the size of the integer type.
22705 If ``value`` is smaller than return type, then a zero extension is done. If
22706 ``value`` is larger than return type, then a truncation is done. If they are
22707 the same size, then nothing is done (*no-op cast*) other than a type
22709 The conversion is performed on lane positions below the explicit vector length
22710 and where the vector mask is true. Masked-off lanes are ``poison``.
22715 .. code-block:: llvm
22717 %r = call <4 x i8> @llvm.vp.ptrtoint.v4i8.v4p0i32(<4 x ptr> %a, <4 x i1> %mask, i32 %evl)
22718 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
22720 %t = ptrtoint <4 x ptr> %a to <4 x i8>
22721 %also.r = select <4 x i1> %mask, <4 x i8> %t, <4 x i8> poison
22724 .. _int_vp_inttoptr:
22726 '``llvm.vp.inttoptr.*``' Intrinsics
22727 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22731 This is an overloaded intrinsic.
22735 declare <16 x ptr> @llvm.vp.inttoptr.v16p0.v16i32 (<16 x i32> <op>, <16 x i1> <mask>, i32 <vector_length>)
22736 declare <vscale x 4 x ptr> @llvm.vp.inttoptr.nxv4p0.nxv4i32 (<vscale x 4 x i32> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
22737 declare <256 x ptr> @llvm.vp.inttoptr.v256p0.v256i32 (<256 x i32> <op>, <256 x i1> <mask>, i32 <vector_length>)
22742 The '``llvm.vp.inttoptr``' intrinsic converts its integer value to the point
22743 return type. The operation has a mask and an explicit vector length parameter.
22749 The '``llvm.vp.inttoptr``' intrinsic takes a value to cast as its first operand
22750 , which must be a vector of :ref:`integer <t_integer>` type, and a type to cast
22751 it to return type, which must be a vector of pointers type.
22752 The second operand is the vector mask. The return type, the value to cast, and
22753 the vector mask have the same number of elements.
22754 The third operand is the explicit vector length of the operation.
22759 The '``llvm.vp.inttoptr``' intrinsic converts ``value`` to return type by
22760 applying either a zero extension or a truncation depending on the size of the
22761 integer ``value``. If ``value`` is larger than the size of a pointer, then a
22762 truncation is done. If ``value`` is smaller than the size of a pointer, then a
22763 zero extension is done. If they are the same size, nothing is done (*no-op cast*).
22764 The conversion is performed on lane positions below the explicit vector length
22765 and where the vector mask is true. Masked-off lanes are ``poison``.
22770 .. code-block:: llvm
22772 %r = call <4 x ptr> @llvm.vp.inttoptr.v4p0i32.v4i32(<4 x i32> %a, <4 x i1> %mask, i32 %evl)
22773 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
22775 %t = inttoptr <4 x i32> %a to <4 x ptr>
22776 %also.r = select <4 x i1> %mask, <4 x ptr> %t, <4 x ptr> poison
22781 '``llvm.vp.fcmp.*``' Intrinsics
22782 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22786 This is an overloaded intrinsic.
22790 declare <16 x i1> @llvm.vp.fcmp.v16f32(<16 x float> <left_op>, <16 x float> <right_op>, metadata <condition code>, <16 x i1> <mask>, i32 <vector_length>)
22791 declare <vscale x 4 x i1> @llvm.vp.fcmp.nxv4f32(<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, metadata <condition code>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
22792 declare <256 x i1> @llvm.vp.fcmp.v256f64(<256 x double> <left_op>, <256 x double> <right_op>, metadata <condition code>, <256 x i1> <mask>, i32 <vector_length>)
22797 The '``llvm.vp.fcmp``' intrinsic returns a vector of boolean values based on
22798 the comparison of its operands. The operation has a mask and an explicit vector
22805 The '``llvm.vp.fcmp``' intrinsic takes the two values to compare as its first
22806 and second operands. These two values must be vectors of :ref:`floating-point
22807 <t_floating>` types.
22808 The return type is the result of the comparison. The return type must be a
22809 vector of :ref:`i1 <t_integer>` type. The fourth operand is the vector mask.
22810 The return type, the values to compare, and the vector mask have the same
22811 number of elements. The third operand is the condition code indicating the kind
22812 of comparison to perform. It must be a metadata string with :ref:`one of the
22813 supported floating-point condition code values <fcmp_md_cc>`. The fifth operand
22814 is the explicit vector length of the operation.
22819 The '``llvm.vp.fcmp``' compares its first two operands according to the
22820 condition code given as the third operand. The operands are compared element by
22821 element on each enabled lane, where the semantics of the comparison are
22822 defined :ref:`according to the condition code <fcmp_md_cc_sem>`. Masked-off
22823 lanes are ``poison``.
22828 .. code-block:: llvm
22830 %r = call <4 x i1> @llvm.vp.fcmp.v4f32(<4 x float> %a, <4 x float> %b, metadata !"oeq", <4 x i1> %mask, i32 %evl)
22831 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
22833 %t = fcmp oeq <4 x float> %a, %b
22834 %also.r = select <4 x i1> %mask, <4 x i1> %t, <4 x i1> poison
22839 '``llvm.vp.icmp.*``' Intrinsics
22840 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22844 This is an overloaded intrinsic.
22848 declare <32 x i1> @llvm.vp.icmp.v32i32(<32 x i32> <left_op>, <32 x i32> <right_op>, metadata <condition code>, <32 x i1> <mask>, i32 <vector_length>)
22849 declare <vscale x 2 x i1> @llvm.vp.icmp.nxv2i32(<vscale x 2 x i32> <left_op>, <vscale x 2 x i32> <right_op>, metadata <condition code>, <vscale x 2 x i1> <mask>, i32 <vector_length>)
22850 declare <128 x i1> @llvm.vp.icmp.v128i8(<128 x i8> <left_op>, <128 x i8> <right_op>, metadata <condition code>, <128 x i1> <mask>, i32 <vector_length>)
22855 The '``llvm.vp.icmp``' intrinsic returns a vector of boolean values based on
22856 the comparison of its operands. The operation has a mask and an explicit vector
22863 The '``llvm.vp.icmp``' intrinsic takes the two values to compare as its first
22864 and second operands. These two values must be vectors of :ref:`integer
22865 <t_integer>` types.
22866 The return type is the result of the comparison. The return type must be a
22867 vector of :ref:`i1 <t_integer>` type. The fourth operand is the vector mask.
22868 The return type, the values to compare, and the vector mask have the same
22869 number of elements. The third operand is the condition code indicating the kind
22870 of comparison to perform. It must be a metadata string with :ref:`one of the
22871 supported integer condition code values <icmp_md_cc>`. The fifth operand is the
22872 explicit vector length of the operation.
22877 The '``llvm.vp.icmp``' compares its first two operands according to the
22878 condition code given as the third operand. The operands are compared element by
22879 element on each enabled lane, where the semantics of the comparison are
22880 defined :ref:`according to the condition code <icmp_md_cc_sem>`. Masked-off
22881 lanes are ``poison``.
22886 .. code-block:: llvm
22888 %r = call <4 x i1> @llvm.vp.icmp.v4i32(<4 x i32> %a, <4 x i32> %b, metadata !"ne", <4 x i1> %mask, i32 %evl)
22889 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
22891 %t = icmp ne <4 x i32> %a, %b
22892 %also.r = select <4 x i1> %mask, <4 x i1> %t, <4 x i1> poison
22896 '``llvm.vp.ceil.*``' Intrinsics
22897 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22901 This is an overloaded intrinsic.
22905 declare <16 x float> @llvm.vp.ceil.v16f32 (<16 x float> <op>, <16 x i1> <mask>, i32 <vector_length>)
22906 declare <vscale x 4 x float> @llvm.vp.ceil.nxv4f32 (<vscale x 4 x float> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
22907 declare <256 x double> @llvm.vp.ceil.v256f64 (<256 x double> <op>, <256 x i1> <mask>, i32 <vector_length>)
22912 Predicated floating-point ceiling of a vector of floating-point values.
22918 The first operand and the result have the same vector of floating-point type.
22919 The second operand is the vector mask and has the same number of elements as the
22920 result vector type. The third operand is the explicit vector length of the
22926 The '``llvm.vp.ceil``' intrinsic performs floating-point ceiling
22927 (:ref:`ceil <int_ceil>`) of the first vector operand on each enabled lane. The
22928 result on disabled lanes is a :ref:`poison value <poisonvalues>`.
22933 .. code-block:: llvm
22935 %r = call <4 x float> @llvm.vp.ceil.v4f32(<4 x float> %a, <4 x i1> %mask, i32 %evl)
22936 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
22938 %t = call <4 x float> @llvm.ceil.v4f32(<4 x float> %a)
22939 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
22943 '``llvm.vp.floor.*``' Intrinsics
22944 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22948 This is an overloaded intrinsic.
22952 declare <16 x float> @llvm.vp.floor.v16f32 (<16 x float> <op>, <16 x i1> <mask>, i32 <vector_length>)
22953 declare <vscale x 4 x float> @llvm.vp.floor.nxv4f32 (<vscale x 4 x float> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
22954 declare <256 x double> @llvm.vp.floor.v256f64 (<256 x double> <op>, <256 x i1> <mask>, i32 <vector_length>)
22959 Predicated floating-point floor of a vector of floating-point values.
22965 The first operand and the result have the same vector of floating-point type.
22966 The second operand is the vector mask and has the same number of elements as the
22967 result vector type. The third operand is the explicit vector length of the
22973 The '``llvm.vp.floor``' intrinsic performs floating-point floor
22974 (:ref:`floor <int_floor>`) of the first vector operand on each enabled lane.
22975 The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
22980 .. code-block:: llvm
22982 %r = call <4 x float> @llvm.vp.floor.v4f32(<4 x float> %a, <4 x i1> %mask, i32 %evl)
22983 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
22985 %t = call <4 x float> @llvm.floor.v4f32(<4 x float> %a)
22986 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
22990 '``llvm.vp.rint.*``' Intrinsics
22991 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22995 This is an overloaded intrinsic.
22999 declare <16 x float> @llvm.vp.rint.v16f32 (<16 x float> <op>, <16 x i1> <mask>, i32 <vector_length>)
23000 declare <vscale x 4 x float> @llvm.vp.rint.nxv4f32 (<vscale x 4 x float> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
23001 declare <256 x double> @llvm.vp.rint.v256f64 (<256 x double> <op>, <256 x i1> <mask>, i32 <vector_length>)
23006 Predicated floating-point rint of a vector of floating-point values.
23012 The first operand and the result have the same vector of floating-point type.
23013 The second operand is the vector mask and has the same number of elements as the
23014 result vector type. The third operand is the explicit vector length of the
23020 The '``llvm.vp.rint``' intrinsic performs floating-point rint
23021 (:ref:`rint <int_rint>`) of the first vector operand on each enabled lane.
23022 The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
23027 .. code-block:: llvm
23029 %r = call <4 x float> @llvm.vp.rint.v4f32(<4 x float> %a, <4 x i1> %mask, i32 %evl)
23030 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
23032 %t = call <4 x float> @llvm.rint.v4f32(<4 x float> %a)
23033 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
23035 .. _int_vp_nearbyint:
23037 '``llvm.vp.nearbyint.*``' Intrinsics
23038 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23042 This is an overloaded intrinsic.
23046 declare <16 x float> @llvm.vp.nearbyint.v16f32 (<16 x float> <op>, <16 x i1> <mask>, i32 <vector_length>)
23047 declare <vscale x 4 x float> @llvm.vp.nearbyint.nxv4f32 (<vscale x 4 x float> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
23048 declare <256 x double> @llvm.vp.nearbyint.v256f64 (<256 x double> <op>, <256 x i1> <mask>, i32 <vector_length>)
23053 Predicated floating-point nearbyint of a vector of floating-point values.
23059 The first operand and the result have the same vector of floating-point type.
23060 The second operand is the vector mask and has the same number of elements as the
23061 result vector type. The third operand is the explicit vector length of the
23067 The '``llvm.vp.nearbyint``' intrinsic performs floating-point nearbyint
23068 (:ref:`nearbyint <int_nearbyint>`) of the first vector operand on each enabled lane.
23069 The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
23074 .. code-block:: llvm
23076 %r = call <4 x float> @llvm.vp.nearbyint.v4f32(<4 x float> %a, <4 x i1> %mask, i32 %evl)
23077 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
23079 %t = call <4 x float> @llvm.nearbyint.v4f32(<4 x float> %a)
23080 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
23084 '``llvm.vp.round.*``' Intrinsics
23085 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23089 This is an overloaded intrinsic.
23093 declare <16 x float> @llvm.vp.round.v16f32 (<16 x float> <op>, <16 x i1> <mask>, i32 <vector_length>)
23094 declare <vscale x 4 x float> @llvm.vp.round.nxv4f32 (<vscale x 4 x float> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
23095 declare <256 x double> @llvm.vp.round.v256f64 (<256 x double> <op>, <256 x i1> <mask>, i32 <vector_length>)
23100 Predicated floating-point round of a vector of floating-point values.
23106 The first operand and the result have the same vector of floating-point type.
23107 The second operand is the vector mask and has the same number of elements as the
23108 result vector type. The third operand is the explicit vector length of the
23114 The '``llvm.vp.round``' intrinsic performs floating-point round
23115 (:ref:`round <int_round>`) of the first vector operand on each enabled lane.
23116 The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
23121 .. code-block:: llvm
23123 %r = call <4 x float> @llvm.vp.round.v4f32(<4 x float> %a, <4 x i1> %mask, i32 %evl)
23124 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
23126 %t = call <4 x float> @llvm.round.v4f32(<4 x float> %a)
23127 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
23129 .. _int_vp_roundeven:
23131 '``llvm.vp.roundeven.*``' Intrinsics
23132 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23136 This is an overloaded intrinsic.
23140 declare <16 x float> @llvm.vp.roundeven.v16f32 (<16 x float> <op>, <16 x i1> <mask>, i32 <vector_length>)
23141 declare <vscale x 4 x float> @llvm.vp.roundeven.nxv4f32 (<vscale x 4 x float> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
23142 declare <256 x double> @llvm.vp.roundeven.v256f64 (<256 x double> <op>, <256 x i1> <mask>, i32 <vector_length>)
23147 Predicated floating-point roundeven of a vector of floating-point values.
23153 The first operand and the result have the same vector of floating-point type.
23154 The second operand is the vector mask and has the same number of elements as the
23155 result vector type. The third operand is the explicit vector length of the
23161 The '``llvm.vp.roundeven``' intrinsic performs floating-point roundeven
23162 (:ref:`roundeven <int_roundeven>`) of the first vector operand on each enabled
23163 lane. The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
23168 .. code-block:: llvm
23170 %r = call <4 x float> @llvm.vp.roundeven.v4f32(<4 x float> %a, <4 x i1> %mask, i32 %evl)
23171 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
23173 %t = call <4 x float> @llvm.roundeven.v4f32(<4 x float> %a)
23174 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
23176 .. _int_vp_roundtozero:
23178 '``llvm.vp.roundtozero.*``' Intrinsics
23179 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23183 This is an overloaded intrinsic.
23187 declare <16 x float> @llvm.vp.roundtozero.v16f32 (<16 x float> <op>, <16 x i1> <mask>, i32 <vector_length>)
23188 declare <vscale x 4 x float> @llvm.vp.roundtozero.nxv4f32 (<vscale x 4 x float> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
23189 declare <256 x double> @llvm.vp.roundtozero.v256f64 (<256 x double> <op>, <256 x i1> <mask>, i32 <vector_length>)
23194 Predicated floating-point round-to-zero of a vector of floating-point values.
23200 The first operand and the result have the same vector of floating-point type.
23201 The second operand is the vector mask and has the same number of elements as the
23202 result vector type. The third operand is the explicit vector length of the
23208 The '``llvm.vp.roundtozero``' intrinsic performs floating-point roundeven
23209 (:ref:`llvm.trunc <int_llvm_trunc>`) of the first vector operand on each enabled lane. The
23210 result on disabled lanes is a :ref:`poison value <poisonvalues>`.
23215 .. code-block:: llvm
23217 %r = call <4 x float> @llvm.vp.roundtozero.v4f32(<4 x float> %a, <4 x i1> %mask, i32 %evl)
23218 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
23220 %t = call <4 x float> @llvm.trunc.v4f32(<4 x float> %a)
23221 %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> poison
23223 .. _int_vp_bitreverse:
23225 '``llvm.vp.bitreverse.*``' Intrinsics
23226 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23230 This is an overloaded intrinsic.
23234 declare <16 x i32> @llvm.vp.bitreverse.v16i32 (<16 x i32> <op>, <16 x i1> <mask>, i32 <vector_length>)
23235 declare <vscale x 4 x i32> @llvm.vp.bitreverse.nxv4i32 (<vscale x 4 x i32> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
23236 declare <256 x i64> @llvm.vp.bitreverse.v256i64 (<256 x i64> <op>, <256 x i1> <mask>, i32 <vector_length>)
23241 Predicated bitreverse of a vector of integers.
23247 The first operand and the result have the same vector of integer type. The
23248 second operand is the vector mask and has the same number of elements as the
23249 result vector type. The third operand is the explicit vector length of the
23255 The '``llvm.vp.bitreverse``' intrinsic performs bitreverse (:ref:`bitreverse <int_bitreverse>`) of the first operand on each
23256 enabled lane. The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
23261 .. code-block:: llvm
23263 %r = call <4 x i32> @llvm.vp.bitreverse.v4i32(<4 x i32> %a, <4 x i1> %mask, i32 %evl)
23264 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
23266 %t = call <4 x i32> @llvm.bitreverse.v4i32(<4 x i32> %a)
23267 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
23272 '``llvm.vp.bswap.*``' Intrinsics
23273 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23277 This is an overloaded intrinsic.
23281 declare <16 x i32> @llvm.vp.bswap.v16i32 (<16 x i32> <op>, <16 x i1> <mask>, i32 <vector_length>)
23282 declare <vscale x 4 x i32> @llvm.vp.bswap.nxv4i32 (<vscale x 4 x i32> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
23283 declare <256 x i64> @llvm.vp.bswap.v256i64 (<256 x i64> <op>, <256 x i1> <mask>, i32 <vector_length>)
23288 Predicated bswap of a vector of integers.
23294 The first operand and the result have the same vector of integer type. The
23295 second operand is the vector mask and has the same number of elements as the
23296 result vector type. The third operand is the explicit vector length of the
23302 The '``llvm.vp.bswap``' intrinsic performs bswap (:ref:`bswap <int_bswap>`) of the first operand on each
23303 enabled lane. The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
23308 .. code-block:: llvm
23310 %r = call <4 x i32> @llvm.vp.bswap.v4i32(<4 x i32> %a, <4 x i1> %mask, i32 %evl)
23311 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
23313 %t = call <4 x i32> @llvm.bswap.v4i32(<4 x i32> %a)
23314 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
23319 '``llvm.vp.ctpop.*``' Intrinsics
23320 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23324 This is an overloaded intrinsic.
23328 declare <16 x i32> @llvm.vp.ctpop.v16i32 (<16 x i32> <op>, <16 x i1> <mask>, i32 <vector_length>)
23329 declare <vscale x 4 x i32> @llvm.vp.ctpop.nxv4i32 (<vscale x 4 x i32> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
23330 declare <256 x i64> @llvm.vp.ctpop.v256i64 (<256 x i64> <op>, <256 x i1> <mask>, i32 <vector_length>)
23335 Predicated ctpop of a vector of integers.
23341 The first operand and the result have the same vector of integer type. The
23342 second operand is the vector mask and has the same number of elements as the
23343 result vector type. The third operand is the explicit vector length of the
23349 The '``llvm.vp.ctpop``' intrinsic performs ctpop (:ref:`ctpop <int_ctpop>`) of the first operand on each
23350 enabled lane. The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
23355 .. code-block:: llvm
23357 %r = call <4 x i32> @llvm.vp.ctpop.v4i32(<4 x i32> %a, <4 x i1> %mask, i32 %evl)
23358 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
23360 %t = call <4 x i32> @llvm.ctpop.v4i32(<4 x i32> %a)
23361 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
23366 '``llvm.vp.ctlz.*``' Intrinsics
23367 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23371 This is an overloaded intrinsic.
23375 declare <16 x i32> @llvm.vp.ctlz.v16i32 (<16 x i32> <op>, <16 x i1> <mask>, i32 <vector_length>, i1 <is_zero_poison>)
23376 declare <vscale x 4 x i32> @llvm.vp.ctlz.nxv4i32 (<vscale x 4 x i32> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>, i1 <is_zero_poison>)
23377 declare <256 x i64> @llvm.vp.ctlz.v256i64 (<256 x i64> <op>, <256 x i1> <mask>, i32 <vector_length>, i1 <is_zero_poison>)
23382 Predicated ctlz of a vector of integers.
23388 The first operand and the result have the same vector of integer type. The
23389 second operand is the vector mask and has the same number of elements as the
23390 result vector type. The third operand is the explicit vector length of the
23396 The '``llvm.vp.ctlz``' intrinsic performs ctlz (:ref:`ctlz <int_ctlz>`) of the first operand on each
23397 enabled lane. The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
23402 .. code-block:: llvm
23404 %r = call <4 x i32> @llvm.vp.ctlz.v4i32(<4 x i32> %a, <4 x i1> %mask, i32 %evl, i1 false)
23405 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
23407 %t = call <4 x i32> @llvm.ctlz.v4i32(<4 x i32> %a, i1 false)
23408 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
23413 '``llvm.vp.cttz.*``' Intrinsics
23414 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23418 This is an overloaded intrinsic.
23422 declare <16 x i32> @llvm.vp.cttz.v16i32 (<16 x i32> <op>, <16 x i1> <mask>, i32 <vector_length>, i1 <is_zero_poison>)
23423 declare <vscale x 4 x i32> @llvm.vp.cttz.nxv4i32 (<vscale x 4 x i32> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>, i1 <is_zero_poison>)
23424 declare <256 x i64> @llvm.vp.cttz.v256i64 (<256 x i64> <op>, <256 x i1> <mask>, i32 <vector_length>, i1 <is_zero_poison>)
23429 Predicated cttz of a vector of integers.
23435 The first operand and the result have the same vector of integer type. The
23436 second operand is the vector mask and has the same number of elements as the
23437 result vector type. The third operand is the explicit vector length of the
23443 The '``llvm.vp.cttz``' intrinsic performs cttz (:ref:`cttz <int_cttz>`) of the first operand on each
23444 enabled lane. The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
23449 .. code-block:: llvm
23451 %r = call <4 x i32> @llvm.vp.cttz.v4i32(<4 x i32> %a, <4 x i1> %mask, i32 %evl, i1 false)
23452 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
23454 %t = call <4 x i32> @llvm.cttz.v4i32(<4 x i32> %a, i1 false)
23455 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
23460 '``llvm.vp.fshl.*``' Intrinsics
23461 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23465 This is an overloaded intrinsic.
23469 declare <16 x i32> @llvm.vp.fshl.v16i32 (<16 x i32> <left_op>, <16 x i32> <middle_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
23470 declare <vscale x 4 x i32> @llvm.vp.fshl.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <middle_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
23471 declare <256 x i64> @llvm.vp.fshl.v256i64 (<256 x i64> <left_op>, <256 x i64> <middle_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
23476 Predicated fshl of three vectors of integers.
23482 The first three operand and the result have the same vector of integer type. The
23483 fourth operand is the vector mask and has the same number of elements as the
23484 result vector type. The fifth operand is the explicit vector length of the
23490 The '``llvm.vp.fshl``' intrinsic performs fshl (:ref:`fshl <int_fshl>`) of the first, second, and third
23491 vector operand on each enabled lane. The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
23497 .. code-block:: llvm
23499 %r = call <4 x i32> @llvm.vp.fshl.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i32> %c, <4 x i1> %mask, i32 %evl)
23500 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
23502 %t = call <4 x i32> @llvm.fshl.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i32> %c)
23503 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
23506 '``llvm.vp.fshr.*``' Intrinsics
23507 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23511 This is an overloaded intrinsic.
23515 declare <16 x i32> @llvm.vp.fshr.v16i32 (<16 x i32> <left_op>, <16 x i32> <middle_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
23516 declare <vscale x 4 x i32> @llvm.vp.fshr.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <middle_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
23517 declare <256 x i64> @llvm.vp.fshr.v256i64 (<256 x i64> <left_op>, <256 x i64> <middle_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
23522 Predicated fshr of three vectors of integers.
23528 The first three operand and the result have the same vector of integer type. The
23529 fourth operand is the vector mask and has the same number of elements as the
23530 result vector type. The fifth operand is the explicit vector length of the
23536 The '``llvm.vp.fshr``' intrinsic performs fshr (:ref:`fshr <int_fshr>`) of the first, second, and third
23537 vector operand on each enabled lane. The result on disabled lanes is a :ref:`poison value <poisonvalues>`.
23543 .. code-block:: llvm
23545 %r = call <4 x i32> @llvm.vp.fshr.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i32> %c, <4 x i1> %mask, i32 %evl)
23546 ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
23548 %t = call <4 x i32> @llvm.fshr.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i32> %c)
23549 %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> poison
23551 '``llvm.vp.is.fpclass.*``' Intrinsics
23552 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23556 This is an overloaded intrinsic.
23560 declare <vscale x 2 x i1> @llvm.vp.is.fpclass.nxv2f32(<vscale x 2 x float> <op>, i32 <test>, <vscale x 2 x i1> <mask>, i32 <vector_length>)
23561 declare <2 x i1> @llvm.vp.is.fpclass.v2f16(<2 x half> <op>, i32 <test>, <2 x i1> <mask>, i32 <vector_length>)
23566 Predicated llvm.is.fpclass :ref:`llvm.is.fpclass <llvm.is.fpclass>`
23571 The first operand is a floating-point vector, the result type is a vector of
23572 boolean with the same number of elements as the first argument. The second
23573 operand specifies, which tests to perform :ref:`llvm.is.fpclass <llvm.is.fpclass>`.
23574 The third operand is the vector mask and has the same number of elements as the
23575 result vector type. The fourth operand is the explicit vector length of the
23581 The '``llvm.vp.is.fpclass``' intrinsic performs llvm.is.fpclass (:ref:`llvm.is.fpclass <llvm.is.fpclass>`).
23587 .. code-block:: llvm
23589 %r = call <2 x i1> @llvm.vp.is.fpclass.v2f16(<2 x half> %x, i32 3, <2 x i1> %m, i32 %evl)
23590 %t = call <vscale x 2 x i1> @llvm.vp.is.fpclass.nxv2f16(<vscale x 2 x half> %x, i32 3, <vscale x 2 x i1> %m, i32 %evl)
23592 .. _int_mload_mstore:
23594 Masked Vector Load and Store Intrinsics
23595 ---------------------------------------
23597 LLVM provides intrinsics for predicated vector load and store operations. The predicate is specified by a mask operand, which holds one bit per vector element, switching the associated vector lane on or off. The memory addresses corresponding to the "off" lanes are not accessed. When all bits of the mask are on, the intrinsic is identical to a regular vector load or store. When all bits are off, no memory is accessed.
23601 '``llvm.masked.load.*``' Intrinsics
23602 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23606 This is an overloaded intrinsic. The loaded data is a vector of any integer, floating-point or pointer data type.
23610 declare <16 x float> @llvm.masked.load.v16f32.p0(ptr <ptr>, i32 <alignment>, <16 x i1> <mask>, <16 x float> <passthru>)
23611 declare <2 x double> @llvm.masked.load.v2f64.p0(ptr <ptr>, i32 <alignment>, <2 x i1> <mask>, <2 x double> <passthru>)
23612 ;; The data is a vector of pointers
23613 declare <8 x ptr> @llvm.masked.load.v8p0.p0(ptr <ptr>, i32 <alignment>, <8 x i1> <mask>, <8 x ptr> <passthru>)
23618 Reads a vector from memory according to the provided mask. The mask holds a bit for each vector lane, and is used to prevent memory accesses to the masked-off lanes. The masked-off lanes in the result vector are taken from the corresponding lanes of the '``passthru``' operand.
23624 The first operand is the base pointer for the load. The second operand is the alignment of the source location. It must be a power of two constant integer value. The third operand, mask, is a vector of boolean values with the same number of elements as the return type. The fourth is a pass-through value that is used to fill the masked-off lanes of the result. The return type, underlying type of the base pointer and the type of the '``passthru``' operand are the same vector types.
23629 The '``llvm.masked.load``' intrinsic is designed for conditional reading of selected vector elements in a single IR operation. It is useful for targets that support vector masked loads and allows vectorizing predicated basic blocks on these targets. Other targets may support this intrinsic differently, for example by lowering it into a sequence of branches that guard scalar load operations.
23630 The result of this operation is equivalent to a regular vector load instruction followed by a 'select' between the loaded and the passthru values, predicated on the same mask. However, using this intrinsic prevents exceptions on memory access to masked-off lanes.
23635 %res = call <16 x float> @llvm.masked.load.v16f32.p0(ptr %ptr, i32 4, <16 x i1>%mask, <16 x float> %passthru)
23637 ;; The result of the two following instructions is identical aside from potential memory access exception
23638 %loadlal = load <16 x float>, ptr %ptr, align 4
23639 %res = select <16 x i1> %mask, <16 x float> %loadlal, <16 x float> %passthru
23643 '``llvm.masked.store.*``' Intrinsics
23644 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23648 This is an overloaded intrinsic. The data stored in memory is a vector of any integer, floating-point or pointer data type.
23652 declare void @llvm.masked.store.v8i32.p0 (<8 x i32> <value>, ptr <ptr>, i32 <alignment>, <8 x i1> <mask>)
23653 declare void @llvm.masked.store.v16f32.p0(<16 x float> <value>, ptr <ptr>, i32 <alignment>, <16 x i1> <mask>)
23654 ;; The data is a vector of pointers
23655 declare void @llvm.masked.store.v8p0.p0 (<8 x ptr> <value>, ptr <ptr>, i32 <alignment>, <8 x i1> <mask>)
23660 Writes a vector to memory according to the provided mask. The mask holds a bit for each vector lane, and is used to prevent memory accesses to the masked-off lanes.
23665 The first operand is the vector value to be written to memory. The second operand is the base pointer for the store, it has the same underlying type as the value operand. The third operand is the alignment of the destination location. It must be a power of two constant integer value. The fourth operand, mask, is a vector of boolean values. The types of the mask and the value operand must have the same number of vector elements.
23671 The '``llvm.masked.store``' intrinsics is designed for conditional writing of selected vector elements in a single IR operation. It is useful for targets that support vector masked store and allows vectorizing predicated basic blocks on these targets. Other targets may support this intrinsic differently, for example by lowering it into a sequence of branches that guard scalar store operations.
23672 The result of this operation is equivalent to a load-modify-store sequence. However, using this intrinsic prevents exceptions and data races on memory access to masked-off lanes.
23676 call void @llvm.masked.store.v16f32.p0(<16 x float> %value, ptr %ptr, i32 4, <16 x i1> %mask)
23678 ;; The result of the following instructions is identical aside from potential data races and memory access exceptions
23679 %oldval = load <16 x float>, ptr %ptr, align 4
23680 %res = select <16 x i1> %mask, <16 x float> %value, <16 x float> %oldval
23681 store <16 x float> %res, ptr %ptr, align 4
23684 Masked Vector Gather and Scatter Intrinsics
23685 -------------------------------------------
23687 LLVM provides intrinsics for vector gather and scatter operations. They are similar to :ref:`Masked Vector Load and Store <int_mload_mstore>`, except they are designed for arbitrary memory accesses, rather than sequential memory accesses. Gather and scatter also employ a mask operand, which holds one bit per vector element, switching the associated vector lane on or off. The memory addresses corresponding to the "off" lanes are not accessed. When all bits are off, no memory is accessed.
23691 '``llvm.masked.gather.*``' Intrinsics
23692 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23696 This is an overloaded intrinsic. The loaded data are multiple scalar values of any integer, floating-point or pointer data type gathered together into one vector.
23700 declare <16 x float> @llvm.masked.gather.v16f32.v16p0(<16 x ptr> <ptrs>, i32 <alignment>, <16 x i1> <mask>, <16 x float> <passthru>)
23701 declare <2 x double> @llvm.masked.gather.v2f64.v2p1(<2 x ptr addrspace(1)> <ptrs>, i32 <alignment>, <2 x i1> <mask>, <2 x double> <passthru>)
23702 declare <8 x ptr> @llvm.masked.gather.v8p0.v8p0(<8 x ptr> <ptrs>, i32 <alignment>, <8 x i1> <mask>, <8 x ptr> <passthru>)
23707 Reads scalar values from arbitrary memory locations and gathers them into one vector. The memory locations are provided in the vector of pointers '``ptrs``'. The memory is accessed according to the provided mask. The mask holds a bit for each vector lane, and is used to prevent memory accesses to the masked-off lanes. The masked-off lanes in the result vector are taken from the corresponding lanes of the '``passthru``' operand.
23713 The first operand is a vector of pointers which holds all memory addresses to read. The second operand is an alignment of the source addresses. It must be 0 or a power of two constant integer value. The third operand, mask, is a vector of boolean values with the same number of elements as the return type. The fourth is a pass-through value that is used to fill the masked-off lanes of the result. The return type, underlying type of the vector of pointers and the type of the '``passthru``' operand are the same vector types.
23718 The '``llvm.masked.gather``' intrinsic is designed for conditional reading of multiple scalar values from arbitrary memory locations in a single IR operation. It is useful for targets that support vector masked gathers and allows vectorizing basic blocks with data and control divergence. Other targets may support this intrinsic differently, for example by lowering it into a sequence of scalar load operations.
23719 The semantics of this operation are equivalent to a sequence of conditional scalar loads with subsequent gathering all loaded values into a single vector. The mask restricts memory access to certain lanes and facilitates vectorization of predicated basic blocks.
23724 %res = call <4 x double> @llvm.masked.gather.v4f64.v4p0(<4 x ptr> %ptrs, i32 8, <4 x i1> <i1 true, i1 true, i1 true, i1 true>, <4 x double> poison)
23726 ;; The gather with all-true mask is equivalent to the following instruction sequence
23727 %ptr0 = extractelement <4 x ptr> %ptrs, i32 0
23728 %ptr1 = extractelement <4 x ptr> %ptrs, i32 1
23729 %ptr2 = extractelement <4 x ptr> %ptrs, i32 2
23730 %ptr3 = extractelement <4 x ptr> %ptrs, i32 3
23732 %val0 = load double, ptr %ptr0, align 8
23733 %val1 = load double, ptr %ptr1, align 8
23734 %val2 = load double, ptr %ptr2, align 8
23735 %val3 = load double, ptr %ptr3, align 8
23737 %vec0 = insertelement <4 x double> poison, %val0, 0
23738 %vec01 = insertelement <4 x double> %vec0, %val1, 1
23739 %vec012 = insertelement <4 x double> %vec01, %val2, 2
23740 %vec0123 = insertelement <4 x double> %vec012, %val3, 3
23744 '``llvm.masked.scatter.*``' Intrinsics
23745 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23749 This is an overloaded intrinsic. The data stored in memory is a vector of any integer, floating-point or pointer data type. Each vector element is stored in an arbitrary memory address. Scatter with overlapping addresses is guaranteed to be ordered from least-significant to most-significant element.
23753 declare void @llvm.masked.scatter.v8i32.v8p0 (<8 x i32> <value>, <8 x ptr> <ptrs>, i32 <alignment>, <8 x i1> <mask>)
23754 declare void @llvm.masked.scatter.v16f32.v16p1(<16 x float> <value>, <16 x ptr addrspace(1)> <ptrs>, i32 <alignment>, <16 x i1> <mask>)
23755 declare void @llvm.masked.scatter.v4p0.v4p0 (<4 x ptr> <value>, <4 x ptr> <ptrs>, i32 <alignment>, <4 x i1> <mask>)
23760 Writes each element from the value vector to the corresponding memory address. The memory addresses are represented as a vector of pointers. Writing is done according to the provided mask. The mask holds a bit for each vector lane, and is used to prevent memory accesses to the masked-off lanes.
23765 The first operand is a vector value to be written to memory. The second operand is a vector of pointers, pointing to where the value elements should be stored. It has the same underlying type as the value operand. The third operand is an alignment of the destination addresses. It must be 0 or a power of two constant integer value. The fourth operand, mask, is a vector of boolean values. The types of the mask and the value operand must have the same number of vector elements.
23770 The '``llvm.masked.scatter``' intrinsics is designed for writing selected vector elements to arbitrary memory addresses in a single IR operation. The operation may be conditional, when not all bits in the mask are switched on. It is useful for targets that support vector masked scatter and allows vectorizing basic blocks with data and control divergence. Other targets may support this intrinsic differently, for example by lowering it into a sequence of branches that guard scalar store operations.
23774 ;; This instruction unconditionally stores data vector in multiple addresses
23775 call @llvm.masked.scatter.v8i32.v8p0(<8 x i32> %value, <8 x ptr> %ptrs, i32 4, <8 x i1> <true, true, .. true>)
23777 ;; It is equivalent to a list of scalar stores
23778 %val0 = extractelement <8 x i32> %value, i32 0
23779 %val1 = extractelement <8 x i32> %value, i32 1
23781 %val7 = extractelement <8 x i32> %value, i32 7
23782 %ptr0 = extractelement <8 x ptr> %ptrs, i32 0
23783 %ptr1 = extractelement <8 x ptr> %ptrs, i32 1
23785 %ptr7 = extractelement <8 x ptr> %ptrs, i32 7
23786 ;; Note: the order of the following stores is important when they overlap:
23787 store i32 %val0, ptr %ptr0, align 4
23788 store i32 %val1, ptr %ptr1, align 4
23790 store i32 %val7, ptr %ptr7, align 4
23793 Masked Vector Expanding Load and Compressing Store Intrinsics
23794 -------------------------------------------------------------
23796 LLVM provides intrinsics for expanding load and compressing store operations. Data selected from a vector according to a mask is stored in consecutive memory addresses (compressed store), and vice-versa (expanding load). These operations effective map to "if (cond.i) a[j++] = v.i" and "if (cond.i) v.i = a[j++]" patterns, respectively. Note that when the mask starts with '1' bits followed by '0' bits, these operations are identical to :ref:`llvm.masked.store <int_mstore>` and :ref:`llvm.masked.load <int_mload>`.
23798 .. _int_expandload:
23800 '``llvm.masked.expandload.*``' Intrinsics
23801 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23805 This is an overloaded intrinsic. Several values of integer, floating point or pointer data type are loaded from consecutive memory addresses and stored into the elements of a vector according to the mask.
23809 declare <16 x float> @llvm.masked.expandload.v16f32 (ptr <ptr>, <16 x i1> <mask>, <16 x float> <passthru>)
23810 declare <2 x i64> @llvm.masked.expandload.v2i64 (ptr <ptr>, <2 x i1> <mask>, <2 x i64> <passthru>)
23815 Reads a number of scalar values sequentially from memory location provided in '``ptr``' and spreads them in a vector. The '``mask``' holds a bit for each vector lane. The number of elements read from memory is equal to the number of '1' bits in the mask. The loaded elements are positioned in the destination vector according to the sequence of '1' and '0' bits in the mask. E.g., if the mask vector is '10010001', "expandload" reads 3 values from memory addresses ptr, ptr+1, ptr+2 and places them in lanes 0, 3 and 7 accordingly. The masked-off lanes are filled by elements from the corresponding lanes of the '``passthru``' operand.
23821 The first operand is the base pointer for the load. It has the same underlying type as the element of the returned vector. The second operand, mask, is a vector of boolean values with the same number of elements as the return type. The third is a pass-through value that is used to fill the masked-off lanes of the result. The return type and the type of the '``passthru``' operand have the same vector type.
23826 The '``llvm.masked.expandload``' intrinsic is designed for reading multiple scalar values from adjacent memory addresses into possibly non-adjacent vector lanes. It is useful for targets that support vector expanding loads and allows vectorizing loop with cross-iteration dependency like in the following example:
23830 // In this loop we load from B and spread the elements into array A.
23831 double *A, B; int *C;
23832 for (int i = 0; i < size; ++i) {
23838 .. code-block:: llvm
23840 ; Load several elements from array B and expand them in a vector.
23841 ; The number of loaded elements is equal to the number of '1' elements in the Mask.
23842 %Tmp = call <8 x double> @llvm.masked.expandload.v8f64(ptr %Bptr, <8 x i1> %Mask, <8 x double> poison)
23843 ; Store the result in A
23844 call void @llvm.masked.store.v8f64.p0(<8 x double> %Tmp, ptr %Aptr, i32 8, <8 x i1> %Mask)
23846 ; %Bptr should be increased on each iteration according to the number of '1' elements in the Mask.
23847 %MaskI = bitcast <8 x i1> %Mask to i8
23848 %MaskIPopcnt = call i8 @llvm.ctpop.i8(i8 %MaskI)
23849 %MaskI64 = zext i8 %MaskIPopcnt to i64
23850 %BNextInd = add i64 %BInd, %MaskI64
23853 Other targets may support this intrinsic differently, for example, by lowering it into a sequence of conditional scalar load operations and shuffles.
23854 If all mask elements are '1', the intrinsic behavior is equivalent to the regular unmasked vector load.
23856 .. _int_compressstore:
23858 '``llvm.masked.compressstore.*``' Intrinsics
23859 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23863 This is an overloaded intrinsic. A number of scalar values of integer, floating point or pointer data type are collected from an input vector and stored into adjacent memory addresses. A mask defines which elements to collect from the vector.
23867 declare void @llvm.masked.compressstore.v8i32 (<8 x i32> <value>, ptr <ptr>, <8 x i1> <mask>)
23868 declare void @llvm.masked.compressstore.v16f32 (<16 x float> <value>, ptr <ptr>, <16 x i1> <mask>)
23873 Selects elements from input vector '``value``' according to the '``mask``'. All selected elements are written into adjacent memory addresses starting at address '`ptr`', from lower to higher. The mask holds a bit for each vector lane, and is used to select elements to be stored. The number of elements to be stored is equal to the number of active bits in the mask.
23878 The first operand is the input vector, from which elements are collected and written to memory. The second operand is the base pointer for the store, it has the same underlying type as the element of the input vector operand. The third operand is the mask, a vector of boolean values. The mask and the input vector must have the same number of vector elements.
23884 The '``llvm.masked.compressstore``' intrinsic is designed for compressing data in memory. It allows to collect elements from possibly non-adjacent lanes of a vector and store them contiguously in memory in one IR operation. It is useful for targets that support compressing store operations and allows vectorizing loops with cross-iteration dependences like in the following example:
23888 // In this loop we load elements from A and store them consecutively in B
23889 double *A, B; int *C;
23890 for (int i = 0; i < size; ++i) {
23896 .. code-block:: llvm
23898 ; Load elements from A.
23899 %Tmp = call <8 x double> @llvm.masked.load.v8f64.p0(ptr %Aptr, i32 8, <8 x i1> %Mask, <8 x double> poison)
23900 ; Store all selected elements consecutively in array B
23901 call <void> @llvm.masked.compressstore.v8f64(<8 x double> %Tmp, ptr %Bptr, <8 x i1> %Mask)
23903 ; %Bptr should be increased on each iteration according to the number of '1' elements in the Mask.
23904 %MaskI = bitcast <8 x i1> %Mask to i8
23905 %MaskIPopcnt = call i8 @llvm.ctpop.i8(i8 %MaskI)
23906 %MaskI64 = zext i8 %MaskIPopcnt to i64
23907 %BNextInd = add i64 %BInd, %MaskI64
23910 Other targets may support this intrinsic differently, for example, by lowering it into a sequence of branches that guard scalar store operations.
23916 This class of intrinsics provides information about the
23917 :ref:`lifetime of memory objects <objectlifetime>` and ranges where variables
23922 '``llvm.lifetime.start``' Intrinsic
23923 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23930 declare void @llvm.lifetime.start(i64 <size>, ptr nocapture <ptr>)
23935 The '``llvm.lifetime.start``' intrinsic specifies the start of a memory
23941 The first argument is a constant integer representing the size of the
23942 object, or -1 if it is variable sized. The second argument is a pointer
23948 If ``ptr`` is a stack-allocated object and it points to the first byte of
23949 the object, the object is initially marked as dead.
23950 ``ptr`` is conservatively considered as a non-stack-allocated object if
23951 the stack coloring algorithm that is used in the optimization pipeline cannot
23952 conclude that ``ptr`` is a stack-allocated object.
23954 After '``llvm.lifetime.start``', the stack object that ``ptr`` points is marked
23955 as alive and has an uninitialized value.
23956 The stack object is marked as dead when either
23957 :ref:`llvm.lifetime.end <int_lifeend>` to the alloca is executed or the
23960 After :ref:`llvm.lifetime.end <int_lifeend>` is called,
23961 '``llvm.lifetime.start``' on the stack object can be called again.
23962 The second '``llvm.lifetime.start``' call marks the object as alive, but it
23963 does not change the address of the object.
23965 If ``ptr`` is a non-stack-allocated object, it does not point to the first
23966 byte of the object or it is a stack object that is already alive, it simply
23967 fills all bytes of the object with ``poison``.
23972 '``llvm.lifetime.end``' Intrinsic
23973 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23980 declare void @llvm.lifetime.end(i64 <size>, ptr nocapture <ptr>)
23985 The '``llvm.lifetime.end``' intrinsic specifies the end of a memory object's
23991 The first argument is a constant integer representing the size of the
23992 object, or -1 if it is variable sized. The second argument is a pointer
23998 If ``ptr`` is a stack-allocated object and it points to the first byte of the
23999 object, the object is dead.
24000 ``ptr`` is conservatively considered as a non-stack-allocated object if
24001 the stack coloring algorithm that is used in the optimization pipeline cannot
24002 conclude that ``ptr`` is a stack-allocated object.
24004 Calling ``llvm.lifetime.end`` on an already dead alloca is no-op.
24006 If ``ptr`` is a non-stack-allocated object or it does not point to the first
24007 byte of the object, it is equivalent to simply filling all bytes of the object
24011 '``llvm.invariant.start``' Intrinsic
24012 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24016 This is an overloaded intrinsic. The memory object can belong to any address space.
24020 declare ptr @llvm.invariant.start.p0(i64 <size>, ptr nocapture <ptr>)
24025 The '``llvm.invariant.start``' intrinsic specifies that the contents of
24026 a memory object will not change.
24031 The first argument is a constant integer representing the size of the
24032 object, or -1 if it is variable sized. The second argument is a pointer
24038 This intrinsic indicates that until an ``llvm.invariant.end`` that uses
24039 the return value, the referenced memory location is constant and
24042 '``llvm.invariant.end``' Intrinsic
24043 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24047 This is an overloaded intrinsic. The memory object can belong to any address space.
24051 declare void @llvm.invariant.end.p0(ptr <start>, i64 <size>, ptr nocapture <ptr>)
24056 The '``llvm.invariant.end``' intrinsic specifies that the contents of a
24057 memory object are mutable.
24062 The first argument is the matching ``llvm.invariant.start`` intrinsic.
24063 The second argument is a constant integer representing the size of the
24064 object, or -1 if it is variable sized and the third argument is a
24065 pointer to the object.
24070 This intrinsic indicates that the memory is mutable again.
24072 '``llvm.launder.invariant.group``' Intrinsic
24073 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24077 This is an overloaded intrinsic. The memory object can belong to any address
24078 space. The returned pointer must belong to the same address space as the
24083 declare ptr @llvm.launder.invariant.group.p0(ptr <ptr>)
24088 The '``llvm.launder.invariant.group``' intrinsic can be used when an invariant
24089 established by ``invariant.group`` metadata no longer holds, to obtain a new
24090 pointer value that carries fresh invariant group information. It is an
24091 experimental intrinsic, which means that its semantics might change in the
24098 The ``llvm.launder.invariant.group`` takes only one argument, which is a pointer
24104 Returns another pointer that aliases its argument but which is considered different
24105 for the purposes of ``load``/``store`` ``invariant.group`` metadata.
24106 It does not read any accessible memory and the execution can be speculated.
24108 '``llvm.strip.invariant.group``' Intrinsic
24109 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24113 This is an overloaded intrinsic. The memory object can belong to any address
24114 space. The returned pointer must belong to the same address space as the
24119 declare ptr @llvm.strip.invariant.group.p0(ptr <ptr>)
24124 The '``llvm.strip.invariant.group``' intrinsic can be used when an invariant
24125 established by ``invariant.group`` metadata no longer holds, to obtain a new pointer
24126 value that does not carry the invariant information. It is an experimental
24127 intrinsic, which means that its semantics might change in the future.
24133 The ``llvm.strip.invariant.group`` takes only one argument, which is a pointer
24139 Returns another pointer that aliases its argument but which has no associated
24140 ``invariant.group`` metadata.
24141 It does not read any memory and can be speculated.
24147 Constrained Floating-Point Intrinsics
24148 -------------------------------------
24150 These intrinsics are used to provide special handling of floating-point
24151 operations when specific rounding mode or floating-point exception behavior is
24152 required. By default, LLVM optimization passes assume that the rounding mode is
24153 round-to-nearest and that floating-point exceptions will not be monitored.
24154 Constrained FP intrinsics are used to support non-default rounding modes and
24155 accurately preserve exception behavior without compromising LLVM's ability to
24156 optimize FP code when the default behavior is used.
24158 If any FP operation in a function is constrained then they all must be
24159 constrained. This is required for correct LLVM IR. Optimizations that
24160 move code around can create miscompiles if mixing of constrained and normal
24161 operations is done. The correct way to mix constrained and less constrained
24162 operations is to use the rounding mode and exception handling metadata to
24163 mark constrained intrinsics as having LLVM's default behavior.
24165 Each of these intrinsics corresponds to a normal floating-point operation. The
24166 data arguments and the return value are the same as the corresponding FP
24169 The rounding mode argument is a metadata string specifying what
24170 assumptions, if any, the optimizer can make when transforming constant
24171 values. Some constrained FP intrinsics omit this argument. If required
24172 by the intrinsic, this argument must be one of the following strings:
24181 "round.tonearestaway"
24183 If this argument is "round.dynamic" optimization passes must assume that the
24184 rounding mode is unknown and may change at runtime. No transformations that
24185 depend on rounding mode may be performed in this case.
24187 The other possible values for the rounding mode argument correspond to the
24188 similarly named IEEE rounding modes. If the argument is any of these values
24189 optimization passes may perform transformations as long as they are consistent
24190 with the specified rounding mode.
24192 For example, 'x-0'->'x' is not a valid transformation if the rounding mode is
24193 "round.downward" or "round.dynamic" because if the value of 'x' is +0 then
24194 'x-0' should evaluate to '-0' when rounding downward. However, this
24195 transformation is legal for all other rounding modes.
24197 For values other than "round.dynamic" optimization passes may assume that the
24198 actual runtime rounding mode (as defined in a target-specific manner) matches
24199 the specified rounding mode, but this is not guaranteed. Using a specific
24200 non-dynamic rounding mode which does not match the actual rounding mode at
24201 runtime results in undefined behavior.
24203 The exception behavior argument is a metadata string describing the floating
24204 point exception semantics that required for the intrinsic. This argument
24205 must be one of the following strings:
24213 If this argument is "fpexcept.ignore" optimization passes may assume that the
24214 exception status flags will not be read and that floating-point exceptions will
24215 be masked. This allows transformations to be performed that may change the
24216 exception semantics of the original code. For example, FP operations may be
24217 speculatively executed in this case whereas they must not be for either of the
24218 other possible values of this argument.
24220 If the exception behavior argument is "fpexcept.maytrap" optimization passes
24221 must avoid transformations that may raise exceptions that would not have been
24222 raised by the original code (such as speculatively executing FP operations), but
24223 passes are not required to preserve all exceptions that are implied by the
24224 original code. For example, exceptions may be potentially hidden by constant
24227 If the exception behavior argument is "fpexcept.strict" all transformations must
24228 strictly preserve the floating-point exception semantics of the original code.
24229 Any FP exception that would have been raised by the original code must be raised
24230 by the transformed code, and the transformed code must not raise any FP
24231 exceptions that would not have been raised by the original code. This is the
24232 exception behavior argument that will be used if the code being compiled reads
24233 the FP exception status flags, but this mode can also be used with code that
24234 unmasks FP exceptions.
24236 The number and order of floating-point exceptions is NOT guaranteed. For
24237 example, a series of FP operations that each may raise exceptions may be
24238 vectorized into a single instruction that raises each unique exception a single
24241 Proper :ref:`function attributes <fnattrs>` usage is required for the
24242 constrained intrinsics to function correctly.
24244 All function *calls* done in a function that uses constrained floating
24245 point intrinsics must have the ``strictfp`` attribute either on the
24246 calling instruction or on the declaration or definition of the function
24249 All function *definitions* that use constrained floating point intrinsics
24250 must have the ``strictfp`` attribute.
24252 '``llvm.experimental.constrained.fadd``' Intrinsic
24253 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24261 @llvm.experimental.constrained.fadd(<type> <op1>, <type> <op2>,
24262 metadata <rounding mode>,
24263 metadata <exception behavior>)
24268 The '``llvm.experimental.constrained.fadd``' intrinsic returns the sum of its
24275 The first two arguments to the '``llvm.experimental.constrained.fadd``'
24276 intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>`
24277 of floating-point values. Both arguments must have identical types.
24279 The third and fourth arguments specify the rounding mode and exception
24280 behavior as described above.
24285 The value produced is the floating-point sum of the two value operands and has
24286 the same type as the operands.
24289 '``llvm.experimental.constrained.fsub``' Intrinsic
24290 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24298 @llvm.experimental.constrained.fsub(<type> <op1>, <type> <op2>,
24299 metadata <rounding mode>,
24300 metadata <exception behavior>)
24305 The '``llvm.experimental.constrained.fsub``' intrinsic returns the difference
24306 of its two operands.
24312 The first two arguments to the '``llvm.experimental.constrained.fsub``'
24313 intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>`
24314 of floating-point values. Both arguments must have identical types.
24316 The third and fourth arguments specify the rounding mode and exception
24317 behavior as described above.
24322 The value produced is the floating-point difference of the two value operands
24323 and has the same type as the operands.
24326 '``llvm.experimental.constrained.fmul``' Intrinsic
24327 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24335 @llvm.experimental.constrained.fmul(<type> <op1>, <type> <op2>,
24336 metadata <rounding mode>,
24337 metadata <exception behavior>)
24342 The '``llvm.experimental.constrained.fmul``' intrinsic returns the product of
24349 The first two arguments to the '``llvm.experimental.constrained.fmul``'
24350 intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>`
24351 of floating-point values. Both arguments must have identical types.
24353 The third and fourth arguments specify the rounding mode and exception
24354 behavior as described above.
24359 The value produced is the floating-point product of the two value operands and
24360 has the same type as the operands.
24363 '``llvm.experimental.constrained.fdiv``' Intrinsic
24364 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24372 @llvm.experimental.constrained.fdiv(<type> <op1>, <type> <op2>,
24373 metadata <rounding mode>,
24374 metadata <exception behavior>)
24379 The '``llvm.experimental.constrained.fdiv``' intrinsic returns the quotient of
24386 The first two arguments to the '``llvm.experimental.constrained.fdiv``'
24387 intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>`
24388 of floating-point values. Both arguments must have identical types.
24390 The third and fourth arguments specify the rounding mode and exception
24391 behavior as described above.
24396 The value produced is the floating-point quotient of the two value operands and
24397 has the same type as the operands.
24400 '``llvm.experimental.constrained.frem``' Intrinsic
24401 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24409 @llvm.experimental.constrained.frem(<type> <op1>, <type> <op2>,
24410 metadata <rounding mode>,
24411 metadata <exception behavior>)
24416 The '``llvm.experimental.constrained.frem``' intrinsic returns the remainder
24417 from the division of its two operands.
24423 The first two arguments to the '``llvm.experimental.constrained.frem``'
24424 intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>`
24425 of floating-point values. Both arguments must have identical types.
24427 The third and fourth arguments specify the rounding mode and exception
24428 behavior as described above. The rounding mode argument has no effect, since
24429 the result of frem is never rounded, but the argument is included for
24430 consistency with the other constrained floating-point intrinsics.
24435 The value produced is the floating-point remainder from the division of the two
24436 value operands and has the same type as the operands. The remainder has the
24437 same sign as the dividend.
24439 '``llvm.experimental.constrained.fma``' Intrinsic
24440 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24448 @llvm.experimental.constrained.fma(<type> <op1>, <type> <op2>, <type> <op3>,
24449 metadata <rounding mode>,
24450 metadata <exception behavior>)
24455 The '``llvm.experimental.constrained.fma``' intrinsic returns the result of a
24456 fused-multiply-add operation on its operands.
24461 The first three arguments to the '``llvm.experimental.constrained.fma``'
24462 intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector
24463 <t_vector>` of floating-point values. All arguments must have identical types.
24465 The fourth and fifth arguments specify the rounding mode and exception behavior
24466 as described above.
24471 The result produced is the product of the first two operands added to the third
24472 operand computed with infinite precision, and then rounded to the target
24475 '``llvm.experimental.constrained.fptoui``' Intrinsic
24476 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24484 @llvm.experimental.constrained.fptoui(<type> <value>,
24485 metadata <exception behavior>)
24490 The '``llvm.experimental.constrained.fptoui``' intrinsic converts a
24491 floating-point ``value`` to its unsigned integer equivalent of type ``ty2``.
24496 The first argument to the '``llvm.experimental.constrained.fptoui``'
24497 intrinsic must be :ref:`floating point <t_floating>` or :ref:`vector
24498 <t_vector>` of floating point values.
24500 The second argument specifies the exception behavior as described above.
24505 The result produced is an unsigned integer converted from the floating
24506 point operand. The value is truncated, so it is rounded towards zero.
24508 '``llvm.experimental.constrained.fptosi``' Intrinsic
24509 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24517 @llvm.experimental.constrained.fptosi(<type> <value>,
24518 metadata <exception behavior>)
24523 The '``llvm.experimental.constrained.fptosi``' intrinsic converts
24524 :ref:`floating-point <t_floating>` ``value`` to type ``ty2``.
24529 The first argument to the '``llvm.experimental.constrained.fptosi``'
24530 intrinsic must be :ref:`floating point <t_floating>` or :ref:`vector
24531 <t_vector>` of floating point values.
24533 The second argument specifies the exception behavior as described above.
24538 The result produced is a signed integer converted from the floating
24539 point operand. The value is truncated, so it is rounded towards zero.
24541 '``llvm.experimental.constrained.uitofp``' Intrinsic
24542 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24550 @llvm.experimental.constrained.uitofp(<type> <value>,
24551 metadata <rounding mode>,
24552 metadata <exception behavior>)
24557 The '``llvm.experimental.constrained.uitofp``' intrinsic converts an
24558 unsigned integer ``value`` to a floating-point of type ``ty2``.
24563 The first argument to the '``llvm.experimental.constrained.uitofp``'
24564 intrinsic must be an :ref:`integer <t_integer>` or :ref:`vector
24565 <t_vector>` of integer values.
24567 The second and third arguments specify the rounding mode and exception
24568 behavior as described above.
24573 An inexact floating-point exception will be raised if rounding is required.
24574 Any result produced is a floating point value converted from the input
24577 '``llvm.experimental.constrained.sitofp``' Intrinsic
24578 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24586 @llvm.experimental.constrained.sitofp(<type> <value>,
24587 metadata <rounding mode>,
24588 metadata <exception behavior>)
24593 The '``llvm.experimental.constrained.sitofp``' intrinsic converts a
24594 signed integer ``value`` to a floating-point of type ``ty2``.
24599 The first argument to the '``llvm.experimental.constrained.sitofp``'
24600 intrinsic must be an :ref:`integer <t_integer>` or :ref:`vector
24601 <t_vector>` of integer values.
24603 The second and third arguments specify the rounding mode and exception
24604 behavior as described above.
24609 An inexact floating-point exception will be raised if rounding is required.
24610 Any result produced is a floating point value converted from the input
24613 '``llvm.experimental.constrained.fptrunc``' Intrinsic
24614 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24622 @llvm.experimental.constrained.fptrunc(<type> <value>,
24623 metadata <rounding mode>,
24624 metadata <exception behavior>)
24629 The '``llvm.experimental.constrained.fptrunc``' intrinsic truncates ``value``
24635 The first argument to the '``llvm.experimental.constrained.fptrunc``'
24636 intrinsic must be :ref:`floating point <t_floating>` or :ref:`vector
24637 <t_vector>` of floating point values. This argument must be larger in size
24640 The second and third arguments specify the rounding mode and exception
24641 behavior as described above.
24646 The result produced is a floating point value truncated to be smaller in size
24649 '``llvm.experimental.constrained.fpext``' Intrinsic
24650 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24658 @llvm.experimental.constrained.fpext(<type> <value>,
24659 metadata <exception behavior>)
24664 The '``llvm.experimental.constrained.fpext``' intrinsic extends a
24665 floating-point ``value`` to a larger floating-point value.
24670 The first argument to the '``llvm.experimental.constrained.fpext``'
24671 intrinsic must be :ref:`floating point <t_floating>` or :ref:`vector
24672 <t_vector>` of floating point values. This argument must be smaller in size
24675 The second argument specifies the exception behavior as described above.
24680 The result produced is a floating point value extended to be larger in size
24681 than the operand. All restrictions that apply to the fpext instruction also
24682 apply to this intrinsic.
24684 '``llvm.experimental.constrained.fcmp``' and '``llvm.experimental.constrained.fcmps``' Intrinsics
24685 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24693 @llvm.experimental.constrained.fcmp(<type> <op1>, <type> <op2>,
24694 metadata <condition code>,
24695 metadata <exception behavior>)
24697 @llvm.experimental.constrained.fcmps(<type> <op1>, <type> <op2>,
24698 metadata <condition code>,
24699 metadata <exception behavior>)
24704 The '``llvm.experimental.constrained.fcmp``' and
24705 '``llvm.experimental.constrained.fcmps``' intrinsics return a boolean
24706 value or vector of boolean values based on comparison of its operands.
24708 If the operands are floating-point scalars, then the result type is a
24709 boolean (:ref:`i1 <t_integer>`).
24711 If the operands are floating-point vectors, then the result type is a
24712 vector of boolean with the same number of elements as the operands being
24715 The '``llvm.experimental.constrained.fcmp``' intrinsic performs a quiet
24716 comparison operation while the '``llvm.experimental.constrained.fcmps``'
24717 intrinsic performs a signaling comparison operation.
24722 The first two arguments to the '``llvm.experimental.constrained.fcmp``'
24723 and '``llvm.experimental.constrained.fcmps``' intrinsics must be
24724 :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>`
24725 of floating-point values. Both arguments must have identical types.
24727 The third argument is the condition code indicating the kind of comparison
24728 to perform. It must be a metadata string with one of the following values:
24732 - "``oeq``": ordered and equal
24733 - "``ogt``": ordered and greater than
24734 - "``oge``": ordered and greater than or equal
24735 - "``olt``": ordered and less than
24736 - "``ole``": ordered and less than or equal
24737 - "``one``": ordered and not equal
24738 - "``ord``": ordered (no nans)
24739 - "``ueq``": unordered or equal
24740 - "``ugt``": unordered or greater than
24741 - "``uge``": unordered or greater than or equal
24742 - "``ult``": unordered or less than
24743 - "``ule``": unordered or less than or equal
24744 - "``une``": unordered or not equal
24745 - "``uno``": unordered (either nans)
24747 *Ordered* means that neither operand is a NAN while *unordered* means
24748 that either operand may be a NAN.
24750 The fourth argument specifies the exception behavior as described above.
24755 ``op1`` and ``op2`` are compared according to the condition code given
24756 as the third argument. If the operands are vectors, then the
24757 vectors are compared element by element. Each comparison performed
24758 always yields an :ref:`i1 <t_integer>` result, as follows:
24760 .. _fcmp_md_cc_sem:
24762 - "``oeq``": yields ``true`` if both operands are not a NAN and ``op1``
24763 is equal to ``op2``.
24764 - "``ogt``": yields ``true`` if both operands are not a NAN and ``op1``
24765 is greater than ``op2``.
24766 - "``oge``": yields ``true`` if both operands are not a NAN and ``op1``
24767 is greater than or equal to ``op2``.
24768 - "``olt``": yields ``true`` if both operands are not a NAN and ``op1``
24769 is less than ``op2``.
24770 - "``ole``": yields ``true`` if both operands are not a NAN and ``op1``
24771 is less than or equal to ``op2``.
24772 - "``one``": yields ``true`` if both operands are not a NAN and ``op1``
24773 is not equal to ``op2``.
24774 - "``ord``": yields ``true`` if both operands are not a NAN.
24775 - "``ueq``": yields ``true`` if either operand is a NAN or ``op1`` is
24777 - "``ugt``": yields ``true`` if either operand is a NAN or ``op1`` is
24778 greater than ``op2``.
24779 - "``uge``": yields ``true`` if either operand is a NAN or ``op1`` is
24780 greater than or equal to ``op2``.
24781 - "``ult``": yields ``true`` if either operand is a NAN or ``op1`` is
24783 - "``ule``": yields ``true`` if either operand is a NAN or ``op1`` is
24784 less than or equal to ``op2``.
24785 - "``une``": yields ``true`` if either operand is a NAN or ``op1`` is
24786 not equal to ``op2``.
24787 - "``uno``": yields ``true`` if either operand is a NAN.
24789 The quiet comparison operation performed by
24790 '``llvm.experimental.constrained.fcmp``' will only raise an exception
24791 if either operand is a SNAN. The signaling comparison operation
24792 performed by '``llvm.experimental.constrained.fcmps``' will raise an
24793 exception if either operand is a NAN (QNAN or SNAN). Such an exception
24794 does not preclude a result being produced (e.g. exception might only
24795 set a flag), therefore the distinction between ordered and unordered
24796 comparisons is also relevant for the
24797 '``llvm.experimental.constrained.fcmps``' intrinsic.
24799 '``llvm.experimental.constrained.fmuladd``' Intrinsic
24800 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24808 @llvm.experimental.constrained.fmuladd(<type> <op1>, <type> <op2>,
24810 metadata <rounding mode>,
24811 metadata <exception behavior>)
24816 The '``llvm.experimental.constrained.fmuladd``' intrinsic represents
24817 multiply-add expressions that can be fused if the code generator determines
24818 that (a) the target instruction set has support for a fused operation,
24819 and (b) that the fused operation is more efficient than the equivalent,
24820 separate pair of mul and add instructions.
24825 The first three arguments to the '``llvm.experimental.constrained.fmuladd``'
24826 intrinsic must be floating-point or vector of floating-point values.
24827 All three arguments must have identical types.
24829 The fourth and fifth arguments specify the rounding mode and exception behavior
24830 as described above.
24839 %0 = call float @llvm.experimental.constrained.fmuladd.f32(%a, %b, %c,
24840 metadata <rounding mode>,
24841 metadata <exception behavior>)
24843 is equivalent to the expression:
24847 %0 = call float @llvm.experimental.constrained.fmul.f32(%a, %b,
24848 metadata <rounding mode>,
24849 metadata <exception behavior>)
24850 %1 = call float @llvm.experimental.constrained.fadd.f32(%0, %c,
24851 metadata <rounding mode>,
24852 metadata <exception behavior>)
24854 except that it is unspecified whether rounding will be performed between the
24855 multiplication and addition steps. Fusion is not guaranteed, even if the target
24856 platform supports it.
24857 If a fused multiply-add is required, the corresponding
24858 :ref:`llvm.experimental.constrained.fma <int_fma>` intrinsic function should be
24860 This never sets errno, just as '``llvm.experimental.constrained.fma.*``'.
24862 Constrained libm-equivalent Intrinsics
24863 --------------------------------------
24865 In addition to the basic floating-point operations for which constrained
24866 intrinsics are described above, there are constrained versions of various
24867 operations which provide equivalent behavior to a corresponding libm function.
24868 These intrinsics allow the precise behavior of these operations with respect to
24869 rounding mode and exception behavior to be controlled.
24871 As with the basic constrained floating-point intrinsics, the rounding mode
24872 and exception behavior arguments only control the behavior of the optimizer.
24873 They do not change the runtime floating-point environment.
24876 '``llvm.experimental.constrained.sqrt``' Intrinsic
24877 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24885 @llvm.experimental.constrained.sqrt(<type> <op1>,
24886 metadata <rounding mode>,
24887 metadata <exception behavior>)
24892 The '``llvm.experimental.constrained.sqrt``' intrinsic returns the square root
24893 of the specified value, returning the same value as the libm '``sqrt``'
24894 functions would, but without setting ``errno``.
24899 The first argument and the return type are floating-point numbers of the same
24902 The second and third arguments specify the rounding mode and exception
24903 behavior as described above.
24908 This function returns the nonnegative square root of the specified value.
24909 If the value is less than negative zero, a floating-point exception occurs
24910 and the return value is architecture specific.
24913 '``llvm.experimental.constrained.pow``' Intrinsic
24914 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24922 @llvm.experimental.constrained.pow(<type> <op1>, <type> <op2>,
24923 metadata <rounding mode>,
24924 metadata <exception behavior>)
24929 The '``llvm.experimental.constrained.pow``' intrinsic returns the first operand
24930 raised to the (positive or negative) power specified by the second operand.
24935 The first two arguments and the return value are floating-point numbers of the
24936 same type. The second argument specifies the power to which the first argument
24939 The third and fourth arguments specify the rounding mode and exception
24940 behavior as described above.
24945 This function returns the first value raised to the second power,
24946 returning the same values as the libm ``pow`` functions would, and
24947 handles error conditions in the same way.
24950 '``llvm.experimental.constrained.powi``' Intrinsic
24951 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24959 @llvm.experimental.constrained.powi(<type> <op1>, i32 <op2>,
24960 metadata <rounding mode>,
24961 metadata <exception behavior>)
24966 The '``llvm.experimental.constrained.powi``' intrinsic returns the first operand
24967 raised to the (positive or negative) power specified by the second operand. The
24968 order of evaluation of multiplications is not defined. When a vector of
24969 floating-point type is used, the second argument remains a scalar integer value.
24975 The first argument and the return value are floating-point numbers of the same
24976 type. The second argument is a 32-bit signed integer specifying the power to
24977 which the first argument should be raised.
24979 The third and fourth arguments specify the rounding mode and exception
24980 behavior as described above.
24985 This function returns the first value raised to the second power with an
24986 unspecified sequence of rounding operations.
24989 '``llvm.experimental.constrained.ldexp``' Intrinsic
24990 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24998 @llvm.experimental.constrained.ldexp(<type0> <op1>, <type1> <op2>,
24999 metadata <rounding mode>,
25000 metadata <exception behavior>)
25005 The '``llvm.experimental.constrained.ldexp``' performs the ldexp function.
25011 The first argument and the return value are :ref:`floating-point
25012 <t_floating>` or :ref:`vector <t_vector>` of floating-point values of
25013 the same type. The second argument is an integer with the same number
25017 The third and fourth arguments specify the rounding mode and exception
25018 behavior as described above.
25023 This function multiplies the first argument by 2 raised to the second
25024 argument's power. If the first argument is NaN or infinite, the same
25025 value is returned. If the result underflows a zero with the same sign
25026 is returned. If the result overflows, the result is an infinity with
25030 '``llvm.experimental.constrained.sin``' Intrinsic
25031 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25039 @llvm.experimental.constrained.sin(<type> <op1>,
25040 metadata <rounding mode>,
25041 metadata <exception behavior>)
25046 The '``llvm.experimental.constrained.sin``' intrinsic returns the sine of the
25052 The first argument and the return type are floating-point numbers of the same
25055 The second and third arguments specify the rounding mode and exception
25056 behavior as described above.
25061 This function returns the sine of the specified operand, returning the
25062 same values as the libm ``sin`` functions would, and handles error
25063 conditions in the same way.
25066 '``llvm.experimental.constrained.cos``' Intrinsic
25067 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25075 @llvm.experimental.constrained.cos(<type> <op1>,
25076 metadata <rounding mode>,
25077 metadata <exception behavior>)
25082 The '``llvm.experimental.constrained.cos``' intrinsic returns the cosine of the
25088 The first argument and the return type are floating-point numbers of the same
25091 The second and third arguments specify the rounding mode and exception
25092 behavior as described above.
25097 This function returns the cosine of the specified operand, returning the
25098 same values as the libm ``cos`` functions would, and handles error
25099 conditions in the same way.
25102 '``llvm.experimental.constrained.exp``' Intrinsic
25103 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25111 @llvm.experimental.constrained.exp(<type> <op1>,
25112 metadata <rounding mode>,
25113 metadata <exception behavior>)
25118 The '``llvm.experimental.constrained.exp``' intrinsic computes the base-e
25119 exponential of the specified value.
25124 The first argument and the return value are floating-point numbers of the same
25127 The second and third arguments specify the rounding mode and exception
25128 behavior as described above.
25133 This function returns the same values as the libm ``exp`` functions
25134 would, and handles error conditions in the same way.
25137 '``llvm.experimental.constrained.exp2``' Intrinsic
25138 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25146 @llvm.experimental.constrained.exp2(<type> <op1>,
25147 metadata <rounding mode>,
25148 metadata <exception behavior>)
25153 The '``llvm.experimental.constrained.exp2``' intrinsic computes the base-2
25154 exponential of the specified value.
25160 The first argument and the return value are floating-point numbers of the same
25163 The second and third arguments specify the rounding mode and exception
25164 behavior as described above.
25169 This function returns the same values as the libm ``exp2`` functions
25170 would, and handles error conditions in the same way.
25173 '``llvm.experimental.constrained.log``' Intrinsic
25174 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25182 @llvm.experimental.constrained.log(<type> <op1>,
25183 metadata <rounding mode>,
25184 metadata <exception behavior>)
25189 The '``llvm.experimental.constrained.log``' intrinsic computes the base-e
25190 logarithm of the specified value.
25195 The first argument and the return value are floating-point numbers of the same
25198 The second and third arguments specify the rounding mode and exception
25199 behavior as described above.
25205 This function returns the same values as the libm ``log`` functions
25206 would, and handles error conditions in the same way.
25209 '``llvm.experimental.constrained.log10``' Intrinsic
25210 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25218 @llvm.experimental.constrained.log10(<type> <op1>,
25219 metadata <rounding mode>,
25220 metadata <exception behavior>)
25225 The '``llvm.experimental.constrained.log10``' intrinsic computes the base-10
25226 logarithm of the specified value.
25231 The first argument and the return value are floating-point numbers of the same
25234 The second and third arguments specify the rounding mode and exception
25235 behavior as described above.
25240 This function returns the same values as the libm ``log10`` functions
25241 would, and handles error conditions in the same way.
25244 '``llvm.experimental.constrained.log2``' Intrinsic
25245 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25253 @llvm.experimental.constrained.log2(<type> <op1>,
25254 metadata <rounding mode>,
25255 metadata <exception behavior>)
25260 The '``llvm.experimental.constrained.log2``' intrinsic computes the base-2
25261 logarithm of the specified value.
25266 The first argument and the return value are floating-point numbers of the same
25269 The second and third arguments specify the rounding mode and exception
25270 behavior as described above.
25275 This function returns the same values as the libm ``log2`` functions
25276 would, and handles error conditions in the same way.
25279 '``llvm.experimental.constrained.rint``' Intrinsic
25280 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25288 @llvm.experimental.constrained.rint(<type> <op1>,
25289 metadata <rounding mode>,
25290 metadata <exception behavior>)
25295 The '``llvm.experimental.constrained.rint``' intrinsic returns the first
25296 operand rounded to the nearest integer. It may raise an inexact floating-point
25297 exception if the operand is not an integer.
25302 The first argument and the return value are floating-point numbers of the same
25305 The second and third arguments specify the rounding mode and exception
25306 behavior as described above.
25311 This function returns the same values as the libm ``rint`` functions
25312 would, and handles error conditions in the same way. The rounding mode is
25313 described, not determined, by the rounding mode argument. The actual rounding
25314 mode is determined by the runtime floating-point environment. The rounding
25315 mode argument is only intended as information to the compiler.
25318 '``llvm.experimental.constrained.lrint``' Intrinsic
25319 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25327 @llvm.experimental.constrained.lrint(<fptype> <op1>,
25328 metadata <rounding mode>,
25329 metadata <exception behavior>)
25334 The '``llvm.experimental.constrained.lrint``' intrinsic returns the first
25335 operand rounded to the nearest integer. An inexact floating-point exception
25336 will be raised if the operand is not an integer. An invalid exception is
25337 raised if the result is too large to fit into a supported integer type,
25338 and in this case the result is undefined.
25343 The first argument is a floating-point number. The return value is an
25344 integer type. Not all types are supported on all targets. The supported
25345 types are the same as the ``llvm.lrint`` intrinsic and the ``lrint``
25348 The second and third arguments specify the rounding mode and exception
25349 behavior as described above.
25354 This function returns the same values as the libm ``lrint`` functions
25355 would, and handles error conditions in the same way.
25357 The rounding mode is described, not determined, by the rounding mode
25358 argument. The actual rounding mode is determined by the runtime floating-point
25359 environment. The rounding mode argument is only intended as information
25362 If the runtime floating-point environment is using the default rounding mode
25363 then the results will be the same as the llvm.lrint intrinsic.
25366 '``llvm.experimental.constrained.llrint``' Intrinsic
25367 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25375 @llvm.experimental.constrained.llrint(<fptype> <op1>,
25376 metadata <rounding mode>,
25377 metadata <exception behavior>)
25382 The '``llvm.experimental.constrained.llrint``' intrinsic returns the first
25383 operand rounded to the nearest integer. An inexact floating-point exception
25384 will be raised if the operand is not an integer. An invalid exception is
25385 raised if the result is too large to fit into a supported integer type,
25386 and in this case the result is undefined.
25391 The first argument is a floating-point number. The return value is an
25392 integer type. Not all types are supported on all targets. The supported
25393 types are the same as the ``llvm.llrint`` intrinsic and the ``llrint``
25396 The second and third arguments specify the rounding mode and exception
25397 behavior as described above.
25402 This function returns the same values as the libm ``llrint`` functions
25403 would, and handles error conditions in the same way.
25405 The rounding mode is described, not determined, by the rounding mode
25406 argument. The actual rounding mode is determined by the runtime floating-point
25407 environment. The rounding mode argument is only intended as information
25410 If the runtime floating-point environment is using the default rounding mode
25411 then the results will be the same as the llvm.llrint intrinsic.
25414 '``llvm.experimental.constrained.nearbyint``' Intrinsic
25415 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25423 @llvm.experimental.constrained.nearbyint(<type> <op1>,
25424 metadata <rounding mode>,
25425 metadata <exception behavior>)
25430 The '``llvm.experimental.constrained.nearbyint``' intrinsic returns the first
25431 operand rounded to the nearest integer. It will not raise an inexact
25432 floating-point exception if the operand is not an integer.
25438 The first argument and the return value are floating-point numbers of the same
25441 The second and third arguments specify the rounding mode and exception
25442 behavior as described above.
25447 This function returns the same values as the libm ``nearbyint`` functions
25448 would, and handles error conditions in the same way. The rounding mode is
25449 described, not determined, by the rounding mode argument. The actual rounding
25450 mode is determined by the runtime floating-point environment. The rounding
25451 mode argument is only intended as information to the compiler.
25454 '``llvm.experimental.constrained.maxnum``' Intrinsic
25455 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25463 @llvm.experimental.constrained.maxnum(<type> <op1>, <type> <op2>
25464 metadata <exception behavior>)
25469 The '``llvm.experimental.constrained.maxnum``' intrinsic returns the maximum
25470 of the two arguments.
25475 The first two arguments and the return value are floating-point numbers
25478 The third argument specifies the exception behavior as described above.
25483 This function follows the IEEE-754 semantics for maxNum.
25486 '``llvm.experimental.constrained.minnum``' Intrinsic
25487 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25495 @llvm.experimental.constrained.minnum(<type> <op1>, <type> <op2>
25496 metadata <exception behavior>)
25501 The '``llvm.experimental.constrained.minnum``' intrinsic returns the minimum
25502 of the two arguments.
25507 The first two arguments and the return value are floating-point numbers
25510 The third argument specifies the exception behavior as described above.
25515 This function follows the IEEE-754 semantics for minNum.
25518 '``llvm.experimental.constrained.maximum``' Intrinsic
25519 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25527 @llvm.experimental.constrained.maximum(<type> <op1>, <type> <op2>
25528 metadata <exception behavior>)
25533 The '``llvm.experimental.constrained.maximum``' intrinsic returns the maximum
25534 of the two arguments, propagating NaNs and treating -0.0 as less than +0.0.
25539 The first two arguments and the return value are floating-point numbers
25542 The third argument specifies the exception behavior as described above.
25547 This function follows semantics specified in the draft of IEEE 754-2018.
25550 '``llvm.experimental.constrained.minimum``' Intrinsic
25551 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25559 @llvm.experimental.constrained.minimum(<type> <op1>, <type> <op2>
25560 metadata <exception behavior>)
25565 The '``llvm.experimental.constrained.minimum``' intrinsic returns the minimum
25566 of the two arguments, propagating NaNs and treating -0.0 as less than +0.0.
25571 The first two arguments and the return value are floating-point numbers
25574 The third argument specifies the exception behavior as described above.
25579 This function follows semantics specified in the draft of IEEE 754-2018.
25582 '``llvm.experimental.constrained.ceil``' Intrinsic
25583 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25591 @llvm.experimental.constrained.ceil(<type> <op1>,
25592 metadata <exception behavior>)
25597 The '``llvm.experimental.constrained.ceil``' intrinsic returns the ceiling of the
25603 The first argument and the return value are floating-point numbers of the same
25606 The second argument specifies the exception behavior as described above.
25611 This function returns the same values as the libm ``ceil`` functions
25612 would and handles error conditions in the same way.
25615 '``llvm.experimental.constrained.floor``' Intrinsic
25616 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25624 @llvm.experimental.constrained.floor(<type> <op1>,
25625 metadata <exception behavior>)
25630 The '``llvm.experimental.constrained.floor``' intrinsic returns the floor of the
25636 The first argument and the return value are floating-point numbers of the same
25639 The second argument specifies the exception behavior as described above.
25644 This function returns the same values as the libm ``floor`` functions
25645 would and handles error conditions in the same way.
25648 '``llvm.experimental.constrained.round``' Intrinsic
25649 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25657 @llvm.experimental.constrained.round(<type> <op1>,
25658 metadata <exception behavior>)
25663 The '``llvm.experimental.constrained.round``' intrinsic returns the first
25664 operand rounded to the nearest integer.
25669 The first argument and the return value are floating-point numbers of the same
25672 The second argument specifies the exception behavior as described above.
25677 This function returns the same values as the libm ``round`` functions
25678 would and handles error conditions in the same way.
25681 '``llvm.experimental.constrained.roundeven``' Intrinsic
25682 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25690 @llvm.experimental.constrained.roundeven(<type> <op1>,
25691 metadata <exception behavior>)
25696 The '``llvm.experimental.constrained.roundeven``' intrinsic returns the first
25697 operand rounded to the nearest integer in floating-point format, rounding
25698 halfway cases to even (that is, to the nearest value that is an even integer),
25699 regardless of the current rounding direction.
25704 The first argument and the return value are floating-point numbers of the same
25707 The second argument specifies the exception behavior as described above.
25712 This function implements IEEE-754 operation ``roundToIntegralTiesToEven``. It
25713 also behaves in the same way as C standard function ``roundeven`` and can signal
25714 the invalid operation exception for a SNAN operand.
25717 '``llvm.experimental.constrained.lround``' Intrinsic
25718 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25726 @llvm.experimental.constrained.lround(<fptype> <op1>,
25727 metadata <exception behavior>)
25732 The '``llvm.experimental.constrained.lround``' intrinsic returns the first
25733 operand rounded to the nearest integer with ties away from zero. It will
25734 raise an inexact floating-point exception if the operand is not an integer.
25735 An invalid exception is raised if the result is too large to fit into a
25736 supported integer type, and in this case the result is undefined.
25741 The first argument is a floating-point number. The return value is an
25742 integer type. Not all types are supported on all targets. The supported
25743 types are the same as the ``llvm.lround`` intrinsic and the ``lround``
25746 The second argument specifies the exception behavior as described above.
25751 This function returns the same values as the libm ``lround`` functions
25752 would and handles error conditions in the same way.
25755 '``llvm.experimental.constrained.llround``' Intrinsic
25756 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25764 @llvm.experimental.constrained.llround(<fptype> <op1>,
25765 metadata <exception behavior>)
25770 The '``llvm.experimental.constrained.llround``' intrinsic returns the first
25771 operand rounded to the nearest integer with ties away from zero. It will
25772 raise an inexact floating-point exception if the operand is not an integer.
25773 An invalid exception is raised if the result is too large to fit into a
25774 supported integer type, and in this case the result is undefined.
25779 The first argument is a floating-point number. The return value is an
25780 integer type. Not all types are supported on all targets. The supported
25781 types are the same as the ``llvm.llround`` intrinsic and the ``llround``
25784 The second argument specifies the exception behavior as described above.
25789 This function returns the same values as the libm ``llround`` functions
25790 would and handles error conditions in the same way.
25793 '``llvm.experimental.constrained.trunc``' Intrinsic
25794 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25802 @llvm.experimental.constrained.trunc(<type> <op1>,
25803 metadata <exception behavior>)
25808 The '``llvm.experimental.constrained.trunc``' intrinsic returns the first
25809 operand rounded to the nearest integer not larger in magnitude than the
25815 The first argument and the return value are floating-point numbers of the same
25818 The second argument specifies the exception behavior as described above.
25823 This function returns the same values as the libm ``trunc`` functions
25824 would and handles error conditions in the same way.
25826 .. _int_experimental_noalias_scope_decl:
25828 '``llvm.experimental.noalias.scope.decl``' Intrinsic
25829 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25837 declare void @llvm.experimental.noalias.scope.decl(metadata !id.scope.list)
25842 The ``llvm.experimental.noalias.scope.decl`` intrinsic identifies where a
25843 noalias scope is declared. When the intrinsic is duplicated, a decision must
25844 also be made about the scope: depending on the reason of the duplication,
25845 the scope might need to be duplicated as well.
25851 The ``!id.scope.list`` argument is metadata that is a list of ``noalias``
25852 metadata references. The format is identical to that required for ``noalias``
25853 metadata. This list must have exactly one element.
25858 The ``llvm.experimental.noalias.scope.decl`` intrinsic identifies where a
25859 noalias scope is declared. When the intrinsic is duplicated, a decision must
25860 also be made about the scope: depending on the reason of the duplication,
25861 the scope might need to be duplicated as well.
25863 For example, when the intrinsic is used inside a loop body, and that loop is
25864 unrolled, the associated noalias scope must also be duplicated. Otherwise, the
25865 noalias property it signifies would spill across loop iterations, whereas it
25866 was only valid within a single iteration.
25868 .. code-block:: llvm
25870 ; This examples shows two possible positions for noalias.decl and how they impact the semantics:
25871 ; If it is outside the loop (Version 1), then %a and %b are noalias across *all* iterations.
25872 ; If it is inside the loop (Version 2), then %a and %b are noalias only within *one* iteration.
25873 declare void @decl_in_loop(ptr %a.base, ptr %b.base) {
25875 ; call void @llvm.experimental.noalias.scope.decl(metadata !2) ; Version 1: noalias decl outside loop
25879 %a = phi ptr [ %a.base, %entry ], [ %a.inc, %loop ]
25880 %b = phi ptr [ %b.base, %entry ], [ %b.inc, %loop ]
25881 ; call void @llvm.experimental.noalias.scope.decl(metadata !2) ; Version 2: noalias decl inside loop
25882 %val = load i8, ptr %a, !alias.scope !2
25883 store i8 %val, ptr %b, !noalias !2
25884 %a.inc = getelementptr inbounds i8, ptr %a, i64 1
25885 %b.inc = getelementptr inbounds i8, ptr %b, i64 1
25886 %cond = call i1 @cond()
25887 br i1 %cond, label %loop, label %exit
25893 !0 = !{!0} ; domain
25894 !1 = !{!1, !0} ; scope
25895 !2 = !{!1} ; scope list
25897 Multiple calls to `@llvm.experimental.noalias.scope.decl` for the same scope
25898 are possible, but one should never dominate another. Violations are pointed out
25899 by the verifier as they indicate a problem in either a transformation pass or
25903 Floating Point Environment Manipulation intrinsics
25904 --------------------------------------------------
25906 These functions read or write floating point environment, such as rounding
25907 mode or state of floating point exceptions. Altering the floating point
25908 environment requires special care. See :ref:`Floating Point Environment <floatenv>`.
25910 .. _int_get_rounding:
25912 '``llvm.get.rounding``' Intrinsic
25913 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25920 declare i32 @llvm.get.rounding()
25925 The '``llvm.get.rounding``' intrinsic reads the current rounding mode.
25930 The '``llvm.get.rounding``' intrinsic returns the current rounding mode.
25931 Encoding of the returned values is same as the result of ``FLT_ROUNDS``,
25932 specified by C standard:
25937 1 - to nearest, ties to even
25938 2 - toward positive infinity
25939 3 - toward negative infinity
25940 4 - to nearest, ties away from zero
25942 Other values may be used to represent additional rounding modes, supported by a
25943 target. These values are target-specific.
25945 '``llvm.set.rounding``' Intrinsic
25946 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25953 declare void @llvm.set.rounding(i32 <val>)
25958 The '``llvm.set.rounding``' intrinsic sets current rounding mode.
25963 The argument is the required rounding mode. Encoding of rounding mode is
25964 the same as used by '``llvm.get.rounding``'.
25969 The '``llvm.set.rounding``' intrinsic sets the current rounding mode. It is
25970 similar to C library function 'fesetround', however this intrinsic does not
25971 return any value and uses platform-independent representation of IEEE rounding
25975 '``llvm.get.fpenv``' Intrinsic
25976 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25983 declare <integer_type> @llvm.get.fpenv()
25988 The '``llvm.get.fpenv``' intrinsic returns bits of the current floating-point
25989 environment. The return value type is platform-specific.
25994 The '``llvm.get.fpenv``' intrinsic reads the current floating-point environment
25995 and returns it as an integer value.
25998 '``llvm.set.fpenv``' Intrinsic
25999 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26006 declare void @llvm.set.fpenv(<integer_type> <val>)
26011 The '``llvm.set.fpenv``' intrinsic sets the current floating-point environment.
26016 The argument is an integer representing the new floating-point environment. The
26017 integer type is platform-specific.
26022 The '``llvm.set.fpenv``' intrinsic sets the current floating-point environment
26023 to the state specified by the argument. The state may be previously obtained by a
26024 call to '``llvm.get.fpenv``' or synthesised in a platform-dependent way.
26027 '``llvm.reset.fpenv``' Intrinsic
26028 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26035 declare void @llvm.reset.fpenv()
26040 The '``llvm.reset.fpenv``' intrinsic sets the default floating-point environment.
26045 The '``llvm.reset.fpenv``' intrinsic sets the current floating-point environment
26046 to default state. It is similar to the call 'fesetenv(FE_DFL_ENV)', except it
26047 does not return any value.
26049 .. _int_get_fpmode:
26051 '``llvm.get.fpmode``' Intrinsic
26052 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26057 The '``llvm.get.fpmode``' intrinsic returns bits of the current floating-point
26058 control modes. The return value type is platform-specific.
26062 declare <integer_type> @llvm.get.fpmode()
26067 The '``llvm.get.fpmode``' intrinsic reads the current dynamic floating-point
26068 control modes and returns it as an integer value.
26078 The '``llvm.get.fpmode``' intrinsic reads the current dynamic floating-point
26079 control modes, such as rounding direction, precision, treatment of denormals and
26080 so on. It is similar to the C library function 'fegetmode', however this
26081 function does not store the set of control modes into memory but returns it as
26082 an integer value. Interpretation of the bits in this value is target-dependent.
26084 '``llvm.set.fpmode``' Intrinsic
26085 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26090 The '``llvm.set.fpmode``' intrinsic sets the current floating-point control modes.
26094 declare void @llvm.set.fpmode(<integer_type> <val>)
26099 The '``llvm.set.fpmode``' intrinsic sets the current dynamic floating-point
26105 The argument is a set of floating-point control modes, represented as an integer
26106 value in a target-dependent way.
26111 The '``llvm.set.fpmode``' intrinsic sets the current dynamic floating-point
26112 control modes to the state specified by the argument, which must be obtained by
26113 a call to '``llvm.get.fpmode``' or constructed in a target-specific way. It is
26114 similar to the C library function 'fesetmode', however this function does not
26115 read the set of control modes from memory but gets it as integer value.
26117 '``llvm.reset.fpmode``' Intrinsic
26118 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26125 declare void @llvm.reset.fpmode()
26130 The '``llvm.reset.fpmode``' intrinsic sets the default dynamic floating-point
26141 The '``llvm.reset.fpmode``' intrinsic sets the current dynamic floating-point
26142 environment to default state. It is similar to the C library function call
26143 'fesetmode(FE_DFL_MODE)', however this function does not return any value.
26146 Floating-Point Test Intrinsics
26147 ------------------------------
26149 These functions get properties of floating-point values.
26152 .. _llvm.is.fpclass:
26154 '``llvm.is.fpclass``' Intrinsic
26155 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26162 declare i1 @llvm.is.fpclass(<fptype> <op>, i32 <test>)
26163 declare <N x i1> @llvm.is.fpclass(<vector-fptype> <op>, i32 <test>)
26168 The '``llvm.is.fpclass``' intrinsic returns a boolean value or vector of boolean
26169 values depending on whether the first argument satisfies the test specified by
26170 the second argument.
26172 If the first argument is a floating-point scalar, then the result type is a
26173 boolean (:ref:`i1 <t_integer>`).
26175 If the first argument is a floating-point vector, then the result type is a
26176 vector of boolean with the same number of elements as the first argument.
26181 The first argument to the '``llvm.is.fpclass``' intrinsic must be
26182 :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>`
26183 of floating-point values.
26185 The second argument specifies, which tests to perform. It must be a compile-time
26186 integer constant, each bit in which specifies floating-point class:
26188 +-------+----------------------+
26189 | Bit # | floating-point class |
26190 +=======+======================+
26191 | 0 | Signaling NaN |
26192 +-------+----------------------+
26194 +-------+----------------------+
26195 | 2 | Negative infinity |
26196 +-------+----------------------+
26197 | 3 | Negative normal |
26198 +-------+----------------------+
26199 | 4 | Negative subnormal |
26200 +-------+----------------------+
26201 | 5 | Negative zero |
26202 +-------+----------------------+
26203 | 6 | Positive zero |
26204 +-------+----------------------+
26205 | 7 | Positive subnormal |
26206 +-------+----------------------+
26207 | 8 | Positive normal |
26208 +-------+----------------------+
26209 | 9 | Positive infinity |
26210 +-------+----------------------+
26215 The function checks if ``op`` belongs to any of the floating-point classes
26216 specified by ``test``. If ``op`` is a vector, then the check is made element by
26217 element. Each check yields an :ref:`i1 <t_integer>` result, which is ``true``,
26218 if the element value satisfies the specified test. The argument ``test`` is a
26219 bit mask where each bit specifies floating-point class to test. For example, the
26220 value 0x108 makes test for normal value, - bits 3 and 8 in it are set, which
26221 means that the function returns ``true`` if ``op`` is a positive or negative
26222 normal value. The function never raises floating-point exceptions. The
26223 function does not canonicalize its input value and does not depend
26224 on the floating-point environment. If the floating-point environment
26225 has a zeroing treatment of subnormal input values (such as indicated
26226 by the ``"denormal-fp-math"`` attribute), a subnormal value will be
26227 observed (will not be implicitly treated as zero).
26233 This class of intrinsics is designed to be generic and has no specific
26236 '``llvm.var.annotation``' Intrinsic
26237 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26244 declare void @llvm.var.annotation(ptr <val>, ptr <str>, ptr <str>, i32 <int>)
26249 The '``llvm.var.annotation``' intrinsic.
26254 The first argument is a pointer to a value, the second is a pointer to a
26255 global string, the third is a pointer to a global string which is the
26256 source file name, and the last argument is the line number.
26261 This intrinsic allows annotation of local variables with arbitrary
26262 strings. This can be useful for special purpose optimizations that want
26263 to look for these annotations. These have no other defined use; they are
26264 ignored by code generation and optimization.
26266 '``llvm.ptr.annotation.*``' Intrinsic
26267 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26272 This is an overloaded intrinsic. You can use '``llvm.ptr.annotation``' on a
26273 pointer to an integer of any width. *NOTE* you must specify an address space for
26274 the pointer. The identifier for the default address space is the integer
26279 declare ptr @llvm.ptr.annotation.p0(ptr <val>, ptr <str>, ptr <str>, i32 <int>)
26280 declare ptr @llvm.ptr.annotation.p1(ptr addrspace(1) <val>, ptr <str>, ptr <str>, i32 <int>)
26285 The '``llvm.ptr.annotation``' intrinsic.
26290 The first argument is a pointer to an integer value of arbitrary bitwidth
26291 (result of some expression), the second is a pointer to a global string, the
26292 third is a pointer to a global string which is the source file name, and the
26293 last argument is the line number. It returns the value of the first argument.
26298 This intrinsic allows annotation of a pointer to an integer with arbitrary
26299 strings. This can be useful for special purpose optimizations that want to look
26300 for these annotations. These have no other defined use; transformations preserve
26301 annotations on a best-effort basis but are allowed to replace the intrinsic with
26302 its first argument without breaking semantics and the intrinsic is completely
26303 dropped during instruction selection.
26305 '``llvm.annotation.*``' Intrinsic
26306 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26311 This is an overloaded intrinsic. You can use '``llvm.annotation``' on
26312 any integer bit width.
26316 declare i8 @llvm.annotation.i8(i8 <val>, ptr <str>, ptr <str>, i32 <int>)
26317 declare i16 @llvm.annotation.i16(i16 <val>, ptr <str>, ptr <str>, i32 <int>)
26318 declare i32 @llvm.annotation.i32(i32 <val>, ptr <str>, ptr <str>, i32 <int>)
26319 declare i64 @llvm.annotation.i64(i64 <val>, ptr <str>, ptr <str>, i32 <int>)
26320 declare i256 @llvm.annotation.i256(i256 <val>, ptr <str>, ptr <str>, i32 <int>)
26325 The '``llvm.annotation``' intrinsic.
26330 The first argument is an integer value (result of some expression), the
26331 second is a pointer to a global string, the third is a pointer to a
26332 global string which is the source file name, and the last argument is
26333 the line number. It returns the value of the first argument.
26338 This intrinsic allows annotations to be put on arbitrary expressions with
26339 arbitrary strings. This can be useful for special purpose optimizations that
26340 want to look for these annotations. These have no other defined use;
26341 transformations preserve annotations on a best-effort basis but are allowed to
26342 replace the intrinsic with its first argument without breaking semantics and the
26343 intrinsic is completely dropped during instruction selection.
26345 '``llvm.codeview.annotation``' Intrinsic
26346 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26351 This annotation emits a label at its program point and an associated
26352 ``S_ANNOTATION`` codeview record with some additional string metadata. This is
26353 used to implement MSVC's ``__annotation`` intrinsic. It is marked
26354 ``noduplicate``, so calls to this intrinsic prevent inlining and should be
26355 considered expensive.
26359 declare void @llvm.codeview.annotation(metadata)
26364 The argument should be an MDTuple containing any number of MDStrings.
26366 '``llvm.trap``' Intrinsic
26367 ^^^^^^^^^^^^^^^^^^^^^^^^^
26374 declare void @llvm.trap() cold noreturn nounwind
26379 The '``llvm.trap``' intrinsic.
26389 This intrinsic is lowered to the target dependent trap instruction. If
26390 the target does not have a trap instruction, this intrinsic will be
26391 lowered to a call of the ``abort()`` function.
26393 '``llvm.debugtrap``' Intrinsic
26394 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26401 declare void @llvm.debugtrap() nounwind
26406 The '``llvm.debugtrap``' intrinsic.
26416 This intrinsic is lowered to code which is intended to cause an
26417 execution trap with the intention of requesting the attention of a
26420 '``llvm.ubsantrap``' Intrinsic
26421 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26428 declare void @llvm.ubsantrap(i8 immarg) cold noreturn nounwind
26433 The '``llvm.ubsantrap``' intrinsic.
26438 An integer describing the kind of failure detected.
26443 This intrinsic is lowered to code which is intended to cause an execution trap,
26444 embedding the argument into encoding of that trap somehow to discriminate
26445 crashes if possible.
26447 Equivalent to ``@llvm.trap`` for targets that do not support this behaviour.
26449 '``llvm.stackprotector``' Intrinsic
26450 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26457 declare void @llvm.stackprotector(ptr <guard>, ptr <slot>)
26462 The ``llvm.stackprotector`` intrinsic takes the ``guard`` and stores it
26463 onto the stack at ``slot``. The stack slot is adjusted to ensure that it
26464 is placed on the stack before local variables.
26469 The ``llvm.stackprotector`` intrinsic requires two pointer arguments.
26470 The first argument is the value loaded from the stack guard
26471 ``@__stack_chk_guard``. The second variable is an ``alloca`` that has
26472 enough space to hold the value of the guard.
26477 This intrinsic causes the prologue/epilogue inserter to force the position of
26478 the ``AllocaInst`` stack slot to be before local variables on the stack. This is
26479 to ensure that if a local variable on the stack is overwritten, it will destroy
26480 the value of the guard. When the function exits, the guard on the stack is
26481 checked against the original guard by ``llvm.stackprotectorcheck``. If they are
26482 different, then ``llvm.stackprotectorcheck`` causes the program to abort by
26483 calling the ``__stack_chk_fail()`` function.
26485 '``llvm.stackguard``' Intrinsic
26486 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26493 declare ptr @llvm.stackguard()
26498 The ``llvm.stackguard`` intrinsic returns the system stack guard value.
26500 It should not be generated by frontends, since it is only for internal usage.
26501 The reason why we create this intrinsic is that we still support IR form Stack
26502 Protector in FastISel.
26512 On some platforms, the value returned by this intrinsic remains unchanged
26513 between loads in the same thread. On other platforms, it returns the same
26514 global variable value, if any, e.g. ``@__stack_chk_guard``.
26516 Currently some platforms have IR-level customized stack guard loading (e.g.
26517 X86 Linux) that is not handled by ``llvm.stackguard()``, while they should be
26520 '``llvm.objectsize``' Intrinsic
26521 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26528 declare i32 @llvm.objectsize.i32(ptr <object>, i1 <min>, i1 <nullunknown>, i1 <dynamic>)
26529 declare i64 @llvm.objectsize.i64(ptr <object>, i1 <min>, i1 <nullunknown>, i1 <dynamic>)
26534 The ``llvm.objectsize`` intrinsic is designed to provide information to the
26535 optimizer to determine whether a) an operation (like memcpy) will overflow a
26536 buffer that corresponds to an object, or b) that a runtime check for overflow
26537 isn't necessary. An object in this context means an allocation of a specific
26538 class, structure, array, or other object.
26543 The ``llvm.objectsize`` intrinsic takes four arguments. The first argument is a
26544 pointer to or into the ``object``. The second argument determines whether
26545 ``llvm.objectsize`` returns 0 (if true) or -1 (if false) when the object size is
26546 unknown. The third argument controls how ``llvm.objectsize`` acts when ``null``
26547 in address space 0 is used as its pointer argument. If it's ``false``,
26548 ``llvm.objectsize`` reports 0 bytes available when given ``null``. Otherwise, if
26549 the ``null`` is in a non-zero address space or if ``true`` is given for the
26550 third argument of ``llvm.objectsize``, we assume its size is unknown. The fourth
26551 argument to ``llvm.objectsize`` determines if the value should be evaluated at
26554 The second, third, and fourth arguments only accept constants.
26559 The ``llvm.objectsize`` intrinsic is lowered to a value representing the size of
26560 the object concerned. If the size cannot be determined, ``llvm.objectsize``
26561 returns ``i32/i64 -1 or 0`` (depending on the ``min`` argument).
26563 '``llvm.expect``' Intrinsic
26564 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
26569 This is an overloaded intrinsic. You can use ``llvm.expect`` on any
26574 declare i1 @llvm.expect.i1(i1 <val>, i1 <expected_val>)
26575 declare i32 @llvm.expect.i32(i32 <val>, i32 <expected_val>)
26576 declare i64 @llvm.expect.i64(i64 <val>, i64 <expected_val>)
26581 The ``llvm.expect`` intrinsic provides information about expected (the
26582 most probable) value of ``val``, which can be used by optimizers.
26587 The ``llvm.expect`` intrinsic takes two arguments. The first argument is
26588 a value. The second argument is an expected value.
26593 This intrinsic is lowered to the ``val``.
26595 '``llvm.expect.with.probability``' Intrinsic
26596 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26601 This intrinsic is similar to ``llvm.expect``. This is an overloaded intrinsic.
26602 You can use ``llvm.expect.with.probability`` on any integer bit width.
26606 declare i1 @llvm.expect.with.probability.i1(i1 <val>, i1 <expected_val>, double <prob>)
26607 declare i32 @llvm.expect.with.probability.i32(i32 <val>, i32 <expected_val>, double <prob>)
26608 declare i64 @llvm.expect.with.probability.i64(i64 <val>, i64 <expected_val>, double <prob>)
26613 The ``llvm.expect.with.probability`` intrinsic provides information about
26614 expected value of ``val`` with probability(or confidence) ``prob``, which can
26615 be used by optimizers.
26620 The ``llvm.expect.with.probability`` intrinsic takes three arguments. The first
26621 argument is a value. The second argument is an expected value. The third
26622 argument is a probability.
26627 This intrinsic is lowered to the ``val``.
26631 '``llvm.assume``' Intrinsic
26632 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26639 declare void @llvm.assume(i1 %cond)
26644 The ``llvm.assume`` allows the optimizer to assume that the provided
26645 condition is true. This information can then be used in simplifying other parts
26648 More complex assumptions can be encoded as
26649 :ref:`assume operand bundles <assume_opbundles>`.
26654 The argument of the call is the condition which the optimizer may assume is
26660 The intrinsic allows the optimizer to assume that the provided condition is
26661 always true whenever the control flow reaches the intrinsic call. No code is
26662 generated for this intrinsic, and instructions that contribute only to the
26663 provided condition are not used for code generation. If the condition is
26664 violated during execution, the behavior is undefined.
26666 Note that the optimizer might limit the transformations performed on values
26667 used by the ``llvm.assume`` intrinsic in order to preserve the instructions
26668 only used to form the intrinsic's input argument. This might prove undesirable
26669 if the extra information provided by the ``llvm.assume`` intrinsic does not cause
26670 sufficient overall improvement in code quality. For this reason,
26671 ``llvm.assume`` should not be used to document basic mathematical invariants
26672 that the optimizer can otherwise deduce or facts that are of little use to the
26677 '``llvm.ssa.copy``' Intrinsic
26678 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26685 declare type @llvm.ssa.copy(type returned %operand) memory(none)
26690 The first argument is an operand which is used as the returned value.
26695 The ``llvm.ssa.copy`` intrinsic can be used to attach information to
26696 operations by copying them and giving them new names. For example,
26697 the PredicateInfo utility uses it to build Extended SSA form, and
26698 attach various forms of information to operands that dominate specific
26699 uses. It is not meant for general use, only for building temporary
26700 renaming forms that require value splits at certain points.
26704 '``llvm.type.test``' Intrinsic
26705 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26712 declare i1 @llvm.type.test(ptr %ptr, metadata %type) nounwind memory(none)
26718 The first argument is a pointer to be tested. The second argument is a
26719 metadata object representing a :doc:`type identifier <TypeMetadata>`.
26724 The ``llvm.type.test`` intrinsic tests whether the given pointer is associated
26725 with the given type identifier.
26727 .. _type.checked.load:
26729 '``llvm.type.checked.load``' Intrinsic
26730 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26737 declare {ptr, i1} @llvm.type.checked.load(ptr %ptr, i32 %offset, metadata %type) nounwind memory(argmem: read)
26743 The first argument is a pointer from which to load a function pointer. The
26744 second argument is the byte offset from which to load the function pointer. The
26745 third argument is a metadata object representing a :doc:`type identifier
26751 The ``llvm.type.checked.load`` intrinsic safely loads a function pointer from a
26752 virtual table pointer using type metadata. This intrinsic is used to implement
26753 control flow integrity in conjunction with virtual call optimization. The
26754 virtual call optimization pass will optimize away ``llvm.type.checked.load``
26755 intrinsics associated with devirtualized calls, thereby removing the type
26756 check in cases where it is not needed to enforce the control flow integrity
26759 If the given pointer is associated with a type metadata identifier, this
26760 function returns true as the second element of its return value. (Note that
26761 the function may also return true if the given pointer is not associated
26762 with a type metadata identifier.) If the function's return value's second
26763 element is true, the following rules apply to the first element:
26765 - If the given pointer is associated with the given type metadata identifier,
26766 it is the function pointer loaded from the given byte offset from the given
26769 - If the given pointer is not associated with the given type metadata
26770 identifier, it is one of the following (the choice of which is unspecified):
26772 1. The function pointer that would have been loaded from an arbitrarily chosen
26773 (through an unspecified mechanism) pointer associated with the type
26776 2. If the function has a non-void return type, a pointer to a function that
26777 returns an unspecified value without causing side effects.
26779 If the function's return value's second element is false, the value of the
26780 first element is undefined.
26782 .. _type.checked.load.relative:
26784 '``llvm.type.checked.load.relative``' Intrinsic
26785 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26792 declare {ptr, i1} @llvm.type.checked.load.relative(ptr %ptr, i32 %offset, metadata %type) argmemonly nounwind readonly
26797 The ``llvm.type.checked.load.relative`` intrinsic loads a relative pointer to a
26798 function from a virtual table pointer using metadata. Otherwise, its semantic is
26799 identical to the ``llvm.type.checked.load`` intrinsic.
26801 A relative pointer is a pointer to an offset to the pointed to value. The
26802 address of the underlying pointer of the relative pointer is obtained by adding
26803 the offset to the address of the offset value.
26805 '``llvm.arithmetic.fence``' Intrinsic
26806 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26814 @llvm.arithmetic.fence(<type> <op>)
26819 The purpose of the ``llvm.arithmetic.fence`` intrinsic
26820 is to prevent the optimizer from performing fast-math optimizations,
26821 particularly reassociation,
26822 between the argument and the expression that contains the argument.
26823 It can be used to preserve the parentheses in the source language.
26828 The ``llvm.arithmetic.fence`` intrinsic takes only one argument.
26829 The argument and the return value are floating-point numbers,
26830 or vector floating-point numbers, of the same type.
26835 This intrinsic returns the value of its operand. The optimizer can optimize
26836 the argument, but the optimizer cannot hoist any component of the operand
26837 to the containing context, and the optimizer cannot move the calculation of
26838 any expression in the containing context into the operand.
26841 '``llvm.donothing``' Intrinsic
26842 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26849 declare void @llvm.donothing() nounwind memory(none)
26854 The ``llvm.donothing`` intrinsic doesn't perform any operation. It's one of only
26855 three intrinsics (besides ``llvm.experimental.patchpoint`` and
26856 ``llvm.experimental.gc.statepoint``) that can be called with an invoke
26867 This intrinsic does nothing, and it's removed by optimizers and ignored
26870 '``llvm.experimental.deoptimize``' Intrinsic
26871 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26878 declare type @llvm.experimental.deoptimize(...) [ "deopt"(...) ]
26883 This intrinsic, together with :ref:`deoptimization operand bundles
26884 <deopt_opbundles>`, allow frontends to express transfer of control and
26885 frame-local state from the currently executing (typically more specialized,
26886 hence faster) version of a function into another (typically more generic, hence
26889 In languages with a fully integrated managed runtime like Java and JavaScript
26890 this intrinsic can be used to implement "uncommon trap" or "side exit" like
26891 functionality. In unmanaged languages like C and C++, this intrinsic can be
26892 used to represent the slow paths of specialized functions.
26898 The intrinsic takes an arbitrary number of arguments, whose meaning is
26899 decided by the :ref:`lowering strategy<deoptimize_lowering>`.
26904 The ``@llvm.experimental.deoptimize`` intrinsic executes an attached
26905 deoptimization continuation (denoted using a :ref:`deoptimization
26906 operand bundle <deopt_opbundles>`) and returns the value returned by
26907 the deoptimization continuation. Defining the semantic properties of
26908 the continuation itself is out of scope of the language reference --
26909 as far as LLVM is concerned, the deoptimization continuation can
26910 invoke arbitrary side effects, including reading from and writing to
26913 Deoptimization continuations expressed using ``"deopt"`` operand bundles always
26914 continue execution to the end of the physical frame containing them, so all
26915 calls to ``@llvm.experimental.deoptimize`` must be in "tail position":
26917 - ``@llvm.experimental.deoptimize`` cannot be invoked.
26918 - The call must immediately precede a :ref:`ret <i_ret>` instruction.
26919 - The ``ret`` instruction must return the value produced by the
26920 ``@llvm.experimental.deoptimize`` call if there is one, or void.
26922 Note that the above restrictions imply that the return type for a call to
26923 ``@llvm.experimental.deoptimize`` will match the return type of its immediate
26926 The inliner composes the ``"deopt"`` continuations of the caller into the
26927 ``"deopt"`` continuations present in the inlinee, and also updates calls to this
26928 intrinsic to return directly from the frame of the function it inlined into.
26930 All declarations of ``@llvm.experimental.deoptimize`` must share the
26931 same calling convention.
26933 .. _deoptimize_lowering:
26938 Calls to ``@llvm.experimental.deoptimize`` are lowered to calls to the
26939 symbol ``__llvm_deoptimize`` (it is the frontend's responsibility to
26940 ensure that this symbol is defined). The call arguments to
26941 ``@llvm.experimental.deoptimize`` are lowered as if they were formal
26942 arguments of the specified types, and not as varargs.
26945 '``llvm.experimental.guard``' Intrinsic
26946 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26953 declare void @llvm.experimental.guard(i1, ...) [ "deopt"(...) ]
26958 This intrinsic, together with :ref:`deoptimization operand bundles
26959 <deopt_opbundles>`, allows frontends to express guards or checks on
26960 optimistic assumptions made during compilation. The semantics of
26961 ``@llvm.experimental.guard`` is defined in terms of
26962 ``@llvm.experimental.deoptimize`` -- its body is defined to be
26965 .. code-block:: text
26967 define void @llvm.experimental.guard(i1 %pred, <args...>) {
26968 %realPred = and i1 %pred, undef
26969 br i1 %realPred, label %continue, label %leave [, !make.implicit !{}]
26972 call void @llvm.experimental.deoptimize(<args...>) [ "deopt"() ]
26980 with the optional ``[, !make.implicit !{}]`` present if and only if it
26981 is present on the call site. For more details on ``!make.implicit``,
26982 see :doc:`FaultMaps`.
26984 In words, ``@llvm.experimental.guard`` executes the attached
26985 ``"deopt"`` continuation if (but **not** only if) its first argument
26986 is ``false``. Since the optimizer is allowed to replace the ``undef``
26987 with an arbitrary value, it can optimize guard to fail "spuriously",
26988 i.e. without the original condition being false (hence the "not only
26989 if"); and this allows for "check widening" type optimizations.
26991 ``@llvm.experimental.guard`` cannot be invoked.
26993 After ``@llvm.experimental.guard`` was first added, a more general
26994 formulation was found in ``@llvm.experimental.widenable.condition``.
26995 Support for ``@llvm.experimental.guard`` is slowly being rephrased in
26996 terms of this alternate.
26998 '``llvm.experimental.widenable.condition``' Intrinsic
26999 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27006 declare i1 @llvm.experimental.widenable.condition()
27011 This intrinsic represents a "widenable condition" which is
27012 boolean expressions with the following property: whether this
27013 expression is `true` or `false`, the program is correct and
27016 Together with :ref:`deoptimization operand bundles <deopt_opbundles>`,
27017 ``@llvm.experimental.widenable.condition`` allows frontends to
27018 express guards or checks on optimistic assumptions made during
27019 compilation and represent them as branch instructions on special
27022 While this may appear similar in semantics to `undef`, it is very
27023 different in that an invocation produces a particular, singular
27024 value. It is also intended to be lowered late, and remain available
27025 for specific optimizations and transforms that can benefit from its
27026 special properties.
27036 The intrinsic ``@llvm.experimental.widenable.condition()``
27037 returns either `true` or `false`. For each evaluation of a call
27038 to this intrinsic, the program must be valid and correct both if
27039 it returns `true` and if it returns `false`. This allows
27040 transformation passes to replace evaluations of this intrinsic
27041 with either value whenever one is beneficial.
27043 When used in a branch condition, it allows us to choose between
27044 two alternative correct solutions for the same problem, like
27047 .. code-block:: text
27049 %cond = call i1 @llvm.experimental.widenable.condition()
27050 br i1 %cond, label %solution_1, label %solution_2
27053 ; Apply memory-consuming but fast solution for a task.
27056 ; Cheap in memory but slow solution.
27058 Whether the result of intrinsic's call is `true` or `false`,
27059 it should be correct to pick either solution. We can switch
27060 between them by replacing the result of
27061 ``@llvm.experimental.widenable.condition`` with different
27064 This is how it can be used to represent guards as widenable branches:
27066 .. code-block:: text
27069 ; Unguarded instructions
27070 call void @llvm.experimental.guard(i1 %cond, <args...>) ["deopt"(<deopt_args...>)]
27071 ; Guarded instructions
27073 Can be expressed in an alternative equivalent form of explicit branch using
27074 ``@llvm.experimental.widenable.condition``:
27076 .. code-block:: text
27079 ; Unguarded instructions
27080 %widenable_condition = call i1 @llvm.experimental.widenable.condition()
27081 %guard_condition = and i1 %cond, %widenable_condition
27082 br i1 %guard_condition, label %guarded, label %deopt
27085 ; Guarded instructions
27088 call type @llvm.experimental.deoptimize(<args...>) [ "deopt"(<deopt_args...>) ]
27090 So the block `guarded` is only reachable when `%cond` is `true`,
27091 and it should be valid to go to the block `deopt` whenever `%cond`
27092 is `true` or `false`.
27094 ``@llvm.experimental.widenable.condition`` will never throw, thus
27095 it cannot be invoked.
27100 When ``@llvm.experimental.widenable.condition()`` is used in
27101 condition of a guard represented as explicit branch, it is
27102 legal to widen the guard's condition with any additional
27105 Guard widening looks like replacement of
27107 .. code-block:: text
27109 %widenable_cond = call i1 @llvm.experimental.widenable.condition()
27110 %guard_cond = and i1 %cond, %widenable_cond
27111 br i1 %guard_cond, label %guarded, label %deopt
27115 .. code-block:: text
27117 %widenable_cond = call i1 @llvm.experimental.widenable.condition()
27118 %new_cond = and i1 %any_other_cond, %widenable_cond
27119 %new_guard_cond = and i1 %cond, %new_cond
27120 br i1 %new_guard_cond, label %guarded, label %deopt
27122 for this branch. Here `%any_other_cond` is an arbitrarily chosen
27123 well-defined `i1` value. By making guard widening, we may
27124 impose stricter conditions on `guarded` block and bail to the
27125 deopt when the new condition is not met.
27130 Default lowering strategy is replacing the result of
27131 call of ``@llvm.experimental.widenable.condition`` with
27132 constant `true`. However it is always correct to replace
27133 it with any other `i1` value. Any pass can
27134 freely do it if it can benefit from non-default lowering.
27137 '``llvm.load.relative``' Intrinsic
27138 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27145 declare ptr @llvm.load.relative.iN(ptr %ptr, iN %offset) nounwind memory(argmem: read)
27150 This intrinsic loads a 32-bit value from the address ``%ptr + %offset``,
27151 adds ``%ptr`` to that value and returns it. The constant folder specifically
27152 recognizes the form of this intrinsic and the constant initializers it may
27153 load from; if a loaded constant initializer is known to have the form
27154 ``i32 trunc(x - %ptr)``, the intrinsic call is folded to ``x``.
27156 LLVM provides that the calculation of such a constant initializer will
27157 not overflow at link time under the medium code model if ``x`` is an
27158 ``unnamed_addr`` function. However, it does not provide this guarantee for
27159 a constant initializer folded into a function body. This intrinsic can be
27160 used to avoid the possibility of overflows when loading from such a constant.
27162 .. _llvm_sideeffect:
27164 '``llvm.sideeffect``' Intrinsic
27165 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27172 declare void @llvm.sideeffect() inaccessiblememonly nounwind willreturn
27177 The ``llvm.sideeffect`` intrinsic doesn't perform any operation. Optimizers
27178 treat it as having side effects, so it can be inserted into a loop to
27179 indicate that the loop shouldn't be assumed to terminate (which could
27180 potentially lead to the loop being optimized away entirely), even if it's
27181 an infinite loop with no other side effects.
27191 This intrinsic actually does nothing, but optimizers must assume that it
27192 has externally observable side effects.
27194 '``llvm.is.constant.*``' Intrinsic
27195 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27200 This is an overloaded intrinsic. You can use llvm.is.constant with any argument type.
27204 declare i1 @llvm.is.constant.i32(i32 %operand) nounwind memory(none)
27205 declare i1 @llvm.is.constant.f32(float %operand) nounwind memory(none)
27206 declare i1 @llvm.is.constant.TYPENAME(TYPE %operand) nounwind memory(none)
27211 The '``llvm.is.constant``' intrinsic will return true if the argument
27212 is known to be a manifest compile-time constant. It is guaranteed to
27213 fold to either true or false before generating machine code.
27218 This intrinsic generates no code. If its argument is known to be a
27219 manifest compile-time constant value, then the intrinsic will be
27220 converted to a constant true value. Otherwise, it will be converted to
27221 a constant false value.
27223 In particular, note that if the argument is a constant expression
27224 which refers to a global (the address of which _is_ a constant, but
27225 not manifest during the compile), then the intrinsic evaluates to
27228 The result also intentionally depends on the result of optimization
27229 passes -- e.g., the result can change depending on whether a
27230 function gets inlined or not. A function's parameters are
27231 obviously not constant. However, a call like
27232 ``llvm.is.constant.i32(i32 %param)`` *can* return true after the
27233 function is inlined, if the value passed to the function parameter was
27238 '``llvm.ptrmask``' Intrinsic
27239 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27246 declare ptrty llvm.ptrmask(ptrty %ptr, intty %mask) speculatable memory(none)
27251 The first argument is a pointer or vector of pointers. The second argument is
27252 an integer or vector of integers with the same bit width as the index type
27253 size of the first argument.
27258 The ``llvm.ptrmask`` intrinsic masks out bits of the pointer according to a mask.
27259 This allows stripping data from tagged pointers without converting them to an
27260 integer (ptrtoint/inttoptr). As a consequence, we can preserve more information
27261 to facilitate alias analysis and underlying-object detection.
27266 The result of ``ptrmask(%ptr, %mask)`` is equivalent to the following expansion,
27267 where ``iPtrIdx`` is the index type size of the pointer::
27269 %intptr = ptrtoint ptr %ptr to iPtrIdx ; this may truncate
27270 %masked = and iPtrIdx %intptr, %mask
27271 %diff = sub iPtrIdx %masked, %intptr
27272 %result = getelementptr i8, ptr %ptr, iPtrIdx %diff
27274 If the pointer index type size is smaller than the pointer type size, this
27275 implies that pointer bits beyond the index size are not affected by this
27276 intrinsic. For integral pointers, it behaves as if the mask were extended with
27277 1 bits to the pointer type size.
27279 Both the returned pointer(s) and the first argument are based on the same
27280 underlying object (for more information on the *based on* terminology see
27281 :ref:`the pointer aliasing rules <pointeraliasing>`).
27283 The intrinsic only captures the pointer argument through the return value.
27285 .. _int_threadlocal_address:
27287 '``llvm.threadlocal.address``' Intrinsic
27288 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27295 declare ptr @llvm.threadlocal.address(ptr) nounwind willreturn memory(none)
27300 The first argument is a pointer, which refers to a thread local global.
27305 The address of a thread local global is not a constant, since it depends on
27306 the calling thread. The `llvm.threadlocal.address` intrinsic returns the
27307 address of the given thread local global in the calling thread.
27311 '``llvm.vscale``' Intrinsic
27312 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
27319 declare i32 llvm.vscale.i32()
27320 declare i64 llvm.vscale.i64()
27325 The ``llvm.vscale`` intrinsic returns the value for ``vscale`` in scalable
27326 vectors such as ``<vscale x 16 x i8>``.
27331 ``vscale`` is a positive value that is constant throughout program
27332 execution, but is unknown at compile time.
27333 If the result value does not fit in the result type, then the result is
27334 a :ref:`poison value <poisonvalues>`.
27337 Stack Map Intrinsics
27338 --------------------
27340 LLVM provides experimental intrinsics to support runtime patching
27341 mechanisms commonly desired in dynamic language JITs. These intrinsics
27342 are described in :doc:`StackMaps`.
27344 Element Wise Atomic Memory Intrinsics
27345 -------------------------------------
27347 These intrinsics are similar to the standard library memory intrinsics except
27348 that they perform memory transfer as a sequence of atomic memory accesses.
27350 .. _int_memcpy_element_unordered_atomic:
27352 '``llvm.memcpy.element.unordered.atomic``' Intrinsic
27353 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27358 This is an overloaded intrinsic. You can use ``llvm.memcpy.element.unordered.atomic`` on
27359 any integer bit width and for different address spaces. Not all targets
27360 support all bit widths however.
27364 declare void @llvm.memcpy.element.unordered.atomic.p0.p0.i32(ptr <dest>,
27367 i32 <element_size>)
27368 declare void @llvm.memcpy.element.unordered.atomic.p0.p0.i64(ptr <dest>,
27371 i32 <element_size>)
27376 The '``llvm.memcpy.element.unordered.atomic.*``' intrinsic is a specialization of the
27377 '``llvm.memcpy.*``' intrinsic. It differs in that the ``dest`` and ``src`` are treated
27378 as arrays with elements that are exactly ``element_size`` bytes, and the copy between
27379 buffers uses a sequence of :ref:`unordered atomic <ordering>` load/store operations
27380 that are a positive integer multiple of the ``element_size`` in size.
27385 The first three arguments are the same as they are in the :ref:`@llvm.memcpy <int_memcpy>`
27386 intrinsic, with the added constraint that ``len`` is required to be a positive integer
27387 multiple of the ``element_size``. If ``len`` is not a positive integer multiple of
27388 ``element_size``, then the behaviour of the intrinsic is undefined.
27390 ``element_size`` must be a compile-time constant positive power of two no greater than
27391 target-specific atomic access size limit.
27393 For each of the input pointers ``align`` parameter attribute must be specified. It
27394 must be a power of two no less than the ``element_size``. Caller guarantees that
27395 both the source and destination pointers are aligned to that boundary.
27400 The '``llvm.memcpy.element.unordered.atomic.*``' intrinsic copies ``len`` bytes of
27401 memory from the source location to the destination location. These locations are not
27402 allowed to overlap. The memory copy is performed as a sequence of load/store operations
27403 where each access is guaranteed to be a multiple of ``element_size`` bytes wide and
27404 aligned at an ``element_size`` boundary.
27406 The order of the copy is unspecified. The same value may be read from the source
27407 buffer many times, but only one write is issued to the destination buffer per
27408 element. It is well defined to have concurrent reads and writes to both source and
27409 destination provided those reads and writes are unordered atomic when specified.
27411 This intrinsic does not provide any additional ordering guarantees over those
27412 provided by a set of unordered loads from the source location and stores to the
27418 In the most general case call to the '``llvm.memcpy.element.unordered.atomic.*``' is
27419 lowered to a call to the symbol ``__llvm_memcpy_element_unordered_atomic_*``. Where '*'
27420 is replaced with an actual element size. See :ref:`RewriteStatepointsForGC intrinsic
27421 lowering <RewriteStatepointsForGC_intrinsic_lowering>` for details on GC specific
27424 Optimizer is allowed to inline memory copy when it's profitable to do so.
27426 '``llvm.memmove.element.unordered.atomic``' Intrinsic
27427 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27432 This is an overloaded intrinsic. You can use
27433 ``llvm.memmove.element.unordered.atomic`` on any integer bit width and for
27434 different address spaces. Not all targets support all bit widths however.
27438 declare void @llvm.memmove.element.unordered.atomic.p0.p0.i32(ptr <dest>,
27441 i32 <element_size>)
27442 declare void @llvm.memmove.element.unordered.atomic.p0.p0.i64(ptr <dest>,
27445 i32 <element_size>)
27450 The '``llvm.memmove.element.unordered.atomic.*``' intrinsic is a specialization
27451 of the '``llvm.memmove.*``' intrinsic. It differs in that the ``dest`` and
27452 ``src`` are treated as arrays with elements that are exactly ``element_size``
27453 bytes, and the copy between buffers uses a sequence of
27454 :ref:`unordered atomic <ordering>` load/store operations that are a positive
27455 integer multiple of the ``element_size`` in size.
27460 The first three arguments are the same as they are in the
27461 :ref:`@llvm.memmove <int_memmove>` intrinsic, with the added constraint that
27462 ``len`` is required to be a positive integer multiple of the ``element_size``.
27463 If ``len`` is not a positive integer multiple of ``element_size``, then the
27464 behaviour of the intrinsic is undefined.
27466 ``element_size`` must be a compile-time constant positive power of two no
27467 greater than a target-specific atomic access size limit.
27469 For each of the input pointers the ``align`` parameter attribute must be
27470 specified. It must be a power of two no less than the ``element_size``. Caller
27471 guarantees that both the source and destination pointers are aligned to that
27477 The '``llvm.memmove.element.unordered.atomic.*``' intrinsic copies ``len`` bytes
27478 of memory from the source location to the destination location. These locations
27479 are allowed to overlap. The memory copy is performed as a sequence of load/store
27480 operations where each access is guaranteed to be a multiple of ``element_size``
27481 bytes wide and aligned at an ``element_size`` boundary.
27483 The order of the copy is unspecified. The same value may be read from the source
27484 buffer many times, but only one write is issued to the destination buffer per
27485 element. It is well defined to have concurrent reads and writes to both source
27486 and destination provided those reads and writes are unordered atomic when
27489 This intrinsic does not provide any additional ordering guarantees over those
27490 provided by a set of unordered loads from the source location and stores to the
27496 In the most general case call to the
27497 '``llvm.memmove.element.unordered.atomic.*``' is lowered to a call to the symbol
27498 ``__llvm_memmove_element_unordered_atomic_*``. Where '*' is replaced with an
27499 actual element size. See :ref:`RewriteStatepointsForGC intrinsic lowering
27500 <RewriteStatepointsForGC_intrinsic_lowering>` for details on GC specific
27503 The optimizer is allowed to inline the memory copy when it's profitable to do so.
27505 .. _int_memset_element_unordered_atomic:
27507 '``llvm.memset.element.unordered.atomic``' Intrinsic
27508 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27513 This is an overloaded intrinsic. You can use ``llvm.memset.element.unordered.atomic`` on
27514 any integer bit width and for different address spaces. Not all targets
27515 support all bit widths however.
27519 declare void @llvm.memset.element.unordered.atomic.p0.i32(ptr <dest>,
27522 i32 <element_size>)
27523 declare void @llvm.memset.element.unordered.atomic.p0.i64(ptr <dest>,
27526 i32 <element_size>)
27531 The '``llvm.memset.element.unordered.atomic.*``' intrinsic is a specialization of the
27532 '``llvm.memset.*``' intrinsic. It differs in that the ``dest`` is treated as an array
27533 with elements that are exactly ``element_size`` bytes, and the assignment to that array
27534 uses uses a sequence of :ref:`unordered atomic <ordering>` store operations
27535 that are a positive integer multiple of the ``element_size`` in size.
27540 The first three arguments are the same as they are in the :ref:`@llvm.memset <int_memset>`
27541 intrinsic, with the added constraint that ``len`` is required to be a positive integer
27542 multiple of the ``element_size``. If ``len`` is not a positive integer multiple of
27543 ``element_size``, then the behaviour of the intrinsic is undefined.
27545 ``element_size`` must be a compile-time constant positive power of two no greater than
27546 target-specific atomic access size limit.
27548 The ``dest`` input pointer must have the ``align`` parameter attribute specified. It
27549 must be a power of two no less than the ``element_size``. Caller guarantees that
27550 the destination pointer is aligned to that boundary.
27555 The '``llvm.memset.element.unordered.atomic.*``' intrinsic sets the ``len`` bytes of
27556 memory starting at the destination location to the given ``value``. The memory is
27557 set with a sequence of store operations where each access is guaranteed to be a
27558 multiple of ``element_size`` bytes wide and aligned at an ``element_size`` boundary.
27560 The order of the assignment is unspecified. Only one write is issued to the
27561 destination buffer per element. It is well defined to have concurrent reads and
27562 writes to the destination provided those reads and writes are unordered atomic
27565 This intrinsic does not provide any additional ordering guarantees over those
27566 provided by a set of unordered stores to the destination.
27571 In the most general case call to the '``llvm.memset.element.unordered.atomic.*``' is
27572 lowered to a call to the symbol ``__llvm_memset_element_unordered_atomic_*``. Where '*'
27573 is replaced with an actual element size.
27575 The optimizer is allowed to inline the memory assignment when it's profitable to do so.
27577 Objective-C ARC Runtime Intrinsics
27578 ----------------------------------
27580 LLVM provides intrinsics that lower to Objective-C ARC runtime entry points.
27581 LLVM is aware of the semantics of these functions, and optimizes based on that
27582 knowledge. You can read more about the details of Objective-C ARC `here
27583 <https://clang.llvm.org/docs/AutomaticReferenceCounting.html>`_.
27585 '``llvm.objc.autorelease``' Intrinsic
27586 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27592 declare ptr @llvm.objc.autorelease(ptr)
27597 Lowers to a call to `objc_autorelease <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-autorelease>`_.
27599 '``llvm.objc.autoreleasePoolPop``' Intrinsic
27600 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27606 declare void @llvm.objc.autoreleasePoolPop(ptr)
27611 Lowers to a call to `objc_autoreleasePoolPop <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-autoreleasepoolpop-void-pool>`_.
27613 '``llvm.objc.autoreleasePoolPush``' Intrinsic
27614 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27620 declare ptr @llvm.objc.autoreleasePoolPush()
27625 Lowers to a call to `objc_autoreleasePoolPush <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-autoreleasepoolpush-void>`_.
27627 '``llvm.objc.autoreleaseReturnValue``' Intrinsic
27628 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27634 declare ptr @llvm.objc.autoreleaseReturnValue(ptr)
27639 Lowers to a call to `objc_autoreleaseReturnValue <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-autoreleasereturnvalue>`_.
27641 '``llvm.objc.copyWeak``' Intrinsic
27642 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27648 declare void @llvm.objc.copyWeak(ptr, ptr)
27653 Lowers to a call to `objc_copyWeak <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-copyweak-id-dest-id-src>`_.
27655 '``llvm.objc.destroyWeak``' Intrinsic
27656 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27662 declare void @llvm.objc.destroyWeak(ptr)
27667 Lowers to a call to `objc_destroyWeak <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-destroyweak-id-object>`_.
27669 '``llvm.objc.initWeak``' Intrinsic
27670 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27676 declare ptr @llvm.objc.initWeak(ptr, ptr)
27681 Lowers to a call to `objc_initWeak <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-initweak>`_.
27683 '``llvm.objc.loadWeak``' Intrinsic
27684 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27690 declare ptr @llvm.objc.loadWeak(ptr)
27695 Lowers to a call to `objc_loadWeak <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-loadweak>`_.
27697 '``llvm.objc.loadWeakRetained``' Intrinsic
27698 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27704 declare ptr @llvm.objc.loadWeakRetained(ptr)
27709 Lowers to a call to `objc_loadWeakRetained <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-loadweakretained>`_.
27711 '``llvm.objc.moveWeak``' Intrinsic
27712 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27718 declare void @llvm.objc.moveWeak(ptr, ptr)
27723 Lowers to a call to `objc_moveWeak <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-moveweak-id-dest-id-src>`_.
27725 '``llvm.objc.release``' Intrinsic
27726 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27732 declare void @llvm.objc.release(ptr)
27737 Lowers to a call to `objc_release <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-release-id-value>`_.
27739 '``llvm.objc.retain``' Intrinsic
27740 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27746 declare ptr @llvm.objc.retain(ptr)
27751 Lowers to a call to `objc_retain <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-retain>`_.
27753 '``llvm.objc.retainAutorelease``' Intrinsic
27754 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27760 declare ptr @llvm.objc.retainAutorelease(ptr)
27765 Lowers to a call to `objc_retainAutorelease <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-retainautorelease>`_.
27767 '``llvm.objc.retainAutoreleaseReturnValue``' Intrinsic
27768 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27774 declare ptr @llvm.objc.retainAutoreleaseReturnValue(ptr)
27779 Lowers to a call to `objc_retainAutoreleaseReturnValue <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-retainautoreleasereturnvalue>`_.
27781 '``llvm.objc.retainAutoreleasedReturnValue``' Intrinsic
27782 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27788 declare ptr @llvm.objc.retainAutoreleasedReturnValue(ptr)
27793 Lowers to a call to `objc_retainAutoreleasedReturnValue <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-retainautoreleasedreturnvalue>`_.
27795 '``llvm.objc.retainBlock``' Intrinsic
27796 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27802 declare ptr @llvm.objc.retainBlock(ptr)
27807 Lowers to a call to `objc_retainBlock <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-retainblock>`_.
27809 '``llvm.objc.storeStrong``' Intrinsic
27810 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27816 declare void @llvm.objc.storeStrong(ptr, ptr)
27821 Lowers to a call to `objc_storeStrong <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-storestrong-id-object-id-value>`_.
27823 '``llvm.objc.storeWeak``' Intrinsic
27824 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27830 declare ptr @llvm.objc.storeWeak(ptr, ptr)
27835 Lowers to a call to `objc_storeWeak <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-storeweak>`_.
27837 Preserving Debug Information Intrinsics
27838 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27840 These intrinsics are used to carry certain debuginfo together with
27841 IR-level operations. For example, it may be desirable to
27842 know the structure/union name and the original user-level field
27843 indices. Such information got lost in IR GetElementPtr instruction
27844 since the IR types are different from debugInfo types and unions
27845 are converted to structs in IR.
27847 '``llvm.preserve.array.access.index``' Intrinsic
27848 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27855 @llvm.preserve.array.access.index.p0s_union.anons.p0a10s_union.anons(<type> base,
27862 The '``llvm.preserve.array.access.index``' intrinsic returns the getelementptr address
27863 based on array base ``base``, array dimension ``dim`` and the last access index ``index``
27864 into the array. The return type ``ret_type`` is a pointer type to the array element.
27865 The array ``dim`` and ``index`` are preserved which is more robust than
27866 getelementptr instruction which may be subject to compiler transformation.
27867 The ``llvm.preserve.access.index`` type of metadata is attached to this call instruction
27868 to provide array or pointer debuginfo type.
27869 The metadata is a ``DICompositeType`` or ``DIDerivedType`` representing the
27870 debuginfo version of ``type``.
27875 The ``base`` is the array base address. The ``dim`` is the array dimension.
27876 The ``base`` is a pointer if ``dim`` equals 0.
27877 The ``index`` is the last access index into the array or pointer.
27879 The ``base`` argument must be annotated with an :ref:`elementtype
27880 <attr_elementtype>` attribute at the call-site. This attribute specifies the
27881 getelementptr element type.
27886 The '``llvm.preserve.array.access.index``' intrinsic produces the same result
27887 as a getelementptr with base ``base`` and access operands ``{dim's 0's, index}``.
27889 '``llvm.preserve.union.access.index``' Intrinsic
27890 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27897 @llvm.preserve.union.access.index.p0s_union.anons.p0s_union.anons(<type> base,
27903 The '``llvm.preserve.union.access.index``' intrinsic carries the debuginfo field index
27904 ``di_index`` and returns the ``base`` address.
27905 The ``llvm.preserve.access.index`` type of metadata is attached to this call instruction
27906 to provide union debuginfo type.
27907 The metadata is a ``DICompositeType`` representing the debuginfo version of ``type``.
27908 The return type ``type`` is the same as the ``base`` type.
27913 The ``base`` is the union base address. The ``di_index`` is the field index in debuginfo.
27918 The '``llvm.preserve.union.access.index``' intrinsic returns the ``base`` address.
27920 '``llvm.preserve.struct.access.index``' Intrinsic
27921 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27928 @llvm.preserve.struct.access.index.p0i8.p0s_struct.anon.0s(<type> base,
27935 The '``llvm.preserve.struct.access.index``' intrinsic returns the getelementptr address
27936 based on struct base ``base`` and IR struct member index ``gep_index``.
27937 The ``llvm.preserve.access.index`` type of metadata is attached to this call instruction
27938 to provide struct debuginfo type.
27939 The metadata is a ``DICompositeType`` representing the debuginfo version of ``type``.
27940 The return type ``ret_type`` is a pointer type to the structure member.
27945 The ``base`` is the structure base address. The ``gep_index`` is the struct member index
27946 based on IR structures. The ``di_index`` is the struct member index based on debuginfo.
27948 The ``base`` argument must be annotated with an :ref:`elementtype
27949 <attr_elementtype>` attribute at the call-site. This attribute specifies the
27950 getelementptr element type.
27955 The '``llvm.preserve.struct.access.index``' intrinsic produces the same result
27956 as a getelementptr with base ``base`` and access operands ``{0, gep_index}``.
27958 '``llvm.fptrunc.round``' Intrinsic
27959 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27967 @llvm.fptrunc.round(<type> <value>, metadata <rounding mode>)
27972 The '``llvm.fptrunc.round``' intrinsic truncates
27973 :ref:`floating-point <t_floating>` ``value`` to type ``ty2``
27974 with a specified rounding mode.
27979 The '``llvm.fptrunc.round``' intrinsic takes a :ref:`floating-point
27980 <t_floating>` value to cast and a :ref:`floating-point <t_floating>` type
27981 to cast it to. This argument must be larger in size than the result.
27983 The second argument specifies the rounding mode as described in the constrained
27984 intrinsics section.
27985 For this intrinsic, the "round.dynamic" mode is not supported.
27990 The '``llvm.fptrunc.round``' intrinsic casts a ``value`` from a larger
27991 :ref:`floating-point <t_floating>` type to a smaller :ref:`floating-point
27992 <t_floating>` type.
27993 This intrinsic is assumed to execute in the default :ref:`floating-point
27994 environment <floatenv>` *except* for the rounding mode.
27995 This intrinsic is not supported on all targets. Some targets may not support
27996 all rounding modes.